cs.LG(2024-10-12)

📊 共 17 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (8 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (6 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
1 Multimodal Physical Activity Forecasting in Free-Living Clinical Settings: Hunting Opportunities for Just-in-Time Interventions MoveSense:利用多模态LSTM预测患者活动行为,为即时干预提供机会 multimodal
2 ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models ReLU激活函数在无LayerNorm的大语言模型中表现优于GELU,提升困惑度。 large language model
3 Mastering AI: Big Data, Deep Learning, and the Evolution of Large Language Models -- AutoML from Basics to State-of-the-Art Techniques AutoML综述:从基础到前沿技术,助力AI模型自动化构建 large language model
4 Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow Analysis 通过梯度流分析,研究Transformer识别词共现的训练动态 large language model
5 Towards Scalable Semantic Representation for Recommendation 提出Mixture-of-Codes方法,提升推荐系统中语义表征的可扩展性和性能。 large language model
6 AT-MoE: Adaptive Task-planning Mixture of Experts via LoRA Approach 提出AT-MoE:一种基于LoRA的自适应任务规划混合专家模型,提升特定任务性能和可解释性。 large language model
7 Towards the Effect of Examples on In-Context Learning: A Theoretical Case Study 理论分析上下文学习中示例对二分类任务的影响,揭示预训练知识与示例的交互机制 large language model
8 Fine-grained Attention I/O Complexity: Comprehensive Analysis for Backward Passes 针对Attention机制反向传播,提出细粒度I/O复杂度分析,优化LLM训练效率。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
9 SeRA: Self-Reviewing and Alignment of Large Language Models using Implicit Reward Margins SeRA:利用隐式奖励边际进行大语言模型的自审查与对齐 reinforcement learning RLHF DPO
10 Mamba4Cast: Efficient Zero-Shot Time Series Forecasting with State Space Models Mamba4Cast:基于状态空间模型的高效零样本时间序列预测 Mamba state space model foundation model
11 Boosting Deductive Reasoning with Step Signals In RLHF 提出MuseD方法,通过RLHF提升LLM在多步演绎推理中的能力 RLHF large language model
12 ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning 提出ActSafe以解决强化学习中的安全探索问题 reinforcement learning model-based RL
13 TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning TOP-ERL:基于Transformer的离线 episodic 强化学习,提升机器人学习性能 reinforcement learning
14 Reinforcement Learning in Hyperbolic Spaces: Models and Experiments 提出基于双曲空间的强化学习框架,解决未知环境探索问题 reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
15 HG2P: Hippocampus-inspired High-reward Graph and Model-Free Q-Gradient Penalty for Path Planning and Motion Control 提出HG2P,融合高奖励图与无模型Q梯度惩罚,提升长程导航与操作任务性能 manipulation reinforcement learning
16 Timeseria: an object-oriented time series processing library Timeseria:一个面向对象的时间序列处理Python库,旨在简化时间序列数据操作和模型构建。 manipulation

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
17 Power-Softmax: Towards Secure LLM Inference over Encrypted Data 提出Power-Softmax,实现加密数据上安全LLM推理,模型参数超十亿 OMOMO

⬅️ 返回 cs.LG 首页 · 🏠 返回主页