cs.LG（2024-10-12）

📊 共 17 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (8 🔗1) 支柱二：RL算法与架构 (RL & Architecture) (6 🔗1) 支柱一：机器人控制 (Robot Control) (2) 支柱五：交互与反应 (Interaction & Reaction) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Multimodal Physical Activity Forecasting in Free-Living Clinical Settings: Hunting Opportunities for Just-in-Time Interventions	MoveSense：利用多模态LSTM预测患者活动行为，为即时干预提供机会	multimodal
2	ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models	ReLU激活函数在无LayerNorm的大语言模型中表现优于GELU，提升困惑度。	large language model	✅
3	Mastering AI: Big Data, Deep Learning, and the Evolution of Large Language Models -- AutoML from Basics to State-of-the-Art Techniques	AutoML综述：从基础到前沿技术，助力AI模型自动化构建	large language model
4	Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow Analysis	通过梯度流分析，研究Transformer识别词共现的训练动态	large language model
5	Towards Scalable Semantic Representation for Recommendation	提出Mixture-of-Codes方法，提升推荐系统中语义表征的可扩展性和性能。	large language model
6	AT-MoE: Adaptive Task-planning Mixture of Experts via LoRA Approach	提出AT-MoE：一种基于LoRA的自适应任务规划混合专家模型，提升特定任务性能和可解释性。	large language model
7	Towards the Effect of Examples on In-Context Learning: A Theoretical Case Study	理论分析上下文学习中示例对二分类任务的影响，揭示预训练知识与示例的交互机制	large language model
8	Fine-grained Attention I/O Complexity: Comprehensive Analysis for Backward Passes	针对Attention机制反向传播，提出细粒度I/O复杂度分析，优化LLM训练效率。	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (6 篇)

#	题目	一句话要点	标签	🔗	⭐
9	SeRA: Self-Reviewing and Alignment of Large Language Models using Implicit Reward Margins	SeRA：利用隐式奖励边际进行大语言模型的自审查与对齐	reinforcement learning RLHF DPO
10	Mamba4Cast: Efficient Zero-Shot Time Series Forecasting with State Space Models	Mamba4Cast：基于状态空间模型的高效零样本时间序列预测	Mamba state space model foundation model	✅
11	Boosting Deductive Reasoning with Step Signals In RLHF	提出MuseD方法，通过RLHF提升LLM在多步演绎推理中的能力	RLHF large language model
12	ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning	提出ActSafe以解决强化学习中的安全探索问题	reinforcement learning model-based RL
13	TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning	TOP-ERL：基于Transformer的离线 episodic 强化学习，提升机器人学习性能	reinforcement learning
14	Reinforcement Learning in Hyperbolic Spaces: Models and Experiments	提出基于双曲空间的强化学习框架，解决未知环境探索问题	reinforcement learning

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
15	HG2P: Hippocampus-inspired High-reward Graph and Model-Free Q-Gradient Penalty for Path Planning and Motion Control	提出HG2P，融合高奖励图与无模型Q梯度惩罚，提升长程导航与操作任务性能	manipulation reinforcement learning
16	Timeseria: an object-oriented time series processing library	Timeseria：一个面向对象的时间序列处理Python库，旨在简化时间序列数据操作和模型构建。	manipulation

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
17	Power-Softmax: Towards Secure LLM Inference over Encrypted Data	提出Power-Softmax，实现加密数据上安全LLM推理，模型参数超十亿	OMOMO

⬅️ 返回 cs.LG 首页 · 🏠 返回主页