cs.LG(2024-10-17)

📊 共 17 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (8) 支柱九:具身大模型 (Embodied Foundation Models) (6 🔗1) 支柱八:物理动画 (Physics-based Animation) (2 🔗1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
1 Reward-free World Models for Online Imitation Learning 提出基于无奖励世界模型的在线模仿学习方法,提升复杂任务性能 reinforcement learning imitation learning world model
2 Personalized Adaptation via In-Context Preference Learning 提出Preference Pretrained Transformer (PPT),通过上下文偏好学习实现个性化自适应 reinforcement learning preference learning RLHF
3 ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization ORSO:通过在线奖励选择与策略优化加速强化学习中的奖励设计 reinforcement learning reward design reward shaping
4 Provable Benefits of Complex Parameterizations for Structured State Space Models 证明复数参数化结构化状态空间模型优于实数参数化 Mamba SSM state space model
5 An Evolved Universal Transformer Memory 提出神经注意力记忆模型,提升Transformer长文本处理效率与性能 reinforcement learning foundation model zero-shot transfer
6 CFTS-GAN: Continual Few-Shot Teacher Student for Generative Adversarial Networks 提出CFTS-GAN,解决GANs在小样本持续学习中的过拟合和灾难性遗忘问题 teacher-student
7 Is Prior-Free Black-Box Non-Stationary Reinforcement Learning Feasible? 揭示无先验黑盒非平稳强化学习算法MASTER的局限性 reinforcement learning
8 A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement 揭示基于边际损失的语言模型对齐的常见陷阱:梯度纠缠 reinforcement learning RLHF

🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)

#题目一句话要点标签🔗
9 Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models 通过解析模型生成理由,挖掘技能层面的洞察,理解大模型的权衡 foundation model
10 AgentDrug: Utilizing Large Language Models in an Agentic Workflow for Zero-Shot Molecular Optimization AgentDrug:利用大语言模型和Agent工作流实现零样本分子优化 large language model
11 Automatically Interpreting Millions of Features in Large Language Models 提出自动化解释框架,利用大语言模型解释稀疏自编码器中的海量特征,提升可解释性。 large language model
12 MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs MathGAP:用于评估LLM在任意复杂证明问题上的泛化能力的数据集与框架 large language model chain-of-thought
13 How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMs 研究表明数值精度显著影响LLM的算术推理能力 large language model
14 Progressive Mixed-Precision Decoding for Efficient LLM Inference 提出渐进式混合精度解码(PMPD),加速LLM推理并降低资源需求。 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
15 Precipitation Nowcasting Using Diffusion Transformer with Causal Attention 提出基于因果注意力扩散Transformer的降水临近预报模型,显著提升强降水预测精度。 spatiotemporal
16 Artificial Kuramoto Oscillatory Neurons 提出人工藏本振荡神经元(AKOrN),通过神经元同步动态提升多种任务性能。 spatiotemporal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
17 WARPD: World model Assisted Reactive Policy Diffusion 提出WARPD以解决扩展动作视野与推理成本问题 locomotion manipulation imitation learning

⬅️ 返回 cs.LG 首页 · 🏠 返回主页