cs.LG(2024-10-17)
📊 共 17 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (8)
支柱九:具身大模型 (Embodied Foundation Models) (6 🔗1)
支柱八:物理动画 (Physics-based Animation) (2 🔗1)
支柱一:机器人控制 (Robot Control) (1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Reward-free World Models for Online Imitation Learning | 提出基于无奖励世界模型的在线模仿学习方法,提升复杂任务性能 | reinforcement learning imitation learning world model | ||
| 2 | Personalized Adaptation via In-Context Preference Learning | 提出Preference Pretrained Transformer (PPT),通过上下文偏好学习实现个性化自适应 | reinforcement learning preference learning RLHF | ||
| 3 | ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization | ORSO:通过在线奖励选择与策略优化加速强化学习中的奖励设计 | reinforcement learning reward design reward shaping | ||
| 4 | Provable Benefits of Complex Parameterizations for Structured State Space Models | 证明复数参数化结构化状态空间模型优于实数参数化 | Mamba SSM state space model | ||
| 5 | An Evolved Universal Transformer Memory | 提出神经注意力记忆模型,提升Transformer长文本处理效率与性能 | reinforcement learning foundation model zero-shot transfer | ||
| 6 | CFTS-GAN: Continual Few-Shot Teacher Student for Generative Adversarial Networks | 提出CFTS-GAN,解决GANs在小样本持续学习中的过拟合和灾难性遗忘问题 | teacher-student | ||
| 7 | Is Prior-Free Black-Box Non-Stationary Reinforcement Learning Feasible? | 揭示无先验黑盒非平稳强化学习算法MASTER的局限性 | reinforcement learning | ||
| 8 | A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement | 揭示基于边际损失的语言模型对齐的常见陷阱:梯度纠缠 | reinforcement learning RLHF |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models | 通过解析模型生成理由,挖掘技能层面的洞察,理解大模型的权衡 | foundation model | ||
| 10 | AgentDrug: Utilizing Large Language Models in an Agentic Workflow for Zero-Shot Molecular Optimization | AgentDrug:利用大语言模型和Agent工作流实现零样本分子优化 | large language model | ||
| 11 | Automatically Interpreting Millions of Features in Large Language Models | 提出自动化解释框架,利用大语言模型解释稀疏自编码器中的海量特征,提升可解释性。 | large language model | ✅ | |
| 12 | MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs | MathGAP:用于评估LLM在任意复杂证明问题上的泛化能力的数据集与框架 | large language model chain-of-thought | ||
| 13 | How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMs | 研究表明数值精度显著影响LLM的算术推理能力 | large language model | ||
| 14 | Progressive Mixed-Precision Decoding for Efficient LLM Inference | 提出渐进式混合精度解码(PMPD),加速LLM推理并降低资源需求。 | large language model |
🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 15 | Precipitation Nowcasting Using Diffusion Transformer with Causal Attention | 提出基于因果注意力扩散Transformer的降水临近预报模型,显著提升强降水预测精度。 | spatiotemporal | ||
| 16 | Artificial Kuramoto Oscillatory Neurons | 提出人工藏本振荡神经元(AKOrN),通过神经元同步动态提升多种任务性能。 | spatiotemporal | ✅ |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 17 | WARPD: World model Assisted Reactive Policy Diffusion | 提出WARPD以解决扩展动作视野与推理成本问题 | locomotion manipulation imitation learning |