cs.LG(2026-01-23)
📊 共 10 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1)
支柱二:RL算法与架构 (RL & Architecture) (4)
支柱一:机器人控制 (Robot Control) (1)
支柱四:生成式动作 (Generative Motion) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Rethinking Large Language Models For Irregular Time Series Classification In Critical Care | 针对ICU不规则时间序列分类,研究并优化大语言模型中的编码器与对齐策略。 | large language model multimodal | ✅ | |
| 2 | Beyond Superficial Unlearning: Sharpness-Aware Robust Erasure of Hallucinations in Multimodal LLMs | 提出SARE,通过对抗扰动增强多模态LLM的幻觉消除鲁棒性 | multimodal | ||
| 3 | Predicting Startup Success Using Large Language Models: A Novel In-Context Learning Approach | 提出kNN-ICL框架,利用大语言模型解决早期创业公司成功预测的数据稀缺问题 | large language model | ||
| 4 | DANCE: Dynamic, Available, Neighbor-gated Condensation for Federated Text-Attributed Graphs | DANCE:动态、可用、邻居门控的联邦文本属性图压缩学习 | large language model |
🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | Towards a Theoretical Understanding to the Generalization of RLHF | 提出RLHF理论框架以解决高维设置中的泛化问题 | reinforcement learning RLHF large language model | ||
| 6 | A Regularized Actor-Critic Algorithm for Bi-Level Reinforcement Learning | 提出一种正则化Actor-Critic算法,用于解决双层强化学习问题 | reinforcement learning RLHF | ||
| 7 | The Trajectory Alignment Coefficient in Two Acts: From Reward Tuning to Reward Learning | 提出Soft-TAC,用于从人类偏好数据中学习奖励函数,提升强化学习效果 | reinforcement learning reward design | ||
| 8 | Endless Terminals: Scaling RL Environments for Terminal Agents | 提出Endless Terminals,用于大规模生成终端任务以训练强化学习Agent。 | reinforcement learning PPO |
🔬 支柱一:机器人控制 (Robot Control) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | GRIP: Algorithm-Agnostic Machine Unlearning for Mixture-of-Experts via Geometric Router Constraints | GRIP:通过几何路由约束实现MoE模型算法无关的机器遗忘 | manipulation large language model |
🔬 支柱四:生成式动作 (Generative Motion) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | Auto-Regressive Masked Diffusion Models | 提出自回归掩码扩散模型(ARMD),提升语言建模效率和并行生成能力。 | MDM |