cs.LG(2026-01-02)

📊 共 10 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (5) 支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
1 IRPO: Scaling the Bradley-Terry Model via Reinforcement Learning 提出IRPO:通过强化学习扩展Bradley-Terry模型,提升生成式奖励模型效率。 reinforcement learning chain-of-thought
2 Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation 提出Avatar Forcing框架,实现自然对话的实时交互式头部Avatar生成 direct preference optimization multimodal
3 The Reasoning-Creativity Trade-off: Toward Creativity-Driven Problem Solving 提出DCR框架,解决LLM推理创造力权衡问题,实现正确性和创造性的统一。 DPO large language model
4 Traffic-Aware Optimal Taxi Placement Using Graph Neural Network-Based Reinforcement Learning 提出基于图神经网络强化学习的交通感知出租车优化调度方法,提升城市出行效率。 reinforcement learning
5 ARISE: Adaptive Reinforcement Integrated with Swarm Exploration ARISE:一种融合群体探索的自适应强化学习框架,提升探索能力 reinforcement learning PPO

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
6 Memory Bank Compression for Continual Adaptation of Large Language Models 提出MBC模型,通过压缩记忆库实现大语言模型的持续自适应,显著降低存储成本。 large language model
7 Bayesian Inverse Games with High-Dimensional Multi-Modal Observations 提出基于变分自编码器的贝叶斯逆向博弈框架,用于多智能体目标推断与不确定性量化。 multimodal
8 Geometry of Reason: Spectral Signatures of Valid Mathematical Reasoning 提出一种免训练的谱分析方法,通过分析LLM注意力模式检测数学推理的有效性。 large language model
9 HFedMoE: Resource-aware Heterogeneous Federated Learning with Mixture-of-Experts HFedMoE:面向资源受限设备的异构联邦MoE学习框架 large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
10 Adversarial Samples Are Not Created Equal 区分利用脆弱特征与否的对抗样本,重新评估深度网络的对抗鲁棒性 manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页