cs.LG(2026-04-17)

📊 共 19 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (9) 支柱二:RL算法与架构 (RL & Architecture) (8 🔗1) 支柱一:机器人控制 (Robot Control) (1) 支柱八:物理动画 (Physics-based Animation) (1 🔗1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
1 Sketching the Readout of Large Language Models for Scalable Data Attribution and Valuation 提出RISE:通过草图化LLM输出层影响热点,实现可扩展的数据归因与估值 large language model
2 JumpLoRA: Sparse Adapters for Continual Learning in Large Language Models JumpLoRA:基于稀疏适配器的LLM持续学习方法 large language model
3 Tabular foundation models for in-context prediction of molecular properties 提出基于表格型预训练模型的分子性质上下文预测方法,无需微调且高效。 foundation model
4 Joint-Centric Dual Contrastive Alignment with Structure-Preserving and Information-Balanced Regularization 提出HILBERT框架,解决低资源场景下长序列音视频文档表示学习问题。 multimodal
5 SCRIPT: Implementing an Intelligent Tutoring System for Programming in a German University Context 构建SCRIPT:德国大学Python编程智能辅导系统,支持个性化指导与研究。 large language model
6 QuantSightBench: Evaluating LLM Quantitative Forecasting with Prediction Intervals QuantSightBench:提出预测区间评估LLM量化预测能力,揭示模型校准问题。 large language model
7 DPrivBench: Benchmarking LLMs' Reasoning for Differential Privacy 提出DPrivBench以自动化差分隐私推理问题 large language model
8 Breaking the Training Barrier of Billion-Parameter Universal Machine Learning Interatomic Potentials 提出MatRIS-MoE和Janus框架,加速十亿参数通用机器学习原子间势模型的训练。 foundation model
9 Faster LLM Inference via Sequential Monte Carlo 提出基于序列蒙特卡洛的推测解码方法,加速LLM推理并提升吞吐量。 instruction following

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
10 AEGIS: Anchor-Enforced Gradient Isolation for Knowledge-Preserving Vision-Language-Action Fine-Tuning AEGIS:锚点增强梯度隔离,用于知识保持的视觉-语言-动作微调 flow matching vision-language-action
11 Evaluating the Progression of Large Language Model Capabilities for Small-Molecule Drug Design 提出基于强化学习后训练的LLM评估框架,提升小分子药物设计能力 reinforcement learning large language model
12 Placing Puzzle Pieces Where They Matter: A Question Augmentation Framework for Reinforcement Learning 提出PieceHint框架,通过问题增强策略提升强化学习在数学推理中的性能和泛化性 reinforcement learning large language model
13 Self-Distillation as a Performance Recovery Mechanism for LLMs: Counteracting Compression and Catastrophic Forgetting 提出基于自蒸馏微调的LLM性能恢复框架,对抗压缩和灾难性遗忘。 distillation large language model
14 Zero-Shot Scalable Resilience in UAV Swarms: A Decentralized Imitation Learning Framework with Physics-Informed Graph Interactions 提出基于物理信息的图对抗模仿学习算法,实现无人机集群的零样本可扩展弹性恢复。 reinforcement learning imitation learning
15 Detecting and Suppressing Reward Hacking with Gradient Fingerprints 提出GRIFT,利用梯度指纹检测并抑制强化学习中的奖励篡改 reinforcement learning chain-of-thought
16 Multi-objective Reinforcement Learning With Augmented States Requires Rewards After Deployment 揭示了增强状态多目标强化学习在部署后仍需奖励信号的重要特性 reinforcement learning
17 Majority Voting for Code Generation 提出基于运行时行为共识的功能多数投票方法,提升代码生成性能 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
18 Flexible Empowerment at Reasoning with Extended Best-of-N Sampling 提出基于扩展Best-of-N采样的灵活Empowerment方法,解决强化学习中探索-利用困境。 locomotion reinforcement learning foundation model

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
19 Modern Structure-Aware Simplicial Spatiotemporal Neural Network 提出ModernSASST,利用单纯复形进行结构化时空建模,提升计算效率。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页