cs.LG（2026-04-17）

📊 共 19 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (9) 支柱二：RL算法与架构 (RL & Architecture) (8 🔗1) 支柱一：机器人控制 (Robot Control) (1) 支柱八：物理动画 (Physics-based Animation) (1 🔗1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (9 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Sketching the Readout of Large Language Models for Scalable Data Attribution and Valuation	提出RISE：通过草图化LLM输出层影响热点，实现可扩展的数据归因与估值	large language model
2	JumpLoRA: Sparse Adapters for Continual Learning in Large Language Models	JumpLoRA：基于稀疏适配器的LLM持续学习方法	large language model
3	Tabular foundation models for in-context prediction of molecular properties	提出基于表格型预训练模型的分子性质上下文预测方法，无需微调且高效。	foundation model
4	Joint-Centric Dual Contrastive Alignment with Structure-Preserving and Information-Balanced Regularization	提出HILBERT框架，解决低资源场景下长序列音视频文档表示学习问题。	multimodal
5	SCRIPT: Implementing an Intelligent Tutoring System for Programming in a German University Context	构建SCRIPT：德国大学Python编程智能辅导系统，支持个性化指导与研究。	large language model
6	QuantSightBench: Evaluating LLM Quantitative Forecasting with Prediction Intervals	QuantSightBench：提出预测区间评估LLM量化预测能力，揭示模型校准问题。	large language model
7	DPrivBench: Benchmarking LLMs' Reasoning for Differential Privacy	提出DPrivBench以自动化差分隐私推理问题	large language model
8	Breaking the Training Barrier of Billion-Parameter Universal Machine Learning Interatomic Potentials	提出MatRIS-MoE和Janus框架，加速十亿参数通用机器学习原子间势模型的训练。	foundation model
9	Faster LLM Inference via Sequential Monte Carlo	提出基于序列蒙特卡洛的推测解码方法，加速LLM推理并提升吞吐量。	instruction following

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
10	AEGIS: Anchor-Enforced Gradient Isolation for Knowledge-Preserving Vision-Language-Action Fine-Tuning	AEGIS：锚点增强梯度隔离，用于知识保持的视觉-语言-动作微调	flow matching vision-language-action
11	Evaluating the Progression of Large Language Model Capabilities for Small-Molecule Drug Design	提出基于强化学习后训练的LLM评估框架，提升小分子药物设计能力	reinforcement learning large language model
12	Placing Puzzle Pieces Where They Matter: A Question Augmentation Framework for Reinforcement Learning	提出PieceHint框架，通过问题增强策略提升强化学习在数学推理中的性能和泛化性	reinforcement learning large language model
13	Self-Distillation as a Performance Recovery Mechanism for LLMs: Counteracting Compression and Catastrophic Forgetting	提出基于自蒸馏微调的LLM性能恢复框架，对抗压缩和灾难性遗忘。	distillation large language model
14	Zero-Shot Scalable Resilience in UAV Swarms: A Decentralized Imitation Learning Framework with Physics-Informed Graph Interactions	提出基于物理信息的图对抗模仿学习算法，实现无人机集群的零样本可扩展弹性恢复。	reinforcement learning imitation learning
15	Detecting and Suppressing Reward Hacking with Gradient Fingerprints	提出GRIFT，利用梯度指纹检测并抑制强化学习中的奖励篡改	reinforcement learning chain-of-thought	✅
16	Multi-objective Reinforcement Learning With Augmented States Requires Rewards After Deployment	揭示了增强状态多目标强化学习在部署后仍需奖励信号的重要特性	reinforcement learning
17	Majority Voting for Code Generation	提出基于运行时行为共识的功能多数投票方法，提升代码生成性能	reinforcement learning large language model

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
18	Flexible Empowerment at Reasoning with Extended Best-of-N Sampling	提出基于扩展Best-of-N采样的灵活Empowerment方法，解决强化学习中探索-利用困境。	locomotion reinforcement learning foundation model

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
19	Modern Structure-Aware Simplicial Spatiotemporal Neural Network	提出ModernSASST，利用单纯复形进行结构化时空建模，提升计算效率。	spatiotemporal	✅

⬅️ 返回 cs.LG 首页 · 🏠 返回主页