cs.LG（2025-02-27）

📊 共 43 篇论文 | 🔗 9 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (22 🔗5) 支柱二：RL算法与架构 (RL & Architecture) (13 🔗3) 支柱一：机器人控制 (Robot Control) (5 🔗1) 支柱八：物理动画 (Physics-based Animation) (2) 支柱三：空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (22 篇)

#	题目	一句话要点	标签	🔗
1	When Continue Learning Meets Multimodal Large Language Model: A Survey	综述多模态大语言模型持续学习，应对灾难性遗忘难题。	large language model multimodal
2	R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts	R2-T2：为多模态混合专家模型提出测试时重路由方法，提升下游任务性能。	large language model multimodal
3	MMSciBench: Benchmarking Language Models on Chinese Multimodal Scientific Problems	MMSciBench：中文多模态科学问题语言模型评测基准	large language model multimodal
4	SeisMoLLM: Advancing Seismic Monitoring via Cross-modal Transfer with Pre-trained Large Language Model	SeisMoLLM：利用跨模态迁移和预训练大语言模型推进地震监测	large language model foundation model
5	Evaluating System 1 vs. 2 Reasoning Approaches for Zero-Shot Time Series Forecasting: A Benchmark and Insights	ReC4TS：首个零样本时间序列预测推理能力评估基准与洞察	large language model foundation model multimodal	✅
6	Conformal Tail Risk Control for Large Language Model Alignment	提出基于Conformal Risk Control的LLM对齐框架，解决人机评分偏差导致的尾部风险控制问题。	large language model
7	Large Language Models as Attribution Regularizers for Efficient Model Training	提出基于LLM归因正则化的高效模型训练方法，提升小模型在少样本学习中的性能。	large language model
8	Mixtera: A Data Plane for Foundation Model Training	Mixtera：用于大模型训练的数据平面，支持声明式数据混合与动态调整。	foundation model
9	Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training	提出双重目的训练方法，通过token区分学习与遗忘，缓解大型语言模型中的成员推理攻击。	large language model
10	Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models	提出大型语言模型表征工程的分类、机遇与挑战，实现更有效、可解释的行为控制。	large language model
11	Walking the Web of Concept-Class Relationships in Incrementally Trained Interpretable Models	提出MuCIL模型，解决增量学习中概念-类别关系的保持与增强问题	multimodal
12	Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription	提出多模态大语言模型以解决多页手写文档转录问题	large language model
13	Stochastic Rounding for LLM Training: Theory and Practice	提出基于随机舍入的BF16训练策略，提升LLM训练效率与稳定性	large language model
14	SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers	SoS1：类O1和R1推理的LLM是平方和求解器，显著提升多项式非负性判定能力。	large language model
15	Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis	分析Web AI Agent脆弱性：揭示其相比独立LLM更易受攻击的原因	large language model
16	PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation	提出PhantomWiki，用于按需生成数据集，评估LLM的推理和检索能力。	large language model	✅
17	Mixture of Experts for Recognizing Depression from Interview and Reading Tasks	提出基于专家混合模型的抑郁症语音识别方法，融合访谈和阅读任务语音。	multimodal
18	AutoHete: An Automatic and Efficient Heterogeneous Training System for LLMs	AutoHete：一种自动高效的LLM异构训练系统，提升训练吞吐量。	large language model
19	SkipPipe: Partial and Reordered Pipelining Framework for Training LLMs in Heterogeneous Networks	SkipPipe：异构网络下LLM训练的部分重排序流水线框架	large language model	✅
20	MobiLLM: Enabling LLM Fine-Tuning on the Mobile Device via Server Assisted Side Tuning	MobiLLM：通过服务器辅助的侧边调优，在移动设备上实现LLM微调	large language model
21	Implicit Search via Discrete Diffusion: A Study on Chess	提出DiffuSearch，通过离散扩散模型进行隐式搜索，提升AI在棋类游戏中的规划能力。	large language model	✅
22	Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM Agents	自适应攻击破解针对LLM Agent间接提示注入攻击的防御	large language model	✅

🔬 支柱二：RL算法与架构 (RL & Architecture) (13 篇)

#	题目	一句话要点	标签	🔗
23	Improving the Efficiency of a Deep Reinforcement Learning-Based Power Management System for HPC Clusters Using Curriculum Learning	利用课程学习提升基于深度强化学习的高性能计算集群电源管理系统效率	reinforcement learning deep reinforcement learning DRL
24	Pokemon Red via Reinforcement Learning	提出基于深度强化学习的宝可梦红自动通关方案，验证奖励塑造的脆弱性。	reinforcement learning deep reinforcement learning DRL	✅
25	On the Importance of Reward Design in Reinforcement Learning-based Dynamic Algorithm Configuration: A Case Study on OneMax with (1+($λ$,$λ$))-GA	提出奖励设计机制以优化动态算法配置中的强化学习表现	reinforcement learning reward design reward shaping
26	ChatMol: A Versatile Molecule Designer Based on the Numerically Enhanced Large Language Model	ChatMol：一种基于数值增强大语言模型的多功能分子设计器	reinforcement learning large language model
27	Highly Parallelized Reinforcement Learning Training with Relaxed Assignment Dependencies	提出TianJi，通过松弛分配依赖性实现高并行强化学习训练加速。	reinforcement learning deep reinforcement learning DRL	✅
28	A Generative Model Enhanced Multi-Agent Reinforcement Learning Method for Electric Vehicle Charging Navigation	提出一种生成模型增强的多智能体强化学习方法，用于电动汽车充电导航。	reinforcement learning deep reinforcement learning DRL
29	Safety Representations for Safer Policy Learning	提出基于安全表征的强化学习方法，提升安全关键场景下的策略学习效率	reinforcement learning policy learning
30	Enhancing Transformer with GNN Structural Knowledge via Distillation: A Novel Approach	提出知识蒸馏框架以增强Transformer的图结构知识	representation learning distillation
31	Contrastive MIM: A Contrastive Mutual Information Framework for Unified Generative and Discriminative Representation Learning	提出对比互信息机cMIM，统一生成式与判别式表征学习。	representation learning contrastive learning
32	$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training	提出Q#以解决LLM后训练中的KL正则化问题	reinforcement learning PPO DPO	✅
33	Sanity Checking Causal Representation Learning on a Simple Real-World System	在真实光学系统中，因果表征学习方法未能有效恢复潜在因果因子	representation learning
34	RouteRL: Multi-agent reinforcement learning framework for urban route choice with autonomous vehicles	RouteRL：用于城市自主车辆路径选择的多智能体强化学习框架	reinforcement learning
35	IL-SOAR : Imitation Learning with Soft Optimistic Actor cRitic	提出基于软乐观Actor-Critic的模仿学习框架SOAR，提升策略探索效率。	imitation learning

🔬 支柱一：机器人控制 (Robot Control) (5 篇)

#	题目	一句话要点	标签	🔗
36	Offline Reinforcement Learning via Inverse Optimization	提出基于逆优化的离线强化学习算法，解决连续状态空间下的分布偏移问题。	MPC model predictive control reinforcement learning
37	RIZE: Adaptive Regularization for Imitation Learning	RIZE：基于自适应正则化的模仿学习方法，提升复杂环境下的决策鲁棒性	humanoid reinforcement learning imitation learning	✅
38	Unifying Model Predictive Path Integral Control, Reinforcement Learning, and Diffusion Models for Optimal Control and Planning	统一MPPI控制、强化学习与扩散模型，实现最优控制与规划	trajectory optimization motion planning reinforcement learning
39	Robust Gymnasium: A Unified Modular Benchmark for Robust Reinforcement Learning	提出Robust-Gymnasium：一个用于鲁棒强化学习的统一模块化基准测试平台	sim-to-real reinforcement learning
40	Your contrastive learning problem is secretly a distribution alignment problem	将对比学习问题重构为分布对齐问题，提升表征学习效果	manipulation contrastive learning

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
41	Regional climate projections using a deep-learning-based model-ranking and downscaling framework: Application to European climate zones	提出基于深度学习的模型排名与降尺度框架以提高区域气候预测精度	spatiotemporal
42	Asymptotics of Non-Convex Generalized Linear Models in High-Dimensions: A proof of the replica formula	提出非凸广义线性模型高维渐近分析框架，严格验证统计物理学的replica公式。	AMP

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
43	NeRFCom: Feature Transform Coding Meets Neural Radiance Field for Free-View 3D Scene Semantic Transmission	NeRFCom：面向自由视角3D场景语义传输的特征变换编码方法	NeRF neural radiance field

⬅️ 返回 cs.LG 首页 · 🏠 返回主页

cs.LG（2025-02-27）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (22 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (13 篇)

🔬 支柱一：机器人控制 (Robot Control) (5 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理