cs.LG（2025-07-17）

📊 共 26 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (11 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (11 🔗2) 支柱一：机器人控制 (Robot Control) (2) 支柱八：物理动画 (Physics-based Animation) (2)

🔬 支柱二：RL算法与架构 (RL & Architecture) (11 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities	基于逆强化学习的大语言模型对齐：综述、进展与机遇	reinforcement learning inverse reinforcement learning large language model
2	Uncertainty-Aware Cross-Modal Knowledge Distillation with Prototype Learning for Multimodal Brain-Computer Interfaces	提出不确定性感知的跨模态知识蒸馏框架，提升多模态脑机接口性能。	distillation multimodal
3	From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning	提出ReLOAD，通过自蒸馏奖励解决离线强化学习中奖励标注难题。	reinforcement learning policy learning offline RL
4	Learning to summarize user information for personalized reinforcement learning from human feedback	提出PLUS框架，通过用户偏好摘要实现个性化强化学习，提升LLM用户对齐。	reinforcement learning preference learning RLHF
5	Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)	将监督式微调视为强化学习，并提出重要性加权方法iw-SFT以提升性能	reinforcement learning imitation learning behavior cloning
6	Apple Intelligence Foundation Language Models: Tech Report 2025	苹果发布Apple Intelligence基础语言模型：包括端侧3B模型和服务器端PT-MoE模型，赋能苹果设备与服务。	reinforcement learning foundation model multimodal
7	Enhancing Spatiotemporal Networks with xLSTM: A Scalar LSTM Approach for Cellular Traffic Forecasting	提出基于标量LSTM的时空网络，用于增强蜂窝网络流量预测。	MAE spatiotemporal
8	Improving Out-of-distribution Human Activity Recognition via IMU-Video Cross-modal Representation Learning	提出基于IMU-视频跨模态表征学习的OOD人体活动识别方法，提升泛化性	representation learning	✅
9	Boosting Team Modeling through Tempo-Relational Representation Learning	提出TRENN和MT-TRENN，通过时序关系表示学习提升团队建模能力	representation learning
10	Autonomous Resource Management in Microservice Systems via Reinforcement Learning	提出基于强化学习的微服务资源自主管理方法，优化资源调度。	reinforcement learning
11	Spectral Bellman Method: Unifying Representation and Exploration in RL	提出Spectral Bellman Representation，统一强化学习中的表征与探索问题	reinforcement learning representation learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (11 篇)

#	题目	一句话要点	标签	🔗	⭐
12	Insights into a radiology-specialised multimodal large language model with sparse autoencoders	利用稀疏自编码器解析放射学多模态大语言模型MAIRA-2的内部机制	large language model multimodal	✅
13	A Comprehensive Survey of Electronic Health Record Modeling: From Deep Learning Approaches to Large Language Models	综述电子病历建模：从深度学习到大语言模型，探索AI在医疗领域的应用。	large language model foundation model multimodal	✅
14	Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning	提出多模态引导的动态数据集剪枝框架，提升数据中心学习的鲁棒性和效率。	foundation model multimodal
15	A Collaborative Framework Integrating Large Language Model and Chemical Fragment Space: Mutual Inspiration for Lead Design	提出AutoLeadDesign框架，融合大语言模型与化学片段空间，用于先导化合物设计。	large language model
16	MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling	MoTM：基于连续建模的时间序列插补基础模型，提升域外泛化能力。	foundation model
17	Change of Thought: Adaptive Test-Time Computation	提出SELF-Transformer，通过自适应迭代注意力权重提升编码器Transformer的性能。	chain-of-thought
18	Provable Low-Frequency Bias of In-Context Learning of Representations	提出双重收敛框架，揭示ICL表征学习的低频偏置特性	large language model
19	Fake or Real: The Impostor Hunt in Texts for Space Operations	针对太空任务，提出区分恶意篡改LLM输出的真伪鉴别方法	large language model
20	Teach Old SAEs New Domain Tricks with Boosting	提出基于Boosting的残差学习方法，提升稀疏自编码器在特定领域的LLM内部表征解释能力	large language model
21	Probabilistic Soundness Guarantees in LLM Reasoning Chains	提出ARES框架，通过概率推理保证LLM推理链的可靠性，解决错误传播问题。	large language model
22	PolyServe: Efficient Multi-SLO Serving at Scale	PolyServe：一种高效的大规模多服务质量等级（SLO）服务系统	large language model

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
23	Model-free Reinforcement Learning for Model-based Control: Towards Safe, Interpretable and Sample-efficient Agents	结合模型与无模型强化学习，提升智能体安全性、可解释性和样本效率	model predictive control reinforcement learning policy learning
24	Vidar: Embodied Video Diffusion Model for Generalist Manipulation	Vidar：基于具身视频扩散模型的通用机器人操作框架	manipulation

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
25	Bridging the Gap: Leveraging Retrieval-Augmented Generation to Better Understand Public Concerns about Vaccines	利用检索增强生成技术(RAG)更深入理解公众对疫苗的担忧	PULSE large language model
26	Bayesian Modeling and Estimation of Linear Time-Variant Systems using Neural Networks and Gaussian Processes	提出基于贝叶斯神经网络和高斯过程的线性时变系统建模与估计方法	PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页