cs.LG(2025-07-17)

📊 共 26 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (11 🔗2) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities 基于逆强化学习的大语言模型对齐:综述、进展与机遇 reinforcement learning inverse reinforcement learning large language model
2 Uncertainty-Aware Cross-Modal Knowledge Distillation with Prototype Learning for Multimodal Brain-Computer Interfaces 提出不确定性感知的跨模态知识蒸馏框架,提升多模态脑机接口性能。 distillation multimodal
3 From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning 提出ReLOAD,通过自蒸馏奖励解决离线强化学习中奖励标注难题。 reinforcement learning policy learning offline RL
4 Learning to summarize user information for personalized reinforcement learning from human feedback 提出PLUS框架,通过用户偏好摘要实现个性化强化学习,提升LLM用户对齐。 reinforcement learning preference learning RLHF
5 Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved) 将监督式微调视为强化学习,并提出重要性加权方法iw-SFT以提升性能 reinforcement learning imitation learning behavior cloning
6 Apple Intelligence Foundation Language Models: Tech Report 2025 苹果发布Apple Intelligence基础语言模型:包括端侧3B模型和服务器端PT-MoE模型,赋能苹果设备与服务。 reinforcement learning foundation model multimodal
7 Enhancing Spatiotemporal Networks with xLSTM: A Scalar LSTM Approach for Cellular Traffic Forecasting 提出基于标量LSTM的时空网络,用于增强蜂窝网络流量预测。 MAE spatiotemporal
8 Improving Out-of-distribution Human Activity Recognition via IMU-Video Cross-modal Representation Learning 提出基于IMU-视频跨模态表征学习的OOD人体活动识别方法,提升泛化性 representation learning
9 Boosting Team Modeling through Tempo-Relational Representation Learning 提出TRENN和MT-TRENN,通过时序关系表示学习提升团队建模能力 representation learning
10 Autonomous Resource Management in Microservice Systems via Reinforcement Learning 提出基于强化学习的微服务资源自主管理方法,优化资源调度。 reinforcement learning
11 Spectral Bellman Method: Unifying Representation and Exploration in RL 提出Spectral Bellman Representation,统一强化学习中的表征与探索问题 reinforcement learning representation learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (11 篇)

#题目一句话要点标签🔗
12 Insights into a radiology-specialised multimodal large language model with sparse autoencoders 利用稀疏自编码器解析放射学多模态大语言模型MAIRA-2的内部机制 large language model multimodal
13 A Comprehensive Survey of Electronic Health Record Modeling: From Deep Learning Approaches to Large Language Models 综述电子病历建模:从深度学习到大语言模型,探索AI在医疗领域的应用。 large language model foundation model multimodal
14 Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning 提出多模态引导的动态数据集剪枝框架,提升数据中心学习的鲁棒性和效率。 foundation model multimodal
15 A Collaborative Framework Integrating Large Language Model and Chemical Fragment Space: Mutual Inspiration for Lead Design 提出AutoLeadDesign框架,融合大语言模型与化学片段空间,用于先导化合物设计。 large language model
16 MoTM: Towards a Foundation Model for Time Series Imputation based on Continuous Modeling MoTM:基于连续建模的时间序列插补基础模型,提升域外泛化能力。 foundation model
17 Change of Thought: Adaptive Test-Time Computation 提出SELF-Transformer,通过自适应迭代注意力权重提升编码器Transformer的性能。 chain-of-thought
18 Provable Low-Frequency Bias of In-Context Learning of Representations 提出双重收敛框架,揭示ICL表征学习的低频偏置特性 large language model
19 Fake or Real: The Impostor Hunt in Texts for Space Operations 针对太空任务,提出区分恶意篡改LLM输出的真伪鉴别方法 large language model
20 Teach Old SAEs New Domain Tricks with Boosting 提出基于Boosting的残差学习方法,提升稀疏自编码器在特定领域的LLM内部表征解释能力 large language model
21 Probabilistic Soundness Guarantees in LLM Reasoning Chains 提出ARES框架,通过概率推理保证LLM推理链的可靠性,解决错误传播问题。 large language model
22 PolyServe: Efficient Multi-SLO Serving at Scale PolyServe:一种高效的大规模多服务质量等级(SLO)服务系统 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
23 Model-free Reinforcement Learning for Model-based Control: Towards Safe, Interpretable and Sample-efficient Agents 结合模型与无模型强化学习,提升智能体安全性、可解释性和样本效率 model predictive control reinforcement learning policy learning
24 Vidar: Embodied Video Diffusion Model for Generalist Manipulation Vidar:基于具身视频扩散模型的通用机器人操作框架 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
25 Bridging the Gap: Leveraging Retrieval-Augmented Generation to Better Understand Public Concerns about Vaccines 利用检索增强生成技术(RAG)更深入理解公众对疫苗的担忧 PULSE large language model
26 Bayesian Modeling and Estimation of Linear Time-Variant Systems using Neural Networks and Gaussian Processes 提出基于贝叶斯神经网络和高斯过程的线性时变系统建模与估计方法 PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页