cs.LG(2025-02-27)

📊 共 43 篇论文 | 🔗 9 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (22 🔗5) 支柱二:RL算法与架构 (RL & Architecture) (13 🔗3) 支柱一:机器人控制 (Robot Control) (5 🔗1) 支柱八:物理动画 (Physics-based Animation) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (22 篇)

#题目一句话要点标签🔗
1 When Continue Learning Meets Multimodal Large Language Model: A Survey 综述多模态大语言模型持续学习,应对灾难性遗忘难题。 large language model multimodal
2 R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts R2-T2:为多模态混合专家模型提出测试时重路由方法,提升下游任务性能。 large language model multimodal
3 MMSciBench: Benchmarking Language Models on Chinese Multimodal Scientific Problems MMSciBench:中文多模态科学问题语言模型评测基准 large language model multimodal
4 SeisMoLLM: Advancing Seismic Monitoring via Cross-modal Transfer with Pre-trained Large Language Model SeisMoLLM:利用跨模态迁移和预训练大语言模型推进地震监测 large language model foundation model
5 Evaluating System 1 vs. 2 Reasoning Approaches for Zero-Shot Time Series Forecasting: A Benchmark and Insights ReC4TS:首个零样本时间序列预测推理能力评估基准与洞察 large language model foundation model multimodal
6 Conformal Tail Risk Control for Large Language Model Alignment 提出基于Conformal Risk Control的LLM对齐框架,解决人机评分偏差导致的尾部风险控制问题。 large language model
7 Large Language Models as Attribution Regularizers for Efficient Model Training 提出基于LLM归因正则化的高效模型训练方法,提升小模型在少样本学习中的性能。 large language model
8 Mixtera: A Data Plane for Foundation Model Training Mixtera:用于大模型训练的数据平面,支持声明式数据混合与动态调整。 foundation model
9 Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training 提出双重目的训练方法,通过token区分学习与遗忘,缓解大型语言模型中的成员推理攻击。 large language model
10 Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models 提出大型语言模型表征工程的分类、机遇与挑战,实现更有效、可解释的行为控制。 large language model
11 Walking the Web of Concept-Class Relationships in Incrementally Trained Interpretable Models 提出MuCIL模型,解决增量学习中概念-类别关系的保持与增强问题 multimodal
12 Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription 提出多模态大语言模型以解决多页手写文档转录问题 large language model
13 Stochastic Rounding for LLM Training: Theory and Practice 提出基于随机舍入的BF16训练策略,提升LLM训练效率与稳定性 large language model
14 SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers SoS1:类O1和R1推理的LLM是平方和求解器,显著提升多项式非负性判定能力。 large language model
15 Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis 分析Web AI Agent脆弱性:揭示其相比独立LLM更易受攻击的原因 large language model
16 PhantomWiki: On-Demand Datasets for Reasoning and Retrieval Evaluation 提出PhantomWiki,用于按需生成数据集,评估LLM的推理和检索能力。 large language model
17 Mixture of Experts for Recognizing Depression from Interview and Reading Tasks 提出基于专家混合模型的抑郁症语音识别方法,融合访谈和阅读任务语音。 multimodal
18 AutoHete: An Automatic and Efficient Heterogeneous Training System for LLMs AutoHete:一种自动高效的LLM异构训练系统,提升训练吞吐量。 large language model
19 SkipPipe: Partial and Reordered Pipelining Framework for Training LLMs in Heterogeneous Networks SkipPipe:异构网络下LLM训练的部分重排序流水线框架 large language model
20 MobiLLM: Enabling LLM Fine-Tuning on the Mobile Device via Server Assisted Side Tuning MobiLLM:通过服务器辅助的侧边调优,在移动设备上实现LLM微调 large language model
21 Implicit Search via Discrete Diffusion: A Study on Chess 提出DiffuSearch,通过离散扩散模型进行隐式搜索,提升AI在棋类游戏中的规划能力。 large language model
22 Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM Agents 自适应攻击破解针对LLM Agent间接提示注入攻击的防御 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (13 篇)

#题目一句话要点标签🔗
23 Improving the Efficiency of a Deep Reinforcement Learning-Based Power Management System for HPC Clusters Using Curriculum Learning 利用课程学习提升基于深度强化学习的高性能计算集群电源管理系统效率 reinforcement learning deep reinforcement learning DRL
24 Pokemon Red via Reinforcement Learning 提出基于深度强化学习的宝可梦红自动通关方案,验证奖励塑造的脆弱性。 reinforcement learning deep reinforcement learning DRL
25 On the Importance of Reward Design in Reinforcement Learning-based Dynamic Algorithm Configuration: A Case Study on OneMax with (1+($λ$,$λ$))-GA 提出奖励设计机制以优化动态算法配置中的强化学习表现 reinforcement learning reward design reward shaping
26 ChatMol: A Versatile Molecule Designer Based on the Numerically Enhanced Large Language Model ChatMol:一种基于数值增强大语言模型的多功能分子设计器 reinforcement learning large language model
27 Highly Parallelized Reinforcement Learning Training with Relaxed Assignment Dependencies 提出TianJi,通过松弛分配依赖性实现高并行强化学习训练加速。 reinforcement learning deep reinforcement learning DRL
28 A Generative Model Enhanced Multi-Agent Reinforcement Learning Method for Electric Vehicle Charging Navigation 提出一种生成模型增强的多智能体强化学习方法,用于电动汽车充电导航。 reinforcement learning deep reinforcement learning DRL
29 Safety Representations for Safer Policy Learning 提出基于安全表征的强化学习方法,提升安全关键场景下的策略学习效率 reinforcement learning policy learning
30 Enhancing Transformer with GNN Structural Knowledge via Distillation: A Novel Approach 提出知识蒸馏框架以增强Transformer的图结构知识 representation learning distillation
31 Contrastive MIM: A Contrastive Mutual Information Framework for Unified Generative and Discriminative Representation Learning 提出对比互信息机cMIM,统一生成式与判别式表征学习。 representation learning contrastive learning
32 $Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training 提出Q#以解决LLM后训练中的KL正则化问题 reinforcement learning PPO DPO
33 Sanity Checking Causal Representation Learning on a Simple Real-World System 在真实光学系统中,因果表征学习方法未能有效恢复潜在因果因子 representation learning
34 RouteRL: Multi-agent reinforcement learning framework for urban route choice with autonomous vehicles RouteRL:用于城市自主车辆路径选择的多智能体强化学习框架 reinforcement learning
35 IL-SOAR : Imitation Learning with Soft Optimistic Actor cRitic 提出基于软乐观Actor-Critic的模仿学习框架SOAR,提升策略探索效率。 imitation learning

🔬 支柱一:机器人控制 (Robot Control) (5 篇)

#题目一句话要点标签🔗
36 Offline Reinforcement Learning via Inverse Optimization 提出基于逆优化的离线强化学习算法,解决连续状态空间下的分布偏移问题。 MPC model predictive control reinforcement learning
37 RIZE: Adaptive Regularization for Imitation Learning RIZE:基于自适应正则化的模仿学习方法,提升复杂环境下的决策鲁棒性 humanoid reinforcement learning imitation learning
38 Unifying Model Predictive Path Integral Control, Reinforcement Learning, and Diffusion Models for Optimal Control and Planning 统一MPPI控制、强化学习与扩散模型,实现最优控制与规划 trajectory optimization motion planning reinforcement learning
39 Robust Gymnasium: A Unified Modular Benchmark for Robust Reinforcement Learning 提出Robust-Gymnasium:一个用于鲁棒强化学习的统一模块化基准测试平台 sim-to-real reinforcement learning
40 Your contrastive learning problem is secretly a distribution alignment problem 将对比学习问题重构为分布对齐问题,提升表征学习效果 manipulation contrastive learning

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
41 Regional climate projections using a deep-learning-based model-ranking and downscaling framework: Application to European climate zones 提出基于深度学习的模型排名与降尺度框架以提高区域气候预测精度 spatiotemporal
42 Asymptotics of Non-Convex Generalized Linear Models in High-Dimensions: A proof of the replica formula 提出非凸广义线性模型高维渐近分析框架,严格验证统计物理学的replica公式。 AMP

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
43 NeRFCom: Feature Transform Coding Meets Neural Radiance Field for Free-View 3D Scene Semantic Transmission NeRFCom:面向自由视角3D场景语义传输的特征变换编码方法 NeRF neural radiance field

⬅️ 返回 cs.LG 首页 · 🏠 返回主页