cs.LG(2025-10-20)

📊 共 42 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (18 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (14 🔗1) 支柱八:物理动画 (Physics-based Animation) (4 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱五:交互与反应 (Interaction & Reaction) (2 🔗1) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (18 篇)

#题目一句话要点标签🔗
1 An Enhanced Dual Transformer Contrastive Network for Multimodal Sentiment Analysis 提出双Transformer对比网络DTCN,用于增强多模态情感分析性能。 representation learning contrastive learning multimodal
2 Inference-Time Compute Scaling For Flow Matching 针对Flow Matching,提出保持线性插值的推理时计算缩放方法,提升生成质量。 flow matching large language model
3 UniRL-Zero: Reinforcement Learning on Unified Models with Joint Language Model and Diffusion Model Experts UniRL-Zero:提出融合语言模型和扩散模型专家的统一强化学习框架 reinforcement learning multimodal
4 Plasma Shape Control via Zero-shot Generative Reinforcement Learning 提出基于零样本生成强化学习的等离子体形状控制方法 reinforcement learning imitation learning representation learning
5 Diffusion Models as Dataset Distillation Priors 提出DAP:利用扩散模型先验提升数据集蒸馏的代表性,无需额外训练。 distillation foundation model
6 Rewarding the Journey, Not Just the Destination: A Composite Path and Answer Self-Scoring Reward Mechanism for Test-Time Reinforcement Learning 提出COMPASS,解决无监督测试时强化学习中LLM奖励估计难题 reinforcement learning large language model
7 Fine-tuning Flow Matching Generative Models with Intermediate Feedback 提出AC-Flow框架,通过中间反馈微调Flow Matching生成模型,提升文图对齐。 flow matching reward shaping
8 TrajMamba: An Efficient and Semantic-rich Vehicle Trajectory Pre-training Model TrajMamba:高效且语义丰富的车辆轨迹预训练模型,解决轨迹数据利用难题。 Mamba distillation
9 Provably Optimal Reinforcement Learning under Safety Filtering 提出安全过滤下的可证明最优强化学习方法,解决安全约束下的性能下降问题 reinforcement learning
10 An Empirical Study of Lagrangian Methods in Safe Reinforcement Learning 研究安全强化学习中拉格朗日方法的性能与稳定性,揭示自动更新乘子的挑战与改进方向。 reinforcement learning
11 Demystifying Transition Matching: When and Why It Can Beat Flow Matching 揭示Transition Matching优势:在分离模态和非零方差目标分布下超越Flow Matching flow matching
12 Batch Distillation Data for Developing Machine Learning Anomaly Detection Methods 构建批量精馏异常检测机器学习方法开发所需的大规模开放实验数据集 distillation
13 R2L: Reliable Reinforcement Learning: Guaranteed Return & Reliable Policies in Reinforcement Learning 提出R2L:一种可靠强化学习方法,保证回报并优化不确定性下的策略。 reinforcement learning
14 Efficient Algorithms for Mitigating Uncertainty and Risk in Reinforcement Learning 针对不确定性强化学习,提出高效算法以优化风险指标并提升策略性能 reinforcement learning
15 Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs 提出自洽性认证框架,为LLM推理提供统计保证和测试时训练方法。 reinforcement learning large language model
16 TabR1: Taming GRPO for tabular reasoning LLMs TabR1:提出基于GRPO的表格推理LLM,提升零样本和小样本学习能力 reinforcement learning large language model
17 Optimizing Energy Management of Smart Grid using Reinforcement Learning aided by Surrogate models built using Physics-informed Neural Networks 利用物理信息神经网络辅助强化学习优化智能电网能量管理 reinforcement learning
18 EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning EvoSyn:面向可验证学习的通用进化数据合成框架 reinforcement learning distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (14 篇)

#题目一句话要点标签🔗
19 MILES: Modality-Informed Learning Rate Scheduler for Balancing Multimodal Learning 提出MILES:一种模态感知学习率调度器,用于平衡多模态学习。 multimodal
20 Foundation Models for Discovery and Exploration in Chemical Space MIST分子基石模型:助力化学空间探索与材料发现 foundation model
21 Quantifying Multimodal Imbalance: A GMM-Guided Adaptive Loss for Audio-Visual Learning 提出GMM引导的自适应损失,量化多模态不平衡并提升音视频学习性能 multimodal
22 Deeper with Riemannian Geometry: Overcoming Oversmoothing and Oversquashing for Graph Foundation Models 提出基于黎曼几何的GBN网络,解决图神经网络的过平滑和过挤压问题 foundation model
23 Efficient Long-context Language Model Training by Core Attention Disaggregation 提出核心注意力解耦(CAD)技术,高效训练长文本语言模型。 large language model
24 Any-Depth Alignment: Unlocking Innate Safety Alignment of LLMs to Any-Depth Any-Depth Alignment (ADA)解锁LLM的深度安全对齐,无需模型参数修改。 large language model
25 Benchmarking Probabilistic Time Series Forecasting Models on Neural Activity 评估概率时间序列预测模型在神经活动预测中的性能,为闭环控制提供基础。 foundation model
26 Unbiased Gradient Low-Rank Projection 提出GUM:一种基于层采样和Muon算法的无偏梯度低秩投影优化方法 large language model
27 MARS-M: When Variance Reduction Meets Matrices 提出MARS-M优化器以提升大规模神经网络训练效率 large language model
28 Enabling Fine-Grained Operating Points for Black-Box LLMs 针对黑盒LLM,提出提升操作粒度且不损失性能的有效方法 large language model
29 LILO: Bayesian Optimization with Interactive Natural Language Feedback 提出LILO框架,利用自然语言反馈进行贝叶斯优化,提升人机交互效率。 large language model
30 I-RAVEN-X: Benchmarking Generalization and Robustness of Analogical and Mathematical Reasoning in Large Language and Reasoning Models I-RAVEN-X:用于评估LLM/LRM类比和数学推理泛化性与鲁棒性的基准 large language model
31 Localist LLMs with Recruitment Learning 提出基于招募学习的局部化LLM框架,实现可解释性与高性能的动态平衡。 large language model
32 Auto-Rubric: Learning to Extract Generalizable Criteria for Reward Modeling 提出Auto-Rubric框架,通过学习可泛化准则提升奖励模型的数据效率和可解释性 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (4 篇)

#题目一句话要点标签🔗
33 MEG-GPT: A transformer-based foundation model for magnetoencephalography data MEG-GPT:基于Transformer的脑磁图数据基础模型,提升神经解码性能。 spatiotemporal foundation model
34 CEPerFed: Communication-Efficient Personalized Federated Learning for Multi-Pulse MRI Classification 提出CEPerFed,一种面向多脉冲MRI分类的通信高效个性化联邦学习方法 PULSE
35 SAFE-D: A Spatiotemporal Detection Framework for Abnormal Driving Among Parkinson's Disease-like Drivers SAFE-D:针对帕金森病患者驾驶异常行为的时空检测框架 spatiotemporal
36 Cross-Domain Long-Term Forecasting: Radiation Dose from Sparse Neutron Sensor via Spatio-Temporal Operator Network 提出STONe,通过时空算子网络解决稀疏中子传感器下的跨域长期辐射剂量预测问题 spatiotemporal

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
37 D2C-HRHR: Discrete Actions with Double Distributional Critics for High-Risk-High-Return Tasks 提出D2C-HRHR框架,解决高风险高回报任务中多模态动作分布的强化学习问题 locomotion manipulation reinforcement learning
38 Closing the Sim2Real Performance Gap in RL 提出双层强化学习框架,直接优化仿真参数以缩小Sim2Real性能差距 sim2real

🔬 支柱五:交互与反应 (Interaction & Reaction) (2 篇)

#题目一句话要点标签🔗
39 Quantum Federated Learning: Architectural Elements and Future Directions 综述量子联邦学习:架构要素、分类体系与未来方向 OMOMO
40 RINS-T: Robust Implicit Neural Solvers for Time Series Linear Inverse Problems RINS-T:针对时间序列线性反问题的鲁棒隐式神经求解器,无需预训练。 IMoS

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
41 Attention-Guided Deep Adversarial Temporal Subspace Clustering (A-DATSC) Model for multivariate spatiotemporal data 提出A-DATSC模型,用于解决多变量时空数据深度子空间聚类问题。 spatial relationship spatiotemporal

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
42 On-the-Fly OVD Adaptation with FLAME: Few-shot Localization via Active Marginal-Samples Exploration 提出FLAME框架,通过主动边缘样本探索实现开放词汇目标检测的快速领域自适应。 open-vocabulary open vocabulary foundation model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页