cs.LG(2025-10-14)

📊 共 37 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (21 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (12) 支柱八:物理动画 (Physics-based Animation) (2) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (21 篇)

#题目一句话要点标签🔗
1 A Multimodal XAI Framework for Trustworthy CNNs and Bias Detection in Deep Representation Learning 提出多模态XAI框架,用于提升CNN可信度并检测深度表征学习中的偏见 representation learning multimodal
2 Expert or not? assessing data quality in offline reinforcement learning 提出Bellman Wasserstein距离(BWD)用于评估离线强化学习数据集质量 reinforcement learning offline RL offline reinforcement learning
3 Deep SPI: Safe Policy Improvement via World Models 提出DeepSPI算法,通过世界模型实现安全策略改进,提升在线强化学习性能 reinforcement learning PPO offline RL
4 Rethinking the Role of Dynamic Sparse Training for Scalable Deep Reinforcement Learning 提出模块化动态稀疏训练框架MST,提升深度强化学习模型的可扩展性。 reinforcement learning deep reinforcement learning DRL
5 Shielded RecRL: Explanation Generation for Recommender Systems without Ranking Degradation 提出Shielded RecRL,在不降低排序性能的前提下为推荐系统生成解释 reinforcement learning PPO RLHF
6 Stratos: An End-to-End Distillation Pipeline for Customized LLMs under Distributed Cloud Environments Stratos:分布式云环境下定制化LLM端到端蒸馏流水线 teacher-student distillation large language model
7 GraphShaper: Geometry-aware Alignment for Improving Transfer Learning in Text-Attributed Graphs 提出GraphShaper以解决图结构多样性导致的迁移学习问题 contrastive learning large language model foundation model
8 K-frames: Scene-Driven Any-k Keyframe Selection for long video understanding 提出K-frames:一种场景驱动的任意数量关键帧选择方法,用于长视频理解。 reinforcement learning large language model multimodal
9 Self-Verifying Reflection Helps Transformers with CoT Reasoning 提出自验证反思框架,提升小型Transformer在CoT推理中的性能 reinforcement learning large language model chain-of-thought
10 Escaping Local Optima in the Waddington Landscape: A Two-Stage TRPO-PPO Approach for Single-Cell Perturbation Analysis 提出一种两阶段TRPO-PPO算法,用于单细胞扰动分析中逃离Waddington景观局部最优。 reinforcement learning PPO
11 MEASURE: Multi-scale Minimal Sufficient Representation Learning for Domain Generalization in Sleep Staging 提出MEASURE框架,通过多尺度最小充分表征学习提升睡眠分期领域泛化能力。 representation learning contrastive learning
12 Pruning Cannot Hurt Robustness: Certified Trade-offs in Reinforcement Learning 提出剪枝策略,提升强化学习在对抗环境下的鲁棒性并进行理论保证。 reinforcement learning
13 Laminar: A Scalable Asynchronous RL Post-Training Framework Laminar:一种可扩展的异步RL后训练框架,解决GPU利用率低的问题。 reinforcement learning large language model
14 Rethinking Knowledge Distillation: A Data Dependent Regulariser With a Negative Asymmetric Payoff 重新审视知识蒸馏:一种具有负非对称收益的数据依赖正则化方法 distillation
15 Finite-time Convergence Analysis of Actor-Critic with Evolving Reward 提出有限时间收敛分析以解决动态奖励问题 reinforcement learning curriculum learning reward shaping
16 Heterogeneous RBCs via deep multi-agent reinforcement learning 提出MARL-BC框架,结合深度多智能体强化学习与RBC模型,模拟异质性宏观经济。 reinforcement learning
17 Diffusion Models for Reinforcement Learning: Foundations, Taxonomy, and Development 综述扩散模型在强化学习中的应用:理论基础、分类与发展 reinforcement learning
18 Chimera: State Space Models Beyond Sequences Chimera:提出一种超越序列建模的状态空间模型,统一处理不同拓扑结构的数据。 state space model
19 Can GRPO Help LLMs Transcend Their Pretraining Origin? 研究表明GRPO对LLM的增强受限于预训练偏差,仅能微调而非创造新能力 reinforcement learning large language model
20 Mamba Can Learn Low-Dimensional Targets In-Context via Test-Time Feature Learning 提出Mamba模型以解决低维目标的上下文学习问题 Mamba
21 Towards Fast Coarse-graining and Equation Discovery with Foundation Inference Models 利用预训练Foundation Inference Models加速粗粒化和方程发现 latent dynamics representation learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)

#题目一句话要点标签🔗
22 Adaptive vector steering: A training-free, layer-wise intervention for hallucination mitigation in large audio and multimodal models 提出自适应向量引导(AVS)方法,无需训练即可缓解大型音频和多模态模型中的幻觉问题 large language model multimodal
23 An Investigation of Memorization Risk in Healthcare Foundation Models 提出黑箱评估测试以解决医疗基础模型的记忆风险问题 foundation model
24 CoRA: Covariate-Aware Adaptation of Time Series Foundation Models 提出CoRA框架,通过协变量自适应提升时间序列基础模型在实际预测任务中的性能。 foundation model
25 On Foundation Models for Temporal Point Processes to Accelerate Scientific Discovery 提出时间点过程基础模型,加速医学、地震学等领域科学发现 foundation model
26 Max It or Miss It: Benchmarking LLM On Solving Extremal Problems 提出ExtremBench基准以评估LLM在极值问题求解中的能力 large language model chain-of-thought
27 Simulation-Based Pretraining and Domain Adaptation for Astronomical Time Series with Minimal Labeled Data 利用模拟数据预训练和领域自适应,解决天文时间序列分析中标注数据稀缺问题 zero-shot transfer
28 Data-Model Co-Evolution: Growing Test Sets to Refine LLM Behavior 提出数据-模型协同进化框架,通过增长测试集优化LLM行为 large language model
29 CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding Compression 提出CARVQ:一种基于校正适配器和分组残差向量量化的LLM嵌入压缩方法 large language model
30 Same model, better performance: the impact of shuffling on DNA Language Models benchmarking 预洗牌数据提升DNA语言模型基准测试的稳定性和可靠性 large language model
31 HiLoRA: Adaptive Hierarchical LoRA Routing for Training-Free Domain Generalization 提出HiLoRA,一种免训练的自适应分层LoRA路由框架,用于领域泛化。 large language model
32 Lifting Manifolds to Mitigate Pseudo-Alignment in LLM4TS 提出TimeSUP,通过提升流形维度缓解LLM4TS中的伪对齐问题 large language model
33 FedLoDrop: Federated LoRA with Dropout for Generalized LLM Fine-tuning 提出FedLoDrop以解决大规模语言模型微调中的过拟合问题 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
34 Bridging Idealized and Operational Models: An Explainable AI Framework for Earth System Emulators 提出基于可解释AI的地球系统模拟器框架,融合理想化模型与运行模型优势 spatiotemporal
35 Leveraging Teleconnections with Physics-Informed Graph Attention Networks for Long-Range Extreme Rainfall Forecasting in Thailand 提出物理信息图注意力网络,结合极值分析,提升泰国极端降雨长期预测精度。 spatiotemporal

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
36 Traveling Salesman-Based Token Ordering Improves Stability in Homomorphically Encrypted Language Models 提出基于旅行商问题的Token重排序方法,提升同态加密语言模型的稳定性 OMOMO large language model

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
37 Keep Calm and Avoid Harmful Content: Concept Alignment and Latent Manipulation Towards Safer Answers 提出CALM:通过概念对齐和隐空间操控,提升大语言模型安全性 manipulation large language model

⬅️ 返回 cs.LG 首页 · 🏠 返回主页