cs.LG(2025-08-28)

📊 共 29 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (13 🔗4) 支柱九:具身大模型 (Embodied Foundation Models) (12 🔗1) 支柱一:机器人控制 (Robot Control) (3) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (13 篇)

#题目一句话要点标签🔗
1 Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance 提出RLG:一种基于强化学习引导的扩散模型推理时对齐控制方法 reinforcement learning flow matching RLHF
2 Masked Autoencoders for Ultrasound Signals: Robust Representation Learning for Downstream Applications 提出基于掩码自编码器的超声信号表征学习方法,提升下游任务性能。 representation learning masked autoencoder MAE
3 cMALC-D: Contextual Multi-Agent LLM-Guided Curriculum Learning with Diversity-Based Context Blending 提出cMALC-D框架,利用LLM引导的课程学习提升上下文多智能体强化学习的泛化性。 reinforcement learning curriculum learning large language model
4 Token Buncher: Shielding LLMs from Harmful Reinforcement Learning Fine-Tuning TokenBuncher:防御基于强化学习的大语言模型有害微调 reinforcement learning large language model
5 Beyond Prediction: Reinforcement Learning as the Defining Leap in Healthcare AI 探索强化学习在医疗AI中的应用:从预测到主动干预的范式转变 reinforcement learning policy learning reward design
6 GSTBench: A Benchmark Study on the Transferability of Graph Self-Supervised Learning GSTBench:图自监督学习可迁移性基准测试,揭示现有方法泛化能力不足。 representation learning masked autoencoder foundation model
7 Learning Robust Spatial Representations from Binaural Audio through Feature Distillation 提出基于特征蒸馏的预训练方法,提升双耳音频空间表征的鲁棒性 representation learning distillation
8 QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning 提出QTMRL,一种基于多指标引导强化学习的量化交易决策智能体。 reinforcement learning policy learning
9 Uncovering the Spectral Bias in Diagonal State Space Models 提出S4D-DFouT,揭示对角状态空间模型中的频谱偏置并提升长序列建模性能 SSM state space model
10 Mirage or Method? How Model-Task Alignment Induces Divergent RL Conclusions 揭示模型-任务对齐对LLM中强化学习结论的影响,区分反直觉现象的适用条件。 reinforcement learning large language model
11 Automating the Deep Space Network Data Systems; A Case Study in Adaptive Anomaly Detection through Agentic AI 利用Agentic AI实现深空网络数据系统自动化异常检测 reinforcement learning large language model
12 VarDiU: A Variational Diffusive Upper Bound for One-Step Diffusion Distillation 提出VarDiU:一种变分扩散上界,用于单步扩散蒸馏 distillation
13 Rethinking Transformer Connectivity: TLinFormer, A Path to Exact, Full Context-Aware Linear Attention TLinFormer:一种精确且具备完整上下文感知能力的线性注意力机制,解决Transformer长序列瓶颈。 linear attention

🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)

#题目一句话要点标签🔗
14 Turning Tabular Foundation Models into Graph Foundation Models 提出G2T-FM框架,利用表格基础模型解决图机器学习中异构节点特征问题 foundation model
15 Provable Benefits of In-Tool Learning for Large Language Models 证明工具学习在大语言模型中优于权重记忆,实现无限事实回忆 large language model
16 On Identifying Why and When Foundation Models Perform Well on Time-Series Forecasting Using Automated Explanations and Rating 结合可解释AI与评分驱动解释,剖析时间序列预测中各类模型优劣势 foundation model
17 GDS Agent for Graph Algorithmic Reasoning 提出GDS Agent,利用图算法工具增强LLM在图结构数据上的推理能力 large language model multimodal
18 LLM Chatbot-Creation Approaches 对比低代码平台与定制化方案,探索LLM聊天机器人在教育场景的应用 large language model multimodal
19 Manifold Trajectories in Next-Token Prediction: From Replicator Dynamics to Softmax Equilibrium 研究Transformer解码过程中的概率单纯形轨迹,揭示Softmax均衡的动态特性 large language model
20 Adaptive LLM Routing under Budget Constraints 提出PILOT:预算约束下基于偏好先验的自适应LLM路由方法 large language model
21 SemSR: Semantics aware robust Session-based Recommendations SemSR:一种语义感知的鲁棒会话推荐模型,融合LLM与数据驱动方法。 large language model
22 MERIT: Maximum-normalized Element-wise Ratio for Language Model Large-batch Training 提出MERIT优化器,通过最大范数归一化元素级比例提升语言模型大批量训练性能。 large language model
23 Towards Mitigating Excessive Forgetting in LLM Unlearning via Entanglement-Guidance with Proxy Constraint 提出EGUP框架,通过纠缠引导和代理约束缓解LLM非学习中的过度遗忘问题 large language model
24 Developing a Multi-Modal Machine Learning Model For Predicting Performance of Automotive Hood Frames 提出多模态机器学习模型,加速汽车引擎盖框架性能预测与设计迭代。 multimodal
25 Poison Once, Refuse Forever: Weaponizing Alignment for Injecting Bias in LLMs 提出Subversive Alignment Injection,利用对齐机制向LLM注入偏见和实施定向审查。 large language model

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
26 Train-Once Plan-Anywhere Kinodynamic Motion Planning via Diffusion Trees DiTree:结合扩散树与采样规划,实现一次训练、多场景通用的运动规划 sim-to-real motion planning diffusion policy
27 An Explainable, Attention-Enhanced, Bidirectional Long Short-Term Memory Neural Network for Joint 48-Hour Forecasting of Temperature, Irradiance, and Relative Humidity 提出一种可解释的、注意力增强的BiLSTM网络,用于联合预测未来48小时的气象数据,以支持智能暖通空调系统的模型预测控制。 MPC model predictive control
28 Enhancing Resilience for IoE: A Perspective of Networking-Level Safeguard 提出基于图结构学习的IoE网络安全防御框架,增强抵御对抗攻击的韧性。 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
29 Spatiotemporal EEG-Based Emotion Recognition Using SAM Ratings from Serious Games with Hybrid Deep Learning 提出统一的多粒度EEG情感分类框架以解决现有方法的局限性 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页