cs.LG(2025-08-14)

📊 共 28 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (12 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (11) 支柱一:机器人控制 (Robot Control) (3) 支柱八:物理动画 (Physics-based Animation) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (12 篇)

#题目一句话要点标签🔗
1 Predictive Multimodal Modeling of Diagnoses and Treatments in EHR 提出多模态预测模型,用于电子病历中诊断和治疗的早期预测。 predictive model multimodal
2 REFN: A Reinforcement-Learning-From-Network Framework against 1-day/n-day Exploitations 提出REFN框架,利用强化学习训练LLM自主生成网络过滤器,防御1-day/n-day漏洞攻击。 reinforcement learning RLHF distillation
3 eMamba: Efficient Acceleration Framework for Mamba Models in Edge Computing eMamba:面向边缘计算的Mamba模型高效加速框架 Mamba SSM state space model
4 Multi-Agent Reinforcement Learning for Adaptive Resource Orchestration in Cloud-Native Clusters 提出基于多智能体强化学习的自适应资源编排方法,解决云原生集群中的资源动态性和调度复杂性问题。 reinforcement learning policy learning reward shaping
5 Stabilizing Long-term Multi-turn Reinforcement Learning with Gated Rewards 提出Gated Reward Accumulation以解决长时程强化学习中的奖励稀疏问题 reinforcement learning reward shaping
6 Nonlocal Monte Carlo via Reinforcement Learning 提出基于强化学习的非局部蒙特卡洛方法,加速组合优化问题求解。 reinforcement learning deep reinforcement learning
7 SynBrain: Enhancing Visual-to-fMRI Synthesis via Probabilistic Representation Learning SynBrain:提出一种基于概率表示学习的视觉到fMRI合成框架,提升神经解码性能。 representation learning
8 CURE: Critical-Token-Guided Re-Concatenation for Entropy-Collapse Prevention CURE:一种通过关键Token引导重拼接来防止熵崩溃的强化学习方法 reinforcement learning large language model
9 Retro-Expert: Collaborative Reasoning for Interpretable Retrosynthesis 提出Retro-Expert,通过协同推理实现可解释的逆合成预测 reinforcement learning large language model
10 Variance Reduced Policy Gradient Method for Multi-Objective Reinforcement Learning 提出方差缩减策略梯度方法,提升多目标强化学习的样本效率 reinforcement learning
11 Geospatial Diffusion for Land Cover Imperviousness Change Forecasting 提出基于地理空间扩散模型的土地覆盖不透水面变化预测方法 MAE spatiotemporal
12 Physics-Informed Reward Machines 提出物理信息奖励机(pRMs),提升强化学习中复杂任务的表达性和学习效率 reinforcement learning reward shaping

🔬 支柱九:具身大模型 (Embodied Foundation Models) (11 篇)

#题目一句话要点标签🔗
13 A Unified Multi-Agent Framework for Universal Multimodal Understanding and Generation 提出MAGUS,一个统一的多智能体框架,用于通用多模态理解与生成。 multimodal instruction following
14 Conditional Information Bottleneck for Multimodal Fusion: Overcoming Shortcut Learning in Sarcasm Detection 提出多模态条件信息瓶颈模型,解决讽刺检测中存在的捷径学习问题。 multimodal
15 Flexible Personalized Split Federated Learning for On-Device Fine-Tuning of Foundation Models 提出FlexP-SFL,用于在设备上对大模型进行灵活的个性化拆分联邦微调 foundation model
16 SC2Arena and StarEvolve: Benchmark and Self-Improvement Framework for LLMs in Complex Decision-Making Tasks 提出SC2Arena和StarEvolve,用于评估和提升LLM在复杂决策任务中的能力。 generalist agent large language model
17 A Vision-Language Pre-training Model-Guided Approach for Mitigating Backdoor Attacks in Federated Learning 提出CLIP-Fed以解决联邦学习中的后门攻击问题 large language model multimodal
18 Hybrid-Hierarchical Fashion Graph Attention Network for Compatibility-Oriented and Personalized Outfit Recommendation 提出FGAT:混合层级时尚图注意力网络,用于兼容性和个性化服装推荐 multimodal
19 BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining BeyondWeb:通过扩展合成数据,实现万亿级预训练的经验总结。 large language model
20 APFL: Analytic Personalized Federated Learning via Dual-Stream Least Squares 提出APFL:通过双流最小二乘实现的解析个性化联邦学习,解决非独立同分布数据下的个性化建模问题。 foundation model
21 Advancing Autonomous Incident Response: Leveraging LLMs and Cyber Threat Intelligence 提出基于RAG的LLM框架,利用网络威胁情报提升自动化事件响应效率。 large language model
22 Technical Report: Facilitating the Adoption of Causal Inference Methods Through LLM-Empowered Co-Pilot CATE-B:基于LLM的因果推理协同助手,降低治疗效果估计门槛 large language model
23 RealAC: A Domain-Agnostic Framework for Realistic and Actionable Counterfactual Explanations RealAC:一种领域无关的现实且可执行的反事实解释框架 large language model

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
24 Projected Coupled Diffusion for Test-Time Constrained Joint Generation 提出投影耦合扩散(PCD)框架,用于测试时约束下的多扩散模型联合生成。 manipulation motion planning
25 SHLIME: Foiling adversarial attacks fooling SHAP and LIME SHLIME:通过对抗攻击揭示并防御针对SHAP和LIME的欺骗行为 manipulation
26 Pinet: Optimizing hard-constrained neural networks with orthogonal projection layers 提出Πnet,通过正交投影层优化带硬约束的神经网络 motion planning

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
27 STRelay: A Universal Spatio-Temporal Relaying Framework for Location Prediction over Human Trajectory Data STRelay:一种通用时空传递框架,用于提升人类轨迹数据的定位预测性能。 spatiotemporal
28 STRelay: A Universal Spatio-Temporal Relaying Framework for Location Prediction with Future Spatiotemporal Contexts 提出STRelay框架以提升位置预测精度 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页