cs.LG(2025-07-31)

📊 共 23 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11) 支柱九:具身大模型 (Embodied Foundation Models) (9 🔗2) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 CX-Mind: A Pioneering Multimodal Large Language Model for Interleaved Reasoning in Chest X-ray via Curriculum-Guided Reinforcement Learning CX-Mind:基于课程引导强化学习的胸部X光片多模态大语言模型,实现交错推理 reinforcement learning spatiotemporal large language model
2 DepMicroDiff: Diffusion-Based Dependency-Aware Multimodal Imputation for Microbiome Data DepMicroDiff:结合依赖感知的扩散模型用于微生物组数据多模态补全 MAE large language model multimodal
3 INSPIRE-GNN: Intelligent Sensor Placement to Improve Sparse Bicycling Network Prediction via Reinforcement Learning Boosted Graph Neural Networks 提出INSPIRE-GNN,通过强化学习优化的图神经网络解决稀疏自行车网络预测问题。 reinforcement learning MAE
4 One-Step Flow Policy Mirror Descent 提出FPMD算法,实现Flow Policy单步采样,加速在线强化学习推理。 reinforcement learning diffusion policy flow matching
5 Merging Memory and Space: A State Space Neural Operator 提出状态空间神经算子(SS-NO)用于高效学习时变偏微分方程的解算子。 SSM state space model spatiotemporal
6 RecoMind: A Reinforcement Learning Framework for Optimizing In-Session User Satisfaction in Recommendation Systems RecoMind:一种基于强化学习的框架,用于优化推荐系统中会话内的用户满意度 reinforcement learning
7 RL as Regressor: A Reinforcement Learning Approach for Function Approximation 提出基于强化学习的回归方法,解决传统回归损失函数的局限性 reinforcement learning
8 Benchmarking Partial Observability in Reinforcement Learning with a Suite of Memory-Improvable Domains 提出POBAX:用于强化学习中部分可观测性基准测试的JAX开源库 reinforcement learning
9 Hierarchical Message-Passing Policies for Multi-Agent Reinforcement Learning 提出层次化消息传递策略以解决多智能体强化学习中的协调问题 reinforcement learning
10 GraphRAG-R1: Graph Retrieval-Augmented Generation with Process-Constrained Reinforcement Learning GraphRAG-R1:提出基于过程约束强化学习的图检索增强生成框架,提升LLM多跳推理能力。 reinforcement learning
11 Predicting Large-scale Urban Network Dynamics with Energy-informed Graph Neural Diffusion 提出能量感知的图神经网络扩散模型,用于预测大规模城市网络动态。 predictive model spatiotemporal

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
12 TriP-LLM: A Tri-Branch Patch-wise Large Language Model Framework for Time-Series Anomaly Detection 提出TriP-LLM,用于时序异常检测的三分支分片大语言模型框架。 large language model multimodal
13 MoLAN: A Unified Modality-Aware Noise Dynamic Editing Framework for Multimodal Sentiment Analysis 提出MoLAN框架,通过模态感知噪声动态编辑提升多模态情感分析性能 multimodal
14 Robust Classification under Noisy Labels: A Geometry-Aware Reliability Framework for Foundation Models 提出一种几何感知可靠性框架,用于提升带噪标签下基础模型的鲁棒分类性能。 foundation model
15 A Bayesian Hybrid Parameter-Efficient Fine-Tuning Method for Large Language Models 提出贝叶斯混合参数高效微调方法,提升大语言模型在商业场景下的可靠性和适应性。 large language model
16 Zero-Shot Document Understanding using Pseudo Table of Contents-Guided Retrieval-Augmented Generation 提出DocsRay以解决复杂文档理解问题 large language model multimodal
17 From LLMs to Edge: Parameter-Efficient Fine-Tuning on Edge Devices 探索边缘设备上的参数高效微调:针对卷积神经网络的LoRA、DoRA和GaLore研究 large language model
18 Differentially Private Clipped-SGD: High-Probability Convergence with Arbitrary Clipping Level 提出差分隐私剪切SGD以解决固定剪切水平下的高概率收敛问题 large language model
19 Learning Like Humans: Resource-Efficient Federated Fine-Tuning through Cognitive Developmental Stages 提出DevFT,通过认知发展阶段的联邦微调,实现资源高效的大语言模型边缘部署。 large language model
20 OKG-LLM: Aligning Ocean Knowledge Graph with Observation Data via LLMs for Global Sea Surface Temperature Prediction OKG-LLM:利用大语言模型对齐海洋知识图谱与观测数据,用于全球海表温度预测 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
21 Policy Learning from Large Vision-Language Model Feedback without Reward Modeling 提出PLARE:利用视觉-语言模型反馈进行离线强化学习,无需奖励建模。 manipulation reinforcement learning policy learning
22 NaN-Propagation: A Novel Method for Sparsity Detection in Black-Box Computational Functions 提出NaN传播方法,用于黑盒计算函数中的稀疏性检测,提升梯度计算效率。 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
23 Designing Dynamic Pricing for Bike-sharing Systems via Differentiable Agent-based Simulation 提出基于可微Agent仿真的动态定价方法,用于平衡共享单车系统供需。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页