cs.LG(2025-10-27)

📊 共 24 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (12) 支柱九:具身大模型 (Embodied Foundation Models) (8 🔗2) 支柱八:物理动画 (Physics-based Animation) (2) 支柱四:生成式动作 (Generative Motion) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (12 篇)

#题目一句话要点标签🔗
1 ZeroFlood: A Geospatial Foundation Model for Data-Efficient Flood Susceptibility Mapping ZeroFlood:一种用于数据高效洪水易感性制图的地理空间基础模型 representation learning foundation model
2 Adapting Interleaved Encoders with PPO for Language-Guided Reinforcement Learning in BabyAI 提出基于PPO的交错编码器,用于BabyAI中的语言引导强化学习 reinforcement learning deep reinforcement learning PPO
3 Debiasing Reward Models by Representation Learning with Guarantees 提出一种基于表征学习的解偏方法,用于提升奖励模型的鲁棒性。 reinforcement learning representation learning large language model
4 Lightweight Robust Direct Preference Optimization 提出DPO-PRO,通过轻量级分布鲁棒优化提升DPO在噪声环境下的性能 DPO direct preference optimization large language model
5 On the Fundamental Limitations of Decentralized Learnable Reward Shaping in Cooperative Multi-Agent Reinforcement Learning DMARL-RSA揭示了去中心化可学习奖励塑造在合作多智能体强化学习中的局限性 reinforcement learning reward shaping
6 GIFT: Group-relative Implicit Fine Tuning Integrates GRPO with DPO and UNA 提出GIFT框架,结合GRPO、DPO和UNA优势,高效对齐LLM。 reinforcement learning PPO DPO
7 The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation 提出基于max@k优化的强化学习方法,提升LLM在Best-of-N采样中的性能。 reinforcement learning large language model
8 Offline Preference Optimization via Maximum Marginal Likelihood Estimation 提出基于最大边缘似然估计的离线偏好优化方法MMPO,简化LLM对齐流程。 reinforcement learning RLHF large language model
9 Learning to Reason Efficiently with Discounted Reinforcement Learning 提出基于折扣强化学习的高效推理方法,缩短推理链并保持准确性 reinforcement learning
10 Towards Stable and Effective Reinforcement Learning for Mixture-of-Experts 提出基于路由感知的重采样方法,稳定MoE模型的强化学习训练。 reinforcement learning
11 Sentinel: Dynamic Knowledge Distillation for Personalized Federated Intrusion Detection in Heterogeneous IoT Networks Sentinel:异构IoT网络中基于动态知识蒸馏的个性化联邦入侵检测 distillation
12 Coupled Flow Matching 提出耦合流匹配(CPFM),实现可控降维和高保真重建。 flow matching

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
13 MUStReason: A Benchmark for Diagnosing Pragmatic Reasoning in Video-LMs for Multimodal Sarcasm Detection 提出MUStReason基准,诊断视频语言模型在多模态讽刺检测中的语用推理能力 multimodal
14 ScaLoRA: Optimally Scaled Low-Rank Adaptation for Efficient High-Rank Fine-Tuning ScaLoRA:优化缩放的低秩适配,实现高效高秩微调 large language model
15 Schrodinger Neural Network and Uncertainty Quantification: Quantum Machine 提出薛定谔神经网络以解决不确定性量化问题 multimodal
16 Beyond Prompt Engineering: Neuro-Symbolic-Causal Architecture for Robust Multi-Objective AI Agents 提出神经-符号-因果架构Chimera,提升多目标AI Agent在电商环境中的鲁棒性 large language model
17 PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization 提出PAHQ,通过混合精度量化加速自动电路发现,提升大语言模型可解释性。 large language model
18 Increasing LLM Coding Capabilities through Diverse Synthetic Coding Tasks 提出一种基于多样化合成编码任务的LLM能力提升方法 large language model
19 LLM Meets Diffusion: A Hybrid Framework for Crystal Material Generation CrysLLMGen:融合LLM与扩散模型的晶体材料生成混合框架 large language model
20 Can Language Models Compose Skills In-Context? 研究表明语言模型在上下文学习中组合技能面临挑战,并提出改进方法。 chain-of-thought

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
21 Modeling Biological Multifunctionality with Echo State Networks 利用回声状态网络建模生物多功能性,有效重现生物系统时空动态行为。 spatiotemporal
22 A Physics-informed Multi-resolution Neural Operator 提出物理信息多分辨率神经算子,解决数据匮乏和分辨率不一致问题 spatiotemporal

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
23 Introducing physics-informed generative models for targeting structural novelty in the exploration of chemical space 提出物理信息生成模型以探索化学空间中的结构新颖性 physics-informed diffusion physically plausible

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
24 Learning Interpretable Features in Audio Latent Spaces via Sparse Autoencoders 提出基于稀疏自编码器的音频隐空间可解释特征学习框架,用于分析和控制AI音乐生成。 manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页