cs.LG（2025-10-27）

📊 共 24 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (12) 支柱九：具身大模型 (Embodied Foundation Models) (8 🔗2) 支柱八：物理动画 (Physics-based Animation) (2) 支柱四：生成式动作 (Generative Motion) (1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
1	ZeroFlood: A Geospatial Foundation Model for Data-Efficient Flood Susceptibility Mapping	ZeroFlood：一种用于数据高效洪水易感性制图的地理空间基础模型	representation learning foundation model
2	Adapting Interleaved Encoders with PPO for Language-Guided Reinforcement Learning in BabyAI	提出基于PPO的交错编码器，用于BabyAI中的语言引导强化学习	reinforcement learning deep reinforcement learning PPO
3	Debiasing Reward Models by Representation Learning with Guarantees	提出一种基于表征学习的解偏方法，用于提升奖励模型的鲁棒性。	reinforcement learning representation learning large language model
4	Lightweight Robust Direct Preference Optimization	提出DPO-PRO，通过轻量级分布鲁棒优化提升DPO在噪声环境下的性能	DPO direct preference optimization large language model
5	On the Fundamental Limitations of Decentralized Learnable Reward Shaping in Cooperative Multi-Agent Reinforcement Learning	DMARL-RSA揭示了去中心化可学习奖励塑造在合作多智能体强化学习中的局限性	reinforcement learning reward shaping
6	GIFT: Group-relative Implicit Fine Tuning Integrates GRPO with DPO and UNA	提出GIFT框架，结合GRPO、DPO和UNA优势，高效对齐LLM。	reinforcement learning PPO DPO
7	The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation	提出基于max@k优化的强化学习方法，提升LLM在Best-of-N采样中的性能。	reinforcement learning large language model
8	Offline Preference Optimization via Maximum Marginal Likelihood Estimation	提出基于最大边缘似然估计的离线偏好优化方法MMPO，简化LLM对齐流程。	reinforcement learning RLHF large language model
9	Learning to Reason Efficiently with Discounted Reinforcement Learning	提出基于折扣强化学习的高效推理方法，缩短推理链并保持准确性	reinforcement learning
10	Towards Stable and Effective Reinforcement Learning for Mixture-of-Experts	提出基于路由感知的重采样方法，稳定MoE模型的强化学习训练。	reinforcement learning
11	Sentinel: Dynamic Knowledge Distillation for Personalized Federated Intrusion Detection in Heterogeneous IoT Networks	Sentinel：异构IoT网络中基于动态知识蒸馏的个性化联邦入侵检测	distillation
12	Coupled Flow Matching	提出耦合流匹配（CPFM），实现可控降维和高保真重建。	flow matching

🔬 支柱九：具身大模型 (Embodied Foundation Models) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
13	MUStReason: A Benchmark for Diagnosing Pragmatic Reasoning in Video-LMs for Multimodal Sarcasm Detection	提出MUStReason基准，诊断视频语言模型在多模态讽刺检测中的语用推理能力	multimodal
14	ScaLoRA: Optimally Scaled Low-Rank Adaptation for Efficient High-Rank Fine-Tuning	ScaLoRA：优化缩放的低秩适配，实现高效高秩微调	large language model
15	Schrodinger Neural Network and Uncertainty Quantification: Quantum Machine	提出薛定谔神经网络以解决不确定性量化问题	multimodal
16	Beyond Prompt Engineering: Neuro-Symbolic-Causal Architecture for Robust Multi-Objective AI Agents	提出神经-符号-因果架构Chimera，提升多目标AI Agent在电商环境中的鲁棒性	large language model
17	PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization	提出PAHQ，通过混合精度量化加速自动电路发现，提升大语言模型可解释性。	large language model	✅
18	Increasing LLM Coding Capabilities through Diverse Synthetic Coding Tasks	提出一种基于多样化合成编码任务的LLM能力提升方法	large language model
19	LLM Meets Diffusion: A Hybrid Framework for Crystal Material Generation	CrysLLMGen：融合LLM与扩散模型的晶体材料生成混合框架	large language model	✅
20	Can Language Models Compose Skills In-Context?	研究表明语言模型在上下文学习中组合技能面临挑战，并提出改进方法。	chain-of-thought

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
21	Modeling Biological Multifunctionality with Echo State Networks	利用回声状态网络建模生物多功能性，有效重现生物系统时空动态行为。	spatiotemporal
22	A Physics-informed Multi-resolution Neural Operator	提出物理信息多分辨率神经算子，解决数据匮乏和分辨率不一致问题	spatiotemporal

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
23	Introducing physics-informed generative models for targeting structural novelty in the exploration of chemical space	提出物理信息生成模型以探索化学空间中的结构新颖性	physics-informed diffusion physically plausible

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
24	Learning Interpretable Features in Audio Latent Spaces via Sparse Autoencoders	提出基于稀疏自编码器的音频隐空间可解释特征学习框架，用于分析和控制AI音乐生成。	manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页