cs.LG（2026-03-30）

📊 共 22 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (8) 支柱九：具身大模型 (Embodied Foundation Models) (8 🔗1) 支柱一：机器人控制 (Robot Control) (5 🔗2) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Evolutionary Discovery of Reinforcement Learning Algorithms via Large Language Models	利用大语言模型进化发现强化学习算法，无需人工设计更新规则。	reinforcement learning PPO SAC
2	Critic-Free Deep Reinforcement Learning for Maritime Coverage Path Planning on Irregular Hexagonal Grids	提出一种无Critic的深度强化学习方法，用于解决复杂海域的覆盖路径规划问题。	reinforcement learning deep reinforcement learning DRL
3	Principal Prototype Analysis on Manifold for Interpretable Reinforcement Learning	提出基于流形的主成分原型分析方法，用于可解释强化学习	reinforcement learning large language model
4	Stepwise Credit Assignment for GRPO on Flow-Matching Models	提出Stepwise-Flow-GRPO，为Flow模型生成过程中的每一步骤分配合适的奖励。	reinforcement learning flow matching
5	Mixture-Model Preference Learning for Many-Objective Bayesian Optimization	提出混合模型偏好学习方法，用于解决多目标贝叶斯优化中异构偏好建模问题。	preference learning
6	Corruption-robust Offline Multi-agent Reinforcement Learning From Human Feedback	提出一种抗干扰的离线多智能体强化学习方法	reinforcement learning
7	ERPO: Token-Level Entropy-Regulated Policy Optimization for Large Reasoning Models	提出ERPO，通过token级熵正则化策略优化提升大模型推理能力	reinforcement learning large language model
8	Koopman-based surrogate modeling for reinforcement-learning-control of Rayleigh-Benard convection	提出基于Koopman算子的代理模型，加速强化学习控制Rayleigh-Bénard对流	reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
9	ORACAL: A Robust and Explainable Multimodal Framework for Smart Contract Vulnerability Detection with Causal Graph Enrichment	ORACAL：一种鲁棒且可解释的智能合约漏洞检测多模态框架	large language model multimodal
10	Multimodal Analytics of Cybersecurity Crisis Preparation Exercises: What Predicts Success?	提出基于多模态分析的网络安全危机演练评估方法，预测团队成功率。	multimodal
11	Graph Vector Field: A Unified Framework for Multimodal Health Risk Assessment from Heterogeneous Wearable and Environmental Data Streams	提出图向量场（GVF）框架，用于多模态健康风险评估。	multimodal
12	Efficient Inference of Large Vision Language Models	综述：高效推理大规模视觉语言模型的技术优化框架	multimodal
13	Rethinking Language Model Scaling under Transferable Hypersphere Optimization	提出HyperP框架，通过可迁移的超球面优化提升大语言模型扩展性与训练稳定性。	large language model	✅
14	See it to Place it: Evolving Macro Placements with Vision-Language Models	提出VeoPlace以解决芯片布局中的宏观放置问题	foundation model
15	CirrusBench: Evaluating LLM-based Agents Beyond Correctness in Real-World Cloud Service Environments	CirrusBench：在真实云服务环境中评估LLM智能体，超越正确性	large language model
16	ITQ3_S: High-Fidelity 3-bit LLM Inference via Interleaved Ternary Quantization with Rotation-Domain Smoothing	提出ITQ3_S以解决大语言模型高保真推理问题	large language model

🔬 支柱一：机器人控制 (Robot Control) (5 篇)

#	题目	一句话要点	标签	🔗	⭐
17	LIBERO-Para: A Diagnostic Benchmark and Metrics for Paraphrase Robustness in VLA Models	LIBERO-Para：针对VLA模型，提出释义鲁棒性的诊断基准与评估指标。	manipulation vision-language-action VLA	✅
18	With a Little Help From My Friends: Collective Manipulation in Risk-Controlling Recommender Systems	揭示风险控制推荐系统中集体操纵漏洞，提出用户级风险控制缓解策略	manipulation affordance
19	Gradient Manipulation in Distributed Stochastic Gradient Descent with Strategic Agents: Truthful Incentives with Convergence Guarantees	提出一种分布式支付机制，在保证诚实性的同时实现分布式随机梯度下降的精确收敛。	manipulation
20	Reducing Oracle Feedback with Vision-Language Embeddings for Preference-Based RL	ROVED：结合视觉-语言嵌入与选择性Oracle反馈，降低基于偏好强化学习的标注成本	manipulation reinforcement learning
21	InkDrop: Invisible Backdoor Attacks Against Dataset Condensation	提出InkDrop，提升数据集浓缩后门攻击的隐蔽性。	manipulation	✅

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
22	BiFormer3D: Grid-Free Time-Domain Reconstruction of Head-Related Impulse Responses with a Spatially Encoded Transformer	BiFormer3D：利用空间编码Transformer进行头部相关脉冲响应的时域重建	PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页