cs.LG(2026-03-30)

📊 共 22 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (8) 支柱九:具身大模型 (Embodied Foundation Models) (8 🔗1) 支柱一:机器人控制 (Robot Control) (5 🔗2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (8 篇)

#题目一句话要点标签🔗
1 Evolutionary Discovery of Reinforcement Learning Algorithms via Large Language Models 利用大语言模型进化发现强化学习算法,无需人工设计更新规则。 reinforcement learning PPO SAC
2 Critic-Free Deep Reinforcement Learning for Maritime Coverage Path Planning on Irregular Hexagonal Grids 提出一种无Critic的深度强化学习方法,用于解决复杂海域的覆盖路径规划问题。 reinforcement learning deep reinforcement learning DRL
3 Principal Prototype Analysis on Manifold for Interpretable Reinforcement Learning 提出基于流形的主成分原型分析方法,用于可解释强化学习 reinforcement learning large language model
4 Stepwise Credit Assignment for GRPO on Flow-Matching Models 提出Stepwise-Flow-GRPO,为Flow模型生成过程中的每一步骤分配合适的奖励。 reinforcement learning flow matching
5 Mixture-Model Preference Learning for Many-Objective Bayesian Optimization 提出混合模型偏好学习方法,用于解决多目标贝叶斯优化中异构偏好建模问题。 preference learning
6 Corruption-robust Offline Multi-agent Reinforcement Learning From Human Feedback 提出一种抗干扰的离线多智能体强化学习方法 reinforcement learning
7 ERPO: Token-Level Entropy-Regulated Policy Optimization for Large Reasoning Models 提出ERPO,通过token级熵正则化策略优化提升大模型推理能力 reinforcement learning large language model
8 Koopman-based surrogate modeling for reinforcement-learning-control of Rayleigh-Benard convection 提出基于Koopman算子的代理模型,加速强化学习控制Rayleigh-Bénard对流 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
9 ORACAL: A Robust and Explainable Multimodal Framework for Smart Contract Vulnerability Detection with Causal Graph Enrichment ORACAL:一种鲁棒且可解释的智能合约漏洞检测多模态框架 large language model multimodal
10 Multimodal Analytics of Cybersecurity Crisis Preparation Exercises: What Predicts Success? 提出基于多模态分析的网络安全危机演练评估方法,预测团队成功率。 multimodal
11 Graph Vector Field: A Unified Framework for Multimodal Health Risk Assessment from Heterogeneous Wearable and Environmental Data Streams 提出图向量场(GVF)框架,用于多模态健康风险评估。 multimodal
12 Efficient Inference of Large Vision Language Models 综述:高效推理大规模视觉语言模型的技术优化框架 multimodal
13 Rethinking Language Model Scaling under Transferable Hypersphere Optimization 提出HyperP框架,通过可迁移的超球面优化提升大语言模型扩展性与训练稳定性。 large language model
14 See it to Place it: Evolving Macro Placements with Vision-Language Models 提出VeoPlace以解决芯片布局中的宏观放置问题 foundation model
15 CirrusBench: Evaluating LLM-based Agents Beyond Correctness in Real-World Cloud Service Environments CirrusBench:在真实云服务环境中评估LLM智能体,超越正确性 large language model
16 ITQ3_S: High-Fidelity 3-bit LLM Inference via Interleaved Ternary Quantization with Rotation-Domain Smoothing 提出ITQ3_S以解决大语言模型高保真推理问题 large language model

🔬 支柱一:机器人控制 (Robot Control) (5 篇)

#题目一句话要点标签🔗
17 LIBERO-Para: A Diagnostic Benchmark and Metrics for Paraphrase Robustness in VLA Models LIBERO-Para:针对VLA模型,提出释义鲁棒性的诊断基准与评估指标。 manipulation vision-language-action VLA
18 With a Little Help From My Friends: Collective Manipulation in Risk-Controlling Recommender Systems 揭示风险控制推荐系统中集体操纵漏洞,提出用户级风险控制缓解策略 manipulation affordance
19 Gradient Manipulation in Distributed Stochastic Gradient Descent with Strategic Agents: Truthful Incentives with Convergence Guarantees 提出一种分布式支付机制,在保证诚实性的同时实现分布式随机梯度下降的精确收敛。 manipulation
20 Reducing Oracle Feedback with Vision-Language Embeddings for Preference-Based RL ROVED:结合视觉-语言嵌入与选择性Oracle反馈,降低基于偏好强化学习的标注成本 manipulation reinforcement learning
21 InkDrop: Invisible Backdoor Attacks Against Dataset Condensation 提出InkDrop,提升数据集浓缩后门攻击的隐蔽性。 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
22 BiFormer3D: Grid-Free Time-Domain Reconstruction of Head-Related Impulse Responses with a Spatially Encoded Transformer BiFormer3D:利用空间编码Transformer进行头部相关脉冲响应的时域重建 PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页