cs.LG(2026-04-21)
📊 共 24 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (14)
支柱九:具身大模型 (Embodied Foundation Models) (6)
支柱一:机器人控制 (Robot Control) (4 🔗1)
🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 15 | RDP LoRA: Geometry-Driven Identification for Parameter-Efficient Adaptation in Large Language Models | RDP LoRA:基于几何驱动的大语言模型参数高效微调方法 | large language model | ||
| 16 | Calibrating Scientific Foundation Models with Inference-Time Stochastic Attention | 提出基于随机注意力的科学基础模型校准方法,提升预测不确定性 | foundation model | ||
| 17 | Evaluating LLM-Generated Obfuscated XSS Payloads for Machine Learning-Based Detection | 利用大语言模型生成混淆的XSS攻击载荷,并评估其对机器学习检测的有效性 | large language model | ||
| 18 | Learning Posterior Predictive Distributions for Node Classification from Synthetic Graph Priors | NodePFN:通过合成图先验学习节点分类的后验预测分布,实现跨图泛化 | large language model | ||
| 19 | FedProxy: Federated Fine-Tuning of LLMs via Proxy SLMs and Heterogeneity-Aware Fusion | FedProxy:通过代理SLM和异构感知融合实现LLM的联邦微调 | large language model | ||
| 20 | Decompose, Structure, and Repair: A Neuro-Symbolic Framework for Autoformalization via Operator Trees | 提出DSR神经符号框架,通过操作符树结构化自动形式化过程,显著提升定理证明性能。 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 21 | Accelerating trajectory optimization with Sobolev-trained diffusion policies | 利用Sobolev训练的扩散策略加速轨迹优化 | trajectory optimization imitation learning diffusion policy | ||
| 22 | Low-Rank Adaptation for Critic Learning in Off-Policy Reinforcement Learning | 提出LoRA结构稀疏正则化方法,提升离线强化学习Critic学习的稳定性和性能 | locomotion reinforcement learning SAC | ||
| 23 | FASTER: Value-Guided Sampling for Fast RL | FASTER:通过价值引导采样加速强化学习,降低扩散策略的计算成本。 | manipulation reinforcement learning VLA | ✅ | |
| 24 | HardNet++: Nonlinear Constraint Enforcement in Neural Networks | HardNet++:神经网络中基于非线性约束执行的通用方法 | model predictive control |