cs.LG(2026-04-29)
📊 共 15 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱二:RL算法与架构 (RL & Architecture) (6)
支柱九:具身大模型 (Embodied Foundation Models) (5 🔗2)
支柱一:机器人控制 (Robot Control) (2)
支柱四:生成式动作 (Generative Motion) (2)
🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Cheeger--Hodge Contrastive Learning for Structurally Robust Graph Representation Learning | 提出Cheeger-Hodge对比学习以解决图表示学习的结构鲁棒性问题 | representation learning contrastive learning | ||
| 2 | PAINT: Partial-Solution Adaptive Interpolated Training for Self-Distilled Reasoners | PAINT:面向自蒸馏推理器的部分解自适应插值训练 | reinforcement learning distillation large language model | ||
| 3 | Lyapunov-Guided Self-Alignment: Test-Time Adaptation for Offline Safe Reinforcement Learning | 提出Lyapunov引导的自对齐方法SAS,用于离线安全强化学习的测试时自适应 | reinforcement learning offline reinforcement learning | ||
| 4 | Electricity price forecasting across Norway's five bidding zones in the post-crisis era | 针对挪威电力市场结构性变化,提出LightGBM电力价格预测基准模型 | MAE multimodal | ||
| 5 | Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control | Entrocraft:通过精确熵曲线控制解决LLM强化学习中的性能饱和问题 | reinforcement learning large language model | ||
| 6 | DORA: A Scalable Asynchronous Reinforcement Learning System for Language Model Training | DORA:一种可扩展的异步强化学习系统,用于加速语言模型训练。 | reinforcement learning |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 7 | Do Larger Models Really Win in Drug Discovery? A Benchmark Assessment of Model Scaling in AI-Driven Molecular Property and Activity Prediction | 评估分子性质预测中模型规模效应:小型模型在药物发现中仍具竞争力 | large language model foundation model | ||
| 8 | SplitFT: An Adaptive Federated Split Learning System For LLMs Fine-Tuning | SplitFT:一种自适应联邦切分学习系统,用于LLM的微调。 | large language model | ||
| 9 | CoQuant: Joint Weight-Activation Subspace Projection for Mixed-Precision LLMs | CoQuant:面向混合精度LLM的联合权重-激活子空间投影量化方法 | large language model | ✅ | |
| 10 | Efficient, VRAM-Constrained xLM Inference on Clients | 提出流水线分片技术,实现VRAM受限的客户端高效xLM推理 | large language model | ✅ | |
| 11 | Hierarchical Long-Term Semantic Memory for LinkedIn's Hiring Agent | 提出层级长时语义记忆框架HLTM,提升LinkedIn招聘助手个性化能力。 | large language model |
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 12 | Uncertainty-Aware Predictive Safety Filters for Probabilistic Neural Network Dynamics | 提出UPSi:一种基于概率神经网络动态模型的、具有不确定性感知的预测安全滤波器 | model predictive control reinforcement learning deep reinforcement learning | ||
| 13 | Learning Over-Relaxation Policies for ADMM with Convergence Guarantees | 提出一种基于在线学习的ADMM松弛策略,提升收敛速度并保证收敛性,应用于模型预测控制等场景。 | MPC model predictive control |
🔬 支柱四:生成式动作 (Generative Motion) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 14 | AlphaJet: Automated Conceptual Aircraft Synthesis via Disentangled Generative Priors and Topology-Preserving Evolutionary Search | AlphaJet:通过解耦生成先验和拓扑保持进化搜索实现自动化概念飞机综合 | penetration | ||
| 15 | DiffAnon: Diffusion-based Prosody Control for Voice Anonymization | DiffAnon:一种基于扩散模型的语音匿名化方法,可控韵律保留程度。 | classifier-free guidance |