cs.LG(2026-04-29)

📊 共 15 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (6) 支柱九:具身大模型 (Embodied Foundation Models) (5 🔗2) 支柱一:机器人控制 (Robot Control) (2) 支柱四:生成式动作 (Generative Motion) (2)

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
1 Cheeger--Hodge Contrastive Learning for Structurally Robust Graph Representation Learning 提出Cheeger-Hodge对比学习以解决图表示学习的结构鲁棒性问题 representation learning contrastive learning
2 PAINT: Partial-Solution Adaptive Interpolated Training for Self-Distilled Reasoners PAINT:面向自蒸馏推理器的部分解自适应插值训练 reinforcement learning distillation large language model
3 Lyapunov-Guided Self-Alignment: Test-Time Adaptation for Offline Safe Reinforcement Learning 提出Lyapunov引导的自对齐方法SAS,用于离线安全强化学习的测试时自适应 reinforcement learning offline reinforcement learning
4 Electricity price forecasting across Norway's five bidding zones in the post-crisis era 针对挪威电力市场结构性变化,提出LightGBM电力价格预测基准模型 MAE multimodal
5 Addressing Performance Saturation for LLM RL via Precise Entropy Curve Control Entrocraft:通过精确熵曲线控制解决LLM强化学习中的性能饱和问题 reinforcement learning large language model
6 DORA: A Scalable Asynchronous Reinforcement Learning System for Language Model Training DORA:一种可扩展的异步强化学习系统,用于加速语言模型训练。 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)

#题目一句话要点标签🔗
7 Do Larger Models Really Win in Drug Discovery? A Benchmark Assessment of Model Scaling in AI-Driven Molecular Property and Activity Prediction 评估分子性质预测中模型规模效应:小型模型在药物发现中仍具竞争力 large language model foundation model
8 SplitFT: An Adaptive Federated Split Learning System For LLMs Fine-Tuning SplitFT:一种自适应联邦切分学习系统,用于LLM的微调。 large language model
9 CoQuant: Joint Weight-Activation Subspace Projection for Mixed-Precision LLMs CoQuant:面向混合精度LLM的联合权重-激活子空间投影量化方法 large language model
10 Efficient, VRAM-Constrained xLM Inference on Clients 提出流水线分片技术,实现VRAM受限的客户端高效xLM推理 large language model
11 Hierarchical Long-Term Semantic Memory for LinkedIn's Hiring Agent 提出层级长时语义记忆框架HLTM,提升LinkedIn招聘助手个性化能力。 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
12 Uncertainty-Aware Predictive Safety Filters for Probabilistic Neural Network Dynamics 提出UPSi:一种基于概率神经网络动态模型的、具有不确定性感知的预测安全滤波器 model predictive control reinforcement learning deep reinforcement learning
13 Learning Over-Relaxation Policies for ADMM with Convergence Guarantees 提出一种基于在线学习的ADMM松弛策略,提升收敛速度并保证收敛性,应用于模型预测控制等场景。 MPC model predictive control

🔬 支柱四:生成式动作 (Generative Motion) (2 篇)

#题目一句话要点标签🔗
14 AlphaJet: Automated Conceptual Aircraft Synthesis via Disentangled Generative Priors and Topology-Preserving Evolutionary Search AlphaJet:通过解耦生成先验和拓扑保持进化搜索实现自动化概念飞机综合 penetration
15 DiffAnon: Diffusion-based Prosody Control for Voice Anonymization DiffAnon:一种基于扩散模型的语音匿名化方法,可控韵律保留程度。 classifier-free guidance

⬅️ 返回 cs.LG 首页 · 🏠 返回主页