cs.LG(2026-03-19)

📊 共 32 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (15 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (12 🔗1) 支柱一:机器人控制 (Robot Control) (3) 支柱三:空间感知与语义 (Perception & Semantics) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (15 篇)

#题目一句话要点标签🔗
1 AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models AcceRL:面向VLA模型的分布式异步强化学习与世界模型框架 reinforcement learning world model vision-language-action
2 RE-SAC: Disentangling aleatoric and epistemic risks in bus fleet control: A stable and robust ensemble DRL approach RE-SAC:解耦随机与认知不确定性,实现稳定鲁棒的公交车队控制 reinforcement learning deep reinforcement learning DRL
3 STEP: Scientific Time-Series Encoder Pretraining via Cross-Domain Distillation STEP:通过跨领域蒸馏预训练科学时间序列编码器,提升表征学习效果 representation learning distillation foundation model
4 HISR: Hindsight Information Modulated Segmental Process Rewards For Multi-turn Agentic Reinforcement Learning 提出HISR,利用后见信息调制的片段过程奖励,提升多轮Agent强化学习性能 reinforcement learning large language model
5 Discounted Beta--Bernoulli Reward Estimation for Sample-Efficient Reinforcement Learning with Verifiable Rewards 提出Discounted Beta--Bernoulli奖励估计,提升可验证奖励强化学习的样本效率 reinforcement learning large language model
6 Adaptive Regime-Aware Stock Price Prediction Using Autoencoder-Gated Dual Node Transformers with Reinforcement Learning Control 提出自编码器门控双节点Transformer与强化学习控制的自适应股票价格预测框架 reinforcement learning SAC
7 CausalRM: Causal-Theoretic Reward Modeling for RLHF from Observational User Feedbacks CausalRM:利用因果理论进行奖励建模,从观测用户反馈中进行RLHF reinforcement learning RLHF
8 Enhancing Pretrained Model-based Continual Representation Learning via Guided Random Projection 提出SCL-MGSM,通过引导随机投影增强预训练模型在持续表征学习中的性能。 representation learning
9 Context Bootstrapped Reinforcement Learning 提出上下文引导强化学习(CBRL)以提升复杂推理任务的探索效率 reinforcement learning
10 Are complicated loss functions necessary for teaching LLMs to reason? 提出RGRA:一种简化的REINFORCE方法,提升LLM数学推理能力,无需复杂约束。 PPO large language model
11 Balancing the Reasoning Load: Difficulty-Differentiated Policy Optimization with Length Redistribution for Efficient and Robust Reinforcement Learning 提出难度区分策略优化DDPO,解决大模型推理中过度思考和欠思考问题。 reinforcement learning
12 iSatCR: Graph-Empowered Joint Onboard Computing and Routing for LEO Data Delivery iSatCR:图神经网络赋能的LEO卫星数据联合计算与路由方法 reinforcement learning deep reinforcement learning
13 AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models AcceRL:用于视觉-语言-动作模型的高效分布式异步强化学习框架 reinforcement learning world model vision-language-action
14 Optimizing Resource-Constrained Non-Pharmaceutical Interventions for Multi-Cluster Outbreak Control Using Hierarchical Reinforcement Learning 提出分层强化学习框架,优化资源约束下多集群疫情控制的非药物干预措施 reinforcement learning
15 Balancing the Reasoning Load: Difficulty-Differentiated Policy Optimization with Length Redistribution for Efficient and Robust Reinforcement Learning 提出难度区分策略优化DDPO,解决大模型推理中过度思考和欠思考问题。 reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)

#题目一句话要点标签🔗
16 From Inference Efficiency to Embodied Efficiency: Revisiting Efficiency Metrics for Vision-Language-Action Models 重新审视VLA模型的效率指标,关注具身智能的系统级效率 vision-language-action VLA
17 On Optimizing Multimodal Jailbreaks for Spoken Language Models 提出JAMA:一种联合优化文本和音频的多模态语音语言模型越狱攻击方法 multimodal
18 Online Learning and Equilibrium Computation with Ranking Feedback 针对排序反馈的在线学习与均衡计算,提出变分效用下的次线性后悔算法 large language model
19 AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science AgentDS:领域数据科学中人机协作的基准测试与未来探索 large language model
20 Book your room in the Turing Hotel! A symmetric and distributed Turing Test with multiple AIs and humans 提出 TuringHotel:一种对称分布式图灵测试,评估多智能体和人类 large language model
21 BeamAgent: LLM-Aided MIMO Beamforming with Decoupled Intent Parsing and Alternating Optimization for Joint Site Selection and Precoding BeamAgent:解耦意图解析与交替优化的LLM辅助MIMO波束成形 large language model
22 SpecForge: A Flexible and Efficient Open-Source Training Framework for Speculative Decoding SpecForge:用于推测解码的灵活高效开源训练框架 large language model
23 GeoLAN: Geometric Learning of Latent Explanatory Directions in Large Language Models GeoLAN:通过几何学习提升大语言模型潜在解释方向 large language model
24 ICLAD: In-Context Learning for Unified Tabular Anomaly Detection Across Supervision Regimes ICLAD:提出一种用于统一表格异常检测的上下文学习框架,可跨越不同监督模式。 foundation model
25 Any-Subgroup Equivariant Networks via Symmetry Breaking 提出Any-Subgroup Equivariant Network,通过对称性破缺实现对多个子群的等变性。 foundation model
26 Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents AutoMIA:利用LLM智能体自动设计成员推断攻击,提升攻击效果。 large language model
27 Exploring the Agentic Frontier of Verilog Code Generation 首个针对Verilog代码生成的Agentic LLM系统评估,揭示工具设计与结构化提示的关键作用。 large language model

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
28 Cyber-Resilient Digital Twins: Discriminating Attacks for Safe Critical Infrastructure Control 提出i-SDT,结合数字孪生与自适应控制,增强关键基础设施网络安全韧性。 manipulation MPC model predictive control
29 Robustness, Cost, and Attack-Surface Concentration in Phishing Detection 提出成本敏感的对抗攻击框架,揭示网络钓鱼检测中的脆弱性和攻击面集中问题。 manipulation
30 Anatomical Heterogeneity in Transformer Language Models 揭示Transformer语言模型层间异构性,提出异构预算分配的训练方法 manipulation

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
31 From ex(p) to poly: Gaussian Splatting with Polynomial Kernels 提出基于多项式核的高斯溅射,兼容现有数据集并提升计算效率。 3DGS gaussian splatting splatting

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
32 Beyond Passive Aggregation: Active Auditing and Topology-Aware Defense in Decentralized Federated Learning 提出主动审计与拓扑感知防御框架,提升去中心化联邦学习抵抗后门攻击能力 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页