cs.LG(2026-04-09)

📊 共 35 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (16 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (14) 支柱八:物理动画 (Physics-based Animation) (3) 支柱一:机器人控制 (Robot Control) (1) 支柱六:视频提取与匹配 (Video Extraction) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (16 篇)

#题目一句话要点标签🔗
1 Tree-of-Evidence: Efficient "System 2" Search for Faithful Multimodal Grounding 提出Tree-of-Evidence算法,用于提升多模态大模型的决策可解释性与忠实度 multimodal
2 Preference Redirection via Attention Concentration: An Attack on Computer Use Agents 提出PRAC:通过注意力集中重定向计算机使用代理偏好的攻击方法 foundation model multimodal
3 SOLAR: Communication-Efficient Model Adaptation via Subspace-Oriented Latent Adapter Reparametrization SOLAR:通过子空间导向的潜在适配器重参数化实现通信高效的模型适配 foundation model
4 Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding 提出一种基于元学习的上下文学习方法,实现无需训练的跨个体脑解码。 foundation model
5 What Drives Representation Steering? A Mechanistic Case Study on Steering Refusal 通过机制性案例研究揭示表征引导的内在机理,聚焦于拒绝回答现象 large language model
6 Zero-shot Multivariate Time Series Forecasting Using Tabular Prior Fitted Networks 提出基于表格先验拟合网络的零样本多元时间序列预测框架 foundation model
7 ADAPTive Input Training for Many-to-One Pre-Training on Time-Series Classification ADAPT:面向时序分类的多对一预训练,解决输入差异性难题。 foundation model
8 Dead Weights, Live Signals: Feedforward Graphs of Frozen Language Models 提出基于冻结语言模型的Feedforward图架构,实现知识融合与性能提升 large language model
9 Alloc-MoE: Budget-Aware Expert Activation Allocation for Efficient Mixture-of-Experts Inference 提出Alloc-MoE以解决稀疏激活导致的推理延迟问题 large language model
10 Automating aggregation strategy selection in federated learning 提出自动化联邦学习聚合策略选择框架,提升非独立同分布数据下的泛化性。 large language model
11 Rethinking Residual Errors in Compensation-based LLM Quantization 重新审视量化残差,提升基于补偿的大语言模型量化性能 large language model
12 QoS-QoE Translation with Large Language Model 构建QoS-QoE翻译数据集,并利用大语言模型实现双向翻译,提升多媒体质量预测与优化。 large language model
13 PRAGMA: Revolut Foundation Model PRAGMA:用于金融事件序列的Revolut基础模型 foundation model
14 HiFloat4 Format for Language Model Pre-training on Ascend NPUs 研究华为昇腾NPU上HiFloat4格式在LLM预训练中的应用,并优化训练稳定性。 large language model foundation model
15 Adaptive Simulation Experiment for LLM Policy Optimization 提出基于对比的自适应仿真实验框架以优化LLM策略 large language model
16 Every Response Counts: Quantifying Uncertainty of LLM-based Multi-Agent Systems through Tensor Decomposition 提出MATU框架,通过张量分解量化LLM多智能体系统的不确定性 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
17 Multimodal Latent Reasoning via Predictive Embeddings 提出Pearl,通过预测嵌入对齐实现多模态隐空间推理,无需显式工具调用。 JEPA depth estimation multimodal
18 Value-Guidance MeanFlow for Offline Multi-Agent Reinforcement Learning 提出VGM$^2$P,通过值引导MeanFlow解决离线多智能体强化学习中的策略学习效率问题。 reinforcement learning policy learning behavior cloning
19 CausalVAE as a Plug-in for World Models: Towards Reliable Counterfactual Dynamics 提出CausalVAE插件式模块,提升世界模型的反事实动态预测可靠性 world model world models
20 Reinforcement Learning with LLM-Guided Action Spaces for Synthesizable Lead Optimization MolReAct:基于LLM引导和反应模板约束的强化学习药物先导化合物优化 reinforcement learning large language model
21 Less Approximates More: Harmonizing Performance and Confidence Faithfulness via Hybrid Post-Training for High-Stakes Tasks 提出HyTuning框架,通过混合后训练提升大模型在高风险任务中的置信度可靠性 reinforcement learning distillation large language model
22 TTVS: Boosting Self-Exploring Reinforcement Learning via Test-time Variational Synthesis 提出TTVS,通过测试时变分合成提升自探索强化学习,解决专业领域监督数据匮乏问题。 reinforcement learning
23 QaRL: Rollout-Aligned Quantization-Aware RL for Fast and Stable Training under Training--Inference Mismatch QaRL:提出Rollout对齐的量化感知强化学习,加速LLM训练并提升稳定性 reinforcement learning large language model
24 Structured Distillation of Web Agent Capabilities Enables Generalization 提出Agent-as-Annotators框架,通过结构化蒸馏提升Web Agent在复杂环境中的泛化能力。 distillation
25 MIPT-SSM: Scaling Language Models with $O(1)$ Inference Cache via Phase Transitions 提出MIPT-SSM以解决语言模型推理效率问题 SSM
26 An Imperfect Verifier is Good Enough: Learning with Noisy Rewards 研究表明:带噪声奖励的强化学习在LLM训练中具有鲁棒性 reinforcement learning large language model
27 Alleviating Community Fear in Disasters via Multi-Agent Actor-Critic Reinforcement Learning 提出基于多智能体Actor-Critic强化学习的灾害社区恐慌缓解方法 reinforcement learning
28 Wireless Communication Enhanced Value Decomposition for Multi-Agent Reinforcement Learning 提出CLOVER框架,利用无线通信图增强多智能体强化学习中的值分解。 reinforcement learning
29 StructRL: Recovering Dynamic Programming Structure from Learning Dynamics in Distributional Reinforcement Learning 提出StructRL框架以从分布式强化学习中恢复动态规划结构 reinforcement learning
30 From Selection to Scheduling: Federated Geometry-Aware Correction Makes Exemplar Replay Work Better under Continual Dynamic Heterogeneity 提出FEAT:联邦几何感知校正方法,提升动态异构联邦持续学习中Exemplar Replay性能 distillation geometric consistency

🔬 支柱八:物理动画 (Physics-based Animation) (3 篇)

#题目一句话要点标签🔗
31 Kuramoto Oscillatory Phase Encoding: Neuro-inspired Synchronization for Improved Learning Efficiency 提出Kuramoto振荡相位编码(KoPE),通过神经启发的同步机制提升Vision Transformer的学习效率。 spatiotemporal
32 Bias-Constrained Diffusion Schedules for PDE Emulations: Reconstruction Error Minimization and Efficient Unrolled Training 提出偏差约束扩散调度方法,提升PDE模拟精度和训练效率。 spatiotemporal
33 Bias-Constrained Diffusion Schedules for PDE Emulations: Reconstruction Error Minimization and Efficient Unrolled Training 提出偏差约束扩散调度方法,提升PDE模拟精度与训练效率 spatiotemporal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
34 PriPG-RL: Privileged Planner-Guided Reinforcement Learning for Partially Observable Systems with Anytime-Feasible MPC 提出Privileged Planner-Guided RL以解决部分可观测系统中的强化学习问题 quadruped MPC model predictive control

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
35 EgoEverything: A Benchmark for Human Behavior Inspired Long Context Egocentric Video Understanding in AR Environment EgoEverything:一个受人类行为启发的AR环境长时程第一视角视频理解基准 egocentric

⬅️ 返回 cs.LG 首页 · 🏠 返回主页