cs.LG(2026-01-08)

📊 共 21 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11 🔗2) 支柱九:具身大模型 (Embodied Foundation Models) (6 🔗1) 支柱一:机器人控制 (Robot Control) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 MPM-LLM4DSE: Reaching the Pareto Frontier in HLS with Multimodal Learning and LLM-Driven Exploration MPM-LLM4DSE:利用多模态学习和LLM驱动探索实现HLS帕累托前沿优化 predictive model large language model multimodal
2 Precision over Diversity: High-Precision Reward Generalizes to Robust Instruction Following 高精度奖励胜过多样性:提升指令跟随的鲁棒性与泛化能力 reinforcement learning instruction following
3 Nightmare Dreamer: Dreaming About Unsafe States And Planning Ahead 提出 Nightmare Dreamer,通过预测不安全状态进行安全强化学习。 reinforcement learning world model dreamer
4 On the Hidden Objective Biases of Group-based Reinforcement Learning 揭示基于群组强化学习的隐藏目标偏差,为未来算法设计提供指导 reinforcement learning large language model
5 FedKDX: Federated Learning with Negative Knowledge Distillation for Enhanced Healthcare AI Systems FedKDX:基于负知识蒸馏的联邦学习框架,提升医疗AI系统性能。 contrastive learning distillation
6 TSSR: Two-Stage Swap-Reward-Driven Reinforcement Learning for Character-Level SMILES Generation 提出TSSR:一种双阶段交换奖励驱动的强化学习方法,用于字符级SMILES生成。 reinforcement learning PPO
7 Safe Continual Reinforcement Learning Methods for Nonstationary Environments. Towards a Survey of the State of the Art 针对非平稳环境,综述安全持续强化学习方法的研究进展与挑战。 reinforcement learning
8 DeepWeightFlow: Re-Basined Flow Matching for Generating Neural Network Weights DeepWeightFlow:基于重定基流匹配的神经网络权重生成方法 flow matching
9 AgentOCR: Reimagining Agent History via Optical Self-Compression AgentOCR:通过光学自压缩重构Agent历史,提升效率 reinforcement learning large language model
10 Improving Semi-Supervised Contrastive Learning via Entropy-Weighted Confidence Integration of Anchor-Positive Pairs 提出基于熵加权置信度集成的半监督对比学习方法,提升低标签数据下的分类精度。 contrastive learning
11 Not All Steps are Informative: On the Linearity of LLMs' RLVR Training 揭示LLM的RLVR训练线性特性,提出权重/Logits外推加速训练。 reinforcement learning large language model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)

#题目一句话要点标签🔗
12 GPU-Accelerated INT8 Quantization for KV Cache Compression in Large Language Models 提出基于GPU加速的INT8量化方法,用于压缩大语言模型中的KV缓存。 large language model
13 IGenBench: Benchmarking the Reliability of Text-to-Infographic Generation IGenBench:构建文本到信息图生成可靠性评测基准 large language model multimodal
14 Milestones over Outcome: Unlocking Geometric Reasoning with Sub-Goal Verifiable Reward 提出SGVR框架,通过子目标可验证奖励提升MLLM几何推理能力 large language model multimodal
15 A Vision for Multisensory Intelligence: Sensing, Synergy, and Science 提出多感官智能研究方向,旨在提升AI对世界的感知、理解与交互能力 multimodal
16 Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers 提出可学习乘数,解除语言模型矩阵层尺度的限制,提升模型性能。 large language model
17 Do LLMs Benefit from User and Item Embeddings in Recommendation Tasks? 提出一种轻量级投影模块,将用户和物品嵌入融入LLM以提升推荐性能 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
18 Robust Reasoning as a Symmetry-Protected Topological Phase 提出Holonomic Network,通过对称保护拓扑相实现对语义噪声的鲁棒推理。 manipulation large language model
19 On the Definition and Detection of Cherry-Picking in Counterfactual Explanations 定义并研究了反事实解释中的“挑选”现象,揭示了其检测的局限性。 manipulation

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
20 Intraday spatiotemporal PV power prediction at national scale using satellite-based solar forecast models 提出基于卫星的太阳能预测模型,实现国家尺度内光伏功率时空预测 optical flow spatiotemporal

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
21 Density Matrix RNN (DM-RNN): A Quantum Information Theoretic Framework for Modeling Musical Context and Polyphony 提出密度矩阵RNN(DM-RNN),利用量子信息理论建模音乐语境和复调音乐。 CHOIS

⬅️ 返回 cs.LG 首页 · 🏠 返回主页