cs.LG(2026-01-27)

📊 共 24 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (12 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (11 🔗1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)

#题目一句话要点标签🔗
1 Out-of-Distribution Generalization via Invariant Trajectories for Multimodal Large Language Model Editing 提出ODEdit,通过不变轨迹学习提升多模态大语言模型知识编辑的泛化能力 large language model multimodal
2 ProToken: Token-Level Attribution for Federated Large Language Models ProToken:为联邦大语言模型实现Token级别溯源,解决贡献者追踪难题。 large language model
3 Explicit Multi-head Attention for Inter-head Interaction in Large Language Models 提出多头显式注意力机制MEA,增强大语言模型中头之间的交互 large language model
4 Post-LayerNorm Is Back: Stable, ExpressivE, and Deep Keel:基于Highway连接的Post-LN Transformer,实现深度LLM的稳定训练 large language model
5 Grasynda: Graph-based Synthetic Time Series Generation 提出Grasynda以解决时间序列数据增强不足问题 foundation model
6 Selective Steering: Norm-Preserving Control Through Discriminative Layer Selection 提出选择性引导,通过判别层选择实现LLM中保持范数的对抗控制 large language model
7 Metric $k$-clustering using only Weak Comparison Oracles 提出基于弱比较Oracle的度量k-聚类算法,适用于无精确距离信息的场景 large language model
8 Whitespaces Don't Lie: Feature-Driven and Embedding-Based Approaches for Detecting Machine-Generated Code 利用代码特征与嵌入,检测机器生成的代码,提升代码溯源能力。 large language model
9 LLM-Assisted Logic Rule Learning: Scaling Human Expertise for Time Series Anomaly Detection 提出LLM辅助的逻辑规则学习框架,解决供应链时序异常检测中专家知识规模化难题。 large language model
10 Foresight Learning for SEC Risk Prediction 提出Foresight Learning,用于从SEC文件中预测风险概率,无需人工标注。 large language model
11 Native LLM and MLLM Inference at Scale on Apple Silicon vllm-mlx:在Apple Silicon上高效进行LLM和MLLM原生推理 multimodal
12 OWLEYE: Zero-Shot Learner for Cross-Domain Graph Data Anomaly Detection OWLEYE:面向跨域图数据异常检测的零样本学习框架 foundation model

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
13 CHEHAB RL: Learning to Optimize Fully Homomorphic Encryption Computations 提出CHEHAB RL,利用深度强化学习优化全同态加密计算。 reinforcement learning deep reinforcement learning OMOMO
14 Self-Distillation Enables Continual Learning 提出自蒸馏微调SDFT,实现从演示中持续学习,缓解灾难性遗忘。 reinforcement learning policy learning distillation
15 From Observations to Events: Event-Aware World Model for Reinforcement Learning 提出事件感知世界模型EAWM,提升MBRL在结构相似场景中的泛化能力。 reinforcement learning policy learning world model
16 The Geometric Mechanics of Contrastive Representation Learning: Alignment Potentials, Entropic Dispersion, and Cross-Modal Divergence 通过几何力学分析对比表示学习,揭示对齐势、熵扩散和跨模态散度的内在联系 representation learning contrastive learning multimodal
17 On the Expressiveness of State Space Models via Temporal Logics 通过时序逻辑分析状态空间模型(SSM)的表达能力 SSM state space model large language model
18 Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning 提出基于Group DRO的强化学习框架,提升LLM在复杂推理任务中的性能 reinforcement learning large language model
19 Improving Policy Exploitation in Online Reinforcement Learning with Instant Retrospect Action IRA:通过即时回顾动作提升在线强化学习中的策略利用 reinforcement learning representation learning
20 Privacy-Preserving Model Transcription with Differentially Private Synthetic Distillation 提出差分隐私合成蒸馏,实现数据自由的隐私保护模型转录 distillation
21 Tracking Drift: Variation-Aware Entropy Scheduling for Non-Stationary Reinforcement Learning 提出自适应熵调度方法以应对非平稳强化学习中的环境漂移问题 reinforcement learning
22 R^3: Replay, Reflection, and Ranking Rewards for LLM Reinforcement Learning R^3:通过回放、反思和排序奖励提升LLM在复杂推理任务中的强化学习性能 reinforcement learning
23 Double Fairness Policy Learning: Integrating Action Fairness and Outcome Fairness in Decision-making 提出双重公平策略学习框架,解决决策中的行动公平与结果公平问题 policy learning

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
24 Component-Aware Pruning Framework for Neural Network Controllers via Gradient-Based Importance Estimation 提出组件感知剪枝框架以解决神经网络控制器复杂性问题 MPC

⬅️ 返回 cs.LG 首页 · 🏠 返回主页