cs.LG（2026-01-27）

📊 共 24 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (12 🔗2) 支柱二：RL算法与架构 (RL & Architecture) (11 🔗1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Out-of-Distribution Generalization via Invariant Trajectories for Multimodal Large Language Model Editing	提出ODEdit，通过不变轨迹学习提升多模态大语言模型知识编辑的泛化能力	large language model multimodal
2	ProToken: Token-Level Attribution for Federated Large Language Models	ProToken：为联邦大语言模型实现Token级别溯源，解决贡献者追踪难题。	large language model
3	Explicit Multi-head Attention for Inter-head Interaction in Large Language Models	提出多头显式注意力机制MEA，增强大语言模型中头之间的交互	large language model
4	Post-LayerNorm Is Back: Stable, ExpressivE, and Deep	Keel：基于Highway连接的Post-LN Transformer，实现深度LLM的稳定训练	large language model
5	Grasynda: Graph-based Synthetic Time Series Generation	提出Grasynda以解决时间序列数据增强不足问题	foundation model
6	Selective Steering: Norm-Preserving Control Through Discriminative Layer Selection	提出选择性引导，通过判别层选择实现LLM中保持范数的对抗控制	large language model	✅
7	Metric $k$-clustering using only Weak Comparison Oracles	提出基于弱比较Oracle的度量k-聚类算法，适用于无精确距离信息的场景	large language model
8	Whitespaces Don't Lie: Feature-Driven and Embedding-Based Approaches for Detecting Machine-Generated Code	利用代码特征与嵌入，检测机器生成的代码，提升代码溯源能力。	large language model
9	LLM-Assisted Logic Rule Learning: Scaling Human Expertise for Time Series Anomaly Detection	提出LLM辅助的逻辑规则学习框架，解决供应链时序异常检测中专家知识规模化难题。	large language model
10	Foresight Learning for SEC Risk Prediction	提出Foresight Learning，用于从SEC文件中预测风险概率，无需人工标注。	large language model	✅
11	Native LLM and MLLM Inference at Scale on Apple Silicon	vllm-mlx：在Apple Silicon上高效进行LLM和MLLM原生推理	multimodal
12	OWLEYE: Zero-Shot Learner for Cross-Domain Graph Data Anomaly Detection	OWLEYE：面向跨域图数据异常检测的零样本学习框架	foundation model

🔬 支柱二：RL算法与架构 (RL & Architecture) (11 篇)

#	题目	一句话要点	标签	🔗	⭐
13	CHEHAB RL: Learning to Optimize Fully Homomorphic Encryption Computations	提出CHEHAB RL，利用深度强化学习优化全同态加密计算。	reinforcement learning deep reinforcement learning OMOMO
14	Self-Distillation Enables Continual Learning	提出自蒸馏微调SDFT，实现从演示中持续学习，缓解灾难性遗忘。	reinforcement learning policy learning distillation
15	From Observations to Events: Event-Aware World Model for Reinforcement Learning	提出事件感知世界模型EAWM，提升MBRL在结构相似场景中的泛化能力。	reinforcement learning policy learning world model	✅
16	The Geometric Mechanics of Contrastive Representation Learning: Alignment Potentials, Entropic Dispersion, and Cross-Modal Divergence	通过几何力学分析对比表示学习，揭示对齐势、熵扩散和跨模态散度的内在联系	representation learning contrastive learning multimodal
17	On the Expressiveness of State Space Models via Temporal Logics	通过时序逻辑分析状态空间模型（SSM）的表达能力	SSM state space model large language model
18	Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning	提出基于Group DRO的强化学习框架，提升LLM在复杂推理任务中的性能	reinforcement learning large language model
19	Improving Policy Exploitation in Online Reinforcement Learning with Instant Retrospect Action	IRA：通过即时回顾动作提升在线强化学习中的策略利用	reinforcement learning representation learning
20	Privacy-Preserving Model Transcription with Differentially Private Synthetic Distillation	提出差分隐私合成蒸馏，实现数据自由的隐私保护模型转录	distillation
21	Tracking Drift: Variation-Aware Entropy Scheduling for Non-Stationary Reinforcement Learning	提出自适应熵调度方法以应对非平稳强化学习中的环境漂移问题	reinforcement learning
22	R^3: Replay, Reflection, and Ranking Rewards for LLM Reinforcement Learning	R^3：通过回放、反思和排序奖励提升LLM在复杂推理任务中的强化学习性能	reinforcement learning
23	Double Fairness Policy Learning: Integrating Action Fairness and Outcome Fairness in Decision-making	提出双重公平策略学习框架，解决决策中的行动公平与结果公平问题	policy learning

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
24	Component-Aware Pruning Framework for Neural Network Controllers via Gradient-Based Importance Estimation	提出组件感知剪枝框架以解决神经网络控制器复杂性问题	MPC

⬅️ 返回 cs.LG 首页 · 🏠 返回主页