cs.LG（2026-01-15）

📊 共 27 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (14 🔗1) 支柱二：RL算法与架构 (RL & Architecture) (9) 支柱一：机器人控制 (Robot Control) (2) 支柱三：空间感知与语义 (Perception & Semantics) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (14 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Multilingual-To-Multimodal (M2M): Unlocking New Languages with Monolingual Text	提出METAL，利用单语文本解锁多语言到多模态的零样本迁移能力	multimodal zero-shot transfer	✅
2	ProbFM: Probabilistic Time Series Foundation Model with Uncertainty Decomposition	ProbFM：基于不确定性分解的概率时间序列基础模型，用于金融预测。	foundation model
3	LeMoF: Level-guided Multimodal Fusion for Heterogeneous Clinical Data	提出LeMoF，通过层级引导的多模态融合提升异构临床数据预测精度。	multimodal
4	PID-Guided Partial Alignment for Multimodal Decentralized Federated Learning	PARSE：一种PID引导的局部对齐多模态去中心化联邦学习框架	multimodal
5	Unlabeled Data Can Provably Enhance In-Context Learning of Transformers	提出一种增强的上下文学习框架，利用无标签数据提升Transformer性能	large language model chain-of-thought
6	Single-Stage Huffman Encoder for ML Compression	提出单阶段霍夫曼编码器，解决LLM压缩中带宽瓶颈问题	large language model
7	PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution	PACEvolve：实现长程、感知进度且一致的进化搜索框架	large language model
8	LangLasso: Interactive Cluster Descriptions through LLM Explanation	提出LangLasso以解决聚类解释的可访问性问题	large language model
9	Queueing-Aware Optimization of Reasoning Tokens for Accuracy-Latency Trade-offs in LLM Servers	针对LLM服务器，提出队列感知的推理Token优化方法，实现精度-延迟权衡。	large language model
10	In-Context Source and Channel Coding	提出In-Context解码框架，提升LLM驱动的算术编码在低信噪比下的文本传输鲁棒性	large language model
11	LOOKAT: Lookup-Optimized Key-Attention for Memory-Efficient Transformers	提出LOOKAT，通过查找优化的键注意力机制实现Transformer的内存高效压缩。	large language model
12	Understanding and Preserving Safety in Fine-Tuned LLMs	提出SPF安全保持微调方法，解决LLM微调中安全性和效用性冲突问题	large language model
13	FaTRQ: Tiered Residual Quantization for LLM Vector Search in Far-Memory-Aware ANNS Systems	提出FaTRQ以解决ANNS系统中的存储与延迟问题	multimodal
14	An Exploratory Study to Repurpose LLMs to a Unified Architecture for Time Series Classification	探索性研究：重用LLM为统一架构用于时间序列分类	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (9 篇)

#	题目	一句话要点	标签	🔗	⭐
15	Projected Microbatch Accumulation yields reference-free proximal policy updates for reinforcement learning	提出PROMA，一种无需参考策略的近端策略更新方法，用于大规模语言模型微调。	reinforcement learning policy learning PPO
16	CS-GBA: A Critical Sample-based Gradient-guided Backdoor Attack for Offline Reinforcement Learning	提出CS-GBA，解决离线强化学习中安全约束算法的隐蔽后门攻击问题	reinforcement learning offline reinforcement learning CQL
17	Sparse-RL: Breaking the Memory Wall in LLM Reinforcement Learning via Stable Sparse Rollouts	Sparse-RL：通过稳定稀疏Rollout打破LLM强化学习中的内存墙	reinforcement learning large language model
18	Reinforcement Learning to Discover a NorthEast Monsoon Index for Monthly Rainfall Prediction in Thailand	利用强化学习发现东北季风指数，提升泰国月降雨量预测精度	reinforcement learning spatiotemporal
19	Combinatorial Optimization Augmented Machine Learning	综述组合优化增强机器学习(COAML)，弥合预测模型与组合决策。	reinforcement learning imitation learning predictive model
20	DeFlow: Decoupling Manifold Modeling and Value Maximization for Offline Policy Extraction	DeFlow：解耦流形建模与价值最大化，用于离线策略提取	offline RL flow matching distillation
21	SuS: Strategy-aware Surprise for Intrinsic Exploration	提出策略感知惊讶以解决强化学习中的探索问题	reinforcement learning large language model
22	PRL: Process Reward Learning Improves LLMs' Reasoning Ability and Broadens the Reasoning Boundary	PRL：过程奖励学习提升LLM的推理能力并拓展推理边界	reinforcement learning large language model
23	CAFEDistill: Learning Personalized and Dynamic Models through Federated Early-Exit Network Distillation	提出CAFEDistill，通过联邦早期退出网络蒸馏学习个性化动态模型。	distillation

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
24	Reinforcement Learning with Multi-Step Lookahead Information Via Adaptive Batching	提出自适应批处理策略以解决多步前瞻强化学习问题	model predictive control reinforcement learning
25	Sim2Real Deep Transfer for Per-Device CFO Calibration	提出Sim2Real迁移学习框架，用于异构SDR设备CFO校准	sim2real

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
26	DInf-Grid: A Neural Differential Equation Solver with Differentiable Feature Grids	提出DInf-Grid，一种基于可微特征网格的神经微分方程求解器，显著提升求解速度。	implicit representation

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
27	SPIKE: Sparse Koopman Regularization for Physics-Informed Neural Networks	SPIKE：基于稀疏Koopman正则化的物理信息神经网络，提升泛化能力。	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页