cs.LG(2026-01-15)

📊 共 27 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (14 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (9) 支柱一:机器人控制 (Robot Control) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (14 篇)

#题目一句话要点标签🔗
1 Multilingual-To-Multimodal (M2M): Unlocking New Languages with Monolingual Text 提出METAL,利用单语文本解锁多语言到多模态的零样本迁移能力 multimodal zero-shot transfer
2 ProbFM: Probabilistic Time Series Foundation Model with Uncertainty Decomposition ProbFM:基于不确定性分解的概率时间序列基础模型,用于金融预测。 foundation model
3 LeMoF: Level-guided Multimodal Fusion for Heterogeneous Clinical Data 提出LeMoF,通过层级引导的多模态融合提升异构临床数据预测精度。 multimodal
4 PID-Guided Partial Alignment for Multimodal Decentralized Federated Learning PARSE:一种PID引导的局部对齐多模态去中心化联邦学习框架 multimodal
5 Unlabeled Data Can Provably Enhance In-Context Learning of Transformers 提出一种增强的上下文学习框架,利用无标签数据提升Transformer性能 large language model chain-of-thought
6 Single-Stage Huffman Encoder for ML Compression 提出单阶段霍夫曼编码器,解决LLM压缩中带宽瓶颈问题 large language model
7 PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution PACEvolve:实现长程、感知进度且一致的进化搜索框架 large language model
8 LangLasso: Interactive Cluster Descriptions through LLM Explanation 提出LangLasso以解决聚类解释的可访问性问题 large language model
9 Queueing-Aware Optimization of Reasoning Tokens for Accuracy-Latency Trade-offs in LLM Servers 针对LLM服务器,提出队列感知的推理Token优化方法,实现精度-延迟权衡。 large language model
10 In-Context Source and Channel Coding 提出In-Context解码框架,提升LLM驱动的算术编码在低信噪比下的文本传输鲁棒性 large language model
11 LOOKAT: Lookup-Optimized Key-Attention for Memory-Efficient Transformers 提出LOOKAT,通过查找优化的键注意力机制实现Transformer的内存高效压缩。 large language model
12 Understanding and Preserving Safety in Fine-Tuned LLMs 提出SPF安全保持微调方法,解决LLM微调中安全性和效用性冲突问题 large language model
13 FaTRQ: Tiered Residual Quantization for LLM Vector Search in Far-Memory-Aware ANNS Systems 提出FaTRQ以解决ANNS系统中的存储与延迟问题 multimodal
14 An Exploratory Study to Repurpose LLMs to a Unified Architecture for Time Series Classification 探索性研究:重用LLM为统一架构用于时间序列分类 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (9 篇)

#题目一句话要点标签🔗
15 Projected Microbatch Accumulation yields reference-free proximal policy updates for reinforcement learning 提出PROMA,一种无需参考策略的近端策略更新方法,用于大规模语言模型微调。 reinforcement learning policy learning PPO
16 CS-GBA: A Critical Sample-based Gradient-guided Backdoor Attack for Offline Reinforcement Learning 提出CS-GBA,解决离线强化学习中安全约束算法的隐蔽后门攻击问题 reinforcement learning offline reinforcement learning CQL
17 Sparse-RL: Breaking the Memory Wall in LLM Reinforcement Learning via Stable Sparse Rollouts Sparse-RL:通过稳定稀疏Rollout打破LLM强化学习中的内存墙 reinforcement learning large language model
18 Reinforcement Learning to Discover a NorthEast Monsoon Index for Monthly Rainfall Prediction in Thailand 利用强化学习发现东北季风指数,提升泰国月降雨量预测精度 reinforcement learning spatiotemporal
19 Combinatorial Optimization Augmented Machine Learning 综述组合优化增强机器学习(COAML),弥合预测模型与组合决策。 reinforcement learning imitation learning predictive model
20 DeFlow: Decoupling Manifold Modeling and Value Maximization for Offline Policy Extraction DeFlow:解耦流形建模与价值最大化,用于离线策略提取 offline RL flow matching distillation
21 SuS: Strategy-aware Surprise for Intrinsic Exploration 提出策略感知惊讶以解决强化学习中的探索问题 reinforcement learning large language model
22 PRL: Process Reward Learning Improves LLMs' Reasoning Ability and Broadens the Reasoning Boundary PRL:过程奖励学习提升LLM的推理能力并拓展推理边界 reinforcement learning large language model
23 CAFEDistill: Learning Personalized and Dynamic Models through Federated Early-Exit Network Distillation 提出CAFEDistill,通过联邦早期退出网络蒸馏学习个性化动态模型。 distillation

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
24 Reinforcement Learning with Multi-Step Lookahead Information Via Adaptive Batching 提出自适应批处理策略以解决多步前瞻强化学习问题 model predictive control reinforcement learning
25 Sim2Real Deep Transfer for Per-Device CFO Calibration 提出Sim2Real迁移学习框架,用于异构SDR设备CFO校准 sim2real

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
26 DInf-Grid: A Neural Differential Equation Solver with Differentiable Feature Grids 提出DInf-Grid,一种基于可微特征网格的神经微分方程求解器,显著提升求解速度。 implicit representation

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
27 SPIKE: Sparse Koopman Regularization for Physics-Informed Neural Networks SPIKE:基于稀疏Koopman正则化的物理信息神经网络,提升泛化能力。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页