cs.LG(2026-01-14)

📊 共 17 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (7 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (6) 支柱一:机器人控制 (Robot Control) (2) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
1 Exploring Fine-Tuning for Tabular Foundation Models 针对表格基础模型,研究微调策略对性能、校准和公平性的影响 foundation model
2 From Prompt to Protocol: Fast Charging Batteries with Large Language Models 利用大语言模型快速优化电池充电协议,提升电池健康状态。 large language model
3 $D^2Prune$: Sparsifying Large Language Models via Dual Taylor Expansion and Attention Distribution Awareness 提出基于双重泰勒展开和注意力分布感知的LLM稀疏化方法$D^2Prune$ large language model
4 BalDRO: A Distributionally Robust Optimization based Framework for Large Language Model Unlearning 提出BalDRO框架,通过分布鲁棒优化实现大语言模型均衡遗忘。 large language model
5 Disentangling Task Conflicts in Multi-Task LoRA via Orthogonal Gradient Projection 提出Ortho-LoRA,通过正交梯度投影缓解多任务LoRA中的任务冲突。 large language model
6 SimMerge: Learning to Select Merge Operators from Similarity Signals 提出SimMerge以优化大语言模型合并过程 large language model
7 Hidden States as Early Signals: Step-level Trace Evaluation and Pruning for Efficient Test-Time Scaling 提出STEP框架,利用隐状态评估和剪枝加速LLM测试时推理并提升精度。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
8 Enhancing Spatial Reasoning in Large Language Models for Metal-Organic Frameworks Structure Prediction MOF-LLM:增强大语言模型空间推理能力,用于金属有机框架结构预测 reinforcement learning large language model
9 Late Breaking Results: Quamba-SE: Soft-edge Quantizer for Activations in State Space Models Quamba-SE:用于状态空间模型激活量化的软边缘量化器 Mamba SSM state space model
10 Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning 提出分布对齐序列蒸馏(DASD),提升轻量级模型在长链推理任务上的性能。 teacher-student distillation
11 SRT: Accelerating Reinforcement Learning via Speculative Rollout with Tree-Structured Cache SRT:通过树状缓存的推测性Rollout加速强化学习,提升语言模型训练效率。 reinforcement learning PPO
12 Draw it like Euclid: Teaching transformer models to generate CAD profiles using ruler and compass construction steps 提出基于几何构造步骤的CAD轮廓生成方法,提升Transformer模型性能 reinforcement learning chain-of-thought
13 Multi-Teacher Ensemble Distillation: A Mathematical Framework for Probability-Domain Knowledge Aggregation 提出多教师集成蒸馏的数学框架,用于概率域知识聚合。 distillation

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
14 Reward Learning through Ranking Mean Squared Error 提出R4方法,通过排序均方误差学习奖励函数,提升机器人强化学习效率。 locomotion reinforcement learning reward design
15 Explainable Autoencoder-Based Anomaly Detection in IEC 61850 GOOSE Networks 提出基于可解释自编码器的IEC 61850 GOOSE网络异常检测框架 manipulation

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
16 Terminally constrained flow-based generative models from an optimal control perspective 提出TOCFlow,通过最优控制解决Flow模型的终端约束采样问题 geometric consistency

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
17 Discrete Solution Operator Learning for Geometry-Dependent PDEs 提出离散解算子学习DiSOL,解决几何依赖偏微分方程的求解难题。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页