cs.LG(2025-12-12)

📊 共 24 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (11) 支柱九:具身大模型 (Embodied Foundation Models) (8 🔗2) 支柱一:机器人控制 (Robot Control) (3) 支柱四:生成式动作 (Generative Motion) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
1 Brain-Semantoks: Learning Semantic Tokens of Brain Dynamics with a Self-Distilled Foundation Model Brain-Semantoks:利用自蒸馏基础模型学习大脑动态的语义Token distillation foundation model
2 Learning to Extract Context for Context-Aware LLM Inference 提出基于强化学习的上下文提取框架,提升LLM在安全任务中的可靠性。 reinforcement learning large language model foundation model
3 Fully Inductive Node Representation Learning via Graph View Transformation 提出图视图变换GVT,实现跨数据集全归纳节点表示学习 representation learning foundation model
4 Mitigating the Safety Alignment Tax with Null-Space Constrained Policy Optimization 提出Null-Space约束策略优化(NSPO)以缓解LLM安全对齐中的能力遗忘问题 reinforcement learning large language model instruction following
5 Multi-Objective Reinforcement Learning for Large-Scale Mixed Traffic Control 提出基于多目标强化学习的大规模混合交通控制框架,提升公平性和安全性。 reinforcement learning penetration
6 Symmetry-Aware Steering of Equivariant Diffusion Policies: Benefits and Limits 提出对称感知策略引导框架,提升等变扩散策略在对称任务中的样本效率和稳定性。 reinforcement learning diffusion policy
7 Data Valuation for LLM Fine-Tuning: Efficient Shapley Value Approximation via Language Model Arithmetic 提出基于语言模型算术的高效Shapley值近似方法,用于LLM微调的数据估值 DPO direct preference optimization large language model
8 DAPO: Design Structure-Aware Pass Ordering in High-Level Synthesis with Graph Contrastive and Reinforcement Learning DAPO:基于图对比学习与强化学习的高层次综合中设计结构感知的Pass排序 reinforcement learning contrastive learning
9 GraphPerf-RT: A Graph-Driven Performance Model for Hardware-Aware Scheduling of OpenMP Codes 提出GraphPerf-RT,用于OpenMP代码硬件感知调度的图驱动性能模型。 reinforcement learning world model model-based RL
10 Softmax as Linear Attention in the Large-Prompt Regime: a Measure-based Perspective 提出基于测度的统一框架,分析大提示下的Softmax注意力机制。 linear attention
11 ReactorFold: Generative discovery of nuclear reactor cores via emergent physical reasoning ReactorFold:通过涌现物理推理生成核反应堆堆芯设计 DPO direct preference optimization

🔬 支柱九:具身大模型 (Embodied Foundation Models) (8 篇)

#题目一句话要点标签🔗
12 Benchmarking the Generality of Vision-Language-Action Models MultiNet v1.0:用于评估视觉-语言-动作模型跨领域泛化能力的统一基准 generalist agent vision-language-action VLA
13 Atomic Action Slicing: Planner-Aligned Options for Generalist VLA Agents 提出原子动作切分(AAS)方法,提升VLA通用智能体在复杂任务中的泛化能力。 vision-language-action VLA
14 CHIME: Chiplet-based Heterogeneous Near-Memory Acceleration for Edge Multimodal LLM Inference CHIME:面向边缘多模态LLM推理的基于Chiplet的异构近存加速 large language model multimodal
15 The Instability of Safety: How Random Seeds and Temperature Expose Inconsistent LLM Refusal Behavior 揭示大语言模型安全性评估的不稳定性:随机种子与温度的影响 large language model
16 EnviroLLM: Resource Tracking and Optimization for Local AI EnviroLLM:用于本地AI资源追踪与优化的开源工具包 large language model
17 AdaGradSelect: An adaptive gradient-guided layer selection method for efficient fine-tuning of SLMs AdaGradSelect:一种自适应梯度引导的层选择方法,用于高效微调小型语言模型。 large language model
18 Spectral entropy prior-guided deep feature fusion architecture for magnetic core loss 提出SEPI-TFPNet混合模型,提升磁芯损耗预测精度与泛化能力 multimodal
19 Insight Miner: A Time Series Analysis Dataset for Cross-Domain Alignment with Natural Language 提出Insight Miner,一个基于大规模多模态模型的时间序列分析框架,并构建了首个通用领域时间序列-语言对齐数据集TS-Insights。 multimodal

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
20 Goal Reaching with Eikonal-Constrained Hierarchical Quasimetric Reinforcement Learning 提出Eik-HiQRL,结合Eikonal方程与分层强化学习解决复杂动态下的目标导向任务 manipulation reinforcement learning reward design
21 High-Dimensional Surrogate Modeling for Closed-Loop Learning of Neural-Network-Parameterized Model Predictive Control 提出贝叶斯神经网络作为代理模型以优化高维控制器参数 model predictive control
22 Neural Chameleons: Language Models Can Learn to Hide Their Thoughts from Unseen Activation Monitors 提出神经变色龙:语言模型可学会对未见过的激活监控器隐藏其真实意图 manipulation

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
23 NeuralOGCM: Differentiable Ocean Modeling with Learnable Physics NeuralOGCM:融合可学习物理的海洋模型,实现高效高精度模拟 physically plausible

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
24 Stable spectral neural operator for learning stiff PDE systems from limited data 提出稳定谱神经算子(SSNO),用少量数据学习刚性偏微分方程系统 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页