cs.LG（2025-12-12）

📊 共 24 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (11) 支柱九：具身大模型 (Embodied Foundation Models) (8 🔗2) 支柱一：机器人控制 (Robot Control) (3) 支柱四：生成式动作 (Generative Motion) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (11 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Brain-Semantoks: Learning Semantic Tokens of Brain Dynamics with a Self-Distilled Foundation Model	Brain-Semantoks：利用自蒸馏基础模型学习大脑动态的语义Token	distillation foundation model
2	Learning to Extract Context for Context-Aware LLM Inference	提出基于强化学习的上下文提取框架，提升LLM在安全任务中的可靠性。	reinforcement learning large language model foundation model
3	Fully Inductive Node Representation Learning via Graph View Transformation	提出图视图变换GVT，实现跨数据集全归纳节点表示学习	representation learning foundation model
4	Mitigating the Safety Alignment Tax with Null-Space Constrained Policy Optimization	提出Null-Space约束策略优化(NSPO)以缓解LLM安全对齐中的能力遗忘问题	reinforcement learning large language model instruction following
5	Multi-Objective Reinforcement Learning for Large-Scale Mixed Traffic Control	提出基于多目标强化学习的大规模混合交通控制框架，提升公平性和安全性。	reinforcement learning penetration
6	Symmetry-Aware Steering of Equivariant Diffusion Policies: Benefits and Limits	提出对称感知策略引导框架，提升等变扩散策略在对称任务中的样本效率和稳定性。	reinforcement learning diffusion policy
7	Data Valuation for LLM Fine-Tuning: Efficient Shapley Value Approximation via Language Model Arithmetic	提出基于语言模型算术的高效Shapley值近似方法，用于LLM微调的数据估值	DPO direct preference optimization large language model
8	DAPO: Design Structure-Aware Pass Ordering in High-Level Synthesis with Graph Contrastive and Reinforcement Learning	DAPO：基于图对比学习与强化学习的高层次综合中设计结构感知的Pass排序	reinforcement learning contrastive learning
9	GraphPerf-RT: A Graph-Driven Performance Model for Hardware-Aware Scheduling of OpenMP Codes	提出GraphPerf-RT，用于OpenMP代码硬件感知调度的图驱动性能模型。	reinforcement learning world model model-based RL
10	Softmax as Linear Attention in the Large-Prompt Regime: a Measure-based Perspective	提出基于测度的统一框架，分析大提示下的Softmax注意力机制。	linear attention
11	ReactorFold: Generative discovery of nuclear reactor cores via emergent physical reasoning	ReactorFold：通过涌现物理推理生成核反应堆堆芯设计	DPO direct preference optimization

🔬 支柱九：具身大模型 (Embodied Foundation Models) (8 篇)

#	题目	一句话要点	标签	🔗	⭐
12	Benchmarking the Generality of Vision-Language-Action Models	MultiNet v1.0：用于评估视觉-语言-动作模型跨领域泛化能力的统一基准	generalist agent vision-language-action VLA
13	Atomic Action Slicing: Planner-Aligned Options for Generalist VLA Agents	提出原子动作切分(AAS)方法，提升VLA通用智能体在复杂任务中的泛化能力。	vision-language-action VLA	✅
14	CHIME: Chiplet-based Heterogeneous Near-Memory Acceleration for Edge Multimodal LLM Inference	CHIME：面向边缘多模态LLM推理的基于Chiplet的异构近存加速	large language model multimodal
15	The Instability of Safety: How Random Seeds and Temperature Expose Inconsistent LLM Refusal Behavior	揭示大语言模型安全性评估的不稳定性：随机种子与温度的影响	large language model
16	EnviroLLM: Resource Tracking and Optimization for Local AI	EnviroLLM：用于本地AI资源追踪与优化的开源工具包	large language model
17	AdaGradSelect: An adaptive gradient-guided layer selection method for efficient fine-tuning of SLMs	AdaGradSelect：一种自适应梯度引导的层选择方法，用于高效微调小型语言模型。	large language model
18	Spectral entropy prior-guided deep feature fusion architecture for magnetic core loss	提出SEPI-TFPNet混合模型，提升磁芯损耗预测精度与泛化能力	multimodal
19	Insight Miner: A Time Series Analysis Dataset for Cross-Domain Alignment with Natural Language	提出Insight Miner，一个基于大规模多模态模型的时间序列分析框架，并构建了首个通用领域时间序列-语言对齐数据集TS-Insights。	multimodal	✅

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
20	Goal Reaching with Eikonal-Constrained Hierarchical Quasimetric Reinforcement Learning	提出Eik-HiQRL，结合Eikonal方程与分层强化学习解决复杂动态下的目标导向任务	manipulation reinforcement learning reward design
21	High-Dimensional Surrogate Modeling for Closed-Loop Learning of Neural-Network-Parameterized Model Predictive Control	提出贝叶斯神经网络作为代理模型以优化高维控制器参数	model predictive control
22	Neural Chameleons: Language Models Can Learn to Hide Their Thoughts from Unseen Activation Monitors	提出神经变色龙：语言模型可学会对未见过的激活监控器隐藏其真实意图	manipulation

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
23	NeuralOGCM: Differentiable Ocean Modeling with Learnable Physics	NeuralOGCM：融合可学习物理的海洋模型，实现高效高精度模拟	physically plausible

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
24	Stable spectral neural operator for learning stiff PDE systems from limited data	提出稳定谱神经算子（SSNO），用少量数据学习刚性偏微分方程系统	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页