cs.LG(2025-10-06)

📊 共 29 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (16 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (11 🔗3) 支柱八:物理动画 (Physics-based Animation) (1) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (16 篇)

#题目一句话要点标签🔗
1 Revealing Interconnections between Diseases: from Statistical Methods to Large Language Models 系统评估多种方法揭示疾病关联,发现LLM在疾病新关联发现上潜力有限 large language model
2 A Clinical-grade Universal Foundation Model for Intraoperative Pathology CRISP:用于术中病理学的临床级通用基础模型 foundation model
3 LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning LaDiR:利用潜在扩散模型增强LLM的文本推理能力 large language model chain-of-thought
4 Activation Quantization of Vision Encoders Needs Prefixing Registers 提出RegCache,通过前缀寄存器实现视觉编码器激活量化的无训练优化 multimodal
5 KVLinC : KV Cache Quantization with Hadamard Rotation and Linear Correction KVLinC通过哈达玛旋转和线性校正实现KV缓存的极低比特量化,提升LLM推理效率。 large language model
6 Physics-informed Attention-enhanced Fourier Neural Operator for Solar Magnetic Field Extrapolations 提出物理信息增强注意力傅里叶神经算子(PIANO)用于太阳磁场外推 multimodal
7 DP-Adam-AC: Privacy-preserving Fine-Tuning of Localizable Language Models Using Adam Optimization with Adaptive Clipping 提出DP-Adam-AC算法,用于保护隐私地微调可本地化语言模型 large language model
8 Decoding Partial Differential Equations: Cross-Modal Adaptation of Decoder-only Models to PDEs 提出Parallel Flipping和Sequence Doubling方法,提升Decoder-only模型在偏微分方程求解中的跨模态适应能力。 large language model
9 Stratum: System-Hardware Co-Design with Tiered Monolithic 3D-Stackable DRAM for Efficient MoE Serving Stratum:面向高效MoE Serving的单片3D堆叠DRAM系统硬件协同设计 large language model
10 CMT-Benchmark: A Benchmark for Condensed Matter Theory Built by Expert Researchers 提出CMT-Benchmark:一个由专家构建的凝聚态理论基准,用于评估LLM的物理推理能力。 large language model
11 Test-Time Scaling in Diffusion LLMs via Hidden Semi-Autoregressive Experts 提出HEX,通过隐式半自回归专家集成,提升扩散LLM的推理时性能。 large language model
12 Inoculation Prompting: Instructing LLMs to misbehave at train-time improves test-time alignment Inoculation Prompting:通过训练时诱导LLM产生不良行为,提升测试时对齐效果 large language model
13 Less is More: Recursive Reasoning with Tiny Networks 提出TRM:使用极小网络进行递归推理,超越大型语言模型在难题上的表现 large language model
14 Distribution Preference Optimization: A Fine-grained Perspective for LLM Unlearning 提出DiPO:一种基于分布偏好优化的LLM非学习方法,提升遗忘质量和模型效用。 large language model
15 ViTs: Teaching Machines to See Time Series Anomalies Like Human Experts ViTs:提出基于视觉-语言模型的时序异常检测框架,解决零样本泛化和变长序列处理难题 large language model
16 Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models 提出ACE框架,通过演进上下文提升LLM在Agent和领域推理任务中的性能。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (11 篇)

#题目一句话要点标签🔗
17 Adversarial Reinforcement Learning for Large Language Model Agent Safety 提出ARLAS,利用对抗强化学习提升大语言模型Agent的安全性,防御提示注入攻击。 reinforcement learning large language model
18 Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization 提出Margin Adaptive DPO,利用奖励模型实现偏好优化中的细粒度控制 DPO direct preference optimization large language model
19 Adjusting the Output of Decision Transformer with Action Gradient 提出基于动作梯度的决策Transformer优化方法,提升离线强化学习性能 reinforcement learning offline RL decision transformer
20 Boomerang Distillation Enables Zero-Shot Model Size Interpolation Boomerang蒸馏实现零样本模型尺寸插值,高效构建模型族 distillation large language model
21 MCCE: A Framework for Multi-LLM Collaborative Co-Evolution 提出MCCE框架,利用多LLM协同进化解决多目标离散优化问题 reinforcement learning distillation large language model
22 Partial Information Decomposition via Normalizing Flows in Latent Gaussian Distributions 提出基于归一化流的高效高斯潜在空间偏信息分解方法 predictive model multimodal
23 Draft, Verify, and Improve: Toward Training-Aware Speculative Decoding 提出DVI框架以解决自回归解码的延迟瓶颈问题 distillation large language model
24 Curiosity-Driven Development of Action and Language in Robots Through Self-Exploration 提出基于好奇心驱动的机器人动作与语言学习框架,实现自主探索和组合泛化。 reinforcement learning large language model
25 Reinforce-Ada: An Adaptive Sampling Framework under Non-linear RL Objectives 提出Reinforce-Ada自适应采样框架,解决非线性RL目标下大语言模型推理中的信号丢失问题。 reinforcement learning large language model
26 Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails 揭示LLM Agent自进化过程中的对齐倾覆现象,及其对长期可靠性的威胁 reinforcement learning large language model
27 Learning on the Job: Test-Time Curricula for Targeted Reinforcement Learning 提出测试时课程强化学习(TTC-RL),解决模型在特定任务上的持续学习问题。 reinforcement learning

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
28 Physics-Informed Neural Networks with Fourier Features and Attention-Driven Decoding 提出基于傅里叶特征和注意力解码的谱PINN,提升偏微分方程求解精度。 spatiotemporal

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
29 HybridFlow: Quantification of Aleatoric and Epistemic Uncertainty with a Single Hybrid Model HybridFlow:提出混合模型,统一量化不确定性,提升回归任务性能 depth estimation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页