cs.LG(2025-07-07)

📊 共 29 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (13 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (12 🔗1) 支柱八:物理动画 (Physics-based Animation) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (13 篇)

#题目一句话要点标签🔗
1 Causal Foundation Models: Disentangling Physics from Instrument Properties 提出因果基础模型,解耦物理现象与仪器特性,提升时间序列泛化性 representation learning contrastive learning foundation model
2 Beyond Training-time Poisoning: Component-level and Post-training Backdoors in Deep Reinforcement Learning 揭示深度强化学习供应链漏洞,提出组件级和后训练的后门攻击 reinforcement learning deep reinforcement learning DRL
3 Representation learning with a transformer by contrastive learning for money laundering detection 提出基于Transformer和对比学习的表示学习方法,用于解决反洗钱检测问题。 representation learning contrastive learning
4 2048: Reinforcement Learning in a Delayed Reward Environment 提出Horizon-DQN,解决2048游戏中延迟奖励下的强化学习问题,显著提升性能。 reinforcement learning PPO curriculum learning
5 Accelerated Online Reinforcement Learning using Auxiliary Start State Distributions 提出基于辅助起始状态分布的加速在线强化学习方法,提升样本效率。 reinforcement learning affordance
6 Critiques of World Models 提出一种基于分层、多级和混合表示的通用世界模型架构,用于实现物理、智能体和嵌套的AGI系统。 world model
7 Identify, Isolate, and Purge: Mitigating Hallucinations in LVLMs via Self-Evolving Distillation 提出SEED框架,通过自进化蒸馏缓解大型视觉语言模型中的幻觉问题。 distillation
8 Information-Guided Diffusion Sampling for Dataset Distillation 提出信息引导的扩散采样以解决数据集蒸馏问题 distillation
9 wd1: Weighted Policy Optimization for Reasoning in Diffusion Language Models 提出wd1以提升扩散语言模型的推理能力 reinforcement learning large language model
10 Replacing thinking with tool usage enables reasoning in small language models 用工具使用代替思考,使小语言模型具备推理能力 reinforcement learning large language model
11 When do World Models Successfully Learn Dynamical Systems? 提出基于World Models的动力系统学习框架,有效模拟物理系统。 world model
12 Going Beyond Heuristics by Imposing Policy Improvement as a Constraint 提出HEPO算法,通过约束策略提升来有效融合启发式信息,降低人工设计奖励函数的难度。 reinforcement learning reward design
13 Interpretable Reward Modeling with Active Concept Bottlenecks 提出基于主动概念瓶颈的可解释奖励建模框架,提升奖励模型透明度和样本效率。 preference learning RLHF

🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)

#题目一句话要点标签🔗
14 Multimodal LLM Integrated Semantic Communications for 6G Immersive Experiences 提出MLLM-SC框架,利用多模态大语言模型提升6G沉浸式体验的语义通信性能 large language model foundation model multimodal
15 Reinforcement Fine-Tuning Naturally Mitigates Forgetting in Continual Post-Training 提出基于强化学习的微调方法,缓解持续后训练中的灾难性遗忘问题 large language model foundation model multimodal
16 Classification of autoimmune diseases from Peripheral blood TCR repertoires by multimodal multi-instance learning EAMil:一种基于多模态多示例学习的TCR序列自身免疫疾病分类方法 multimodal
17 NTSFormer: A Self-Teaching Graph Transformer for Multimodal Isolated Cold-Start Node Classification 提出NTSFormer,通过自监督图Transformer解决多模态孤立冷启动节点分类问题 multimodal
18 Performance Evaluation of General Purpose Large Language Models for Basic Linear Algebra Subprograms Code Generation 评估通用大语言模型在BLAS代码生成中的性能表现 large language model
19 Accuracy and Consumption analysis from a compressed model by CompactifAI from Multiverse Computing CompactifAI压缩Llama 3.1 8B模型,实现能耗降低与精度保持 large language model
20 Beyond Scaling Curves: Internal Dynamics of Neural Networks Through the NTK Lens 通过NTK视角分析神经网络内部动态,揭示性能缩放规律的局限性 large language model
21 The Case for Instance-Optimized LLMs in OLAP Databases IOLM-DB:针对OLAP数据库,提出实例优化LLM以提升查询效率。 large language model
22 Fine-tuning on simulated data outperforms prompting for agent tone of voice 通过在模拟数据上微调,可显著提升Agent语音交互的自然对话风格,优于Prompting方法。 instruction following
23 Spatial and Semantic Embedding Integration for Stereo Sound Event Localization and Detection in Regular Videos 提出基于空间和语义嵌入融合的立体声音频事件定位与检测方法,用于常规视频。 multimodal
24 ABench-Physics: Benchmarking Physical Reasoning in LLMs via High-Difficulty and Dynamic Physics Problems ABench-Physics:通过高难度动态物理问题评估LLM的物理推理能力 large language model
25 any4: Learned 4-bit Numeric Representation for LLMs 提出any4:一种面向LLM的可学习4比特数值表示方法,无需预处理。 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
26 VaxPulse: Monitoring of Online Public Concerns to Enhance Post-licensure Vaccine Surveillance VaxPulse:通过监测在线公众关注点,增强疫苗上市后监测 PULSE
27 A generalized Wasserstein-2 distance approach for efficient reconstruction of random field models using stochastic neural networks 提出一种广义Wasserstein-2距离方法,利用随机神经网络高效重建随机场模型。 spatiotemporal

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
28 Photon Splatting: A Physics-Guided Neural Surrogate for Real-Time Wireless Channel Prediction 提出Photon Splatting,一种用于复杂环境实时无线信道预测的物理引导神经代理模型。 splatting PULSE

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
29 Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation 利用扩散模型和Boomerang采样进行音频数据增强和乐器替换 manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页