cs.LG（2025-07-07）

📊 共 29 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (13 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (12 🔗1) 支柱八：物理动画 (Physics-based Animation) (2) 支柱三：空间感知与语义 (Perception & Semantics) (1) 支柱一：机器人控制 (Robot Control) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (13 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Causal Foundation Models: Disentangling Physics from Instrument Properties	提出因果基础模型，解耦物理现象与仪器特性，提升时间序列泛化性	representation learning contrastive learning foundation model
2	Beyond Training-time Poisoning: Component-level and Post-training Backdoors in Deep Reinforcement Learning	揭示深度强化学习供应链漏洞，提出组件级和后训练的后门攻击	reinforcement learning deep reinforcement learning DRL
3	Representation learning with a transformer by contrastive learning for money laundering detection	提出基于Transformer和对比学习的表示学习方法，用于解决反洗钱检测问题。	representation learning contrastive learning
4	2048: Reinforcement Learning in a Delayed Reward Environment	提出Horizon-DQN，解决2048游戏中延迟奖励下的强化学习问题，显著提升性能。	reinforcement learning PPO curriculum learning
5	Accelerated Online Reinforcement Learning using Auxiliary Start State Distributions	提出基于辅助起始状态分布的加速在线强化学习方法，提升样本效率。	reinforcement learning affordance
6	Critiques of World Models	提出一种基于分层、多级和混合表示的通用世界模型架构，用于实现物理、智能体和嵌套的AGI系统。	world model
7	Identify, Isolate, and Purge: Mitigating Hallucinations in LVLMs via Self-Evolving Distillation	提出SEED框架，通过自进化蒸馏缓解大型视觉语言模型中的幻觉问题。	distillation
8	Information-Guided Diffusion Sampling for Dataset Distillation	提出信息引导的扩散采样以解决数据集蒸馏问题	distillation
9	wd1: Weighted Policy Optimization for Reasoning in Diffusion Language Models	提出wd1以提升扩散语言模型的推理能力	reinforcement learning large language model
10	Replacing thinking with tool usage enables reasoning in small language models	用工具使用代替思考，使小语言模型具备推理能力	reinforcement learning large language model
11	When do World Models Successfully Learn Dynamical Systems?	提出基于World Models的动力系统学习框架，有效模拟物理系统。	world model
12	Going Beyond Heuristics by Imposing Policy Improvement as a Constraint	提出HEPO算法，通过约束策略提升来有效融合启发式信息，降低人工设计奖励函数的难度。	reinforcement learning reward design	✅
13	Interpretable Reward Modeling with Active Concept Bottlenecks	提出基于主动概念瓶颈的可解释奖励建模框架，提升奖励模型透明度和样本效率。	preference learning RLHF

🔬 支柱九：具身大模型 (Embodied Foundation Models) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
14	Multimodal LLM Integrated Semantic Communications for 6G Immersive Experiences	提出MLLM-SC框架，利用多模态大语言模型提升6G沉浸式体验的语义通信性能	large language model foundation model multimodal
15	Reinforcement Fine-Tuning Naturally Mitigates Forgetting in Continual Post-Training	提出基于强化学习的微调方法，缓解持续后训练中的灾难性遗忘问题	large language model foundation model multimodal
16	Classification of autoimmune diseases from Peripheral blood TCR repertoires by multimodal multi-instance learning	EAMil：一种基于多模态多示例学习的TCR序列自身免疫疾病分类方法	multimodal
17	NTSFormer: A Self-Teaching Graph Transformer for Multimodal Isolated Cold-Start Node Classification	提出NTSFormer，通过自监督图Transformer解决多模态孤立冷启动节点分类问题	multimodal
18	Performance Evaluation of General Purpose Large Language Models for Basic Linear Algebra Subprograms Code Generation	评估通用大语言模型在BLAS代码生成中的性能表现	large language model
19	Accuracy and Consumption analysis from a compressed model by CompactifAI from Multiverse Computing	CompactifAI压缩Llama 3.1 8B模型，实现能耗降低与精度保持	large language model
20	Beyond Scaling Curves: Internal Dynamics of Neural Networks Through the NTK Lens	通过NTK视角分析神经网络内部动态，揭示性能缩放规律的局限性	large language model
21	The Case for Instance-Optimized LLMs in OLAP Databases	IOLM-DB：针对OLAP数据库，提出实例优化LLM以提升查询效率。	large language model
22	Fine-tuning on simulated data outperforms prompting for agent tone of voice	通过在模拟数据上微调，可显著提升Agent语音交互的自然对话风格，优于Prompting方法。	instruction following
23	Spatial and Semantic Embedding Integration for Stereo Sound Event Localization and Detection in Regular Videos	提出基于空间和语义嵌入融合的立体声音频事件定位与检测方法，用于常规视频。	multimodal
24	ABench-Physics: Benchmarking Physical Reasoning in LLMs via High-Difficulty and Dynamic Physics Problems	ABench-Physics：通过高难度动态物理问题评估LLM的物理推理能力	large language model
25	any4: Learned 4-bit Numeric Representation for LLMs	提出any4：一种面向LLM的可学习4比特数值表示方法，无需预处理。	large language model	✅

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
26	VaxPulse: Monitoring of Online Public Concerns to Enhance Post-licensure Vaccine Surveillance	VaxPulse：通过监测在线公众关注点，增强疫苗上市后监测	PULSE
27	A generalized Wasserstein-2 distance approach for efficient reconstruction of random field models using stochastic neural networks	提出一种广义Wasserstein-2距离方法，利用随机神经网络高效重建随机场模型。	spatiotemporal

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
28	Photon Splatting: A Physics-Guided Neural Surrogate for Real-Time Wireless Channel Prediction	提出Photon Splatting，一种用于复杂环境实时无线信道预测的物理引导神经代理模型。	splatting PULSE

🔬 支柱一：机器人控制 (Robot Control) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
29	Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation	利用扩散模型和Boomerang采样进行音频数据增强和乐器替换	manipulation

⬅️ 返回 cs.LG 首页 · 🏠 返回主页