cs.LG（2025-10-28）

📊 共 30 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (14 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (13 🔗2) 支柱八：物理动画 (Physics-based Animation) (2) 支柱四：生成式动作 (Generative Motion) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (14 篇)

#	题目	一句话要点	标签	🔗	⭐
1	LRT-Diffusion: Calibrated Risk-Aware Guidance for Diffusion Policies	LRT-Diffusion：用于离线强化学习中具有校准风险意识的扩散策略引导方法	reinforcement learning offline RL offline reinforcement learning
2	Greedy Sampling Is Provably Efficient for RLHF	针对通用偏好模型的RLHF，提出贪婪采样算法并证明其高效性	reinforcement learning RLHF large language model
3	HiMAE: Hierarchical Masked Autoencoders Discover Resolution-Specific Structure in Wearable Time Series	HiMAE：分层掩码自编码器发现可穿戴时间序列中特定分辨率的结构	representation learning masked autoencoder foundation model
4	SpatialTraceGen: High-Fidelity Traces for Efficient VLM Spatial Reasoning Distillation	SpatialTraceGen：高效VLM空间推理蒸馏的高保真轨迹生成	reinforcement learning offline reinforcement learning distillation
5	Dual-Mind World Models: A General Framework for Learning in Dynamic Wireless Networks	提出双脑世界模型，解决动态无线网络中数据低效和泛化性差的问题。	reinforcement learning world model model-based RL
6	Non-myopic Matching and Rebalancing in Large-Scale On-Demand Ride-Pooling Systems Using Simulation-Informed Reinforcement Learning	提出基于模拟增强强化学习的非近视匹配与重平衡算法，提升大规模按需拼车系统效率。	reinforcement learning spatiotemporal
7	PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling	PaTaRM：通过偏好感知任务自适应奖励建模桥接成对和点式信号，提升RLHF性能	reinforcement learning RLHF large language model	✅
8	Enhancing Hierarchical Reinforcement Learning through Change Point Detection in Time Series	提出基于Transformer的变点检测模块，增强分层强化学习在长时任务中的可扩展性。	reinforcement learning
9	Eigenfunction Extraction for Ordered Representation Learning	提出特征函数提取框架，用于有序表征学习，提升特征选择的效率和准确性。	representation learning
10	Perception Learning: A Formal Separation of Sensory Representation Learning from Decision Learning	提出感知学习以解决决策学习与感知表示学习的分离问题	representation learning
11	Causal-Aware Generative Adversarial Networks with Reinforcement Learning	提出CA-GAN，利用因果图和强化学习生成高质量、保护隐私的表格数据。	reinforcement learning
12	Predicting Barge Tow Size on Inland Waterways Using Vessel Trajectory Derived Features: Proof of Concept	提出一种基于AIS数据的机器学习方法，用于预测内河航道驳船数量，提升水域感知能力。	MAE spatiotemporal
13	Sample-efficient and Scalable Exploration in Continuous-Time RL	提出COMBRL算法，解决连续时间强化学习中的样本效率和可扩展性问题。	reinforcement learning model-based RL
14	A Novel XAI-Enhanced Quantum Adversarial Networks for Velocity Dispersion Modeling in MaNGA Galaxies	提出XAI增强的量子对抗网络，用于星系速度弥散建模	predictive model MAE

🔬 支柱九：具身大模型 (Embodied Foundation Models) (13 篇)

#	题目	一句话要点	标签	🔗	⭐
15	What do vision-language models see in the context? Investigating multimodal in-context learning	系统性研究视觉-语言模型中的上下文学习能力，揭示其多模态融合的局限性	large language model multimodal instruction following
16	Can Aha Moments Be Fake? Identifying True and Decorative Thinking Steps in Chain-of-Thought	提出True Thinking Score以识别CoT中真实推理步骤与装饰性步骤	large language model chain-of-thought
17	The Kinetics of Reasoning: How Chain-of-Thought Shapes Learning in Transformers?	研究CoT如何影响Transformer学习，揭示其加速泛化但受限于任务复杂度的特性	chain-of-thought
18	Pearl: A Foundation Model for Placing Every Atom in the Right Location	Pearl：用于蛋白质-配体共折叠的原子级精度基础模型	foundation model
19	A Pragmatic Way to Measure Chain-of-Thought Monitorability	提出一种评估思维链（CoT）可监控性的实用方法，保障AI安全。	chain-of-thought
20	ChessQA: Evaluating Large Language Models for Chess Understanding	提出ChessQA：一个综合性基准测试，用于评估大型语言模型在国际象棋理解方面的能力。	large language model
21	Secure Retrieval-Augmented Generation against Poisoning Attacks	提出RAGuard框架，增强检索增强生成模型抵抗数据投毒攻击的安全性	large language model
22	Bayesian Neural Networks vs. Mixture Density Networks: Theoretical and Empirical Insights for Uncertainty-Aware Nonlinear Modeling	对比贝叶斯神经网络与混合密度网络，用于不确定性非线性建模	multimodal
23	Sequences of Logits Reveal the Low Rank Structure of Language Models	揭示语言模型低秩结构：利用logits序列进行高效生成	large language model
24	MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling	提出MISA以解决大语言模型优化中的内存效率问题	large language model	✅
25	FLoRA: Fused forward-backward adapters for parameter efficient fine-tuning and reducing inference-time latencies of LLMs	FLoRA：融合前向-后向适配器，提升LLM微调效率并降低推理延迟	large language model
26	SALS: Sparse Attention in Latent Space for KV cache Compression	提出SALS框架，通过潜在空间稀疏注意力实现KV缓存压缩，加速长文本LLM推理。	large language model
27	FALQON: Accelerating LoRA Fine-tuning with Low-Bit Floating-Point Arithmetic	FALQON：通过低比特浮点运算加速LoRA微调，提升训练效率。	large language model	✅

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
28	Causal Convolutional Neural Networks as Finite Impulse Response Filters	将因果卷积神经网络视为有限脉冲响应滤波器，用于动态系统建模	PULSE multimodal
29	Learning from History: A Retrieval-Augmented Framework for Spatiotemporal Prediction	提出检索增强预测框架RAP，解决时空预测中长期误差累积问题	spatiotemporal

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
30	Unlocking Out-of-Distribution Generalization in Dynamics through Physics-Guided Augmentation	提出SPARK，通过物理引导增强提升动态系统建模的泛化能力	physically plausible

⬅️ 返回 cs.LG 首页 · 🏠 返回主页