cs.LG(2025-10-28)

📊 共 30 篇论文 | 🔗 3 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (14 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (13 🔗2) 支柱八:物理动画 (Physics-based Animation) (2) 支柱四:生成式动作 (Generative Motion) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (14 篇)

#题目一句话要点标签🔗
1 LRT-Diffusion: Calibrated Risk-Aware Guidance for Diffusion Policies LRT-Diffusion:用于离线强化学习中具有校准风险意识的扩散策略引导方法 reinforcement learning offline RL offline reinforcement learning
2 Greedy Sampling Is Provably Efficient for RLHF 针对通用偏好模型的RLHF,提出贪婪采样算法并证明其高效性 reinforcement learning RLHF large language model
3 HiMAE: Hierarchical Masked Autoencoders Discover Resolution-Specific Structure in Wearable Time Series HiMAE:分层掩码自编码器发现可穿戴时间序列中特定分辨率的结构 representation learning masked autoencoder foundation model
4 SpatialTraceGen: High-Fidelity Traces for Efficient VLM Spatial Reasoning Distillation SpatialTraceGen:高效VLM空间推理蒸馏的高保真轨迹生成 reinforcement learning offline reinforcement learning distillation
5 Dual-Mind World Models: A General Framework for Learning in Dynamic Wireless Networks 提出双脑世界模型,解决动态无线网络中数据低效和泛化性差的问题。 reinforcement learning world model model-based RL
6 Non-myopic Matching and Rebalancing in Large-Scale On-Demand Ride-Pooling Systems Using Simulation-Informed Reinforcement Learning 提出基于模拟增强强化学习的非近视匹配与重平衡算法,提升大规模按需拼车系统效率。 reinforcement learning spatiotemporal
7 PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling PaTaRM:通过偏好感知任务自适应奖励建模桥接成对和点式信号,提升RLHF性能 reinforcement learning RLHF large language model
8 Enhancing Hierarchical Reinforcement Learning through Change Point Detection in Time Series 提出基于Transformer的变点检测模块,增强分层强化学习在长时任务中的可扩展性。 reinforcement learning
9 Eigenfunction Extraction for Ordered Representation Learning 提出特征函数提取框架,用于有序表征学习,提升特征选择的效率和准确性。 representation learning
10 Perception Learning: A Formal Separation of Sensory Representation Learning from Decision Learning 提出感知学习以解决决策学习与感知表示学习的分离问题 representation learning
11 Causal-Aware Generative Adversarial Networks with Reinforcement Learning 提出CA-GAN,利用因果图和强化学习生成高质量、保护隐私的表格数据。 reinforcement learning
12 Predicting Barge Tow Size on Inland Waterways Using Vessel Trajectory Derived Features: Proof of Concept 提出一种基于AIS数据的机器学习方法,用于预测内河航道驳船数量,提升水域感知能力。 MAE spatiotemporal
13 Sample-efficient and Scalable Exploration in Continuous-Time RL 提出COMBRL算法,解决连续时间强化学习中的样本效率和可扩展性问题。 reinforcement learning model-based RL
14 A Novel XAI-Enhanced Quantum Adversarial Networks for Velocity Dispersion Modeling in MaNGA Galaxies 提出XAI增强的量子对抗网络,用于星系速度弥散建模 predictive model MAE

🔬 支柱九:具身大模型 (Embodied Foundation Models) (13 篇)

#题目一句话要点标签🔗
15 What do vision-language models see in the context? Investigating multimodal in-context learning 系统性研究视觉-语言模型中的上下文学习能力,揭示其多模态融合的局限性 large language model multimodal instruction following
16 Can Aha Moments Be Fake? Identifying True and Decorative Thinking Steps in Chain-of-Thought 提出True Thinking Score以识别CoT中真实推理步骤与装饰性步骤 large language model chain-of-thought
17 The Kinetics of Reasoning: How Chain-of-Thought Shapes Learning in Transformers? 研究CoT如何影响Transformer学习,揭示其加速泛化但受限于任务复杂度的特性 chain-of-thought
18 Pearl: A Foundation Model for Placing Every Atom in the Right Location Pearl:用于蛋白质-配体共折叠的原子级精度基础模型 foundation model
19 A Pragmatic Way to Measure Chain-of-Thought Monitorability 提出一种评估思维链(CoT)可监控性的实用方法,保障AI安全。 chain-of-thought
20 ChessQA: Evaluating Large Language Models for Chess Understanding 提出ChessQA:一个综合性基准测试,用于评估大型语言模型在国际象棋理解方面的能力。 large language model
21 Secure Retrieval-Augmented Generation against Poisoning Attacks 提出RAGuard框架,增强检索增强生成模型抵抗数据投毒攻击的安全性 large language model
22 Bayesian Neural Networks vs. Mixture Density Networks: Theoretical and Empirical Insights for Uncertainty-Aware Nonlinear Modeling 对比贝叶斯神经网络与混合密度网络,用于不确定性非线性建模 multimodal
23 Sequences of Logits Reveal the Low Rank Structure of Language Models 揭示语言模型低秩结构:利用logits序列进行高效生成 large language model
24 MISA: Memory-Efficient LLMs Optimization with Module-wise Importance Sampling 提出MISA以解决大语言模型优化中的内存效率问题 large language model
25 FLoRA: Fused forward-backward adapters for parameter efficient fine-tuning and reducing inference-time latencies of LLMs FLoRA:融合前向-后向适配器,提升LLM微调效率并降低推理延迟 large language model
26 SALS: Sparse Attention in Latent Space for KV cache Compression 提出SALS框架,通过潜在空间稀疏注意力实现KV缓存压缩,加速长文本LLM推理。 large language model
27 FALQON: Accelerating LoRA Fine-tuning with Low-Bit Floating-Point Arithmetic FALQON:通过低比特浮点运算加速LoRA微调,提升训练效率。 large language model

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
28 Causal Convolutional Neural Networks as Finite Impulse Response Filters 将因果卷积神经网络视为有限脉冲响应滤波器,用于动态系统建模 PULSE multimodal
29 Learning from History: A Retrieval-Augmented Framework for Spatiotemporal Prediction 提出检索增强预测框架RAP,解决时空预测中长期误差累积问题 spatiotemporal

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
30 Unlocking Out-of-Distribution Generalization in Dynamics through Physics-Guided Augmentation 提出SPARK,通过物理引导增强提升动态系统建模的泛化能力 physically plausible

⬅️ 返回 cs.LG 首页 · 🏠 返回主页