cs.LG(2025-09-19)

📊 共 32 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (17 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (12 🔗1) 支柱一:机器人控制 (Robot Control) (3)

🔬 支柱二:RL算法与架构 (RL & Architecture) (17 篇)

#题目一句话要点标签🔗
1 Foundation Models as World Models: A Foundational Study in Text-Based GridWorlds 提出基于Foundation Model的世界模型与智能体,提升文本网格世界中的强化学习效率。 reinforcement learning world model large language model
2 Estimating Clinical Lab Test Result Trajectories from PPG using Physiological Foundation Model and Patient-Aware State Space Model -- a UNIPHY+ Approach UNIPHY+Lab:利用PPG和生理基础模型预测ICU患者的连续实验室指标轨迹 Mamba state space model MAE
3 Polynomial Contrastive Learning for Privacy-Preserving Representation Learning on Graphs 提出Poly-GRACE,实现同态加密友好的图神经网络自监督表示学习 representation learning contrastive learning OMOMO
4 Optimizing Product Deduplication in E-Commerce with Multimodal Embeddings 提出一种基于多模态嵌入的电商产品去重方法,提升大规模商品识别精度。 masked autoencoder multimodal
5 MTS-DMAE: Dual-Masked Autoencoder for Unsupervised Multivariate Time Series Representation Learning 提出双掩码自编码器DMAE,用于无监督多元时间序列表示学习。 representation learning masked autoencoder
6 Test-Time Learning and Inference-Time Deliberation for Efficiency-First Offline Reinforcement Learning in Care Coordination and Population Health Management 提出TTL+ITD框架,用于高效、可审计的医疗协调离线强化学习。 reinforcement learning offline reinforcement learning
7 Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers EWA-VQ-ODT:为在线决策Transformer引入经验加权吸引机制,提升样本效率 reinforcement learning decision transformer reward shaping
8 Automated Cyber Defense with Generalizable Graph-based Reinforcement Learning Agents 提出基于图神经网络的强化学习智能体,用于自动化网络防御并具备泛化能力 reinforcement learning deep reinforcement learning
9 Fully Decentralized Cooperative Multi-Agent Reinforcement Learning is A Context Modeling Problem 提出动力感知上下文(DAC)方法,解决完全去中心化合作多智能体强化学习中的非平稳性和过度泛化问题 reinforcement learning policy learning
10 DiffusionNFT: Online Diffusion Reinforcement with Forward Process 提出DiffusionNFT,通过前向过程进行在线扩散强化学习,提升生成质量和效率。 reinforcement learning flow matching classifier-free guidance
11 Uncertainty-Based Smooth Policy Regularisation for Reinforcement Learning with Few Demonstrations SPReD:基于不确定性的平滑策略正则化,提升少样本演示强化学习效果 reinforcement learning
12 RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation RLinf:通过宏微观流转换实现灵活高效的大规模强化学习 reinforcement learning
13 RMT-KD: Random Matrix Theoretic Causal Knowledge Distillation RMT-KD:提出基于随机矩阵理论的因果知识蒸馏方法,用于深度学习模型压缩。 distillation
14 Nonconvex Regularization for Feature Selection in Reinforcement Learning 提出基于非凸正则化的强化学习特征选择算法,提升高噪声环境下的性能。 reinforcement learning
15 Inverse Optimization Latent Variable Models for Learning Costs Applied to Route Problems 提出逆优化隐变量模型,用于学习路径规划问题中的成本函数分布 reinforcement learning inverse reinforcement learning
16 HyP-ASO: A Hybrid Policy-based Adaptive Search Optimization Framework for Large-Scale Integer Linear Programs HyP-ASO:一种混合策略自适应搜索优化框架,用于解决大规模整数线性规划问题。 reinforcement learning deep reinforcement learning
17 Learning to Optimize Capacity Planning in Semiconductor Manufacturing 提出基于异构图神经网络的深度强化学习模型,优化半导体制造中的产能规划。 reinforcement learning deep reinforcement learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (12 篇)

#题目一句话要点标签🔗
18 Uncertainty Quantification of Large Language Models using Approximate Bayesian Computation 提出基于近似贝叶斯计算的大语言模型不确定性量化方法,提升临床诊断可靠性。 large language model
19 Efficient Long-Tail Learning in Latent Space by sampling Synthetic Data 提出基于潜在空间合成数据采样的高效长尾学习方法 foundation model
20 MatchFixAgent: Language-Agnostic Autonomous Repository-Level Code Translation Validation and Repair 提出MatchFixAgent,实现语言无关的仓库级代码翻译验证与修复 large language model
21 Randomized Smoothing Meets Vision-Language Models 针对视觉-语言模型,提出基于随机平滑的鲁棒性验证方法,防御对抗攻击。 VLA
22 SABER: Uncovering Vulnerabilities in Safety Alignment via Cross-Layer Residual Connection SABER:通过跨层残差连接揭示安全对齐大语言模型的脆弱性 large language model
23 The Alignment Bottleneck 提出容量耦合对齐性能区间以解决大语言模型对齐问题 large language model
24 On Optimal Steering to Achieve Exact Fairness 提出基于KL散度的最优特征分布引导方法,实现精确公平性并提升模型效用 large language model
25 EigenTrack: Spectral Activation Feature Tracking for Hallucination and Out-of-Distribution Detection in LLMs and VLMs EigenTrack:利用谱激活特征追踪LLM和VLM中的幻觉和OOD检测 large language model
26 KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning KITE:基于核方法和信息论的上下文学习范例选择,提升小样本分类性能。 large language model
27 Information Geometry of Variational Bayes 揭示信息几何与变分贝叶斯的联系,利用自然梯度优化大规模语言模型。 large language model
28 Spectral Logit Sculpting: Adaptive Low-Rank Logit Transformation for Controlled Text Generation 提出Spectral Logit Sculpting (SLS),通过自适应低秩logit变换控制文本生成,提升LLM可靠性。 large language model
29 Small LLMs with Expert Blocks Are Good Enough for Hyperparamter Tuning 提出专家块框架,小LLM即可实现高效超参数调优 large language model

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
30 Quantum Reinforcement Learning with Dynamic-Circuit Qubit Reuse and Grover-Based Trajectory Optimization 提出一种基于动态电路量子比特复用和Grover搜索的量子强化学习框架。 trajectory optimization reinforcement learning
31 CoUn: Empowering Machine Unlearning via Contrastive Learning CoUn:通过对比学习增强机器学习的不可学习性 manipulation contrastive learning
32 UniTac2Pose: A Unified Approach Learned in Simulation for Category-level Visuotactile In-hand Pose Estimation UniTac2Pose:模拟环境学习的统一框架,用于类别级视觉触觉手内姿态估计 sim-to-real feature matching

⬅️ 返回 cs.LG 首页 · 🏠 返回主页