cs.LG（2025-09-19）

📊 共 32 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (17 🔗1) 支柱九：具身大模型 (Embodied Foundation Models) (12 🔗1) 支柱一：机器人控制 (Robot Control) (3)

🔬 支柱二：RL算法与架构 (RL & Architecture) (17 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Foundation Models as World Models: A Foundational Study in Text-Based GridWorlds	提出基于Foundation Model的世界模型与智能体，提升文本网格世界中的强化学习效率。	reinforcement learning world model large language model
2	Estimating Clinical Lab Test Result Trajectories from PPG using Physiological Foundation Model and Patient-Aware State Space Model -- a UNIPHY+ Approach	UNIPHY+Lab：利用PPG和生理基础模型预测ICU患者的连续实验室指标轨迹	Mamba state space model MAE
3	Polynomial Contrastive Learning for Privacy-Preserving Representation Learning on Graphs	提出Poly-GRACE，实现同态加密友好的图神经网络自监督表示学习	representation learning contrastive learning OMOMO
4	Optimizing Product Deduplication in E-Commerce with Multimodal Embeddings	提出一种基于多模态嵌入的电商产品去重方法，提升大规模商品识别精度。	masked autoencoder multimodal
5	MTS-DMAE: Dual-Masked Autoencoder for Unsupervised Multivariate Time Series Representation Learning	提出双掩码自编码器DMAE，用于无监督多元时间序列表示学习。	representation learning masked autoencoder
6	Test-Time Learning and Inference-Time Deliberation for Efficiency-First Offline Reinforcement Learning in Care Coordination and Population Health Management	提出TTL+ITD框架，用于高效、可审计的医疗协调离线强化学习。	reinforcement learning offline reinforcement learning
7	Mental Accounts for Actions: EWA-Inspired Attention in Decision Transformers	EWA-VQ-ODT：为在线决策Transformer引入经验加权吸引机制，提升样本效率	reinforcement learning decision transformer reward shaping
8	Automated Cyber Defense with Generalizable Graph-based Reinforcement Learning Agents	提出基于图神经网络的强化学习智能体，用于自动化网络防御并具备泛化能力	reinforcement learning deep reinforcement learning
9	Fully Decentralized Cooperative Multi-Agent Reinforcement Learning is A Context Modeling Problem	提出动力感知上下文(DAC)方法，解决完全去中心化合作多智能体强化学习中的非平稳性和过度泛化问题	reinforcement learning policy learning
10	DiffusionNFT: Online Diffusion Reinforcement with Forward Process	提出DiffusionNFT，通过前向过程进行在线扩散强化学习，提升生成质量和效率。	reinforcement learning flow matching classifier-free guidance
11	Uncertainty-Based Smooth Policy Regularisation for Reinforcement Learning with Few Demonstrations	SPReD：基于不确定性的平滑策略正则化，提升少样本演示强化学习效果	reinforcement learning	✅
12	RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation	RLinf：通过宏微观流转换实现灵活高效的大规模强化学习	reinforcement learning
13	RMT-KD: Random Matrix Theoretic Causal Knowledge Distillation	RMT-KD：提出基于随机矩阵理论的因果知识蒸馏方法，用于深度学习模型压缩。	distillation
14	Nonconvex Regularization for Feature Selection in Reinforcement Learning	提出基于非凸正则化的强化学习特征选择算法，提升高噪声环境下的性能。	reinforcement learning
15	Inverse Optimization Latent Variable Models for Learning Costs Applied to Route Problems	提出逆优化隐变量模型，用于学习路径规划问题中的成本函数分布	reinforcement learning inverse reinforcement learning
16	HyP-ASO: A Hybrid Policy-based Adaptive Search Optimization Framework for Large-Scale Integer Linear Programs	HyP-ASO：一种混合策略自适应搜索优化框架，用于解决大规模整数线性规划问题。	reinforcement learning deep reinforcement learning
17	Learning to Optimize Capacity Planning in Semiconductor Manufacturing	提出基于异构图神经网络的深度强化学习模型，优化半导体制造中的产能规划。	reinforcement learning deep reinforcement learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (12 篇)

#	题目	一句话要点	标签	🔗	⭐
18	Uncertainty Quantification of Large Language Models using Approximate Bayesian Computation	提出基于近似贝叶斯计算的大语言模型不确定性量化方法，提升临床诊断可靠性。	large language model
19	Efficient Long-Tail Learning in Latent Space by sampling Synthetic Data	提出基于潜在空间合成数据采样的高效长尾学习方法	foundation model
20	MatchFixAgent: Language-Agnostic Autonomous Repository-Level Code Translation Validation and Repair	提出MatchFixAgent，实现语言无关的仓库级代码翻译验证与修复	large language model
21	Randomized Smoothing Meets Vision-Language Models	针对视觉-语言模型，提出基于随机平滑的鲁棒性验证方法，防御对抗攻击。	VLA
22	SABER: Uncovering Vulnerabilities in Safety Alignment via Cross-Layer Residual Connection	SABER：通过跨层残差连接揭示安全对齐大语言模型的脆弱性	large language model	✅
23	The Alignment Bottleneck	提出容量耦合对齐性能区间以解决大语言模型对齐问题	large language model
24	On Optimal Steering to Achieve Exact Fairness	提出基于KL散度的最优特征分布引导方法，实现精确公平性并提升模型效用	large language model
25	EigenTrack: Spectral Activation Feature Tracking for Hallucination and Out-of-Distribution Detection in LLMs and VLMs	EigenTrack：利用谱激活特征追踪LLM和VLM中的幻觉和OOD检测	large language model
26	KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning	KITE：基于核方法和信息论的上下文学习范例选择，提升小样本分类性能。	large language model
27	Information Geometry of Variational Bayes	揭示信息几何与变分贝叶斯的联系，利用自然梯度优化大规模语言模型。	large language model
28	Spectral Logit Sculpting: Adaptive Low-Rank Logit Transformation for Controlled Text Generation	提出Spectral Logit Sculpting (SLS)，通过自适应低秩logit变换控制文本生成，提升LLM可靠性。	large language model
29	Small LLMs with Expert Blocks Are Good Enough for Hyperparamter Tuning	提出专家块框架，小LLM即可实现高效超参数调优	large language model

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
30	Quantum Reinforcement Learning with Dynamic-Circuit Qubit Reuse and Grover-Based Trajectory Optimization	提出一种基于动态电路量子比特复用和Grover搜索的量子强化学习框架。	trajectory optimization reinforcement learning
31	CoUn: Empowering Machine Unlearning via Contrastive Learning	CoUn：通过对比学习增强机器学习的不可学习性	manipulation contrastive learning
32	UniTac2Pose: A Unified Approach Learned in Simulation for Category-level Visuotactile In-hand Pose Estimation	UniTac2Pose：模拟环境学习的统一框架，用于类别级视觉触觉手内姿态估计	sim-to-real feature matching

⬅️ 返回 cs.LG 首页 · 🏠 返回主页