cs.LG（2025-03-12）

📊 共 34 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (17 🔗3) 支柱九：具身大模型 (Embodied Foundation Models) (13 🔗2) 支柱一：机器人控制 (Robot Control) (2) 支柱五：交互与反应 (Interaction & Reaction) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (17 篇)

#	题目	一句话要点	标签	🔗	⭐
1	A Deep Reinforcement Learning Approach to Automated Stock Trading, using xLSTM Networks	提出基于xLSTM网络的深度强化学习股票交易方法，提升长期依赖建模能力。	reinforcement learning deep reinforcement learning DRL
2	A Survey of Direct Preference Optimization	DPO综述：直接偏好优化方法，提升LLM对齐效率与稳定性	reinforcement learning RLHF DPO	✅
3	Towards Robust Multimodal Representation: A Unified Approach with Adaptive Experts and Alignment	提出MoSARe模型，解决医疗多模态数据缺失下的鲁棒表示问题	contrastive learning multimodal	✅
4	Language-Enhanced Representation Learning for Single-Cell Transcriptomics	提出scMMGPT，用于单细胞转录组学中语言增强的表征学习。	representation learning large language model multimodal
5	The Pitfalls of Imitation Learning when Actions are Continuous	揭示连续动作空间模仿学习的局限性，并探索改进策略	offline RL imitation learning behavior cloning
6	Temporal Difference Flows	提出TD-Flow，通过概率路径上的贝尔曼方程和流匹配技术，学习长时域精确的几何视界模型。	flow matching world model predictive model
7	Strategyproof Reinforcement Learning from Human Feedback	提出Pessimistic Median of MLEs算法，解决RLHF中策略性反馈导致的策略偏差问题	reinforcement learning RLHF
8	Rule-Guided Reinforcement Learning Policy Evaluation and Improvement	LEGIBLE：一种规则引导的强化学习策略评估与改进方法	reinforcement learning deep reinforcement learning
9	Implicit Contrastive Representation Learning with Guided Stop-gradient	提出引导式停止梯度方法，提升自监督对比学习的稳定性和性能	representation learning contrastive learning	✅
10	Distributionally Robust Multi-Agent Reinforcement Learning for Dynamic Chute Mapping	提出DRMARL框架，解决亚马逊仓库动态分拣中对不确定诱导率的鲁棒映射问题	reinforcement learning
11	Privacy-Preserved Automated Scoring using Federated Learning for Educational Research	提出基于联邦学习的隐私保护自动评分框架，用于教育评估研究。	MAE large language model
12	ConjointNet: Enhancing Conjoint Analysis for Preference Prediction with Representation Learning	ConjointNet：利用表征学习增强联合分析，提升偏好预测精度	representation learning
13	Towards Causal Model-Based Policy Optimization	提出C-MBPO，通过因果模型提升模型基强化学习的泛化性和鲁棒性	reinforcement learning policy learning predictive model
14	Optimisation of the Accelerator Control by Reinforcement Learning: A Simulation-Based Approach	提出基于强化学习的加速器控制优化框架，提升束线性能	reinforcement learning
15	Reinforcement Learning is all You Need	利用纯强化学习训练3B语言模型，提升推理能力。	reinforcement learning
16	Evaluating Reinforcement Learning Safety and Trustworthiness in Cyber-Physical Systems	提出SAFE-RL框架，用于评估和提升强化学习在信息物理系统中的安全性和可信度	reinforcement learning
17	Representation Retrieval Learning for Heterogeneous Data Integration	提出表征检索学习框架R2，解决异构数据集成中的预测性能下降问题。	predictive model representation learning

🔬 支柱九：具身大模型 (Embodied Foundation Models) (13 篇)

#	题目	一句话要点	标签	🔗	⭐
18	SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models	SciHorizon：构建AI4Science评估框架，从科学数据到大语言模型	large language model multimodal
19	Foundation Models for Spatio-Temporal Data Science: A Tutorial and Survey	综述时空数据科学中的Foundation Model，提升时空数据任务的泛化性和适应性。	large language model foundation model
20	Revisiting semi-supervised learning in the era of foundation models	针对视觉基础模型，提出基于集成伪标签的半监督自训练方法	foundation model
21	Large Language Models for Multi-Facility Location Mechanism Design	提出LLMMech，利用大语言模型解决多设施选址机制设计问题	large language model
22	Towards Graph Foundation Models: A Transferability Perspective	构建图基础模型迁移性分析框架，促进跨领域图数据泛化	foundation model
23	PharMolixFM: All-Atom Foundation Models for Molecular Modeling and Generation	PharMolixFM：用于分子建模和生成的全原子基础模型	foundation model	✅
24	LLM-PS: Empowering Large Language Models for Time Series Forecasting with Temporal Patterns and Semantics	LLM-PS：通过时序模式与语义增强大语言模型用于时间序列预测	large language model
25	TabNSA: Native Sparse Attention for Efficient Tabular Data Learning	提出TabNSA，利用原生稀疏注意力高效学习表格数据	large language model
26	Adaptive political surveys and GPT-4: Tackling the cold start problem with simulated user interactions	利用GPT-4模拟用户交互，解决自适应政治调查的冷启动问题	large language model
27	Priority-Aware Preemptive Scheduling for Mixed-Priority Workloads in MoE Inference	QLLM：面向MoE模型，实现优先级感知的抢占式调度，优化混合负载推理。	large language model
28	Týr-the-Pruner: Structural Pruning LLMs via Global Sparsity Distribution Optimization	Týr-the-Pruner：通过全局稀疏度分布优化实现LLM结构化剪枝	large language model	✅
29	Why LLMs Cannot Think and How to Fix It	揭示LLM架构约束导致其无法进行“思考”，并提出改进方案	large language model
30	GRU: Mitigating the Trade-off between Unlearning and Retention for LLMs	提出梯度修正卸载(GRU)框架，缓解LLM卸载与保留的权衡问题	large language model

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
31	Probing Latent Subspaces in LLM for AI Security: Identifying and Manipulating Adversarial States	利用LLM隐空间探测AI安全：识别并操纵对抗状态	manipulation large language model
32	I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data?	提出基于隐变量生成模型的理论框架，揭示LLM学习人类可解释概念的内在机制。	manipulation large language model

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
33	Exploiting Unstructured Sparsity in Fully Homomorphic Encrypted DNNs	针对全同态加密DNN，提出利用非结构化稀疏性加速矩阵乘法方案	OMOMO

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
34	Time-EAPCR: A Deep Learning-Based Novel Approach for Anomaly Detection Applied to the Environmental Field	提出Time-EAPCR深度学习模型，用于环境领域的异常检测。	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页