cs.LG(2025-03-12)

📊 共 34 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (17 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (13 🔗2) 支柱一:机器人控制 (Robot Control) (2) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (17 篇)

#题目一句话要点标签🔗
1 A Deep Reinforcement Learning Approach to Automated Stock Trading, using xLSTM Networks 提出基于xLSTM网络的深度强化学习股票交易方法,提升长期依赖建模能力。 reinforcement learning deep reinforcement learning DRL
2 A Survey of Direct Preference Optimization DPO综述:直接偏好优化方法,提升LLM对齐效率与稳定性 reinforcement learning RLHF DPO
3 Towards Robust Multimodal Representation: A Unified Approach with Adaptive Experts and Alignment 提出MoSARe模型,解决医疗多模态数据缺失下的鲁棒表示问题 contrastive learning multimodal
4 Language-Enhanced Representation Learning for Single-Cell Transcriptomics 提出scMMGPT,用于单细胞转录组学中语言增强的表征学习。 representation learning large language model multimodal
5 The Pitfalls of Imitation Learning when Actions are Continuous 揭示连续动作空间模仿学习的局限性,并探索改进策略 offline RL imitation learning behavior cloning
6 Temporal Difference Flows 提出TD-Flow,通过概率路径上的贝尔曼方程和流匹配技术,学习长时域精确的几何视界模型。 flow matching world model predictive model
7 Strategyproof Reinforcement Learning from Human Feedback 提出Pessimistic Median of MLEs算法,解决RLHF中策略性反馈导致的策略偏差问题 reinforcement learning RLHF
8 Rule-Guided Reinforcement Learning Policy Evaluation and Improvement LEGIBLE:一种规则引导的强化学习策略评估与改进方法 reinforcement learning deep reinforcement learning
9 Implicit Contrastive Representation Learning with Guided Stop-gradient 提出引导式停止梯度方法,提升自监督对比学习的稳定性和性能 representation learning contrastive learning
10 Distributionally Robust Multi-Agent Reinforcement Learning for Dynamic Chute Mapping 提出DRMARL框架,解决亚马逊仓库动态分拣中对不确定诱导率的鲁棒映射问题 reinforcement learning
11 Privacy-Preserved Automated Scoring using Federated Learning for Educational Research 提出基于联邦学习的隐私保护自动评分框架,用于教育评估研究。 MAE large language model
12 ConjointNet: Enhancing Conjoint Analysis for Preference Prediction with Representation Learning ConjointNet:利用表征学习增强联合分析,提升偏好预测精度 representation learning
13 Towards Causal Model-Based Policy Optimization 提出C-MBPO,通过因果模型提升模型基强化学习的泛化性和鲁棒性 reinforcement learning policy learning predictive model
14 Optimisation of the Accelerator Control by Reinforcement Learning: A Simulation-Based Approach 提出基于强化学习的加速器控制优化框架,提升束线性能 reinforcement learning
15 Reinforcement Learning is all You Need 利用纯强化学习训练3B语言模型,提升推理能力。 reinforcement learning
16 Evaluating Reinforcement Learning Safety and Trustworthiness in Cyber-Physical Systems 提出SAFE-RL框架,用于评估和提升强化学习在信息物理系统中的安全性和可信度 reinforcement learning
17 Representation Retrieval Learning for Heterogeneous Data Integration 提出表征检索学习框架R2,解决异构数据集成中的预测性能下降问题。 predictive model representation learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (13 篇)

#题目一句话要点标签🔗
18 SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models SciHorizon:构建AI4Science评估框架,从科学数据到大语言模型 large language model multimodal
19 Foundation Models for Spatio-Temporal Data Science: A Tutorial and Survey 综述时空数据科学中的Foundation Model,提升时空数据任务的泛化性和适应性。 large language model foundation model
20 Revisiting semi-supervised learning in the era of foundation models 针对视觉基础模型,提出基于集成伪标签的半监督自训练方法 foundation model
21 Large Language Models for Multi-Facility Location Mechanism Design 提出LLMMech,利用大语言模型解决多设施选址机制设计问题 large language model
22 Towards Graph Foundation Models: A Transferability Perspective 构建图基础模型迁移性分析框架,促进跨领域图数据泛化 foundation model
23 PharMolixFM: All-Atom Foundation Models for Molecular Modeling and Generation PharMolixFM:用于分子建模和生成的全原子基础模型 foundation model
24 LLM-PS: Empowering Large Language Models for Time Series Forecasting with Temporal Patterns and Semantics LLM-PS:通过时序模式与语义增强大语言模型用于时间序列预测 large language model
25 TabNSA: Native Sparse Attention for Efficient Tabular Data Learning 提出TabNSA,利用原生稀疏注意力高效学习表格数据 large language model
26 Adaptive political surveys and GPT-4: Tackling the cold start problem with simulated user interactions 利用GPT-4模拟用户交互,解决自适应政治调查的冷启动问题 large language model
27 Priority-Aware Preemptive Scheduling for Mixed-Priority Workloads in MoE Inference QLLM:面向MoE模型,实现优先级感知的抢占式调度,优化混合负载推理。 large language model
28 Týr-the-Pruner: Structural Pruning LLMs via Global Sparsity Distribution Optimization Týr-the-Pruner:通过全局稀疏度分布优化实现LLM结构化剪枝 large language model
29 Why LLMs Cannot Think and How to Fix It 揭示LLM架构约束导致其无法进行“思考”,并提出改进方案 large language model
30 GRU: Mitigating the Trade-off between Unlearning and Retention for LLMs 提出梯度修正卸载(GRU)框架,缓解LLM卸载与保留的权衡问题 large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
31 Probing Latent Subspaces in LLM for AI Security: Identifying and Manipulating Adversarial States 利用LLM隐空间探测AI安全:识别并操纵对抗状态 manipulation large language model
32 I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data? 提出基于隐变量生成模型的理论框架,揭示LLM学习人类可解释概念的内在机制。 manipulation large language model

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
33 Exploiting Unstructured Sparsity in Fully Homomorphic Encrypted DNNs 针对全同态加密DNN,提出利用非结构化稀疏性加速矩阵乘法方案 OMOMO

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
34 Time-EAPCR: A Deep Learning-Based Novel Approach for Anomaly Detection Applied to the Environmental Field 提出Time-EAPCR深度学习模型,用于环境领域的异常检测。 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页