cs.LG(2025-05-30)

📊 共 50 篇论文 | 🔗 10 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (30 🔗5) 支柱二:RL算法与架构 (RL & Architecture) (15 🔗5) 支柱一:机器人控制 (Robot Control) (2) 支柱四:生成式动作 (Generative Motion) (2) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (30 篇)

#题目一句话要点标签🔗
1 AFLoRA: Adaptive Federated Fine-Tuning of Large Language Models with Resource-Aware Low-Rank Adaption AFLoRA:面向异构资源环境,自适应联邦微调大语言模型 large language model foundation model
2 Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting 提出Proxy-FDA,通过代理特征分布对齐解决视觉基础模型微调中的概念遗忘问题 foundation model
3 Beyond Atomic Geometry Representations in Materials Science: A Human-in-the-Loop Multimodal Framework 提出MCS-Set多模态材料科学框架,融合原子结构、2D投影和文本注释,提升材料性质预测和晶体生成。 multimodal
4 Intercept Cancer: Cancer Pre-Screening with Large Scale Healthcare Foundation Models CATCH-FM:利用大规模医疗健康基础模型进行癌症预筛查 foundation model
5 Applying Large Language Models to Issue Classification: Revisiting with Extended Data and New Models 利用大型语言模型进行问题分类,扩展数据并评估新模型 large language model
6 The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models 利用预训练模型赋能神经符号学习,提升复杂推理任务的泛化性 foundation model
7 PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models PhySense:提出基于物理原理推理的大语言模型评测基准 large language model
8 HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts 提出HELM:基于混合曲率专家模型的双曲空间大型语言模型,提升文本几何结构建模能力。 large language model
9 Learning Safety Constraints for Large Language Models 提出Safety Polytope (SaP)方法,在表征空间中学习并执行LLM安全约束。 large language model
10 Equivalent Linear Mappings of Large Language Models 提出等效线性映射以解析大型语言模型的推理机制 large language model
11 Generalisation Bounds of Zero-Shot Economic Forecasting using Time Series Foundation Models 利用时间序列基础模型实现零样本经济预测,无需定制训练。 foundation model
12 Learn from the Past: Fast Sparse Indexing for Large Language Model Decoding LFPS:利用历史注意力模式加速长文本LLM解码中的稀疏索引检索 large language model
13 When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs 揭示GPT知识文件泄露风险:提出全面评估框架并发现多种泄露途径 large language model
14 Breakpoint: Scalable evaluation of system-level reasoning in LLM code agents Breakpoint:通过对抗性代码损坏,可扩展地评估LLM代码智能体的系统级推理能力 large language model
15 Privacy Amplification in Differentially Private Zeroth-Order Optimization with Hidden States 提出隐藏状态下差分隐私零阶优化收敛性保证,并改进算法设计 large language model
16 Aligning Language Models with Observational Data: Opportunities and Risks from a Causal Perspective 提出DeconfoundLM,通过因果去混淆提升语言模型在观测数据上的对齐效果 large language model
17 Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning Chameleon:一种灵活的数据混合框架,用于语言模型预训练和微调。 large language model
18 SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training SUMO:子空间感知矩正交化加速内存高效的大语言模型训练 large language model
19 PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations 提出PDE-Transformer,用于高效且通用的物理模拟代理建模。 foundation model
20 Beyond Linear Steering: Unified Multi-Attribute Control for Language Models K-Steering:一种用于语言模型的多属性统一控制非线性方法 large language model
21 Can Slow-thinking LLMs Reason Over Time? Empirical Studies in Time Series Forecasting TimeReasoner:探索慢思考LLM在时间序列预测中的推理能力 multimodal
22 Object Centric Concept Bottlenecks 提出Object-Centric Concept Bottlenecks,提升复杂视觉任务中概念瓶颈模型的性能与可解释性。 foundation model
23 Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling 提出DORA:通过优化资源分配,提升大语言模型在测试时推理的效率和准确率 large language model
24 ReCalKV: Low-Rank KV Cache Compression via Head Reordering and Offline Calibration ReCalKV:通过Head重排序与离线校准实现低秩KV缓存压缩 large language model
25 SwiftEval: Developing a Language-Specific Benchmark for LLM-generated Code Evaluation 提出SwiftEval:一个用于评估LLM生成Swift代码能力的高质量基准 large language model
26 LittleBit: Ultra Low-Bit Quantization via Latent Factorization LittleBit:通过潜在因子分解实现超低比特量化,显著压缩大语言模型。 large language model
27 Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows 针对低代码工作流生成,证明微调小型语言模型在质量上优于提示大型语言模型。 large language model
28 SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling 提出SALE:一种低比特估计的稀疏注意力方法,加速长文本LLM Prefilling阶段。 large language model
29 Invariant Link Selector for Spatial-Temporal Out-of-Distribution Problem 提出一种不变链接选择器,解决时序图上的空间-时间分布外泛化问题 foundation model
30 Don't Just Follow MLLM Plans: Robust and Efficient Planning for Open-world Agents 提出REPOA框架,解决开放世界智能体鲁棒高效规划问题 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (15 篇)

#题目一句话要点标签🔗
31 Causal-aware Large Language Models: Enhancing Decision-Making Through Learning, Adapting and Acting 提出因果感知大语言模型,通过学习、适应和行动增强决策能力 reinforcement learning large language model
32 Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer 提出M3DT:通过混合专家决策Transformer掌握大规模多任务强化学习 reinforcement learning decision transformer
33 MDPO: Multi-Granularity Direct Preference Optimization for Mathematical Reasoning 提出MDPO:一种多粒度直接偏好优化方法,提升LLM的数学推理能力 DPO direct preference optimization large language model
34 ROAD: Responsibility-Oriented Reward Design for Reinforcement Learning in Autonomous Driving 提出面向责任的强化学习奖励设计ROAD,提升自动驾驶决策性能 reinforcement learning reward design
35 AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning AReaL:一种用于语言推理的大规模异步强化学习系统,显著提升训练效率。 reinforcement learning PPO large language model
36 Adversarial Preference Learning for Robust LLM Alignment 提出对抗偏好学习(APL)以提升LLM对对抗攻击的鲁棒性 reinforcement learning preference learning RLHF
37 Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning 提出REDI:利用负样本进行强化蒸馏,提升LLM推理能力 DPO distillation
38 QiMeng-CodeV-R1: Reasoning-Enhanced Verilog Generation CodeV-R1:推理增强的Verilog代码生成框架,提升硬件设计自动化水平 reinforcement learning distillation large language model
39 Hyperbolic Dataset Distillation 提出基于双曲空间的HDD数据集精馏方法,有效建模数据层级关系并提升训练稳定性。 distillation
40 Sorrel: A simple and flexible framework for multi-agent reinforcement learning Sorrel:一个简单灵活的多智能体强化学习框架,易于环境生成与测试。 reinforcement learning
41 On Designing Diffusion Autoencoders for Efficient Generation and Representation Learning 提出DMZ:一种结合扩散自编码器优势的高效生成与表征学习框架 representation learning
42 RAST: Reasoning Activation in LLMs via Small-model Transfer 提出RAST以高效提升大语言模型的推理能力 reinforcement learning large language model
43 REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards 提出 Reasoning Gym,用于强化学习的可验证奖励推理环境库 reinforcement learning
44 On Symmetric Losses for Robust Policy Optimization with Noisy Preferences 提出SymPO,利用对称损失优化含噪声偏好的鲁棒策略 reinforcement learning RLHF direct preference optimization
45 Logits-Based Finetuning 提出基于Logits的微调方法以解决传统SFT的局限性 distillation large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
46 Adapting Offline Reinforcement Learning with Online Delays DT-CORL:利用Transformer置信度策略弥合离线强化学习中的延迟差距 sim-to-real reinforcement learning offline RL
47 Cascading Adversarial Bias from Injection to Distillation in Language Models 揭示语言模型蒸馏中对抗性偏见注入与传递的脆弱性 manipulation distillation

🔬 支柱四:生成式动作 (Generative Motion) (2 篇)

#题目一句话要点标签🔗
48 Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking 提出基于熵界非掩蔽的EB-Sampler,加速掩码扩散模型采样过程。 MDM
49 Learning to Optimally Dispatch Power: Performance on a Nation-Wide Real-World Dataset 提出乌拉圭国家级电网数据集,揭示机器学习优化在实际电力调度中的挑战。 penetration

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
50 Cartan Networks: Group theoretical Hyperbolic Deep Learning 提出Cartan网络,一种基于群论的双曲深度学习方法,用于高效嵌入分层数据。 OMOMO

⬅️ 返回 cs.LG 首页 · 🏠 返回主页