cs.LG(2025-05-30)

📊 共 50 篇论文 | 🔗 10 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (30 🔗5) 支柱二:RL算法与架构 (RL & Architecture) (15 🔗5) 支柱一:机器人控制 (Robot Control) (2) 支柱四:生成式动作 (Generative Motion) (2) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (30 篇)

#题目一句话要点标签🔗
1 AFLoRA: Adaptive Federated Fine-Tuning of Large Language Models with Resource-Aware Low-Rank Adaption 提出AFLoRA以解决异构环境下大语言模型的适应性微调问题 large language model foundation model
2 Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting 提出Proxy-FDA以解决视觉基础模型微调中的遗忘问题 foundation model
3 Beyond Atomic Geometry Representations in Materials Science: A Human-in-the-Loop Multimodal Framework 提出MCS-Set框架以解决材料科学数据集的多模态学习问题 multimodal
4 Intercept Cancer: Cancer Pre-Screening with Large Scale Healthcare Foundation Models 提出CATCH-FM以解决癌症筛查效率低下问题 foundation model
5 Applying Large Language Models to Issue Classification: Revisiting with Extended Data and New Models 提出基于大语言模型的自动化问题分类方法 large language model
6 The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models 提出神经符号提示以解决复杂推理任务的挑战 foundation model
7 PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models 提出PhySense以解决大型语言模型物理推理不足问题 large language model
8 HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts 提出HELM以解决现有语言模型几何结构不足问题 large language model
9 Learning Safety Constraints for Large Language Models 提出安全多面体方法以增强大语言模型的安全性 large language model
10 Equivalent Linear Mappings of Large Language Models 提出等效线性映射以解析大型语言模型的推理机制 large language model
11 Generalisation Bounds of Zero-Shot Economic Forecasting using Time Series Foundation Models 提出时间序列基础模型进行零-shot经济预测以解决数据稀缺问题 foundation model
12 Learn from the Past: Fast Sparse Indexing for Large Language Model Decoding 提出LFPS以解决长上下文解码中的稀疏索引问题 large language model
13 When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs 提出全面评估GPT知识文件泄露风险的方法 large language model
14 Breakpoint: Scalable evaluation of system-level reasoning in LLM code agents 提出Breakpoint以解决LLM代码代理的系统级推理评估问题 large language model
15 Privacy Amplification in Differentially Private Zeroth-Order Optimization with Hidden States 提出隐状态差分隐私的零阶优化隐私放大方法 large language model
16 Aligning Language Models with Observational Data: Opportunities and Risks from a Causal Perspective 提出DeconfoundLM以解决观察数据对语言模型微调的挑战 large language model
17 Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning 提出Chameleon框架以高效混合数据提升语言模型性能 large language model
18 SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training 提出SUMO以加速内存高效的大型语言模型训练 large language model
19 PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations 提出PDE-Transformer以提升物理仿真建模效率 foundation model
20 Beyond Linear Steering: Unified Multi-Attribute Control for Language Models 提出K-Steering以解决多行为属性控制问题 large language model
21 Can Slow-thinking LLMs Reason Over Time? Empirical Studies in Time Series Forecasting 提出TimeReasoner以解决时间序列预测中的推理不足问题 multimodal
22 Object Centric Concept Bottlenecks 提出对象中心概念瓶颈框架以提升模型可解释性与性能 foundation model
23 Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling 提出DORA以优化测试时间资源分配问题 large language model
24 ReCalKV: Low-Rank KV Cache Compression via Head Reordering and Offline Calibration 提出ReCalKV以解决长上下文推理中的KV缓存压缩问题 large language model
25 SwiftEval: Developing a Language-Specific Benchmark for LLM-generated Code Evaluation 提出SwiftEval以解决Swift代码评估的不足问题 large language model
26 LittleBit: Ultra Low-Bit Quantization via Latent Factorization 提出LittleBit以解决大语言模型超低比特量化问题 large language model
27 Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows 比较SLM微调与LLM提示在低代码工作流生成中的效果 large language model
28 SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling 提出SALE以解决长上下文LLM预填充阶段的稀疏注意力问题 large language model
29 Invariant Link Selector for Spatial-Temporal Out-of-Distribution Problem 提出不变链接选择器以解决时空分布外问题 foundation model
30 Don't Just Follow MLLM Plans: Robust and Efficient Planning for Open-world Agents 提出REPOA框架以解决开放世界智能体的规划效率与鲁棒性问题 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (15 篇)

#题目一句话要点标签🔗
31 Causal-aware Large Language Models: Enhancing Decision-Making Through Learning, Adapting and Acting 提出因果感知的大型语言模型以增强决策能力 reinforcement learning large language model
32 Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer 提出M3DT框架以解决大规模多任务强化学习问题 reinforcement learning decision transformer
33 MDPO: Multi-Granularity Direct Preference Optimization for Mathematical Reasoning 提出MDPO以解决长链数学推理中的错误输出问题 DPO direct preference optimization large language model
34 ROAD: Responsibility-Oriented Reward Design for Reinforcement Learning in Autonomous Driving 提出责任导向奖励设计以解决自动驾驶中的奖励函数问题 reinforcement learning reward design
35 AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning 提出AReaL以解决大规模语言推理的异步强化学习问题 reinforcement learning PPO large language model
36 Adversarial Preference Learning for Robust LLM Alignment 提出对抗偏好学习以解决大型语言模型的鲁棒性问题 reinforcement learning preference learning RLHF
37 Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning 提出负信号蒸馏方法以提升大语言模型推理性能 DPO distillation
38 QiMeng-CodeV-R1: Reasoning-Enhanced Verilog Generation 提出CodeV-R1框架以解决HDL自动生成中的验证挑战 reinforcement learning distillation large language model
39 Hyperbolic Dataset Distillation 提出超曲面数据集蒸馏方法以解决大规模数据集挑战 distillation
40 Sorrel: A simple and flexible framework for multi-agent reinforcement learning 提出Sorrel框架以简化多智能体强化学习环境的构建与测试 reinforcement learning
41 On Designing Diffusion Autoencoders for Efficient Generation and Representation Learning 提出扩散自编码器以提升生成与表示学习效率 representation learning
42 RAST: Reasoning Activation in LLMs via Small-model Transfer 提出RAST以高效提升大语言模型的推理能力 reinforcement learning large language model
43 REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards 提出Reasoning Gym以解决强化学习中可验证奖励的问题 reinforcement learning
44 On Symmetric Losses for Robust Policy Optimization with Noisy Preferences 提出对抗噪声偏好的稳健策略优化方法 reinforcement learning RLHF direct preference optimization
45 Logits-Based Finetuning 提出基于logits的微调方法以解决传统SFT的局限性 distillation large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
46 Adapting Offline Reinforcement Learning with Online Delays 提出DT-CORL以解决离线强化学习中的延迟问题 sim-to-real reinforcement learning offline RL
47 Cascading Adversarial Bias from Injection to Distillation in Language Models 提出对抗性偏差传播机制以增强语言模型的安全性 manipulation distillation

🔬 支柱四:生成式动作 (Generative Motion) (2 篇)

#题目一句话要点标签🔗
48 Accelerated Sampling from Masked Diffusion Models via Entropy Bounded Unmasking 提出EB-Sampler以加速从掩蔽扩散模型的采样 MDM
49 Learning to Optimally Dispatch Power: Performance on a Nation-Wide Real-World Dataset 提出基于真实数据的最优无功功率调度方法以解决电力系统优化问题 penetration

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
50 Cartan Networks: Group theoretical Hyperbolic Deep Learning 提出Cartan网络以提升超曲面深度学习的表现 OMOMO

⬅️ 返回 cs.LG 首页 · 🏠 返回主页