cs.LG(2025-05-21)

📊 共 49 篇论文 | 🔗 9 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (27 🔗6) 支柱二:RL算法与架构 (RL & Architecture) (19 🔗3) 支柱一:机器人控制 (Robot Control) (2) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (27 篇)

#题目一句话要点标签🔗
1 CoT Information: Improved Sample Complexity under Chain-of-Thought Supervision 提出CoT信息以提高链式思维监督下的样本复杂度 large language model chain-of-thought
2 Multi-modal Integration Analysis of Alzheimer's Disease Using Large Language Models and Knowledge Graphs 提出基于LLM和知识图谱的多模态融合框架,用于阿尔茨海默病研究 large language model multimodal
3 Graph Foundation Models: A Comprehensive Survey 图基础模型综述:统一框架、泛化范围与未来方向 foundation model multimodal
4 Large Language models for Time Series Analysis: Techniques, Applications, and Challenges 综述性论文:探索大型语言模型在时间序列分析中的技术、应用与挑战 large language model foundation model
5 Learning to Rank Chain-of-Thought: Using a Small Model 提出EORM:一种轻量级后验验证器,提升LLM数学推理可靠性。 large language model chain-of-thought
6 Multimodal Biomarkers for Schizophrenia: Towards Individual Symptom Severity Estimation 提出多模态融合框架,用于精神分裂症个体症状严重程度估计 multimodal
7 Large Language Models as Computable Approximations to Solomonoff Induction 将大语言模型视为Solomonoff归纳的可计算近似,并提出一种新的少样本选择方法。 large language model
8 Boost Post-Training Quantization via Null Space Optimization for Large Language Models 提出Q2N:通过零空间优化提升大语言模型后训练量化性能 large language model
9 Robust Multimodal Learning via Entropy-Gated Contrastive Fusion 提出自适应熵门控对比融合(AECF),提升多模态系统在缺失输入下的鲁棒性和校准性。 multimodal
10 Physical models realizing the transformer architecture of large language models 提出基于开放量子系统的Transformer物理模型,弥补Transformer架构理论理解的空白。 large language model
11 GenFT: A Generative Parameter-Efficient Fine-Tuning Method for Pretrained Foundation Models GenFT:一种生成式的参数高效微调方法,用于预训练模型。 foundation model
12 SIMCOPILOT: Evaluating Large Language Models for Copilot-Style Code Generation SIMCOPILOT:提出用于评估大语言模型在协同编程中代码生成能力的基准测试。 large language model
13 MoTime: A Dataset Suite for Multimodal Time Series Forecasting MoTime:多模态时间序列预测数据集套件,支持结构化模态效用评估。 multimodal
14 Harnessing On-Device Large Language Model: Empirical Results and Implications for AI PC 针对AI PC,提出一套片上大语言模型评估方法,并分析其部署优化策略 large language model
15 Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification 提出基于MLLM的人机交互式学习框架ICL,提升文本到图像行人重识别性能。 large language model multimodal
16 Beyond Classification: Evaluating Diffusion Denoised Smoothing for Security-Utility Trade off 评估扩散去噪平滑在安全-效用权衡中的表现,超越分类任务 foundation model
17 Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models 提出MoE模型局部路由一致性度量指标,优化专家卸载策略,提升推理效率。 large language model
18 Evaluating Adversarial Robustness of Concept Representations in Sparse Autoencoders 评估稀疏自编码器中概念表示的对抗鲁棒性,揭示其脆弱性。 large language model
19 Is (Selective) Round-To-Nearest Quantization All You Need? 重新审视RTN量化:一种高效且具竞争力的LLM量化方案 large language model
20 The Effects of Data Augmentation on Confidence Estimation for LLMs 研究数据增强对大语言模型置信度估计的影响,提升模型可靠性 large language model
21 SSR: Speculative Parallel Scaling Reasoning in Test-time 提出SSR:一种测试时推测并行扩展推理框架,提升LLM数学推理效率。 large language model
22 FlexQuant: A Flexible and Efficient Dynamic Precision Switching Framework for LLM Quantization FlexQuant:一种灵活高效的LLM动态精度切换量化框架 large language model
23 Time Tracker: Mixture-of-Experts-Enhanced Foundation Time Series Forecasting Model with Decoupled Training Pipelines Time Tracker:一种混合专家增强的、解耦训练流程的时序预测基础模型,用于提升多元时间序列预测精度。 foundation model
24 BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms 提出BanditSpec以解决大语言模型推理加速问题 large language model
25 Cost-aware LLM-based Online Dataset Annotation 提出CaMVo:一种成本感知的LLM在线数据集标注框架,显著降低标注成本。 large language model
26 Why and When Deep is Better than Shallow: An Implementation-Agnostic State-Transition View of Depth Supremacy 提出深度模型的状态转移视角以解决深度优越性问题 chain-of-thought
27 PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration PiFlow:基于多智能体协作和原理感知的科学发现框架 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (19 篇)

#题目一句话要点标签🔗
28 LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models LLM-Explorer:利用大语言模型增强强化学习策略探索的插件式方法 reinforcement learning policy learning TD3
29 Multiple Weaks Win Single Strong: Large Language Models Ensemble Weak Reinforcement Learning Agents into a Supreme One LLM-Ens:利用大语言模型集成弱强化学习智能体,提升整体性能 reinforcement learning large language model
30 FR-Mamba: Time-Series Physical Field Reconstruction Based on State Space Model FR-Mamba:基于状态空间模型的时序物理场重构 Mamba SSM state space model
31 A Unified Theoretical Analysis of Private and Robust Offline Alignment: from RLHF to DPO 针对RLHF和DPO,提出统一理论框架,分析离线对齐中隐私与鲁棒性的权衡。 reinforcement learning RLHF DPO
32 World Models as Reference Trajectories for Rapid Motor Adaptation 提出Reflexive World Models,利用世界模型作为参考轨迹实现快速运动适应 reinforcement learning policy learning world model
33 Bridging the Domain Gap in Equation Distillation with Reinforcement Feedback 提出基于强化学习反馈的方程蒸馏方法,弥合领域鸿沟,提升Data2Eqn任务性能。 reinforcement learning distillation foundation model
34 ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search ReGUIDE:通过空间推理和搜索实现数据高效的GUI元素定位 reinforcement learning large language model multimodal
35 AM-PPO: (Advantage) Alpha-Modulation with Proximal Policy Optimization 提出AM-PPO,通过优势函数调制提升PPO算法在连续控制任务中的性能。 reinforcement learning PPO
36 Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning 提出轨迹贝尔曼残差最小化(TBRM),一种简单高效的LLM推理值函数方法 reinforcement learning PPO large language model
37 RLBenchNet: The Right Network for the Right Reinforcement Learning Task RLBenchNet:针对不同强化学习任务选择最优神经网络架构。 reinforcement learning Mamba
38 Guided Policy Optimization under Partial Observability 提出引导策略优化(GPO)框架,解决部分可观测环境下强化学习的挑战。 reinforcement learning imitation learning privileged information
39 Toward Theoretical Insights into Diffusion Trajectory Distillation via Operator Merging 通过算子合并理论分析扩散轨迹蒸馏,优化单步生成质量。 distillation
40 On the creation of narrow AI: hierarchy and nonlocality of neural network skills 研究窄AI创建:神经网络技能的层级性和非局部性 distillation foundation model
41 Reward Is Enough: LLMs Are In-Context Reinforcement Learners 提出ICRL:利用上下文学习,使LLM在推理时进行强化学习自我提升 reinforcement learning large language model
42 Graph-Conditional Flow Matching for Relational Data Generation 提出图条件Flow Matching方法,用于生成具有复杂关系的合成关系型数据 flow matching
43 A Framework for Non-Linear Attention via Modern Hopfield Networks 提出基于现代Hopfield网络的非线性注意力机制框架 linear attention
44 The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning 熵最小化显著提升LLM推理能力,无需标注数据 reinforcement learning large language model
45 Khan-GCL: Kolmogorov-Arnold Network Based Graph Contrastive Learning with Hard Negatives Khan-GCL:基于Kolmogorov-Arnold网络和难负例的图对比学习 contrastive learning
46 RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning 提出RL Tango,通过强化学习协同训练生成器和验证器,提升LLM的语言推理能力。 reinforcement learning large language model

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
47 Filtering Learning Histories Enhances In-Context Reinforcement Learning 提出学习历史过滤(LHF)方法,通过数据预处理提升Transformer模型在上下文强化学习中的性能。 manipulation reinforcement learning
48 Covert Attacks on Machine Learning Training in Passively Secure MPC 揭示被动安全MPC训练中隐蔽攻击的风险,强调主动安全协议的重要性 MPC

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
49 Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing 提出基于近似消息传递的最优重训练方法,提升二元分类模型性能。 AMP

⬅️ 返回 cs.LG 首页 · 🏠 返回主页