cs.LG（2025-05-21）

📊 共 49 篇论文 | 🔗 9 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (27 🔗6) 支柱二：RL算法与架构 (RL & Architecture) (19 🔗3) 支柱一：机器人控制 (Robot Control) (2) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (27 篇)

#	题目	一句话要点	标签	🔗
1	CoT Information: Improved Sample Complexity under Chain-of-Thought Supervision	提出CoT信息以提高链式思维监督下的样本复杂度	large language model chain-of-thought
2	Multi-modal Integration Analysis of Alzheimer's Disease Using Large Language Models and Knowledge Graphs	提出基于LLM和知识图谱的多模态融合框架，用于阿尔茨海默病研究	large language model multimodal
3	Graph Foundation Models: A Comprehensive Survey	图基础模型综述：统一框架、泛化范围与未来方向	foundation model multimodal	✅
4	Large Language models for Time Series Analysis: Techniques, Applications, and Challenges	综述性论文：探索大型语言模型在时间序列分析中的技术、应用与挑战	large language model foundation model
5	Learning to Rank Chain-of-Thought: Using a Small Model	提出EORM：一种轻量级后验验证器，提升LLM数学推理可靠性。	large language model chain-of-thought
6	Multimodal Biomarkers for Schizophrenia: Towards Individual Symptom Severity Estimation	提出多模态融合框架，用于精神分裂症个体症状严重程度估计	multimodal
7	Large Language Models as Computable Approximations to Solomonoff Induction	将大语言模型视为Solomonoff归纳的可计算近似，并提出一种新的少样本选择方法。	large language model
8	Boost Post-Training Quantization via Null Space Optimization for Large Language Models	提出Q2N：通过零空间优化提升大语言模型后训练量化性能	large language model	✅
9	Robust Multimodal Learning via Entropy-Gated Contrastive Fusion	提出自适应熵门控对比融合(AECF)，提升多模态系统在缺失输入下的鲁棒性和校准性。	multimodal
10	Physical models realizing the transformer architecture of large language models	提出基于开放量子系统的Transformer物理模型，弥补Transformer架构理论理解的空白。	large language model
11	GenFT: A Generative Parameter-Efficient Fine-Tuning Method for Pretrained Foundation Models	GenFT：一种生成式的参数高效微调方法，用于预训练模型。	foundation model
12	SIMCOPILOT: Evaluating Large Language Models for Copilot-Style Code Generation	SIMCOPILOT：提出用于评估大语言模型在协同编程中代码生成能力的基准测试。	large language model
13	MoTime: A Dataset Suite for Multimodal Time Series Forecasting	MoTime：多模态时间序列预测数据集套件，支持结构化模态效用评估。	multimodal
14	Harnessing On-Device Large Language Model: Empirical Results and Implications for AI PC	针对AI PC，提出一套片上大语言模型评估方法，并分析其部署优化策略	large language model	✅
15	Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification	提出基于MLLM的人机交互式学习框架ICL，提升文本到图像行人重识别性能。	large language model multimodal
16	Beyond Classification: Evaluating Diffusion Denoised Smoothing for Security-Utility Trade off	评估扩散去噪平滑在安全-效用权衡中的表现，超越分类任务	foundation model
17	Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models	提出MoE模型局部路由一致性度量指标，优化专家卸载策略，提升推理效率。	large language model	✅
18	Evaluating Adversarial Robustness of Concept Representations in Sparse Autoencoders	评估稀疏自编码器中概念表示的对抗鲁棒性，揭示其脆弱性。	large language model
19	Is (Selective) Round-To-Nearest Quantization All You Need?	重新审视RTN量化：一种高效且具竞争力的LLM量化方案	large language model
20	The Effects of Data Augmentation on Confidence Estimation for LLMs	研究数据增强对大语言模型置信度估计的影响，提升模型可靠性	large language model
21	SSR: Speculative Parallel Scaling Reasoning in Test-time	提出SSR：一种测试时推测并行扩展推理框架，提升LLM数学推理效率。	large language model
22	FlexQuant: A Flexible and Efficient Dynamic Precision Switching Framework for LLM Quantization	FlexQuant：一种灵活高效的LLM动态精度切换量化框架	large language model	✅
23	Time Tracker: Mixture-of-Experts-Enhanced Foundation Time Series Forecasting Model with Decoupled Training Pipelines	Time Tracker：一种混合专家增强的、解耦训练流程的时序预测基础模型，用于提升多元时间序列预测精度。	foundation model
24	BanditSpec: Adaptive Speculative Decoding via Bandit Algorithms	提出BanditSpec以解决大语言模型推理加速问题	large language model
25	Cost-aware LLM-based Online Dataset Annotation	提出CaMVo：一种成本感知的LLM在线数据集标注框架，显著降低标注成本。	large language model
26	Why and When Deep is Better than Shallow: An Implementation-Agnostic State-Transition View of Depth Supremacy	提出深度模型的状态转移视角以解决深度优越性问题	chain-of-thought
27	PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration	PiFlow：基于多智能体协作和原理感知的科学发现框架	large language model	✅

🔬 支柱二：RL算法与架构 (RL & Architecture) (19 篇)

#	题目	一句话要点	标签	🔗
28	LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models	LLM-Explorer：利用大语言模型增强强化学习策略探索的插件式方法	reinforcement learning policy learning TD3	✅
29	Multiple Weaks Win Single Strong: Large Language Models Ensemble Weak Reinforcement Learning Agents into a Supreme One	LLM-Ens：利用大语言模型集成弱强化学习智能体，提升整体性能	reinforcement learning large language model
30	FR-Mamba: Time-Series Physical Field Reconstruction Based on State Space Model	FR-Mamba：基于状态空间模型的时序物理场重构	Mamba SSM state space model
31	A Unified Theoretical Analysis of Private and Robust Offline Alignment: from RLHF to DPO	针对RLHF和DPO，提出统一理论框架，分析离线对齐中隐私与鲁棒性的权衡。	reinforcement learning RLHF DPO
32	World Models as Reference Trajectories for Rapid Motor Adaptation	提出Reflexive World Models，利用世界模型作为参考轨迹实现快速运动适应	reinforcement learning policy learning world model
33	Bridging the Domain Gap in Equation Distillation with Reinforcement Feedback	提出基于强化学习反馈的方程蒸馏方法，弥合领域鸿沟，提升Data2Eqn任务性能。	reinforcement learning distillation foundation model
34	ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search	ReGUIDE：通过空间推理和搜索实现数据高效的GUI元素定位	reinforcement learning large language model multimodal
35	AM-PPO: (Advantage) Alpha-Modulation with Proximal Policy Optimization	提出AM-PPO，通过优势函数调制提升PPO算法在连续控制任务中的性能。	reinforcement learning PPO
36	Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning	提出轨迹贝尔曼残差最小化(TBRM)，一种简单高效的LLM推理值函数方法	reinforcement learning PPO large language model
37	RLBenchNet: The Right Network for the Right Reinforcement Learning Task	RLBenchNet：针对不同强化学习任务选择最优神经网络架构。	reinforcement learning Mamba	✅
38	Guided Policy Optimization under Partial Observability	提出引导策略优化(GPO)框架，解决部分可观测环境下强化学习的挑战。	reinforcement learning imitation learning privileged information
39	Toward Theoretical Insights into Diffusion Trajectory Distillation via Operator Merging	通过算子合并理论分析扩散轨迹蒸馏，优化单步生成质量。	distillation
40	On the creation of narrow AI: hierarchy and nonlocality of neural network skills	研究窄AI创建：神经网络技能的层级性和非局部性	distillation foundation model
41	Reward Is Enough: LLMs Are In-Context Reinforcement Learners	提出ICRL：利用上下文学习，使LLM在推理时进行强化学习自我提升	reinforcement learning large language model
42	Graph-Conditional Flow Matching for Relational Data Generation	提出图条件Flow Matching方法，用于生成具有复杂关系的合成关系型数据	flow matching
43	A Framework for Non-Linear Attention via Modern Hopfield Networks	提出基于现代Hopfield网络的非线性注意力机制框架	linear attention
44	The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning	熵最小化显著提升LLM推理能力，无需标注数据	reinforcement learning large language model
45	Khan-GCL: Kolmogorov-Arnold Network Based Graph Contrastive Learning with Hard Negatives	Khan-GCL：基于Kolmogorov-Arnold网络和难负例的图对比学习	contrastive learning
46	RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning	提出RL Tango，通过强化学习协同训练生成器和验证器，提升LLM的语言推理能力。	reinforcement learning large language model	✅

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
47	Filtering Learning Histories Enhances In-Context Reinforcement Learning	提出学习历史过滤（LHF）方法，通过数据预处理提升Transformer模型在上下文强化学习中的性能。	manipulation reinforcement learning
48	Covert Attacks on Machine Learning Training in Passively Secure MPC	揭示被动安全MPC训练中隐蔽攻击的风险，强调主动安全协议的重要性	MPC

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
49	Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing	提出基于近似消息传递的最优重训练方法，提升二元分类模型性能。	AMP

⬅️ 返回 cs.LG 首页 · 🏠 返回主页

cs.LG（2025-05-21）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (27 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (19 篇)

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理