cs.LG（2025-05-20）

📊 共 60 篇论文 | 🔗 14 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (30 🔗10) 支柱二：RL算法与架构 (RL & Architecture) (23 🔗2) 支柱一：机器人控制 (Robot Control) (3 🔗2) 支柱八：物理动画 (Physics-based Animation) (2) 支柱七：动作重定向 (Motion Retargeting) (1) 支柱五：交互与反应 (Interaction & Reaction) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (30 篇)

#	题目	一句话要点	标签	🔗
1	Output Scaling: YingLong-Delayed Chain of Thought in a Large Pretrained Time Series Forecasting Model	提出YingLong，通过非因果CoT在时间序列预测中实现显著的输出尺度效应。	foundation model chain-of-thought	✅
2	Towards Non-Euclidean Foundation Models: Advancing AI Beyond Euclidean Frameworks	探索非欧几里得空间基础模型，提升AI在复杂关系建模能力	large language model foundation model	✅
3	KERL: Knowledge-Enhanced Personalized Recipe Recommendation using Large Language Models	KERL：利用大型语言模型和知识图谱的个性化食谱推荐系统	large language model	✅
4	Foundations of Unknown-aware Machine Learning	提出未知感知机器学习框架，提升开放世界中AI模型的可靠性与安全性。	large language model foundation model multimodal
5	Quartet: Native FP4 Training Can Be Optimal for Large Language Models	Quartet：原生FP4训练为大语言模型提供最优解	large language model	✅
6	This Time is Different: An Observability Perspective on Time Series Foundation Models	提出Toto：一个面向可观测性时间序列的预训练基础模型，并在大规模基准测试中取得领先成果。	foundation model	✅
7	LEANCODE: Understanding Models Better for Code Simplification of Pre-trained Large Language Models	LeanCode：利用注意力机制进行代码简化，加速预训练大语言模型	large language model
8	Table Foundation Models: on knowledge pre-training for tabular learning	提出表格基础模型TARTE，通过知识增强的向量表示提升表格学习性能。	foundation model
9	LLMSynthor: Macro-Aligned Micro-Records Synthesis with Large Language Models	LLMSynthor：利用大语言模型合成宏观对齐的微观记录，用于社会科学模拟。	large language model
10	MAS-KCL: Knowledge component graph structure learning with large language model-based agentic workflow	提出MAS-KCL算法，利用LLM驱动的多智能体系统学习知识组件图结构。	large language model
11	Fusing Cross-Domain Knowledge from Multimodal Data to Solve Problems in the Physical World	提出跨领域多模态数据融合框架，解决物理世界复杂问题	multimodal
12	Adversarially Pretrained Transformers May Be Universally Robust In-Context Learners	提出对抗预训练Transformer作为通用鲁棒的上下文学习器，提升下游任务的鲁棒性。	foundation model	✅
13	Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity	Polar Sparsity：通过可扩展的上下文稀疏性实现高吞吐量批量LLM推理	large language model	✅
14	The Role of Visualization in LLM-Assisted Knowledge Graph Systems: Effects on User Trust, Exploration, and Workflows	LinkQ：研究LLM辅助知识图谱系统中可视化对用户信任、探索和工作流的影响	large language model
15	FisherSFT: Data-Efficient Supervised Fine-Tuning of Language Models Using Information Gain	FisherSFT：利用信息增益实现语言模型的高效监督微调	large language model
16	Enhancing Learned Knowledge in LoRA Adapters Through Efficient Contrastive Decoding on Ascend NPUs	提出CoLD对比解码框架，提升LoRA适配模型在Ascend NPU上的推理性能。	large language model
17	Spiking Neural Networks with Temporal Attention-Guided Adaptive Fusion for imbalanced Multi-modal Learning	提出时序注意力引导的自适应融合SNN，解决多模态不平衡学习问题	multimodal
18	LLINBO: Trustworthy LLM-in-the-Loop Bayesian Optimization	提出LLINBO，结合LLM与统计模型，提升黑盒优化中的探索与利用平衡。	large language model	✅
19	ServerlessLoRA: Minimizing Latency and Cost in Serverless Inference for LoRA-Based LLMs	ServerlessLoRA：面向LoRA LLM的Serverless推理，降低延迟和成本	large language model
20	Interpretable Neural System Dynamics: Combining Deep Learning with System Dynamics Modeling to Support Critical Applications	提出神经系统动力学框架，结合深度学习与系统动力学建模，提升可解释性和因果可靠性。	multimodal
21	Byte Pair Encoding for Efficient Time Series Forecasting	提出基于字节对编码的时间序列token化方法，显著提升预测性能与效率。	foundation model
22	Low-Cost FlashAttention with Fused Exponential and Multiplication Hardware Operators	针对FlashAttention，提出融合指数运算和乘法运算的低成本硬件加速器	large language model
23	Scaling Law for Quantization-Aware Training	提出面向量化感知训练的统一缩放定律，揭示W4A4量化误差来源并优化模型性能。	large language model
24	Safety Subspaces are Not Linearly Distinct: A Fine-Tuning Case Study	大型语言模型安全性微调研究：安全子空间并非线性可分	large language model	✅
25	Acoustic and Machine Learning Methods for Speech-Based Suicide Risk Assessment: A Systematic Review	利用语音声学特征和机器学习进行自杀风险评估的系统性综述	multimodal
26	Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis	Quaff：基于异常值空间稳定性假设的量化参数高效微调框架	large language model	✅
27	When LLMs meet open-world graph learning: a new perspective for unlabeled data uncertainty	提出Open-world Graph Assistant (OGA)框架，解决开放世界图学习中的数据不确定性问题	large language model
28	Causes and Consequences of Representational Similarity in Machine Learning Models	探究数据集与任务重叠对机器学习模型表征相似性的影响	large language model
29	The Energy Cost of Reasoning: Analyzing Energy Usage in LLMs with Test-time Compute	提出测试时计算（TTC）方法，提升LLM推理能效，尤其在复杂推理任务中。	large language model
30	FlowBERT: Prompt-tuned BERT for variable flow field prediction	FlowBERT：基于Prompt调优的BERT用于变流场预测，提升泛化性和计算效率。	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (23 篇)

#	题目	一句话要点	标签	🔗
31	Structured Agent Distillation for Large Language Model	提出结构化Agent蒸馏方法，压缩LLM Agent并保持推理和行动一致性	imitation learning distillation large language model
32	Modality-Balancing Preference Optimization of Large Multimodal Models by Adversarial Negative Mining	提出MBPO，通过对抗负样本挖掘和模态平衡优化解决大模型中的模态不平衡问题	preference learning large language model multimodal
33	InfiFPO: Implicit Model Fusion via Preference Optimization in Large Language Models	InfiFPO：通过偏好优化实现大语言模型中的隐式模型融合	DPO direct preference optimization large language model
34	FlowQ: Energy-Guided Flow Policies for Offline Reinforcement Learning	FlowQ：基于能量引导流策略的离线强化学习算法	reinforcement learning offline reinforcement learning flow matching
35	Time to Embed: Unlocking Foundation Models for Time Series with Channel Descriptions	CHARM：一种结合通道描述的时间序列基础嵌入模型，实现卓越的表征学习。	representation learning foundation model
36	Energy-Efficient Deep Reinforcement Learning with Spiking Transformers	提出基于脉冲Transformer的强化学习算法，实现能量高效的复杂决策。	reinforcement learning deep reinforcement learning
37	AAPO: Enhancing the Reasoning Capabilities of LLMs with Advantage Momentum	提出AAPO算法，利用优势动量提升LLM在数学推理中的能力	reinforcement learning PPO large language model
38	Imitation Learning via Focused Satisficing	提出基于专注性Satisficing的模仿学习方法，提升示范轨迹质量。	reinforcement learning deep reinforcement learning imitation learning
39	The Evolution of Alpha in Finance Harnessing Human Insight and LLM Agents	提出基于LLM金融Agent的Alpha策略演进框架，提升投资决策智能化水平	representation learning large language model multimodal
40	Interpretable Reinforcement Learning for Load Balancing using Kolmogorov-Arnold Networks	提出基于Kolmogorov-Arnold Networks的可解释强化学习负载均衡方法	reinforcement learning PPO
41	Preference Learning with Lie Detectors can Induce Honesty or Evasion	利用测谎器进行偏好学习可能诱导诚实或规避行为	preference learning DPO
42	Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation	提出一种高效的连续时间强化学习算法以解决样本和计算效率问题	reinforcement learning
43	Text embedding models can be great data engineers	ADEPT：利用文本嵌入自动构建数据工程流水线，提升预测模型性能。	predictive model TAMP
44	TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning	TinyV：通过减少验证中的假阴性来改进LLM推理的强化学习	reinforcement learning large language model	✅
45	Performance Optimization of Energy-Harvesting Underlay Cognitive Radio Networks Using Reinforcement Learning	提出基于深度Q网络的认知无线电能量收集优化方案，提升次级用户数据速率	reinforcement learning
46	KIPPO: Koopman-Inspired Proximal Policy Optimization	提出KIPPO，利用Koopman理论提升PPO在复杂控制任务中的性能与稳定性。	reinforcement learning policy learning PPO
47	Bellman operator convergence enhancements in reinforcement learning algorithms	通过改进贝尔曼算子，提升强化学习算法的收敛性和性能	reinforcement learning
48	Personalised Insulin Adjustment with Reinforcement Learning: An In-Silico Validation for People with Diabetes on Intensive Insulin Treatment	提出基于强化学习的个性化胰岛素调整方案ABBA，优化糖尿病患者血糖控制。	reinforcement learning
49	FlowTSE: Target Speaker Extraction with Flow Matching	FlowTSE：基于流匹配的说话人提取方法，简化流程并提升性能	flow matching
50	Self Distillation via Iterative Constructive Perturbations	提出迭代构造扰动自蒸馏框架，提升深度神经网络的泛化性能。	distillation
51	From Reasoning to Code: GRPO Optimization for Underrepresented Languages	提出GRPO优化方法，提升LLM在低资源语言上的代码生成能力	reinforcement learning large language model
52	Riemannian Flow Matching for Brain Connectivity Matrices via Pullback Geometry	提出DiffeoCFM，通过拉回几何实现脑连接矩阵的黎曼流匹配生成模型。	flow matching	✅
53	When to retrain a machine learning model	提出基于不确定性的模型重训练方法，应对数据漂移下的性能退化问题	reinforcement learning offline reinforcement learning

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

#	题目	一句话要点	标签	🔗
54	RLVR-World: Training World Models with Reinforcement Learning	提出RLVR-World，利用强化学习优化世界模型，提升生成模型的任务特定效用。	manipulation reinforcement learning world model	✅
55	Flattening Hierarchies with Policy Bootstrapping	提出基于策略引导的扁平化层级强化学习方法，解决长程目标条件强化学习问题。	locomotion manipulation reinforcement learning	✅
56	Lessons from Defending Gemini Against Indirect Prompt Injections	针对间接提示注入攻击，评估并提升Gemini模型的对抗鲁棒性	manipulation

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
57	Physics-Guided Learning of Meteorological Dynamics for Weather Downscaling and Forecasting	提出PhyDL-NWP，融合物理知识的深度学习气象降尺度与预测框架	spatiotemporal
58	A PID-Controlled Tensor Wheel Decomposition Model for Dynamic Link Prediction	提出PID控制的张量轮分解模型PTWD，用于动态网络中的链路预测。	spatiotemporal

🔬 支柱七：动作重定向 (Motion Retargeting) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
59	Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models	利用文本引导向量提升多模态大语言模型的视觉理解能力	spatial relationship large language model multimodal

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
60	Securing Transfer-Learned Networks with Reverse Homomorphic Encryption	提出反向同态加密方法，保护迁移学习网络免受训练数据重建攻击	OMOMO

⬅️ 返回 cs.LG 首页 · 🏠 返回主页

cs.LG（2025-05-20）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (30 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (23 篇)

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

🔬 支柱七：动作重定向 (Motion Retargeting) (1 篇)

🔬 支柱五：交互与反应 (Interaction & Reaction) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理