cs.LG(2025-05-20)

📊 共 60 篇论文 | 🔗 14 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (30 🔗10) 支柱二:RL算法与架构 (RL & Architecture) (23 🔗2) 支柱一:机器人控制 (Robot Control) (3 🔗2) 支柱八:物理动画 (Physics-based Animation) (2) 支柱七:动作重定向 (Motion Retargeting) (1) 支柱五:交互与反应 (Interaction & Reaction) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (30 篇)

#题目一句话要点标签🔗
1 Output Scaling: YingLong-Delayed Chain of Thought in a Large Pretrained Time Series Forecasting Model 提出YingLong,通过非因果CoT在时间序列预测中实现显著的输出尺度效应。 foundation model chain-of-thought
2 Towards Non-Euclidean Foundation Models: Advancing AI Beyond Euclidean Frameworks 探索非欧几里得空间基础模型,提升AI在复杂关系建模能力 large language model foundation model
3 KERL: Knowledge-Enhanced Personalized Recipe Recommendation using Large Language Models KERL:利用大型语言模型和知识图谱的个性化食谱推荐系统 large language model
4 Foundations of Unknown-aware Machine Learning 提出未知感知机器学习框架,提升开放世界中AI模型的可靠性与安全性。 large language model foundation model multimodal
5 Quartet: Native FP4 Training Can Be Optimal for Large Language Models Quartet:原生FP4训练为大语言模型提供最优解 large language model
6 This Time is Different: An Observability Perspective on Time Series Foundation Models 提出Toto:一个面向可观测性时间序列的预训练基础模型,并在大规模基准测试中取得领先成果。 foundation model
7 LEANCODE: Understanding Models Better for Code Simplification of Pre-trained Large Language Models LeanCode:利用注意力机制进行代码简化,加速预训练大语言模型 large language model
8 Table Foundation Models: on knowledge pre-training for tabular learning 提出表格基础模型TARTE,通过知识增强的向量表示提升表格学习性能。 foundation model
9 LLMSynthor: Macro-Aligned Micro-Records Synthesis with Large Language Models LLMSynthor:利用大语言模型合成宏观对齐的微观记录,用于社会科学模拟。 large language model
10 MAS-KCL: Knowledge component graph structure learning with large language model-based agentic workflow 提出MAS-KCL算法,利用LLM驱动的多智能体系统学习知识组件图结构。 large language model
11 Fusing Cross-Domain Knowledge from Multimodal Data to Solve Problems in the Physical World 提出跨领域多模态数据融合框架,解决物理世界复杂问题 multimodal
12 Adversarially Pretrained Transformers May Be Universally Robust In-Context Learners 提出对抗预训练Transformer作为通用鲁棒的上下文学习器,提升下游任务的鲁棒性。 foundation model
13 Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity Polar Sparsity:通过可扩展的上下文稀疏性实现高吞吐量批量LLM推理 large language model
14 The Role of Visualization in LLM-Assisted Knowledge Graph Systems: Effects on User Trust, Exploration, and Workflows LinkQ:研究LLM辅助知识图谱系统中可视化对用户信任、探索和工作流的影响 large language model
15 FisherSFT: Data-Efficient Supervised Fine-Tuning of Language Models Using Information Gain FisherSFT:利用信息增益实现语言模型的高效监督微调 large language model
16 Enhancing Learned Knowledge in LoRA Adapters Through Efficient Contrastive Decoding on Ascend NPUs 提出CoLD对比解码框架,提升LoRA适配模型在Ascend NPU上的推理性能。 large language model
17 Spiking Neural Networks with Temporal Attention-Guided Adaptive Fusion for imbalanced Multi-modal Learning 提出时序注意力引导的自适应融合SNN,解决多模态不平衡学习问题 multimodal
18 LLINBO: Trustworthy LLM-in-the-Loop Bayesian Optimization 提出LLINBO,结合LLM与统计模型,提升黑盒优化中的探索与利用平衡。 large language model
19 ServerlessLoRA: Minimizing Latency and Cost in Serverless Inference for LoRA-Based LLMs ServerlessLoRA:面向LoRA LLM的Serverless推理,降低延迟和成本 large language model
20 Interpretable Neural System Dynamics: Combining Deep Learning with System Dynamics Modeling to Support Critical Applications 提出神经系统动力学框架,结合深度学习与系统动力学建模,提升可解释性和因果可靠性。 multimodal
21 Byte Pair Encoding for Efficient Time Series Forecasting 提出基于字节对编码的时间序列token化方法,显著提升预测性能与效率。 foundation model
22 Low-Cost FlashAttention with Fused Exponential and Multiplication Hardware Operators 针对FlashAttention,提出融合指数运算和乘法运算的低成本硬件加速器 large language model
23 Scaling Law for Quantization-Aware Training 提出面向量化感知训练的统一缩放定律,揭示W4A4量化误差来源并优化模型性能。 large language model
24 Safety Subspaces are Not Linearly Distinct: A Fine-Tuning Case Study 大型语言模型安全性微调研究:安全子空间并非线性可分 large language model
25 Acoustic and Machine Learning Methods for Speech-Based Suicide Risk Assessment: A Systematic Review 利用语音声学特征和机器学习进行自杀风险评估的系统性综述 multimodal
26 Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis Quaff:基于异常值空间稳定性假设的量化参数高效微调框架 large language model
27 When LLMs meet open-world graph learning: a new perspective for unlabeled data uncertainty 提出Open-world Graph Assistant (OGA)框架,解决开放世界图学习中的数据不确定性问题 large language model
28 Causes and Consequences of Representational Similarity in Machine Learning Models 探究数据集与任务重叠对机器学习模型表征相似性的影响 large language model
29 The Energy Cost of Reasoning: Analyzing Energy Usage in LLMs with Test-time Compute 提出测试时计算(TTC)方法,提升LLM推理能效,尤其在复杂推理任务中。 large language model
30 FlowBERT: Prompt-tuned BERT for variable flow field prediction FlowBERT:基于Prompt调优的BERT用于变流场预测,提升泛化性和计算效率。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (23 篇)

#题目一句话要点标签🔗
31 Structured Agent Distillation for Large Language Model 提出结构化Agent蒸馏方法,压缩LLM Agent并保持推理和行动一致性 imitation learning distillation large language model
32 Modality-Balancing Preference Optimization of Large Multimodal Models by Adversarial Negative Mining 提出MBPO,通过对抗负样本挖掘和模态平衡优化解决大模型中的模态不平衡问题 preference learning large language model multimodal
33 InfiFPO: Implicit Model Fusion via Preference Optimization in Large Language Models InfiFPO:通过偏好优化实现大语言模型中的隐式模型融合 DPO direct preference optimization large language model
34 FlowQ: Energy-Guided Flow Policies for Offline Reinforcement Learning FlowQ:基于能量引导流策略的离线强化学习算法 reinforcement learning offline reinforcement learning flow matching
35 Time to Embed: Unlocking Foundation Models for Time Series with Channel Descriptions CHARM:一种结合通道描述的时间序列基础嵌入模型,实现卓越的表征学习。 representation learning foundation model
36 Energy-Efficient Deep Reinforcement Learning with Spiking Transformers 提出基于脉冲Transformer的强化学习算法,实现能量高效的复杂决策。 reinforcement learning deep reinforcement learning
37 AAPO: Enhancing the Reasoning Capabilities of LLMs with Advantage Momentum 提出AAPO算法,利用优势动量提升LLM在数学推理中的能力 reinforcement learning PPO large language model
38 Imitation Learning via Focused Satisficing 提出基于专注性Satisficing的模仿学习方法,提升示范轨迹质量。 reinforcement learning deep reinforcement learning imitation learning
39 The Evolution of Alpha in Finance Harnessing Human Insight and LLM Agents 提出基于LLM金融Agent的Alpha策略演进框架,提升投资决策智能化水平 representation learning large language model multimodal
40 Interpretable Reinforcement Learning for Load Balancing using Kolmogorov-Arnold Networks 提出基于Kolmogorov-Arnold Networks的可解释强化学习负载均衡方法 reinforcement learning PPO
41 Preference Learning with Lie Detectors can Induce Honesty or Evasion 利用测谎器进行偏好学习可能诱导诚实或规避行为 preference learning DPO
42 Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation 提出一种高效的连续时间强化学习算法以解决样本和计算效率问题 reinforcement learning
43 Text embedding models can be great data engineers ADEPT:利用文本嵌入自动构建数据工程流水线,提升预测模型性能。 predictive model TAMP
44 TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning TinyV:通过减少验证中的假阴性来改进LLM推理的强化学习 reinforcement learning large language model
45 Performance Optimization of Energy-Harvesting Underlay Cognitive Radio Networks Using Reinforcement Learning 提出基于深度Q网络的认知无线电能量收集优化方案,提升次级用户数据速率 reinforcement learning
46 KIPPO: Koopman-Inspired Proximal Policy Optimization 提出KIPPO,利用Koopman理论提升PPO在复杂控制任务中的性能与稳定性。 reinforcement learning policy learning PPO
47 Bellman operator convergence enhancements in reinforcement learning algorithms 通过改进贝尔曼算子,提升强化学习算法的收敛性和性能 reinforcement learning
48 Personalised Insulin Adjustment with Reinforcement Learning: An In-Silico Validation for People with Diabetes on Intensive Insulin Treatment 提出基于强化学习的个性化胰岛素调整方案ABBA,优化糖尿病患者血糖控制。 reinforcement learning
49 FlowTSE: Target Speaker Extraction with Flow Matching FlowTSE:基于流匹配的说话人提取方法,简化流程并提升性能 flow matching
50 Self Distillation via Iterative Constructive Perturbations 提出迭代构造扰动自蒸馏框架,提升深度神经网络的泛化性能。 distillation
51 From Reasoning to Code: GRPO Optimization for Underrepresented Languages 提出GRPO优化方法,提升LLM在低资源语言上的代码生成能力 reinforcement learning large language model
52 Riemannian Flow Matching for Brain Connectivity Matrices via Pullback Geometry 提出DiffeoCFM,通过拉回几何实现脑连接矩阵的黎曼流匹配生成模型。 flow matching
53 When to retrain a machine learning model 提出基于不确定性的模型重训练方法,应对数据漂移下的性能退化问题 reinforcement learning offline reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
54 RLVR-World: Training World Models with Reinforcement Learning 提出RLVR-World,利用强化学习优化世界模型,提升生成模型的任务特定效用。 manipulation reinforcement learning world model
55 Flattening Hierarchies with Policy Bootstrapping 提出基于策略引导的扁平化层级强化学习方法,解决长程目标条件强化学习问题。 locomotion manipulation reinforcement learning
56 Lessons from Defending Gemini Against Indirect Prompt Injections 针对间接提示注入攻击,评估并提升Gemini模型的对抗鲁棒性 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
57 Physics-Guided Learning of Meteorological Dynamics for Weather Downscaling and Forecasting 提出PhyDL-NWP,融合物理知识的深度学习气象降尺度与预测框架 spatiotemporal
58 A PID-Controlled Tensor Wheel Decomposition Model for Dynamic Link Prediction 提出PID控制的张量轮分解模型PTWD,用于动态网络中的链路预测。 spatiotemporal

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
59 Textual Steering Vectors Can Improve Visual Understanding in Multimodal Large Language Models 利用文本引导向量提升多模态大语言模型的视觉理解能力 spatial relationship large language model multimodal

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
60 Securing Transfer-Learned Networks with Reverse Homomorphic Encryption 提出反向同态加密方法,保护迁移学习网络免受训练数据重建攻击 OMOMO

⬅️ 返回 cs.LG 首页 · 🏠 返回主页