cs.LG(2025-02-05)

📊 共 47 篇论文 | 🔗 10 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (21 🔗5) 支柱二:RL算法与架构 (RL & Architecture) (17 🔗4) 支柱一:机器人控制 (Robot Control) (4 🔗1) 支柱五:交互与反应 (Interaction & Reaction) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1) 支柱四:生成式动作 (Generative Motion) (1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (21 篇)

#题目一句话要点标签🔗
1 RiemannGFM: Learning a Graph Foundation Model from Riemannian Geometry 提出RiemannGFM,通过黎曼几何学习图结构基础模型,实现跨领域图数据迁移。 large language model foundation model
2 Mol-LLM: Multimodal Generalist Molecular LLM with Improved Graph Utilization Mol-LLM:通过改进图利用率的多模态通用分子大语言模型 large language model multimodal
3 Schema-Guided Scene-Graph Reasoning based on Multi-Agent Large Language Model System 提出基于多智能体LLM的Schema引导场景图推理框架SG^2,提升复杂环境下的推理能力。 large language model
4 Code Simulation as a Proxy for High-order Tasks in Large Language Models 利用代码模拟作为大语言模型高阶任务能力的代理评估方法 large language model
5 Do Large Language Model Benchmarks Test Reliability? 提出铂金基准测试集,解决大语言模型可靠性评估中标签错误问题 large language model
6 Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications 评估时间序列预测模型:从统计方法到基础模型在餐饮业的实际应用 foundation model
7 Transformers and Their Roles as Time Series Foundation Models 分析Transformer作为时间序列基础模型的能力,揭示其在自回归建模中的作用。 foundation model
8 DiffListener: Discrete Diffusion Model for Listener Generation DiffListener:提出基于离散扩散模型的非自回归听者头部姿态生成方法 multimodal
9 Bilevel ZOFO: Efficient LLM Fine-Tuning and Meta-Training 提出Bilevel-ZOFO以解决LLM微调效率问题 large language model
10 QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache QuantSpec:利用分层量化KV缓存的自推测解码加速长文本LLM推理。 large language model
11 Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training 提出Adapt-Pruner以解决小型语言模型训练效率问题 large language model
12 Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning 提出Divergence-driven Zeroth-Order优化算法,加速并提升LLM的零阶微调性能。 large language model
13 General Time-series Model for Universal Knowledge Representation of Multivariate Time-Series data 提出通用时间序列模型GTM,解决多变量时间序列数据通用知识表示问题。 foundation model
14 CARROT: A Cost Aware Rate Optimal Router 提出CARROT:一种成本感知的速率最优LLM路由方法 large language model
15 PICBench: Benchmarking LLMs for Photonic Integrated Circuits Design PICBench:用于光子集成电路设计的LLM基准测试框架 large language model
16 Disproving Program Equivalence with LLMs 提出ProbeGen,利用LLM和执行反馈验证代码等价性,提升代码理解与合成。 large language model
17 SPARC: Subspace-Aware Prompt Adaptation for Robust Continual Learning in LLMs SPARC:基于子空间的提示微调,提升LLM在持续学习中的鲁棒性 large language model
18 LoCA: Location-Aware Cosine Adaptation for Parameter-Efficient Fine-Tuning 提出LoCA以解决低秩适应方法的局限性问题 large language model
19 Data Wrangling Task Automation Using Code-Generating Language Models 提出一种基于代码生成语言模型的数据整理自动化系统,用于提升数据质量。 large language model
20 Scaling Laws for Upcycling Mixture-of-Experts Language Models 研究MoE语言模型Upcycling扩展规律,指导高效训练并超越从头训练。 large language model
21 Leveraging the true depth of LLMs 提出LLM层并行加速方法,无需重训练,显著提升推理吞吐量。 large language model

🔬 支柱二:RL算法与架构 (RL & Architecture) (17 篇)

#题目一句话要点标签🔗
22 CTR-Driven Advertising Image Generation with Multimodal Large Language Models 提出基于多模态大语言模型和CTR优化的广告图像生成方法,提升电商广告效果。 reinforcement learning large language model multimodal
23 Reveal the Mystery of DPO: The Connection between DPO and RL Algorithms 统一框架揭示DPO与RL算法的关联,洞察RLHF算法的内在联系 reinforcement learning PPO RLHF
24 Interactive Symbolic Regression through Offline Reinforcement Learning: A Co-Design Framework 提出Sym-Q:一种基于离线强化学习的交互式符号回归框架,解决表达式搜索难题。 reinforcement learning offline reinforcement learning IMoS
25 Double Distillation Network for Multi-Agent Reinforcement Learning 提出双重蒸馏网络(DDN)以提升多智能体强化学习中的协作策略。 reinforcement learning distillation
26 Contrastive Learning for Cold Start Recommendation with Adaptive Feature Fusion 提出融合对比学习的冷启动推荐模型,解决交互数据稀疏问题 contrastive learning multimodal
27 RLOMM: An Efficient and Robust Online Map Matching Framework with Reinforcement Learning 提出RLOMM,利用强化学习实现高效鲁棒的在线地图匹配 reinforcement learning representation learning contrastive learning
28 Teaching Language Models to Critique via Reinforcement Learning 提出CTRL框架,通过强化学习训练代码生成评论模型,提升LLM代码生成能力。 reinforcement learning large language model
29 TopoCL: Topological Contrastive Learning for Time Series TopoCL:针对时间序列数据,提出拓扑对比学习方法,提升通用表征能力。 representation learning contrastive learning
30 MobiCLR: Mobility Time Series Contrastive Learning for Urban Region Representations MobiCLR:提出基于对比学习的城市区域表征模型,挖掘城市流动时序数据。 representation learning contrastive learning
31 Towards Large-Scale In-Context Reinforcement Learning by Meta-Training in Randomized Worlds 提出AnyMDP及解耦策略蒸馏,实现大规模上下文强化学习的元训练 reinforcement learning distillation
32 Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks 提出任务感知虚拟训练(TAVT),提升元强化学习在分布外任务上的泛化能力 reinforcement learning representation learning
33 Elucidating the Preconditioning in Consistency Distillation 提出Analytic-Precond,通过解析优化预处理加速一致性蒸馏训练。 distillation
34 A Unified Knowledge-Distillation and Semi-Supervised Learning Framework to Improve Industrial Ads Delivery Systems 提出UKDSL框架,融合知识蒸馏与半监督学习,提升工业广告投放系统性能。 distillation
35 Calibrated Unsupervised Anomaly Detection in Multivariate Time-series using Reinforcement Learning 提出基于强化学习的校准无监督异常检测方法,用于多元时间序列分析。 reinforcement learning
36 Optimistic ε-Greedy Exploration for Cooperative Multi-Agent Reinforcement Learning 提出乐观ε-贪婪探索算法,解决合作多智能体强化学习中的次优策略问题 reinforcement learning
37 Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning 提出Wolfpack对抗攻击与WALL框架,提升多智能体强化学习的鲁棒性 reinforcement learning
38 DeepCell: Self-Supervised Multiview Fusion for Circuit Representation Learning DeepCell:面向电路表示学习的自监督多视图融合框架 representation learning

🔬 支柱一:机器人控制 (Robot Control) (4 篇)

#题目一句话要点标签🔗
39 TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint TD-M(PC)$^2$:通过策略约束改进时序差分模型预测控制,提升数据效率 humanoid MPC reinforcement learning
40 Scaling laws in wearable human activity recognition 首次建立可穿戴人体活动识别的缩放定律,指导模型设计与数据选择。 locomotion multimodal
41 Analyze Feature Flow to Enhance Interpretation and Steering in Language Models 提出跨层特征流分析方法,增强语言模型的可解释性和操控性 manipulation large language model
42 Clone-Robust Weights in Metric Spaces: Handling Redundancy Bias for Benchmark Aggregation 提出克隆鲁棒权重方法,解决度量空间中基准聚合的冗余偏差问题 manipulation

🔬 支柱五:交互与反应 (Interaction & Reaction) (2 篇)

#题目一句话要点标签🔗
43 HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference HACK:通过压缩键值缓存实现异构LLM推理的同态加速 OMOMO large language model
44 Functional 3D Scene Synthesis through Human-Scene Optimization 提出基于人-场景优化的功能性3D场景生成方法,提升场景可用性。 human-object interaction

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
45 No Location Left Behind: Measuring and Improving the Fairness of Implicit Representations for Earth Data 针对地球数据隐式表征的公平性问题,提出基于球谐小波编码的改进方案。 implicit representation

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
46 A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests VuTeCo:AI驱动的漏洞与安全单元测试匹配框架,促进安全测试用例生成。 penetration

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
47 A Bayesian perspective on single-shot laser characterization 提出基于贝叶斯框架的单次超强激光脉冲时空耦合特性测量方法 PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页