cs.LG（2025-02-05）

📊 共 47 篇论文 | 🔗 10 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (21 🔗5) 支柱二：RL算法与架构 (RL & Architecture) (17 🔗4) 支柱一：机器人控制 (Robot Control) (4 🔗1) 支柱五：交互与反应 (Interaction & Reaction) (2) 支柱三：空间感知与语义 (Perception & Semantics) (1) 支柱四：生成式动作 (Generative Motion) (1) 支柱八：物理动画 (Physics-based Animation) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (21 篇)

#	题目	一句话要点	标签	🔗
1	RiemannGFM: Learning a Graph Foundation Model from Riemannian Geometry	提出RiemannGFM，通过黎曼几何学习图结构基础模型，实现跨领域图数据迁移。	large language model foundation model
2	Mol-LLM: Multimodal Generalist Molecular LLM with Improved Graph Utilization	Mol-LLM：通过改进图利用率的多模态通用分子大语言模型	large language model multimodal
3	Schema-Guided Scene-Graph Reasoning based on Multi-Agent Large Language Model System	提出基于多智能体LLM的Schema引导场景图推理框架SG^2，提升复杂环境下的推理能力。	large language model
4	Code Simulation as a Proxy for High-order Tasks in Large Language Models	利用代码模拟作为大语言模型高阶任务能力的代理评估方法	large language model
5	Do Large Language Model Benchmarks Test Reliability?	提出铂金基准测试集，解决大语言模型可靠性评估中标签错误问题	large language model	✅
6	Benchmarking Time Series Forecasting Models: From Statistical Techniques to Foundation Models in Real-World Applications	评估时间序列预测模型：从统计方法到基础模型在餐饮业的实际应用	foundation model
7	Transformers and Their Roles as Time Series Foundation Models	分析Transformer作为时间序列基础模型的能力，揭示其在自回归建模中的作用。	foundation model
8	DiffListener: Discrete Diffusion Model for Listener Generation	DiffListener：提出基于离散扩散模型的非自回归听者头部姿态生成方法	multimodal	✅
9	Bilevel ZOFO: Efficient LLM Fine-Tuning and Meta-Training	提出Bilevel-ZOFO以解决LLM微调效率问题	large language model
10	QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache	QuantSpec：利用分层量化KV缓存的自推测解码加速长文本LLM推理。	large language model
11	Adapt-Pruner: Adaptive Structural Pruning for Efficient Small Language Model Training	提出Adapt-Pruner以解决小型语言模型训练效率问题	large language model	✅
12	Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning	提出Divergence-driven Zeroth-Order优化算法，加速并提升LLM的零阶微调性能。	large language model	✅
13	General Time-series Model for Universal Knowledge Representation of Multivariate Time-Series data	提出通用时间序列模型GTM，解决多变量时间序列数据通用知识表示问题。	foundation model
14	CARROT: A Cost Aware Rate Optimal Router	提出CARROT：一种成本感知的速率最优LLM路由方法	large language model
15	PICBench: Benchmarking LLMs for Photonic Integrated Circuits Design	PICBench：用于光子集成电路设计的LLM基准测试框架	large language model	✅
16	Disproving Program Equivalence with LLMs	提出ProbeGen，利用LLM和执行反馈验证代码等价性，提升代码理解与合成。	large language model
17	SPARC: Subspace-Aware Prompt Adaptation for Robust Continual Learning in LLMs	SPARC：基于子空间的提示微调，提升LLM在持续学习中的鲁棒性	large language model
18	LoCA: Location-Aware Cosine Adaptation for Parameter-Efficient Fine-Tuning	提出LoCA以解决低秩适应方法的局限性问题	large language model
19	Data Wrangling Task Automation Using Code-Generating Language Models	提出一种基于代码生成语言模型的数据整理自动化系统，用于提升数据质量。	large language model
20	Scaling Laws for Upcycling Mixture-of-Experts Language Models	研究MoE语言模型Upcycling扩展规律，指导高效训练并超越从头训练。	large language model
21	Leveraging the true depth of LLMs	提出LLM层并行加速方法，无需重训练，显著提升推理吞吐量。	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (17 篇)

#	题目	一句话要点	标签	🔗
22	CTR-Driven Advertising Image Generation with Multimodal Large Language Models	提出基于多模态大语言模型和CTR优化的广告图像生成方法，提升电商广告效果。	reinforcement learning large language model multimodal	✅
23	Reveal the Mystery of DPO: The Connection between DPO and RL Algorithms	统一框架揭示DPO与RL算法的关联，洞察RLHF算法的内在联系	reinforcement learning PPO RLHF
24	Interactive Symbolic Regression through Offline Reinforcement Learning: A Co-Design Framework	提出Sym-Q：一种基于离线强化学习的交互式符号回归框架，解决表达式搜索难题。	reinforcement learning offline reinforcement learning IMoS	✅
25	Double Distillation Network for Multi-Agent Reinforcement Learning	提出双重蒸馏网络（DDN）以提升多智能体强化学习中的协作策略。	reinforcement learning distillation
26	Contrastive Learning for Cold Start Recommendation with Adaptive Feature Fusion	提出融合对比学习的冷启动推荐模型，解决交互数据稀疏问题	contrastive learning multimodal
27	RLOMM: An Efficient and Robust Online Map Matching Framework with Reinforcement Learning	提出RLOMM，利用强化学习实现高效鲁棒的在线地图匹配	reinforcement learning representation learning contrastive learning
28	Teaching Language Models to Critique via Reinforcement Learning	提出CTRL框架，通过强化学习训练代码生成评论模型，提升LLM代码生成能力。	reinforcement learning large language model
29	TopoCL: Topological Contrastive Learning for Time Series	TopoCL：针对时间序列数据，提出拓扑对比学习方法，提升通用表征能力。	representation learning contrastive learning
30	MobiCLR: Mobility Time Series Contrastive Learning for Urban Region Representations	MobiCLR：提出基于对比学习的城市区域表征模型，挖掘城市流动时序数据。	representation learning contrastive learning
31	Towards Large-Scale In-Context Reinforcement Learning by Meta-Training in Randomized Worlds	提出AnyMDP及解耦策略蒸馏，实现大规模上下文强化学习的元训练	reinforcement learning distillation
32	Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks	提出任务感知虚拟训练(TAVT)，提升元强化学习在分布外任务上的泛化能力	reinforcement learning representation learning	✅
33	Elucidating the Preconditioning in Consistency Distillation	提出Analytic-Precond，通过解析优化预处理加速一致性蒸馏训练。	distillation
34	A Unified Knowledge-Distillation and Semi-Supervised Learning Framework to Improve Industrial Ads Delivery Systems	提出UKDSL框架，融合知识蒸馏与半监督学习，提升工业广告投放系统性能。	distillation
35	Calibrated Unsupervised Anomaly Detection in Multivariate Time-series using Reinforcement Learning	提出基于强化学习的校准无监督异常检测方法，用于多元时间序列分析。	reinforcement learning
36	Optimistic ε-Greedy Exploration for Cooperative Multi-Agent Reinforcement Learning	提出乐观ε-贪婪探索算法，解决合作多智能体强化学习中的次优策略问题	reinforcement learning
37	Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning	提出Wolfpack对抗攻击与WALL框架，提升多智能体强化学习的鲁棒性	reinforcement learning	✅
38	DeepCell: Self-Supervised Multiview Fusion for Circuit Representation Learning	DeepCell：面向电路表示学习的自监督多视图融合框架	representation learning

🔬 支柱一：机器人控制 (Robot Control) (4 篇)

#	题目	一句话要点	标签	🔗
39	TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint	TD-M(PC)$^2$：通过策略约束改进时序差分模型预测控制，提升数据效率	humanoid MPC reinforcement learning	✅
40	Scaling laws in wearable human activity recognition	首次建立可穿戴人体活动识别的缩放定律，指导模型设计与数据选择。	locomotion multimodal
41	Analyze Feature Flow to Enhance Interpretation and Steering in Language Models	提出跨层特征流分析方法，增强语言模型的可解释性和操控性	manipulation large language model
42	Clone-Robust Weights in Metric Spaces: Handling Redundancy Bias for Benchmark Aggregation	提出克隆鲁棒权重方法，解决度量空间中基准聚合的冗余偏差问题	manipulation

🔬 支柱五：交互与反应 (Interaction & Reaction) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
43	HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference	HACK：通过压缩键值缓存实现异构LLM推理的同态加速	OMOMO large language model
44	Functional 3D Scene Synthesis through Human-Scene Optimization	提出基于人-场景优化的功能性3D场景生成方法，提升场景可用性。	human-object interaction

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
45	No Location Left Behind: Measuring and Improving the Fairness of Implicit Representations for Earth Data	针对地球数据隐式表征的公平性问题，提出基于球谐小波编码的改进方案。	implicit representation

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
46	A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests	VuTeCo：AI驱动的漏洞与安全单元测试匹配框架，促进安全测试用例生成。	penetration

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
47	A Bayesian perspective on single-shot laser characterization	提出基于贝叶斯框架的单次超强激光脉冲时空耦合特性测量方法	PULSE

⬅️ 返回 cs.LG 首页 · 🏠 返回主页

cs.LG（2025-02-05）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (21 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (17 篇)

🔬 支柱一：机器人控制 (Robot Control) (4 篇)

🔬 支柱五：交互与反应 (Interaction & Reaction) (2 篇)

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (1 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理