cs.LG（2025-02-06）

📊 共 49 篇论文 | 🔗 9 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (24 🔗7) 支柱二：RL算法与架构 (RL & Architecture) (20 🔗2) 支柱一：机器人控制 (Robot Control) (3) 支柱八：物理动画 (Physics-based Animation) (2)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (24 篇)

#	题目	一句话要点	标签	🔗
1	MRAMG-Bench: A Comprehensive Benchmark for Advancing Multimodal Retrieval-Augmented Multimodal Generation	提出MRAMG-Bench，用于评估多模态检索增强多模态生成任务的综合基准。	large language model multimodal	✅
2	WaferLLM: Large Language Model Inference at Wafer Scale	WaferLLM：晶圆级大语言模型推理系统，充分利用晶圆级加速器的算力。	large language model	✅
3	Efficient Randomized Experiments Using Foundation Models	利用预训练模型提升随机实验效率并保证统计有效性	foundation model
4	Multimodal Data-Driven Classification of Mental Disorders: A Comprehensive Approach to Diagnosing Depression, Anxiety, and Schizophrenia	提出基于多模态数据融合的深度学习方法，用于精神疾病的辅助诊断。	multimodal
5	HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture	提出基于联合嵌入预测架构的HEP-JEPA模型，用于高能粒子对撞机物理任务	foundation model
6	FAS: Fast ANN-SNN Conversion for Spiking Large Language Models	提出FAS快速ANN-SNN转换方法，高效构建Spiking大语言模型	large language model	✅
7	TorchResist: Open-Source Differentiable Resist Simulator	TorchResist：开源可微光刻胶仿真器，助力光刻技术优化	embodied AI large language model
8	Iterate to Accelerate: A Unified Framework for Iterative Reasoning and Feedback Convergence	提出基于Bregman散度的统一迭代推理框架，加速反馈收敛	large language model chain-of-thought
9	Speeding up Speculative Decoding via Sequential Approximate Verification	提出SPRINTER，通过序列近似验证加速推测解码，降低大语言模型推理延迟。	large language model
10	Exploring Model Invariance with Discrete Search for Ultra-Low-Bit Quantization	提出InvarExplore框架，通过离散搜索探索模型不变性，实现超低比特量化。	large language model
11	An Empirical Analysis of Machine Learning Model and Dataset Documentation, Supply Chain, and Licensing Challenges on Hugging Face	分析Hugging Face平台模型与数据集的文档、供应链和许可挑战	large language model
12	FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks	FocalCodec：基于焦点调制网络的低比特率语音编码	large language model	✅
13	Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions	Speak Easy：利用简单交互从LLM中诱导出有害的越狱行为	large language model
14	Algorithmic causal structure emerging through compression	提出基于压缩的算法因果结构学习框架，解决因果模型不可识别问题。	large language model
15	Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence	短长度对抗训练提升LLM对长长度越狱攻击的防御能力	large language model	✅
16	Multi-agent Architecture Search via Agentic Supernet	提出基于Agentic Supernet的多智能体架构搜索MaAS，实现查询感知的资源高效分配。	large language model
17	KVTuner: Sensitivity-Aware Layer-Wise Mixed-Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference	KVTuner：一种敏感度感知的层级混合精度KV缓存量化方法，用于高效且近乎无损的LLM推理。	large language model	✅
18	Understanding and Mitigating the Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks	研究并缓解基于LLM的数据增强在下游任务中的偏差继承问题	large language model
19	CMoE: Converting Mixture-of-Experts from Dense to Accelerate LLM Inference	提出CMoE框架以加速大语言模型推理	large language model	✅
20	Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing	Mediator：通过参数冲突感知和不确定性路由实现高效LLM融合	large language model
21	Unravelling Causal Genetic Biomarkers of Alzheimer's Disease via Neuron to Gene-token Backtracking in Neural Architecture: A Groundbreaking Reverse-Gene-Finder Approach	提出Reverse-Gene-Finder，通过神经元回溯寻找阿尔茨海默症的因果遗传生物标志物。	foundation model
22	InfiniteHBD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers	InfiniteHBD：构建基于光路交换收发器的数据中心级LLM高带宽域	large language model
23	Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning	提出HILO：一种层级配置的Adapter专家混合模型微调方法，提升LLM微调效率。	large language model
24	Adaptive Prototype Knowledge Transfer for Federated Learning with Mixed Modalities and Heterogeneous Tasks	提出AproMFL，解决混合模态联邦学习中异构任务和非统一标签问题。	multimodal

🔬 支柱二：RL算法与架构 (RL & Architecture) (20 篇)

#	题目	一句话要点	标签	🔗
25	Transforming Multimodal Models into Action Models for Radiotherapy	提出基于少样本强化学习的行动模型，将多模态模型应用于放疗计划。	reinforcement learning foundation model multimodal
26	CAST: Cross Attention based multimodal fusion of Structure and Text for materials property prediction	提出CAST：一种基于交叉注意力的结构-文本多模态融合模型，用于材料属性预测。	predictive model MAE multimodal
27	Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning	提出基于行为熵的离线强化学习数据集生成方法，提升复杂连续控制任务性能。	reinforcement learning offline RL offline reinforcement learning
28	Illuminating Spaces: Deep Reinforcement Learning and Laser-Wall Partitioning for Architectural Layout Generation	提出基于深度强化学习和激光墙分割的建筑布局生成方法	reinforcement learning deep reinforcement learning
29	Training Language Models to Reason Efficiently	提出基于强化学习的推理效率优化方法，降低大语言模型推理成本。	reinforcement learning large language model chain-of-thought
30	PILAF: Optimal Human Preference Sampling for Reward Modeling	提出PILAF，通过优化人类偏好采样提升奖励模型对齐效果	reinforcement learning preference learning RLHF
31	Towards Cost-Effective Reward Guided Text Generation	提出一种新型奖励模型以提高文本生成效率	reinforcement learning offline RL offline reinforcement learning
32	Fairness Aware Reinforcement Learning via Proximal Policy Optimization	提出公平强化学习方法Fair-PPO以解决多智能体系统中的公平性问题	reinforcement learning PPO
33	Revisiting Intermediate-Layer Matching in Knowledge Distillation: Layer-Selection Strategy Doesn't Matter (Much)	知识蒸馏中层匹配策略不敏感性研究：层选择策略影响甚微	distillation
34	Provable Sample-Efficient Transfer Learning Conditional Diffusion Models via Representation Learning	提出基于表征学习的条件扩散模型迁移学习理论框架，提升样本效率。	representation learning
35	Consistency of augmentation graph and network approximability in contrastive learning	分析对比学习中数据增强图的一致性和网络可逼近性	contrastive learning
36	Orthogonal Representation Learning for Estimating Causal Quantities	提出正交表示学习以提高因果量估计的效率	representation learning
37	Autotelic Reinforcement Learning: Exploring Intrinsic Motivations for Skill Acquisition in Open-Ended Environments	提出自生强化学习，探索开放环境中基于内在动机的技能获取方法	reinforcement learning
38	Deep Meta Coordination Graphs for Multi-agent Reinforcement Learning	提出深度元协调图以解决多智能体强化学习中的协作策略问题	reinforcement learning	✅
39	CleanSurvival: Automated data preprocessing for time-to-event models using reinforcement learning	CleanSurvival：利用强化学习自动进行生存分析数据预处理	reinforcement learning	✅
40	Beyond Interpolation: Extrapolative Reasoning with Reinforcement Learning and Graph Neural Networks	提出基于强化学习和图神经网络的框架，用于解决逻辑谜题中的外推推理问题	reinforcement learning
41	Online Location Planning for AI-Defined Vehicles: Optimizing Joint Tasks of Order Serving and Spatio-Temporal Heterogeneous Model Fine-Tuning	提出基于MARL的在线位置规划框架，优化AI车辆订单服务和时空异构模型微调联合任务。	reinforcement learning foundation model
42	Should Code Models Learn Pedagogically? A Preliminary Evaluation of Curriculum Learning for Real-World Software Engineering Tasks	探索课程学习在真实软件工程任务中的有效性：CodeT5模型的初步评估	curriculum learning
43	Self-Improving Skill Learning for Robust Skill-based Meta-Reinforcement Learning	提出自提升技能学习(SISL)，解决技能型元强化学习在噪声离线数据下的不稳定问题。	reinforcement learning
44	Learning Reward Machines from Partially Observed Policies	提出基于前缀树策略的奖励机器学习方法以解决逆强化学习问题	reinforcement learning inverse reinforcement learning

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

#	题目	一句话要点	标签
45	How Vulnerable Is My Learned Policy? Universal Adversarial Perturbation Attacks On Modern Behavior Cloning Policies	揭示模仿学习策略的脆弱性：通用对抗扰动攻击行为克隆策略	manipulation behavior cloning diffusion policy
46	Making Sense of Touch: Unsupervised Shapelet Learning in Bag-of-words Sense	提出NN-STNE以解决时间序列数据聚类问题	manipulation
47	Detecting Backdoor Attacks via Similarity in Semantic Communication Systems	提出基于语义相似性的后门攻击检测方法，用于保护语义通信系统。	manipulation

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
48	MedGNN: Towards Multi-resolution Spatiotemporal Graph Learning for Medical Time Series Classification	MedGNN：面向医疗时间序列分类的多分辨率时空图学习框架	spatiotemporal
49	Network-Wide Traffic Flow Estimation Across Multiple Cities with Global Open Multi-Source Data: A Large-Scale Case Study in Europe and North America	提出基于全局开放多源数据的深度学习框架，解决多城市网络级交通流量估计问题	spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页

cs.LG（2025-02-06）

🎯 兴趣领域导航

🔬 支柱九：具身大模型 (Embodied Foundation Models) (24 篇)

🔬 支柱二：RL算法与架构 (RL & Architecture) (20 篇)

🔬 支柱一：机器人控制 (Robot Control) (3 篇)

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理