cs.LG(2025-02-06)

📊 共 49 篇论文 | 🔗 9 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (24 🔗7) 支柱二:RL算法与架构 (RL & Architecture) (20 🔗2) 支柱一:机器人控制 (Robot Control) (3) 支柱八:物理动画 (Physics-based Animation) (2)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (24 篇)

#题目一句话要点标签🔗
1 MRAMG-Bench: A Comprehensive Benchmark for Advancing Multimodal Retrieval-Augmented Multimodal Generation 提出MRAMG-Bench,用于评估多模态检索增强多模态生成任务的综合基准。 large language model multimodal
2 WaferLLM: Large Language Model Inference at Wafer Scale WaferLLM:晶圆级大语言模型推理系统,充分利用晶圆级加速器的算力。 large language model
3 Efficient Randomized Experiments Using Foundation Models 利用预训练模型提升随机实验效率并保证统计有效性 foundation model
4 Multimodal Data-Driven Classification of Mental Disorders: A Comprehensive Approach to Diagnosing Depression, Anxiety, and Schizophrenia 提出基于多模态数据融合的深度学习方法,用于精神疾病的辅助诊断。 multimodal
5 HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture 提出基于联合嵌入预测架构的HEP-JEPA模型,用于高能粒子对撞机物理任务 foundation model
6 FAS: Fast ANN-SNN Conversion for Spiking Large Language Models 提出FAS快速ANN-SNN转换方法,高效构建Spiking大语言模型 large language model
7 TorchResist: Open-Source Differentiable Resist Simulator TorchResist:开源可微光刻胶仿真器,助力光刻技术优化 embodied AI large language model
8 Iterate to Accelerate: A Unified Framework for Iterative Reasoning and Feedback Convergence 提出基于Bregman散度的统一迭代推理框架,加速反馈收敛 large language model chain-of-thought
9 Speeding up Speculative Decoding via Sequential Approximate Verification 提出SPRINTER,通过序列近似验证加速推测解码,降低大语言模型推理延迟。 large language model
10 Exploring Model Invariance with Discrete Search for Ultra-Low-Bit Quantization 提出InvarExplore框架,通过离散搜索探索模型不变性,实现超低比特量化。 large language model
11 An Empirical Analysis of Machine Learning Model and Dataset Documentation, Supply Chain, and Licensing Challenges on Hugging Face 分析Hugging Face平台模型与数据集的文档、供应链和许可挑战 large language model
12 FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks FocalCodec:基于焦点调制网络的低比特率语音编码 large language model
13 Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions Speak Easy:利用简单交互从LLM中诱导出有害的越狱行为 large language model
14 Algorithmic causal structure emerging through compression 提出基于压缩的算法因果结构学习框架,解决因果模型不可识别问题。 large language model
15 Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence 短长度对抗训练提升LLM对长长度越狱攻击的防御能力 large language model
16 Multi-agent Architecture Search via Agentic Supernet 提出基于Agentic Supernet的多智能体架构搜索MaAS,实现查询感知的资源高效分配。 large language model
17 KVTuner: Sensitivity-Aware Layer-Wise Mixed-Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference KVTuner:一种敏感度感知的层级混合精度KV缓存量化方法,用于高效且近乎无损的LLM推理。 large language model
18 Understanding and Mitigating the Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks 研究并缓解基于LLM的数据增强在下游任务中的偏差继承问题 large language model
19 CMoE: Converting Mixture-of-Experts from Dense to Accelerate LLM Inference 提出CMoE框架以加速大语言模型推理 large language model
20 Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing Mediator:通过参数冲突感知和不确定性路由实现高效LLM融合 large language model
21 Unravelling Causal Genetic Biomarkers of Alzheimer's Disease via Neuron to Gene-token Backtracking in Neural Architecture: A Groundbreaking Reverse-Gene-Finder Approach 提出Reverse-Gene-Finder,通过神经元回溯寻找阿尔茨海默症的因果遗传生物标志物。 foundation model
22 InfiniteHBD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers InfiniteHBD:构建基于光路交换收发器的数据中心级LLM高带宽域 large language model
23 Rank Also Matters: Hierarchical Configuration for Mixture of Adapter Experts in LLM Fine-Tuning 提出HILO:一种层级配置的Adapter专家混合模型微调方法,提升LLM微调效率。 large language model
24 Adaptive Prototype Knowledge Transfer for Federated Learning with Mixed Modalities and Heterogeneous Tasks 提出AproMFL,解决混合模态联邦学习中异构任务和非统一标签问题。 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (20 篇)

#题目一句话要点标签🔗
25 Transforming Multimodal Models into Action Models for Radiotherapy 提出基于少样本强化学习的行动模型,将多模态模型应用于放疗计划。 reinforcement learning foundation model multimodal
26 CAST: Cross Attention based multimodal fusion of Structure and Text for materials property prediction 提出CAST:一种基于交叉注意力的结构-文本多模态融合模型,用于材料属性预测。 predictive model MAE multimodal
27 Behavioral Entropy-Guided Dataset Generation for Offline Reinforcement Learning 提出基于行为熵的离线强化学习数据集生成方法,提升复杂连续控制任务性能。 reinforcement learning offline RL offline reinforcement learning
28 Illuminating Spaces: Deep Reinforcement Learning and Laser-Wall Partitioning for Architectural Layout Generation 提出基于深度强化学习和激光墙分割的建筑布局生成方法 reinforcement learning deep reinforcement learning
29 Training Language Models to Reason Efficiently 提出基于强化学习的推理效率优化方法,降低大语言模型推理成本。 reinforcement learning large language model chain-of-thought
30 PILAF: Optimal Human Preference Sampling for Reward Modeling 提出PILAF,通过优化人类偏好采样提升奖励模型对齐效果 reinforcement learning preference learning RLHF
31 Towards Cost-Effective Reward Guided Text Generation 提出一种新型奖励模型以提高文本生成效率 reinforcement learning offline RL offline reinforcement learning
32 Fairness Aware Reinforcement Learning via Proximal Policy Optimization 提出公平强化学习方法Fair-PPO以解决多智能体系统中的公平性问题 reinforcement learning PPO
33 Revisiting Intermediate-Layer Matching in Knowledge Distillation: Layer-Selection Strategy Doesn't Matter (Much) 知识蒸馏中层匹配策略不敏感性研究:层选择策略影响甚微 distillation
34 Provable Sample-Efficient Transfer Learning Conditional Diffusion Models via Representation Learning 提出基于表征学习的条件扩散模型迁移学习理论框架,提升样本效率。 representation learning
35 Consistency of augmentation graph and network approximability in contrastive learning 分析对比学习中数据增强图的一致性和网络可逼近性 contrastive learning
36 Orthogonal Representation Learning for Estimating Causal Quantities 提出正交表示学习以提高因果量估计的效率 representation learning
37 Autotelic Reinforcement Learning: Exploring Intrinsic Motivations for Skill Acquisition in Open-Ended Environments 提出自生强化学习,探索开放环境中基于内在动机的技能获取方法 reinforcement learning
38 Deep Meta Coordination Graphs for Multi-agent Reinforcement Learning 提出深度元协调图以解决多智能体强化学习中的协作策略问题 reinforcement learning
39 CleanSurvival: Automated data preprocessing for time-to-event models using reinforcement learning CleanSurvival:利用强化学习自动进行生存分析数据预处理 reinforcement learning
40 Beyond Interpolation: Extrapolative Reasoning with Reinforcement Learning and Graph Neural Networks 提出基于强化学习和图神经网络的框架,用于解决逻辑谜题中的外推推理问题 reinforcement learning
41 Online Location Planning for AI-Defined Vehicles: Optimizing Joint Tasks of Order Serving and Spatio-Temporal Heterogeneous Model Fine-Tuning 提出基于MARL的在线位置规划框架,优化AI车辆订单服务和时空异构模型微调联合任务。 reinforcement learning foundation model
42 Should Code Models Learn Pedagogically? A Preliminary Evaluation of Curriculum Learning for Real-World Software Engineering Tasks 探索课程学习在真实软件工程任务中的有效性:CodeT5模型的初步评估 curriculum learning
43 Self-Improving Skill Learning for Robust Skill-based Meta-Reinforcement Learning 提出自提升技能学习(SISL),解决技能型元强化学习在噪声离线数据下的不稳定问题。 reinforcement learning
44 Learning Reward Machines from Partially Observed Policies 提出基于前缀树策略的奖励机器学习方法以解决逆强化学习问题 reinforcement learning inverse reinforcement learning

🔬 支柱一:机器人控制 (Robot Control) (3 篇)

#题目一句话要点标签🔗
45 How Vulnerable Is My Learned Policy? Universal Adversarial Perturbation Attacks On Modern Behavior Cloning Policies 揭示模仿学习策略的脆弱性:通用对抗扰动攻击行为克隆策略 manipulation behavior cloning diffusion policy
46 Making Sense of Touch: Unsupervised Shapelet Learning in Bag-of-words Sense 提出NN-STNE以解决时间序列数据聚类问题 manipulation
47 Detecting Backdoor Attacks via Similarity in Semantic Communication Systems 提出基于语义相似性的后门攻击检测方法,用于保护语义通信系统。 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
48 MedGNN: Towards Multi-resolution Spatiotemporal Graph Learning for Medical Time Series Classification MedGNN:面向医疗时间序列分类的多分辨率时空图学习框架 spatiotemporal
49 Network-Wide Traffic Flow Estimation Across Multiple Cities with Global Open Multi-Source Data: A Large-Scale Case Study in Europe and North America 提出基于全局开放多源数据的深度学习框架,解决多城市网络级交通流量估计问题 spatiotemporal

⬅️ 返回 cs.LG 首页 · 🏠 返回主页