| 24 |
TuneComp: Joint Fine-tuning and Compression for Large Foundation Models |
提出TuneComp:联合微调与压缩大型基础模型,提升性能并减小模型体积。 |
distillation foundation model |
|
|
| 25 |
A Cross Modal Knowledge Distillation & Data Augmentation Recipe for Improving Transcriptomics Representations through Morphological Features |
提出跨模态知识蒸馏与数据增强方法,利用形态学特征提升转录组学表征 |
distillation foundation model multimodal |
|
|
| 26 |
Foundation Model Hidden Representations for Heart Rate Estimation from Auscultation |
利用预训练声学基础模型表征进行听诊心率估计,性能媲美甚至超越传统方法。 |
MAE foundation model |
|
|
| 27 |
TabReason: A Reinforcement Learning-Enhanced Reasoning LLM for Explainable Tabular Data Prediction |
提出TabReason,一种强化学习增强的推理LLM,用于可解释的表格数据预测。 |
reinforcement learning predictive model large language model |
|
|
| 28 |
Deep Reinforcement Learning Agents are not even close to Human Intelligence |
HackAtari揭示深度强化学习智能体在简化任务中泛化能力不足 |
reinforcement learning deep reinforcement learning |
|
|
| 29 |
Topology-Aware and Highly Generalizable Deep Reinforcement Learning for Efficient Retrieval in Multi-Deep Storage Systems |
提出拓扑感知深度强化学习,用于多深度存储系统高效检索 |
reinforcement learning deep reinforcement learning |
|
|
| 30 |
Hierarchical Reinforcement Learning with Uncertainty-Guided Diffusional Subgoals |
提出基于不确定性引导扩散子目标的层级强化学习方法,提升样本效率和性能。 |
reinforcement learning diffusion policy |
|
|
| 31 |
Simple yet Effective Graph Distillation via Clustering |
提出ClustGDD,通过聚类实现高效图数据蒸馏,加速GNN训练。 |
representation learning distillation |
|
|
| 32 |
Semi-supervised Clustering Through Representation Learning of Large-scale EHR Data |
提出SCORE半监督聚类框架,通过表征学习处理大规模EHR数据,提升患者分型和预测能力。 |
predictive model representation learning |
|
|
| 33 |
Accelerating RL for LLM Reasoning with Optimal Advantage Regression |
提出A*-PO算法,通过最优优势回归加速LLM推理的强化学习训练。 |
reinforcement learning PPO large language model |
✅ |
|
| 34 |
A Framework for Adversarial Analysis of Decision Support Systems Prior to Deployment |
提出一种决策支持系统对抗分析框架,用于评估和防御深度强化学习智能体的安全风险。 |
reinforcement learning deep reinforcement learning DRL |
|
|
| 35 |
Universal Value-Function Uncertainties |
提出通用价值函数不确定性(UVU)方法,高效量化强化学习中的价值不确定性。 |
reinforcement learning offline RL distillation |
|
|
| 36 |
HAD: Hybrid Architecture Distillation Outperforms Teacher in Genomic Sequence Modeling |
提出混合架构蒸馏(HAD),提升基因组序列建模中小模型性能,超越大模型教师。 |
distillation |
|
|
| 37 |
A reinforcement learning agent for maintenance of deteriorating systems with increasingly imperfect repairs |
提出基于强化学习的维护策略,解决退化系统日益不完善的维修问题。 |
reinforcement learning |
|
|
| 38 |
Apprenticeship learning with prior beliefs using inverse optimization |
利用逆优化的先验信念进行学徒学习,解决逆强化学习中的病态问题。 |
reinforcement learning inverse reinforcement learning |
|
|
| 39 |
Sparsified State-Space Models are Efficient Highway Networks |
提出Simba方法以提高状态空间模型的效率 |
Mamba SSM |
✅ |
|