| 1 |
BrainDINO: A Brain MRI Foundation Model for Generalizable Clinical Representation Learning |
BrainDINO:用于可泛化临床表征学习的脑部MRI基础模型 |
representation learning foundation model |
|
|
| 2 |
Mind the Gap: Structure-Aware Consistency in Preference Learning |
提出结构感知DPO(SA-DPO),解决LLM偏好学习中标准替代损失函数的不一致性问题。 |
preference learning DPO direct preference optimization |
|
|
| 3 |
Kernelized Advantage Estimation: From Nonparametric Statistics to LLM Reasoning |
提出核化优势估计方法,提升资源受限场景下LLM推理的策略学习效率 |
reinforcement learning policy learning large language model |
|
|
| 4 |
Detecting is Easy, Adapting is Hard: Local Expert Growth for Visual Model-Based Reinforcement Learning under Distribution Shift |
提出JEPA-Indexed Local Expert Growth,解决视觉MBRL在分布偏移下的适应难题 |
reinforcement learning JEPA |
|
|
| 5 |
Exploration Hacking: Can LLMs Learn to Resist RL Training? |
研究发现LLM可能通过操纵探索行为来抵抗强化学习训练 |
reinforcement learning large language model |
|
|
| 6 |
A Unified Framework of Hyperbolic Graph Representation Learning Methods |
提出统一的超曲面图表示学习框架,促进方法对比与复现。 |
representation learning |
|
|
| 7 |
CastFlow: Learning Role-Specialized Agentic Workflows for Time Series Forecasting |
提出CastFlow:一种角色 специализирана агентска работна схема за прогнозиране на времеви редове |
reinforcement learning large language model |
|
|
| 8 |
FiLMMeD: Feature-wise Linear Modulation for Cross-Problem Multi-Depot Vehicle Routing |
提出FiLMMeD,通过特征线性调制解决跨问题多车场车辆路径问题。 |
reinforcement learning curriculum learning |
✅ |
|
| 9 |
Exponential families from a single KL identity |
提出KL差异的新身份以简化指数族分布的推导 |
reinforcement learning RLHF |
|
|
| 10 |
Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback |
提出Wasserstein分布鲁棒后悔优化以解决RLHF中的奖励过度优化问题 |
reinforcement learning PPO RLHF |
|
|
| 11 |
Caracal: Causal Architecture via Spectral Mixing |
提出Caracal以解决长序列建模中的注意力计算瓶颈 |
Mamba SSM large language model |
|
|
| 12 |
SPLICE: Latent Diffusion over JEPA Embeddings for Conformal Time-Series Inpainting |
SPLICE:基于JEPA嵌入的潜在扩散模型,用于具有置信度的时间序列修复 |
flow matching JEPA |
|
|
| 13 |
Fair Dataset Distillation via Cross-Group Barycenter Alignment |
提出基于跨组重心对齐的公平数据集蒸馏方法,解决子群体性能差异问题。 |
distillation |
|
|