| 1 |
A Vision-Language Foundation Model for Zero-shot Clinical Collaboration and Automated Concept Discovery in Dermatology |
DermFM-Zero:用于皮肤科零样本临床协作的视觉-语言基础模型 |
contrastive learning foundation model multimodal |
|
|
| 2 |
MetaphorStar: Image Metaphor Understanding and Reasoning with End-to-End Visual Reinforcement Learning |
提出MetaphorStar,利用端到端视觉强化学习解决图像隐喻理解与推理难题。 |
reinforcement learning large language model multimodal |
|
|
| 3 |
3DXTalker: Unifying Identity, Lip Sync, Emotion, and Spatial Dynamics in Expressive 3D Talking Avatars |
3DXTalker:统一身份、口型同步、情感和空间动态的表达性3D说话头像生成。 |
flow matching motion generation |
|
|
| 4 |
HII-DPO: Eliminate Hallucination via Accurate Hallucination-Inducing Counterfactual Images |
提出HII-DPO,通过对抗图像消除视觉语言模型中的幻觉问题 |
DPO multimodal |
|
|
| 5 |
Beyond VLM-Based Rewards: Diffusion-Native Latent Reward Modeling |
提出DiNa-LRM,一种扩散原生潜在奖励模型,提升扩散模型偏好优化效率。 |
flow matching preference learning multimodal |
|
|
| 6 |
FastFlow: Accelerating The Generative Flow Matching Models with Bandit Inference |
FastFlow:利用Bandit推断加速生成流匹配模型 |
flow matching distillation |
✅ |
|
| 7 |
LaSSM: Efficient Semantic-Spatial Query Decoding via Local Aggregation and State Space Models for 3D Instance Segmentation |
LaSSM:基于局部聚合与状态空间模型的3D实例分割 |
SSM state space model |
✅ |
|
| 8 |
Spectral-Spatial Contrastive Learning Framework for Regression on Hyperspectral Data |
提出用于高光谱数据回归的光谱-空间对比学习框架,提升模型性能。 |
representation learning contrastive learning |
|
|
| 9 |
Self-Supervised Image Super-Resolution Quality Assessment based on Content-Free Multi-Model Oriented Representation Learning |
提出基于无内容多模型导向表征学习的自监督图像超分辨率质量评估方法 |
representation learning contrastive learning |
|
|
| 10 |
Dual-End Consistency Model |
提出双端一致性模型(DE-CM),解决一致性模型训练不稳定和采样不灵活的问题,实现高效图像生成。 |
flow matching distillation |
|
|