| 1 |
Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation |
提出Mosaic3D数据集与模型,用于开放词汇3D场景分割 |
contrastive learning scene understanding open-vocabulary |
|
|
| 2 |
LAYOUTDREAMER: Physics-guided Layout for Text-to-3D Compositional Scene Generation |
LayoutDreamer:提出物理引导的布局方法,用于文本到3D组合场景生成。 |
dreamer 3D gaussian splatting 3DGS |
|
|
| 3 |
3D Foundation Model for Generalizable Disease Detection in Head Computed Tomography |
提出FM-CT:用于头部CT图像疾病检测的3D基础模型 |
distillation foundation model |
|
|
| 4 |
Particle Trajectory Representation Learning with Masked Point Modeling |
提出PoLAr-MAE,利用掩码点建模实现LArTPC图像的自监督粒子轨迹表示学习。 |
representation learning masked autoencoder MAE |
|
|
| 5 |
AAD-DCE: An Aggregated Multimodal Attention Mechanism for Early and Late Dynamic Contrast Enhanced Prostate MRI Synthesis |
提出AAD-DCE,利用多模态注意力机制合成早期和晚期动态增强前列腺MRI图像。 |
MAE multimodal |
✅ |
|
| 6 |
Mind the Gap: Evaluating Patch Embeddings from General-Purpose and Histopathology Foundation Models for Cell Segmentation and Classification |
对比通用与病理学预训练模型,评估细胞分割与分类中的Patch Embedding性能差距 |
representation learning foundation model |
|
|
| 7 |
MaintaAvatar: A Maintainable Avatar Based on Neural Radiance Fields by Continual Learning |
提出MaintaAvatar,通过持续学习维护NeRF化身,解决外观和姿态变化下的灾难性遗忘问题。 |
distillation NeRF neural radiance field |
|
|
| 8 |
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm |
MotionLab:通过运动-条件-运动范式统一生成和编辑人体运动 |
curriculum learning motion generation |
✅ |
|
| 9 |
IPO: Iterative Preference Optimization for Text-to-Video Generation |
提出迭代偏好优化(IPO)方法,提升文本到视频生成模型的视频质量。 |
direct preference optimization large language model foundation model |
|
|
| 10 |
One Diffusion Step to Real-World Super-Resolution via Flow Trajectory Distillation |
提出FluxSR,通过流轨迹蒸馏实现单步真实世界图像超分辨率重建 |
flow matching distillation |
✅ |
|
| 11 |
DAMA: Data- and Model-aware Alignment of Multi-modal LLMs |
DAMA:数据与模型感知的多模态LLM对齐方法,提升模型可信度与效果 |
DPO direct preference optimization large language model |
|
|
| 12 |
Controllable Video Generation with Provable Disentanglement |
提出CoVoGAN,通过可证明的解耦实现可控视频生成 |
latent dynamics spatiotemporal |
|
|
| 13 |
UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation |
UNIP:重新思考红外语义分割的预训练注意力模式 |
MAE distillation |
✅ |
|