| 12 |
SegMo: Segment-aligned Text to 3D Human Motion Generation |
提出SegMo框架,通过对齐文本和运动片段实现更精细的文本驱动3D人体动作生成。 |
contrastive learning motion generation human motion |
|
|
| 13 |
Multimodal Skeleton-Based Action Representation Learning via Decomposition and Composition |
提出分解与组合的多模态骨骼动作表示学习框架,提升效率与性能。 |
representation learning multimodal |
|
|
| 14 |
Surgical Scene Segmentation using a Spike-Driven Video Transformer with Real-Time Potential |
提出 SpikeSurgSeg,一种用于手术场景分割的脉冲驱动视频Transformer,具有实时潜力。 |
representation learning scene understanding spatiotemporal |
|
|
| 15 |
TICON: A Slide-Level Tile Contextualizer for Histopathology Representation Learning |
TICON:一种用于组织病理学表征学习的切片级瓦片上下文建模方法 |
representation learning foundation model |
|
|
| 16 |
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations |
提出NExT-Vid,一种基于下一帧预测的自回归视频建模框架,提升视觉表征学习效果。 |
flow matching representation learning visual pre-training |
|
|
| 17 |
A Graph-Augmented knowledge Distillation based Dual-Stream Vision Transformer with Region-Aware Attention for Gastrointestinal Disease Classification with Explainable AI |
提出基于图增强知识蒸馏的双流Vision Transformer用于可解释的胃肠道疾病分类 |
teacher-student distillation |
|
|
| 18 |
Self-supervised Multiplex Consensus Mamba for General Image Fusion |
提出SMC-Mamba框架,用于通用图像融合,提升多种融合任务性能。 |
Mamba contrastive learning |
|
|
| 19 |
PUFM++: Point Cloud Upsampling via Enhanced Flow Matching |
PUFM++:通过增强的流匹配实现点云上采样,提升几何保真度和鲁棒性 |
flow matching |
✅ |
|
| 20 |
XGrid-Mapping: Explicit Implicit Hybrid Grid Submaps for Efficient Incremental Neural LiDAR Mapping |
提出XGrid-Mapping,利用显隐混合网格子图实现高效增量式神经激光雷达建图 |
distillation implicit representation |
|
|