| 1 |
Physics-based Scene Layout Generation from Human Motion |
提出基于物理的场景布局生成方法,实现逼真的人机交互动画 |
reinforcement learning affordance physically plausible |
|
|
| 2 |
Cross-spectral Gated-RGB Stereo Depth Estimation |
提出跨光谱门控RGB立体深度估计方法,提升远距离深度精度。 |
MAE depth estimation stereo depth |
|
|
| 3 |
AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection |
提出AMFD框架,通过自适应多模态融合蒸馏提升多光谱行人检测效率。 |
distillation multimodal |
✅ |
|
| 4 |
3DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image Classification |
提出3DSS-Mamba,用于高光谱图像分类,提升长程依赖建模效率。 |
Mamba state space model HSI |
|
|
| 5 |
A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data |
综述基于多模态数据的深度学习放射学报告生成方法,聚焦数据融合与模型可解释性。 |
contrastive learning multimodal |
|
|
| 6 |
A Multimodal Learning-based Approach for Autonomous Landing of UAV |
提出一种基于多模态学习的无人机自主着陆方法,提升精度和环境适应性。 |
reinforcement learning multimodal |
|
|
| 7 |
Active Object Detection with Knowledge Aggregation and Distillation from Large Models |
提出基于知识聚合与蒸馏的主动对象检测方法,提升交互场景下的检测精度。 |
distillation affordance Ego4D |
|
|
| 8 |
RemoCap: Disentangled Representation Learning for Motion Capture |
RemoCap:提出解耦表征学习方法,解决复杂遮挡下的三维人体运动捕捉难题 |
representation learning penetration |
✅ |
|
| 9 |
CLRKDNet: Speeding up Lane Detection with Knowledge Distillation |
CLRKDNet:利用知识蒸馏加速车道线检测,提升自动驾驶实时性 |
teacher-student distillation |
|
|
| 10 |
BIMM: Brain Inspired Masked Modeling for Video Representation Learning |
提出脑启发的掩码建模BIMM框架,用于视频表征学习 |
representation learning |
|
|
| 11 |
C3L: Content Correlated Vision-Language Instruction Tuning Data Generation via Contrastive Learning |
提出C3L,通过对比学习生成内容相关视觉-语言指令微调数据,提升LVLM性能。 |
contrastive learning |
|
|