| 1 |
MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios |
提出MLLM-SUL框架,利用多模态大语言模型解决交通场景下的语义场景理解与风险定位问题。 |
scene understanding large language model multimodal |
✅ |
|
| 2 |
DAS3R: Dynamics-Aware Gaussian Splatting for Static Scene Reconstruction |
DAS3R:提出动力学感知高斯溅射方法,用于静态场景重建 |
gaussian splatting splatting scene reconstruction |
✅ |
|
| 3 |
Dust to Tower: Coarse-to-Fine Photo-Realistic Scene Reconstruction from Sparse Uncalibrated Images |
提出Dust to Tower以解决稀疏无标定图像的场景重建问题 |
3D gaussian splatting 3DGS gaussian splatting |
|
|
| 4 |
Learning Radiance Fields from a Single Snapshot Compressive Image |
提出SCINeRF和SCISplat,从单快照压缩图像中学习辐射场,实现高质量三维重建和快速渲染。 |
3D gaussian splatting 3DGS gaussian splatting |
|
|
| 5 |
Towards Open-Vocabulary Remote Sensing Image Semantic Segmentation |
提出GSNet框架与LandDiscover50K数据集,实现遥感图像开放词汇语义分割 |
open-vocabulary open vocabulary |
✅ |
|
| 6 |
Sharpening Neural Implicit Functions with Frequency Consolidation Priors |
提出频率整合先验以提升神经隐式函数的表现 |
implicit representation |
✅ |
|
| 7 |
Generalized Uncertainty-Based Evidential Fusion with Hybrid Multi-Head Attention for Weak-Supervised Temporal Action Localization |
提出基于广义不确定性的证据融合与混合多头注意力机制,解决弱监督时序动作定位中的动作-背景混淆问题。 |
optical flow |
✅ |
|