| 1 |
TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers |
TranSplat:利用Transformer从稀疏多视角图像中实现可泛化的3D高斯溅射 |
depth estimation monocular depth 3D gaussian splatting |
✅ |
|
| 2 |
Making Large Language Models Better Planners with Reasoning-Decision Alignment |
提出RDA-Driver,通过推理-决策对齐提升大语言模型在自动驾驶规划中的性能。 |
scene understanding large language model multimodal |
|
|
| 3 |
OpenNav: Efficient Open Vocabulary 3D Object Detection for Smart Wheelchair Navigation |
OpenNav:面向智能轮椅导航的高效开放词汇3D目标检测 |
open-vocabulary open vocabulary |
✅ |
|
| 4 |
Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs |
Splatt3R:基于未标定图像对的零样本高斯溅射方法 |
gaussian splatting splatting |
|
|
| 5 |
InSpaceType: Dataset and Benchmark for Reconsidering Cross-Space Type Performance in Indoor Monocular Depth |
提出InSpaceType数据集与评测基准,用于评估室内单目深度估计在不同空间类型上的泛化性能。 |
depth estimation monocular depth |
✅ |
|
| 6 |
3D-VirtFusion: Synthetic 3D Data Augmentation through Generative Diffusion Models and Controllable Editing |
3D-VirtFusion:利用生成扩散模型和可控编辑进行合成3D数据增强 |
scene understanding foundation model |
|
|