| 1 |
$Δ$-NeRF: Incremental Refinement of Neural Radiance Fields through Residual Control and Knowledge Transfer |
提出$Δ$-NeRF,通过残差控制和知识迁移实现神经辐射场的增量优化,适用于卫星图像等序列数据场景。 |
NeRF neural radiance novel view synthesis |
|
|
| 2 |
DeLightMono: Enhancing Self-Supervised Monocular Depth Estimation in Endoscopy by Decoupling Uneven Illumination |
DeLightMono:通过解耦不均匀光照增强内窥镜自监督单目深度估计 |
depth estimation monocular depth navigation |
|
|
| 3 |
Redefining Radar Segmentation: Simultaneous Static-Moving Segmentation and Ego-Motion Estimation using Radar Point Clouds |
提出基于雷达点云的静态-动态分割与自运动估计同步方法 |
point cloud ego-motion |
|
|
| 4 |
3D-Aware Multi-Task Learning with Cross-View Correlations for Dense Scene Understanding |
提出基于跨视角相关性的3D感知多任务学习,用于密集场景理解 |
depth estimation scene understanding geometric consistency |
|
|
| 5 |
Material-informed Gaussian Splatting for 3D World Reconstruction in a Digital Twin |
提出基于材质信息的3D高斯溅射方法,用于数字孪生中的三维世界重建 |
3D gaussian splatting gaussian splatting point cloud |
|
|
| 6 |
VGGT4D: Mining Motion Cues in Visual Geometry Transformers for 4D Scene Reconstruction |
VGGT4D:挖掘视觉几何Transformer中的运动线索,用于4D场景重建 |
scene reconstruction pose estimation VGGT |
|
|
| 7 |
ACIT: Attention-Guided Cross-Modal Interaction Transformer for Pedestrian Crossing Intention Prediction |
提出ACIT模型,利用注意力机制和跨模态交互Transformer提升行人过街意图预测精度。 |
optical flow interaction transformer |
|
|
| 8 |
MODEST: Multi-Optics Depth-of-Field Stereo Dataset |
MODEST:多光圈景深立体视觉数据集,弥合真实光学与合成数据差距 |
depth estimation stereo depth novel view synthesis |
|
|
| 9 |
Conceptual Evaluation of Deep Visual Stereo Odometry for the MARWIN Radiation Monitoring Robot in Accelerator Tunnels |
探索深度视觉立体里程计在加速器隧道辐射监测机器人中的应用 |
optical flow ego-motion localization |
|
|
| 10 |
FLaTEC: Frequency-Disentangled Latent Triplanes for Efficient Compression of LiDAR Point Clouds |
FLaTEC:提出频率解耦的隐式三平面表示,高效压缩LiDAR点云。 |
point cloud |
|
|
| 11 |
ReDirector: Creating Any-Length Video Retakes with Rotary Camera Encoding |
ReDirector:利用旋转相机编码生成任意长度的视频重拍 |
localization geometric consistency |
|
|
| 12 |
VGGTFace: Topologically Consistent Facial Geometry Reconstruction in the Wild |
VGGTFace:利用3D基础模型实现拓扑一致的人脸几何重建 |
point cloud VGGT |
✅ |
|
| 13 |
AMB3R: Accurate Feed-forward Metric-scale 3D Reconstruction with Backend |
AMB3R:利用紧凑体素后端实现精确的度量尺度三维重建 |
visual odometry SLAM |
|
|
| 14 |
STAvatar: Soft Binding and Temporal Density Control for Monocular 3D Head Avatars Reconstruction |
STAvatar:提出软绑定与时序密度控制的单目3D头部Avatar重建方法 |
3D gaussian splatting gaussian splatting |
|
|
| 15 |
Estimating Fog Parameters from a Sequence of Stereo Images |
提出一种基于立体图像序列的雾参数动态估计方法,适用于视觉SLAM和里程计系统。 |
SLAM |
✅ |
|
| 16 |
Mistake Attribution: Fine-Grained Mistake Understanding in Egocentric Videos |
提出Mistake Attribution (MATT)任务,用于细粒度理解以自我为中心的视频中的人类错误。 |
localization |
|
|
| 17 |
Zoo3D: Zero-Shot 3D Object Detection at Scene Level |
Zoo3D:提出一种场景级零样本3D目标检测框架,无需训练即可实现SOTA性能。 |
point cloud |
✅ |
|
| 18 |
Explainable Visual Anomaly Detection via Concept Bottleneck Models |
提出基于概念瓶颈模型的可解释视觉异常检测方法CONVAD |
localization |
|
|
| 19 |
Tell Model Where to Look: Mitigating Hallucinations in MLLMs by Vision-Guided Attention |
提出视觉引导注意力机制(VGA),缓解多模态大语言模型中的幻觉问题 |
localization |
|
|
| 20 |
Foundry: Distilling 3D Foundation Models for the Edge |
Foundry:边缘设备3D基础模型蒸馏,保持通用性的同时实现高效压缩 |
point cloud |
|
|
| 21 |
Multi-Context Fusion Transformer for Pedestrian Crossing Intention Prediction in Urban Environments |
提出多上下文融合Transformer(MFT)用于城市环境中行人意图预测。 |
localization |
✅ |
|