| 1 |
Moment-Based 3D Gaussian Splatting: Resolving Volumetric Occlusion with Order-Independent Transmittance |
提出基于矩的3D高斯溅射,通过与顺序无关的透射率解决体积遮挡问题 |
3D gaussian splatting 3DGS gaussian splatting |
|
|
| 2 |
Prior-Enhanced Gaussian Splatting for Dynamic Scene Reconstruction from Casual Video |
提出先验增强的高斯溅射方法,用于从日常视频中重建动态场景 |
gaussian splatting scene reconstruction |
|
|
| 3 |
Lightweight 3D Gaussian Splatting Compression via Video Codec |
提出基于视频编解码器的轻量级3D高斯溅射压缩方法,适用于轻量级设备。 |
3D gaussian splatting gaussian splatting |
✅ |
|
| 4 |
MultiEgo: A Multi-View Egocentric Video Dataset for 4D Scene Reconstruction |
提出MultiEgo:用于4D场景重建的多视角第一人称视频数据集 |
scene reconstruction social interaction |
|
|
| 5 |
Super-Resolved Canopy Height Mapping from Sentinel-2 Time Series Using LiDAR HD Reference Data across Metropolitan France |
提出THREASURE-Net,利用Sentinel-2时间序列和LiDAR数据进行高分辨率森林冠层高度制图。 |
height map |
✅ |
|
| 6 |
On Geometric Understanding and Learned Data Priors in VGGT |
分析VGGT几何理解能力:揭示其隐式几何学习与数据先验依赖 |
VGGT |
|
|
| 7 |
Multi-task Learning with Extended Temporal Shift Module for Temporal Action Localization |
提出扩展时序位移模块的多任务学习方法,用于时序动作定位 |
localization |
|
|
| 8 |
Exploring Spatial-Temporal Representation via Star Graph for mmWave Radar-based Human Activity Recognition |
提出基于星型图的离散动态图神经网络,用于毫米波雷达人体活动识别 |
point cloud |
|
|
| 9 |
Particulate: Feed-Forward 3D Object Articulation |
Particulate:提出一种前馈3D物体关节运动估计方法,无需逐对象优化。 |
point cloud |
|
|
| 10 |
Structure From Tracking: Distilling Structure-Preserving Motion for Video Generation |
提出SAM2VideoX,通过蒸馏结构保持运动先验,提升视频生成质量。 |
optical flow |
|
|
| 11 |
Depth-Copy-Paste: Multimodal and Depth-Aware Compositing for Robust Face Detection |
提出Depth-Copy-Paste,通过多模态深度感知合成增强人脸检测鲁棒性。 |
Depth Anything |
|
|
| 12 |
FactorPortrait: Controllable Portrait Animation via Disentangled Expression, Pose, and Viewpoint |
FactorPortrait:通过解耦的表情、姿势和视角实现可控的人像动画 |
novel view synthesis |
|
|
| 13 |
3DTeethSAM: Taming SAM2 for 3D Teeth Segmentation |
3DTeethSAM:利用SAM2进行三维牙齿分割,实现牙科数字化 |
localization |
|
|
| 14 |
Reconstruction as a Bridge for Event-Based Visual Question Answering |
提出基于重建的事件相机视觉问答框架,解决事件数据与多模态大语言模型兼容性问题。 |
scene understanding |
|
|
| 15 |
DOS: Distilling Observable Softmaps of Zipfian Prototypes for Self-Supervised Point Representation |
DOS:通过Zipfian原型蒸馏可观测软标签,实现自监督点云表示学习 |
point cloud |
|
|
| 16 |
Collaborative Reconstruction and Repair for Multi-class Industrial Anomaly Detection |
提出协同重建与修复网络CRR,解决多类别工业异常检测中的身份映射问题。 |
localization |
|
|
| 17 |
Assisted Refinement Network Based on Channel Information Interaction for Camouflaged and Salient Object Detection |
提出基于通道信息交互的辅助精炼网络,用于伪装目标检测和显著性目标检测。 |
localization |
✅ |
|
| 18 |
Surveillance Video-Based Traffic Accident Detection Using Transformer Architecture |
提出基于Transformer的交通视频事故检测模型,并构建了大规模平衡数据集。 |
optical flow |
|
|
| 19 |
UFVideo: Towards Unified Fine-Grained Video Cooperative Understanding with Large Language Models |
提出UFVideo,实现统一的多粒度视频协同理解,超越现有Video LLM。 |
localization |
|
|
| 20 |
SmokeBench: Evaluating Multimodal Large Language Models for Wildfire Smoke Detection |
SmokeBench:评估多模态大语言模型在野火烟雾检测中的性能 |
localization |
|
|