cs.CV(2025-03-07)

📊 共 26 篇论文 | 🔗 10 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (11 🔗3) 支柱二:RL算法与架构 (RL & Architecture) (6 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (6 🔗3) 支柱五:交互与反应 (Interaction & Reaction) (1) 支柱六:视频提取与匹配 (Video Extraction) (1 🔗1) 支柱八:物理动画 (Physics-based Animation) (1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (11 篇)

#题目一句话要点标签🔗
1 SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting SplatPose:利用3D高斯溅射实现单RGB图像的几何感知6自由度位姿估计 3D gaussian splatting 3DGS gaussian splatting
2 Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs 提出基于场景约束的视频扩散先验方法,解决稀疏输入下3D高斯溅射的重建问题 3D gaussian splatting 3DGS gaussian splatting
3 Bayesian Fields: Task-driven Open-Set Semantic Gaussian Splatting 提出 Bayesian Fields,用于任务驱动的开放集语义高斯 Splatting。 gaussian splatting splatting semantic mapping
4 CoMoGaussian: Continuous Motion-Aware Gaussian Splatting from Motion-Blurred Images CoMoGaussian:提出连续运动感知的高斯溅射,解决运动模糊图像的三维重建问题 3D gaussian splatting 3DGS gaussian splatting
5 GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting GaussianCAD:利用3D高斯溅射从三个正交视图中进行鲁棒的自监督CAD重建 3D gaussian splatting gaussian splatting splatting
6 D2GV: Deformable 2D Gaussian Splatting for Video Representation in 400FPS 提出基于可变形2D高斯溅射的D2GV视频表示方法,实现400FPS高效高质量渲染。 gaussian splatting splatting TAMP
7 MGSR: 2D/3D Mutual-boosted Gaussian Splatting for High-fidelity Surface Reconstruction under Various Light Conditions 提出MGSR,通过2D/3D高斯溅射互增强,实现各种光照条件下高保真表面重建。 3D gaussian splatting gaussian splatting splatting
8 EvolvingGS: High-Fidelity Streamable Volumetric Video via Evolving 3D Gaussian Representation EvolvingGS:通过演化3D高斯表示实现高保真可流式传输的体视频 3D gaussian splatting 3DGS gaussian splatting
9 HexPlane Representation for 3D Semantic Scene Understanding 提出HexPlane表示用于3D语义场景理解,提升分割精度。 scene understanding
10 TomatoScanner: phenotyping tomato fruit based on only RGB image 提出TomatoScanner以解决番茄果实表型测量问题 depth estimation
11 Stereo Any Video: Temporally Consistent Stereo Matching 提出Stereo Any Video框架,无需辅助信息实现时序一致的视频立体匹配 optical flow

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
12 GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving GoalFlow:面向端到端自动驾驶的多模态轨迹生成目标驱动流匹配方法 diffusion policy flow matching multimodal
13 Robust Multimodal Learning for Ophthalmic Disease Grading via Disentangled Representation 提出EDRL框架,通过解耦表征实现鲁棒的多模态眼科疾病分级 representation learning distillation multimodal
14 Unified Reward Model for Multimodal Understanding and Generation 提出UnifiedReward统一奖励模型,用于多模态理解与生成任务的偏好对齐。 DPO direct preference optimization multimodal
15 FMT:A Multimodal Pneumonia Detection Model Based on Stacking MOE Framework 提出FMT模型,通过堆叠MOE框架实现更鲁棒的多模态肺炎检测。 representation learning multimodal
16 Novel Object 6D Pose Estimation with a Single Reference View 提出基于单参考视图和状态空间模型的SinRef-6D新物体6D位姿估计方法 SSM state space model 6D pose estimation
17 MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice MagicInfinite:提出一种扩散Transformer框架,用于生成无限长度的逼真说话视频。 curriculum learning distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)

#题目一句话要点标签🔗
18 CASP: Compression of Large Multimodal Models Based on Attention Sparsity CASP:基于注意力稀疏性的多模态大模型压缩技术 large language model multimodal
19 CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation 提出CMMCoT框架,通过多模态思维链和记忆增强提升复杂多图理解能力 multimodal chain-of-thought
20 New multimodal similarity measure for image registration via modeling local functional dependence with linear combination of learned basis functions 提出基于线性组合学习基函数的图像配准多模态相似性度量方法,提升医学图像配准精度。 multimodal
21 Gaussian Random Fields as an Abstract Representation of Patient Metadata for Multimodal Medical Image Segmentation 提出基于高斯随机场的患者元数据融合方法,提升糖尿病足溃疡的多模态图像分割性能。 multimodal
22 Pi-GPS: Enhancing Geometry Problem Solving by Unleashing the Power of Diagrammatic Information Pi-GPS:利用图示信息增强几何问题求解能力 multimodal
23 Escaping Plato's Cave: Towards the Alignment of 3D and Text Latent Spaces 提出一种3D和文本隐空间对齐方法,通过子空间投影提升跨模态检索性能。 foundation model

🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)

#题目一句话要点标签🔗
24 Encrypted Vector Similarity Computations Using Partially Homomorphic Encryption: Applications and Performance Analysis 利用部分同态加密实现加密向量相似度计算,应用于人脸识别等领域。 OMOMO large language model

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
25 EDM: Efficient Deep Feature Matching 提出EDM:一种高效深度特征匹配网络,兼顾精度与效率。 feature matching

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
26 Decadal analysis of sea surface temperature patterns, climatology, and anomalies in temperate coastal waters with Landsat-8 TIRS observations 利用Landsat-8 TIRS数据分析南澳大利亚沿海海面温度时空模式与异常 spatiotemporal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页