cs.CV(2024-05-20)

📊 共 24 篇论文 | 🔗 11 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (9 🔗4) 支柱九:具身大模型 (Embodied Foundation Models) (7 🔗2) 支柱二:RL算法与架构 (RL & Architecture) (5 🔗2) 支柱八:物理动画 (Physics-based Animation) (2 🔗2) 支柱六:视频提取与匹配 (Video Extraction) (1 🔗1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (9 篇)

#题目一句话要点标签🔗
1 CoR-GS: Sparse-View 3D Gaussian Splatting via Co-Regularization CoR-GS:通过协同正则化提升稀疏视角下的3D高斯溅射 3D gaussian splatting 3DGS gaussian splatting
2 AtomGS: Atomizing Gaussian Splatting for High-Fidelity Radiance Field AtomGS:通过原子化高斯溅射实现高保真辐射场重建 3D gaussian splatting 3DGS gaussian splatting
3 MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo MVSGaussian:基于多视图立体重建的快速可泛化高斯溅射方法 gaussian splatting splatting NeRF
4 Depth Prompting for Sensor-Agnostic Depth Estimation 提出深度提示模块,解决传感器异构性导致的深度估计泛化性问题 depth estimation monocular depth foundation model
5 MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering 提出MTVQA:一个多语言文本中心视觉问答基准,促进多语言场景理解。 scene understanding large language model multimodal
6 MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections 提出MirrorGaussian,通过反射3D高斯模型重建镜面反射场景,实现实时渲染。 3D gaussian splatting gaussian splatting splatting
7 Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping 提出基于Anchor Gaussian引导纹理扭曲的高保真神经上半身Avatar方法 3D gaussian splatting gaussian splatting splatting
8 Depth Reconstruction with Neural Signed Distance Fields in Structured Light Systems 提出基于神经符号距离场的结构光三维重建方法,提升几何精度 depth estimation implicit representation
9 NPLMV-PS: Neural Point-Light Multi-View Photometric Stereo 提出NPLMV-PS,一种利用神经渲染的点光源多视角光度立体方法,提升三维重建精度。 NeRF

🔬 支柱九:具身大模型 (Embodied Foundation Models) (7 篇)

#题目一句话要点标签🔗
10 Imp: Highly Capable Large Multimodal Models for Mobile Devices Imp:面向移动设备的高性能轻量级多模态大模型 large language model multimodal
11 Comparing ImageNet Pre-training with Digital Pathology Foundation Models for Whole Slide Image-Based Survival Analysis 利用病理学预训练模型提升WSI生存分析的MIL网络性能 foundation model
12 Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning 提出 InvariantSelectPR,提升大模型在分布偏移下的域自适应能力 multimodal
13 Data Augmentation for Text-based Person Retrieval Using Large Language Models 提出基于大语言模型的数据增强方法LLM-DA,提升文本行人检索性能。 large language model
14 Mammo-CLIP: A Vision Language Foundation Model to Enhance Data Efficiency and Robustness in Mammography 提出Mammo-CLIP,提升乳腺钼靶影像诊断的数据效率和鲁棒性 foundation model
15 Generative AI Empowered LiDAR Point Cloud Generation with Multimodal Transformer 提出基于多模态Transformer的生成式AI方法,利用图像和雷达数据生成LiDAR点云,提升6G无线通信系统环境感知能力。 multimodal
16 MM-Retinal: Knowledge-Enhanced Foundational Pretraining with Fundus Image-Text Expertise 提出KeepFIT:一种知识增强的眼底图像-文本预训练模型,提升眼科图像分析的泛化性。 foundation model

🔬 支柱二:RL算法与架构 (RL & Architecture) (5 篇)

#题目一句话要点标签🔗
17 GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details 提出GarmentDreamer以解决3D服装生成中的一致性问题 dreamer distillation 3D gaussian splatting
18 Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification 提出Mamba-in-Mamba模型,用于高光谱图像分类,提升特征聚合和效率。 Mamba SSM state space model
19 GeoMask3D: Geometrically Informed Mask Selection for Self-Supervised Point Cloud Learning in 3D GeoMask3D:基于几何信息的掩码选择,提升3D点云自监督学习性能 MAE teacher-student distillation
20 Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation 提出Versatile Teacher,解决跨域目标检测中类别差异性问题,提升伪标签质量。 teacher-student
21 Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices 提出蒸馏与剪枝结合的方法以提升边缘设备立体匹配精度 distillation scene flow

🔬 支柱八:物理动画 (Physics-based Animation) (2 篇)

#题目一句话要点标签🔗
22 CSTA: CNN-based Spatiotemporal Attention for Video Summarization 提出基于CNN空时注意力的CSTA视频摘要方法,提升关键帧提取性能。 spatiotemporal
23 AutoSoccerPose: Automated 3D posture Analysis of Soccer Shot Movements 提出AutoSoccerPose,半自动化足球射门动作3D姿态分析流程。 spatiotemporal

🔬 支柱六:视频提取与匹配 (Video Extraction) (1 篇)

#题目一句话要点标签🔗
24 Learning Spatial Similarity Distribution for Few-shot Object Counting 提出基于空间相似性分布学习的少样本目标计数网络 feature matching

⬅️ 返回 cs.CV 首页 · 🏠 返回主页