cs.CV(2024-07-04)
📊 共 16 篇论文 | 🔗 4 篇有代码
🎯 兴趣领域导航
支柱三:空间感知与语义 (Perception & Semantics) (4 🔗1)
支柱九:具身大模型 (Embodied Foundation Models) (4 🔗2)
支柱二:RL算法与架构 (RL & Architecture) (4 🔗1)
支柱一:机器人控制 (Robot Control) (2)
支柱五:交互与反应 (Interaction & Reaction) (1)
支柱七:动作重定向 (Motion Retargeting) (1)
🔬 支柱三:空间感知与语义 (Perception & Semantics) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | CRiM-GS: Continuous Rigid Motion-Aware Gaussian Splatting from Motion-Blurred Images | CRiM-GS:提出连续刚性运动感知的高斯溅射方法,解决运动模糊图像的三维重建问题。 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 2 | PFGS: High Fidelity Point Cloud Rendering via Feature Splatting | 提出PFGS,通过特征溅射实现高保真点云渲染 | 3D gaussian splatting gaussian splatting splatting | ||
| 3 | SpikeGS: Reconstruct 3D scene via fast-moving bio-inspired sensors | SpikeGS:利用仿生高速传感器重建3D场景 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 4 | Towards Cross-View-Consistent Self-Supervised Surround Depth Estimation | 提出交叉视角一致性自监督环视深度估计方法,提升重叠区域深度预测精度 | depth estimation | ✅ |
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | Slice-100K: A Multimodal Dataset for Extrusion-based 3D Printing | Slice-100K:用于基于挤出的3D打印的多模态数据集 | foundation model multimodal | ✅ | |
| 6 | ADAPT: Multimodal Learning for Detecting Physiological Changes under Missing Modalities | ADAPT:针对模态缺失的多模态学习框架,用于检测生理变化 | multimodal | ||
| 7 | Robust Adaptation of Foundation Models with Black-Box Visual Prompting | 提出BlackVIP,通过黑盒视觉提示实现基础模型的鲁棒自适应。 | foundation model | ||
| 8 | SSP-IR: Semantic and Structure Priors for Diffusion-based Realistic Image Restoration | 提出SSP-IR,利用语义和结构先验提升扩散模型图像复原的真实感和准确性。 | large language model multimodal | ✅ |
🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 9 | QueryMamba: A Mamba-Based Encoder-Decoder Architecture with a Statistical Verb-Noun Interaction Module for Video Action Forecasting @ Ego4D Long-Term Action Anticipation Challenge 2024 | 提出QueryMamba,结合统计动宾交互模块,用于视频行为预测。 | Mamba Ego4D | ||
| 10 | Relative Difficulty Distillation for Semantic Segmentation | 提出相对难度蒸馏(RDD)方法,提升语义分割任务中的知识蒸馏效果 | teacher-student distillation | ||
| 11 | Do Generalised Classifiers really work on Human Drawn Sketches? | 提出一种新方法以提升人类手绘草图的分类能力 | representation learning foundation model | ||
| 12 | Vision Mamba for Classification of Breast Ultrasound Images | 提出基于Mamba的视觉模型,提升乳腺超声图像分类性能 | Mamba | ✅ |
🔬 支柱一:机器人控制 (Robot Control) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 13 | M^3:Manipulation Mask Manufacturer for Arbitrary-Scale Super-Resolution Mask | 提出M^3框架,用于生成任意尺度超分辨率图像篡改掩码,解决图像篡改定位数据集不足问题。 | manipulation | ||
| 14 | Advances in Diffusion Models for Image Data Augmentation: A Review of Methods, Models, Evaluation Metrics and Future Research Directions | 综述:扩散模型在图像数据增强中的应用、方法与未来方向 | manipulation |
🔬 支柱五:交互与反应 (Interaction & Reaction) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 15 | Orthogonal Constrained Minimization with Tensor $\ell_{2,p}$ Regularization for HSI Denoising and Destriping | 提出多尺度低秩张量正则化方法以解决高光谱图像去噪与去条纹问题 | HSI |
🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 16 | Fast Learning of Signed Distance Functions from Noisy Point Clouds via Noise to Noise Mapping | 提出基于噪声映射的快速SDF学习方法,解决从噪声点云重建问题 | geometric consistency |