cs.CV(2025-05-13)

📊 共 22 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (7 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (6) 支柱九:具身大模型 (Embodied Foundation Models) (6 🔗2) 支柱一:机器人控制 (Robot Control) (2 🔗1) 支柱八:物理动画 (Physics-based Animation) (1 🔗1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (7 篇)

#题目一句话要点标签🔗
1 DLO-Splatting: Tracking Deformable Linear Objects Using 3D Gaussian Splatting 提出DLO-Splatting以解决可变形线性物体跟踪问题 3D gaussian splatting gaussian splatting splatting
2 ADC-GS: Anchor-Driven Deformable and Compressed Gaussian Splatting for Dynamic Scene Reconstruction 提出ADC-GS以解决动态场景重建中的冗余问题 gaussian splatting splatting scene reconstruction
3 A Survey of 3D Reconstruction with Event Cameras 综述事件相机在3D重建中的应用与挑战 3D gaussian splatting 3DGS gaussian splatting
4 Monocular Depth Guided Occlusion-Aware Disparity Refinement via Semi-supervised Learning in Laparoscopic Images 提出深度引导的遮挡感知视差精炼网络以解决腹腔镜图像中的视差估计问题 monocular depth optical flow
5 Boosting Zero-shot Stereo Matching using Large-scale Mixed Images Sources in the Real World 提出BooSTer以解决真实世界中零-shot立体匹配问题 depth estimation monocular depth foundation model
6 EventDiff: A Unified and Efficient Diffusion Model Framework for Event-based Video Frame Interpolation 提出EventDiff以解决事件驱动视频帧插值问题 optical flow
7 SpNeRF: Memory Efficient Sparse Volumetric Neural Rendering Accelerator for Edge Devices 提出SpNeRF以解决边缘设备上稀疏体积神经渲染的内存效率问题 NeRF

🔬 支柱二:RL算法与架构 (RL & Architecture) (6 篇)

#题目一句话要点标签🔗
8 Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection 提出轨迹感知自适应标记选择以解决视频建模中的掩蔽策略问题 reinforcement learning PPO masked autoencoder
9 DFA-CON: A Contrastive Learning Approach for Detecting Copyright Infringement in DeepFake Art 提出DFA-CON以解决深度伪造艺术作品的版权侵犯检测问题 contrastive learning foundation model
10 Adaptive Security Policy Management in Cloud Environments Using Reinforcement Learning 提出基于强化学习的动态安全策略管理框架以应对云环境安全挑战 reinforcement learning deep reinforcement learning
11 OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning 提出OpenThinkIMG以解决视觉工具增强学习的挑战 reinforcement learning
12 Leveraging Multi-Modal Information to Enhance Dataset Distillation 提出多模态数据蒸馏框架以提升数据集表现 distillation
13 MoKD: Multi-Task Optimization for Knowledge Distillation 提出MoKD以解决知识蒸馏中的梯度冲突与主导问题 distillation

🔬 支柱九:具身大模型 (Embodied Foundation Models) (6 篇)

#题目一句话要点标签🔗
14 An integrated language-vision foundation model for conversational diagnostics and triaging in primary eye care 提出Meta-EyeFM以解决初级眼科诊断中的多任务整合问题 large language model foundation model
15 Generative AI for Autonomous Driving: Frontiers and Opportunities 综述生成性人工智能在自动驾驶中的应用与挑战 embodied AI large language model multimodal
16 Multimodal Fusion of Glucose Monitoring and Food Imagery for Caloric Content Prediction 提出多模态深度学习框架以提升卡路里估算精度 multimodal
17 Prioritizing Image-Related Tokens Enhances Vision-Language Pre-Training 提出PRIOR以解决视觉语言模型中的噪声问题 large language model
18 Advancing Food Nutrition Estimation via Visual-Ingredient Feature Fusion 提出视觉-成分特征融合方法以提升食品营养估计 multimodal
19 Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion 提出ResULIC以解决现有图像压缩效率低的问题 multimodal

🔬 支柱一:机器人控制 (Robot Control) (2 篇)

#题目一句话要点标签🔗
20 TT-DF: A Large-Scale Diffusion-Based Dataset and Benchmark for Human Body Forgery Detection 提出TT-DF数据集以解决人体伪造检测的不足问题 manipulation optical flow spatiotemporal
21 Removing Watermarks with Partial Regeneration using Semantic Information 提出SemanticRegen以解决水印防护脆弱性问题 manipulation

🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)

#题目一句话要点标签🔗
22 TiMo: Spatiotemporal Foundation Model for Satellite Image Time Series 提出TiMo以解决卫星图像时间序列分析中的多尺度时空关系捕捉问题 spatiotemporal foundation model

⬅️ 返回 cs.CV 首页 · 🏠 返回主页