cs.CV(2025-09-07)

📊 共 10 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱三:空间感知与语义 (Perception & Semantics) (4) 支柱二:RL算法与架构 (RL & Architecture) (4 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (2 🔗1)

🔬 支柱三:空间感知与语义 (Perception & Semantics) (4 篇)

#题目一句话要点标签🔗
1 MEGS$^{2}$: Memory-Efficient Gaussian Splatting via Spherical Gaussians and Unified Pruning MEGS$^{2}$: 通过球谐高斯和统一剪枝实现内存高效的高斯溅射 3D gaussian splatting 3DGS gaussian splatting
2 Light-Weight Cross-Modal Enhancement Method with Benchmark Construction for UAV-based Open-Vocabulary Object Detection 针对无人机开放词汇目标检测,提出轻量级跨模态增强方法与基准数据集。 open-vocabulary open vocabulary
3 Motion Aware ViT-based Framework for Monocular 6-DoF Spacecraft Pose Estimation 提出一种基于运动感知的ViT框架,用于单目6自由度航天器姿态估计 optical flow
4 S-LAM3D: Segmentation-Guided Monocular 3D Object Detection via Feature Space Fusion S-LAM3D:通过特征空间融合的分割引导单目3D目标检测 depth estimation

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
5 MedSeqFT: Sequential Fine-tuning Foundation Models for 3D Medical Image Segmentation MedSeqFT:提出序列化微调框架,提升医学影像分割Foundation Model在增量任务中的性能。 distillation foundation model
6 A Fine-Grained Attention and Geometric Correspondence Model for Musculoskeletal Risk Classification in Athletes Using Multimodal Visual and Skeletal Features ViSK-GAT:融合视觉与骨骼特征,实现运动员肌肉骨骼风险精准分类 MAE multimodal
7 Coefficients-Preserving Sampling for Reinforcement Learning with Flow Matching 提出系数保持采样(CPS)方法,解决Flow Matching模型RL优化中的噪声伪影问题。 reinforcement learning flow matching
8 UNO: Unifying One-stage Video Scene Graph Generation via Object-Centric Visual Representation Learning UNO:提出统一的单阶段视频场景图生成框架,通过对象中心视觉表征学习同时处理box-level和pixel-level任务。 representation learning

🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)

#题目一句话要点标签🔗
9 Compression Beyond Pixels: Semantic Compression with Multimodal Foundation Models 提出基于多模态大模型的语义压缩方法,超越像素级重建。 foundation model multimodal
10 BTCChat: Advancing Remote Sensing Bi-temporal Change Captioning with Multimodal Large Language Model BTCChat:利用多模态大语言模型提升遥感双时相变化描述能力 large language model multimodal

⬅️ 返回 cs.CV 首页 · 🏠 返回主页