cs.CV(2024-08-16)

📊 共 6 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (3 🔗1) 支柱九:具身大模型 (Embodied Foundation Models) (2) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
1 Multi Teacher Privileged Knowledge Distillation for Multimodal Expression Recognition 提出多教师特权知识蒸馏方法MT-PKDOT,提升多模态情感识别在模态缺失场景下的性能。 distillation privileged information multimodal
2 RGBT Tracking via All-layer Multimodal Interactions with Progressive Fusion Mamba 提出AINet,通过全层多模态交互和渐进式融合Mamba实现鲁棒的RGBT跟踪。 Mamba multimodal
3 PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders PCP-MAE:通过预测中心点学习点云掩码自编码器的语义表征 masked autoencoder MAE

🔬 支柱九:具身大模型 (Embodied Foundation Models) (2 篇)

#题目一句话要点标签🔗
4 xGen-MM (BLIP-3): A Family of Open Large Multimodal Models 发布BLIP-3:一个开放的大型多模态模型系列xGen-MM multimodal
5 Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models 提出基于DINOv2和SAM2的检索增强少样本医学图像分割方法,无需微调。 foundation model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
6 VF-NeRF: Learning Neural Vector Fields for Indoor Scene Reconstruction VF-NeRF:提出基于神经向量场的室内场景重建方法,有效处理弱纹理平面。 NeRF neural radiance field implicit representation

⬅️ 返回 cs.CV 首页 · 🏠 返回主页