cs.CV(2024-08-24)

📊 共 8 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1) 支柱二:RL算法与架构 (RL & Architecture) (3) 支柱三:空间感知与语义 (Perception & Semantics) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
1 Preliminary Investigations of a Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models 提出结合视觉Transformer、LLM和多模态模型的半导体电子显微图像分析方法 large language model multimodal
2 Can Visual Foundation Models Achieve Long-term Point Tracking? 评估视觉基础模型在长期点跟踪中的几何感知能力 foundation model
3 Probing the Robustness of Vision-Language Pretrained Models: A Multimodal Adversarial Attack Approach 提出JMTFA,揭示视觉-语言预训练模型在多模态对抗攻击下的脆弱性 multimodal
4 Segment Any Mesh 提出Segment Any Mesh,一种零样本网格部件分割方法,提升了通用性和性能。 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
5 Hierarchical Network Fusion for Multi-Modal Electron Micrograph Representation Learning with Foundational Large Language Models 提出分层网络融合(HNF)框架,用于多模态电子显微图像表征学习,提升纳米材料分类精度。 representation learning large language model
6 PointDGMamba: Domain Generalization of Point Cloud Classification via Generalized State Space Model 提出PointDGMamba以解决点云分类中的领域泛化问题 Mamba SSM state space model
7 Explainable Concept Generation through Vision-Language Preference Learning for Understanding Neural Networks' Internal Representations 提出基于视觉-语言偏好学习的可解释概念生成方法,用于理解神经网络内部表示 reinforcement learning preference learning

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
8 G3DST: Generalizing 3D Style Transfer with Neural Radiance Fields across Scenes and Styles 提出G3DST,利用NeRF实现跨场景和风格的通用3D风格迁移 NeRF neural radiance field

⬅️ 返回 cs.CV 首页 · 🏠 返回主页