cs.CV(2024-08-24)
📊 共 8 篇论文 | 🔗 1 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (4 🔗1)
支柱二:RL算法与架构 (RL & Architecture) (3)
支柱三:空间感知与语义 (Perception & Semantics) (1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Preliminary Investigations of a Multi-Faceted Robust and Synergistic Approach in Semiconductor Electron Micrograph Analysis: Integrating Vision Transformers with Large Language and Multimodal Models | 提出结合视觉Transformer、LLM和多模态模型的半导体电子显微图像分析方法 | large language model multimodal | ||
| 2 | Can Visual Foundation Models Achieve Long-term Point Tracking? | 评估视觉基础模型在长期点跟踪中的几何感知能力 | foundation model | ||
| 3 | Probing the Robustness of Vision-Language Pretrained Models: A Multimodal Adversarial Attack Approach | 提出JMTFA,揭示视觉-语言预训练模型在多模态对抗攻击下的脆弱性 | multimodal | ||
| 4 | Segment Any Mesh | 提出Segment Any Mesh,一种零样本网格部件分割方法,提升了通用性和性能。 | multimodal | ✅ |
🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | Hierarchical Network Fusion for Multi-Modal Electron Micrograph Representation Learning with Foundational Large Language Models | 提出分层网络融合(HNF)框架,用于多模态电子显微图像表征学习,提升纳米材料分类精度。 | representation learning large language model | ||
| 6 | PointDGMamba: Domain Generalization of Point Cloud Classification via Generalized State Space Model | 提出PointDGMamba以解决点云分类中的领域泛化问题 | Mamba SSM state space model | ||
| 7 | Explainable Concept Generation through Vision-Language Preference Learning for Understanding Neural Networks' Internal Representations | 提出基于视觉-语言偏好学习的可解释概念生成方法,用于理解神经网络内部表示 | reinforcement learning preference learning |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 8 | G3DST: Generalizing 3D Style Transfer with Neural Radiance Fields across Scenes and Styles | 提出G3DST,利用NeRF实现跨场景和风格的通用3D风格迁移 | NeRF neural radiance field |