cs.CV(2025-04-19)

📊 共 13 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (9 🔗4) 支柱二:RL算法与架构 (RL & Architecture) (3) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (9 篇)

#题目一句话要点标签🔗
1 Manipulating Multimodal Agents via Cross-Modal Prompt Injection 提出CrossInject框架,通过跨模态提示注入攻击操纵多模态Agent。 large language model multimodal
2 A Multimodal Recaptioning Framework to Account for Perceptual Diversity Across Languages in Vision-Language Modeling 提出多模态重述框架,解决视觉-语言模型中跨语言感知差异问题 multimodal
3 Towards Explainable Fake Image Detection with Multi-Modal Large Language Models 提出基于多模态大语言模型的AI生成图像可解释性检测框架 large language model
4 Enhancing Multimodal In-Context Learning for Image Classification through Coreset Optimization 提出基于关键帧优化的KeCO框架,提升图像分类中多模态上下文学习性能。 multimodal
5 Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D LOCATE 3D:通过3D自监督学习实现真实世界物体定位 foundation model language conditioned
6 Adversarial Attack for RGB-Event based Visual Object Tracking 提出一种跨模态对抗攻击算法,用于降低RGB-Event视觉目标跟踪的鲁棒性。 multimodal
7 Visual Consensus Prompting for Co-Salient Object Detection 提出视觉共识提示(VCP)方法,解决共显著性目标检测中效率和交互不足的问题。 foundation model
8 Exploring Modality Guidance to Enhance VFM-based Feature Fusion for UDA in 3D Semantic Segmentation 提出基于模态引导的VFM特征融合方法,提升3D语义分割UDA性能。 foundation model
9 Real-IAD D3: A Real-World 2D/Pseudo-3D/3D Dataset for Industrial Anomaly Detection Real-IAD D3:用于工业异常检测的真实世界多模态数据集 multimodal

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
10 HFBRI-MAE: Handcrafted Feature Based Rotation-Invariant Masked Autoencoder for 3D Point Cloud Analysis HFBRI-MAE:基于手工特征的旋转不变掩码自编码器,用于提升3D点云分析的鲁棒性。 masked autoencoder MAE
11 PVLM: Parsing-Aware Vision Language Model with Dynamic Contrastive Learning for Zero-Shot Deepfake Attribution 提出PVLM,利用解析信息和动态对比学习实现零样本深度伪造溯源 representation learning contrastive learning
12 Efficient Spiking Point Mamba for Point Cloud Analysis 提出Spiking Point Mamba (SPM),用于高效点云分析的Mamba架构SNN Mamba

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
13 Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models Hydra:一种Agentic推理方法,增强视觉-语言模型对抗鲁棒性并缓解幻觉问题 manipulation chain-of-thought

⬅️ 返回 cs.CV 首页 · 🏠 返回主页