cs.CV（2024-05-02）

📊 共 11 篇论文 | 🔗 2 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (5 🔗1) 支柱二：RL算法与架构 (RL & Architecture) (4 🔗1) 支柱四：生成式动作 (Generative Motion) (1) 支柱六：视频提取与匹配 (Video Extraction) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (5 篇)

#	题目	一句话要点	标签	🔗	⭐
1	MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors	提出MiniGPT-3D以高效对齐3D点云与大语言模型	large language model	✅
2	Language-Enhanced Latent Representations for Out-of-Distribution Detection in Autonomous Driving	提出基于语言增强的潜在表征，用于自动驾驶中的OOD检测	foundation model multimodal
3	Transformer-Aided Semantic Communications	提出基于Transformer的语义通信框架，提升带宽受限场景下的图像传输质量。	large language model
4	MANTIS: Interleaved Multi-Image Instruction Tuning	MANTIS：通过交错多图指令微调提升多模态大模型的多图理解能力	multimodal
5	Teaching Human Behavior Improves Content Understanding Abilities Of LLMs	利用人类行为反馈提升大语言模型的内容理解能力	large language model

🔬 支柱二：RL算法与架构 (RL & Architecture) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
6	SSUMamba: Spatial-Spectral Selective State Space Model for Hyperspectral Image Denoising	提出SSUMamba，用于高光谱图像去噪，兼顾长程依赖建模与计算效率。	Mamba SSM state space model	✅
7	SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients	提出基于状态空间模型和可编程梯度的SOAR方法，提升航空影像中小目标检测性能。	Mamba SSM state space model
8	ATOM: Attention Mixer for Efficient Dataset Distillation	提出ATOM：一种用于高效数据集蒸馏的注意力混合模块	distillation feature matching
9	Goal-conditioned reinforcement learning for ultrasound navigation guidance	提出基于对比学习的目标条件强化学习方法，用于超声导航引导。	reinforcement learning contrastive learning

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
10	SATO: Stable Text-to-Motion Framework	提出SATO框架，解决文本到动作生成中语义相似文本输入导致动作不稳定的问题	text-to-motion

🔬 支柱六：视频提取与匹配 (Video Extraction) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
11	Sparse multi-view hand-object reconstruction for unseen environments	提出SVHO模型，用于稀疏多视角下未见物体的三维手-物体重建	hand-object reconstruction

⬅️ 返回 cs.CV 首页 · 🏠 返回主页