cs.CV（2024-09-14）

📊 共 11 篇论文 | 🔗 4 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (3 🔗1) 支柱一：机器人控制 (Robot Control) (2) 支柱八：物理动画 (Physics-based Animation) (2 🔗1) 支柱三：空间感知与语义 (Perception & Semantics) (2 🔗1) 支柱二：RL算法与架构 (RL & Architecture) (1 🔗1) 支柱四：生成式动作 (Generative Motion) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
1	Keypoint-Integrated Instruction-Following Data Generation for Enhanced Human Pose and Action Understanding in Multimodal Models	提出关键点整合的指令跟随数据生成方法，提升多模态模型对人体姿态和行为的理解	multimodal instruction following	✅
2	On the Generalizability of Foundation Models for Crop Type Mapping	评估遥感Foundation Model在作物类型mapping中的泛化能力与地理偏差	foundation model
3	AI-Driven Virtual Teacher for Enhanced Educational Efficiency: Leveraging Large Pretrain Models for Autonomous Error Analysis and Correction	提出VATE：利用大语言模型实现自主错误分析与纠正，提升教学效率	large language model

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
4	ManiDext: Hand-Object Manipulation Synthesis via Continuous Correspondence Embeddings and Residual-Guided Diffusion	ManiDext：基于连续对应嵌入和残差引导扩散的手-物操作合成	manipulation dexterous manipulation bi-manual
5	ChildPlay-Hand: A Dataset of Hand Manipulations in the Wild	提出ChildPlay-Hand数据集，用于研究真实场景下儿童与成人手部操作交互	manipulation HOI egocentric

🔬 支柱八：物理动画 (Physics-based Animation) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
6	Multimodal Power Outage Prediction for Rapid Disaster Response and Resource Allocation	提出VST-GNN模型，用于多模态电力中断预测，助力灾后快速响应和资源分配	spatiotemporal multimodal
7	MHAD: Multimodal Home Activity Dataset with Multi-Angle Videos and Synchronized Physiological Signals	提出MHAD多模态家庭活动数据集，用于提升视频生理信号分析在家庭环境中的性能。	PULSE multimodal	✅

🔬 支柱三：空间感知与语义 (Perception & Semantics) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
8	Associate Everything Detected: Facilitating Tracking-by-Detection to the Unknown	提出AED框架，通过关联一切检测结果，统一解决已知与未知类别多目标跟踪问题	open-vocabulary open vocabulary	✅
9	Real-Time Stochastic Terrain Mapping and Processing for Autonomous Safe Landing	提出基于高斯过程回归的实时随机地形建模算法，用于自主安全着陆。	elevation map

🔬 支柱二：RL算法与架构 (RL & Architecture) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
10	Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image Retrieval	评估预训练CNN和Foundation模型在医学图像检索中的特征提取性能	contrastive learning foundation model	✅

🔬 支柱四：生成式动作 (Generative Motion) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
11	LawDNet: Enhanced Audio-Driven Lip Synthesis via Local Affine Warping Deformation	提出LawDNet以解决音频驱动的唇部合成问题	motion synthesis

⬅️ 返回 cs.CV 首页 · 🏠 返回主页