cs.CV（2025-04-16）

📊 共 7 篇论文 | 🔗 1 篇有代码

🎯 兴趣领域导航

支柱九：具身大模型 (Embodied Foundation Models) (3) 支柱一：机器人控制 (Robot Control) (2 🔗1) 支柱三：空间感知与语义 (Perception & Semantics) (1) 支柱二：RL算法与架构 (RL & Architecture) (1)

🔬 支柱九：具身大模型 (Embodied Foundation Models) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
1	WorldMem: Long-term Consistent World Simulation with Memory	WorldMem：利用记忆机制实现长期一致的世界模拟	TAMP
2	The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation	提出RAPO框架，通过检索增强提示优化提升文本到视频生成质量	large language model
3	Interpreting the linear structure of vision-language model embedding spaces	利用稀疏自编码器解析视觉-语言模型嵌入空间的线性结构	multimodal

🔬 支柱一：机器人控制 (Robot Control) (2 篇)

#	题目	一句话要点	标签	🔗	⭐
4	Event Quality Score (EQS): Assessing the Realism of Simulated Event Camera Streams via Distances in Latent Space	提出事件质量评分（EQS），用于评估模拟事件相机数据与真实数据的逼真度。	sim-to-real	✅
5	DG-MVP: 3D Domain Generalization via Multiple Views of Point Clouds for Classification	DG-MVP：通过点云多视角投影实现3D领域泛化分类	sim-to-real

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
6	How Do I Do That? Synthesizing 3D Hand Motion and Contacts for Everyday Interactions	提出基于交互轨迹预测的3D手部动作合成方法，用于日常交互场景。	affordance VQ-VAE

🔬 支柱二：RL算法与架构 (RL & Architecture) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
7	AdaVid: Adaptive Video-Language Pretraining	AdaVid：自适应视频语言预训练，提升边缘设备视频编码效率	representation learning Ego4D

⬅️ 返回 cs.CV 首页 · 🏠 返回主页