cs.CV(2025-09-02)
📊 共 5 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | STROKEVISION-BENCH: A Multimodal Video And 2D Pose Benchmark For Tracking Stroke Recovery | StrokeVision-Bench:用于跟踪中风恢复的多模态视频和2D姿态基准数据集 | multimodal | ||
| 2 | Toward a robust lesion detection model in breast DCE-MRI: adapting foundation models to high-risk women | 针对高危女性,提出结合MST和KAN的乳腺DCE-MRI病灶检测模型。 | foundation model | ||
| 3 | DIET-CP: Lightweight and Data Efficient Self Supervised Continued Pretraining | DIET-CP:轻量级且数据高效的自监督持续预训练方法 | foundation model | ||
| 4 | Understanding Space Is Rocket Science -- Only Top Reasoning Models Can Solve Spatial Understanding Tasks | 提出RocketScience基准以解决空间理解任务的挑战 | chain-of-thought | ✅ |
🔬 支柱八:物理动画 (Physics-based Animation) (1 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 5 | PixFoundation 2.0: Do Video Multi-Modal LLMs Use Motion in Visual Grounding? | PixFoundation 2.0:探究视频多模态LLM在视觉定位中是否利用运动信息 | spatiotemporal large language model visual grounding | ✅ |