cs.CV(2025-12-02)
📊 共 11 篇论文 | 🔗 2 篇有代码
🎯 兴趣领域导航
支柱九:具身大模型 (Embodied Foundation Models) (5)
支柱三:空间感知与语义 (Perception & Semantics) (4 🔗1)
支柱二:RL算法与架构 (RL & Architecture) (2 🔗1)
🔬 支柱九:具身大模型 (Embodied Foundation Models) (5 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 1 | Contextual Image Attack: How Visual Context Exposes Multimodal Safety Vulnerabilities | 提出上下文图像攻击方法以解决多模态安全漏洞问题 | large language model multimodal | ||
| 2 | See, Think, Learn: A Self-Taught Multimodal Reasoner | 提出See-Think-Learn框架,通过自训练提升视觉语言模型的多模态推理能力。 | multimodal chain-of-thought | ||
| 3 | WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning | 提出WorldMM:动态多模态记忆代理,用于长视频推理。 | large language model multimodal | ||
| 4 | Polar Perspectives: Evaluating 2-D LiDAR Projections for Robust Place Recognition with Visual Foundation Models | 利用视觉基础模型,研究LiDAR投影方式对稳健位置识别的影响 | foundation model | ||
| 5 | LLM-Guided Material Inference for 3D Point Clouds | 提出LLM引导的材质推断方法,从3D点云中推断材质组成。 | large language model |
🔬 支柱三:空间感知与语义 (Perception & Semantics) (4 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 6 | Flux4D: Flow-based Unsupervised 4D Reconstruction | Flux4D:基于光流的无监督大规模动态场景4D重建 | 3D gaussian splatting 3DGS gaussian splatting | ||
| 7 | Content-Aware Texturing for Gaussian Splatting | 提出内容感知纹理化高斯溅射,提升渲染质量并减少参数量 | gaussian splatting splatting | ||
| 8 | SurfFill: Completion of LiDAR Point Clouds via Gaussian Surfel Splatting | SurfFill:利用高斯 Surfel Splatting 完成 LiDAR 点云补全 | splatting | ||
| 9 | BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection | BEVDilation:一种以激光雷达为中心的多模态融合3D目标检测方法 | depth estimation | ✅ |
🔬 支柱二:RL算法与架构 (RL & Architecture) (2 篇)
| # | 题目 | 一句话要点 | 标签 | 🔗 | ⭐ |
|---|---|---|---|---|---|
| 10 | U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences | U4D:面向自动驾驶,提出不确定性感知的LiDAR序列4D世界建模方法 | world model embodied AI | ||
| 11 | ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning | ReVSeg:利用强化学习激励推理链,实现视频分割 | reinforcement learning | ✅ |