cs.CV(2025-10-22)

📊 共 9 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二:RL算法与架构 (RL & Architecture) (4 🔗3) 支柱九:具身大模型 (Embodied Foundation Models) (3) 支柱三:空间感知与语义 (Perception & Semantics) (1 🔗1) 支柱七:动作重定向 (Motion Retargeting) (1 🔗1)

🔬 支柱二:RL算法与架构 (RL & Architecture) (4 篇)

#题目一句话要点标签🔗
1 MoE-GS: Mixture of Experts for Dynamic Gaussian Splatting 提出MoE-GS,利用专家混合模型提升动态高斯溅射的渲染质量与效率。 distillation 3D gaussian splatting gaussian splatting
2 X-Ego: Acquiring Team-Level Tactical Situational Awareness via Cross-Egocentric Contrastive Video Representation Learning 提出基于跨视角对比学习的X-Ego方法,用于获取团队级战术态势感知 representation learning contrastive learning egocentric
3 Rethinking Driving World Model as Synthetic Data Generator for Perception Tasks 提出Dream4Drive框架以提升自动驾驶感知任务的合成数据生成 world model multimodal
4 From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction 提出策略世界模型,融合世界建模与轨迹规划,提升自动驾驶决策能力 world model

🔬 支柱九:具身大模型 (Embodied Foundation Models) (3 篇)

#题目一句话要点标签🔗
5 PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning 提出PruneHal,通过自适应KV缓存剪枝减少多模态大语言模型中的幻觉问题 large language model
6 A Flow Model with Low-Rank Transformers for Incomplete Multimodal Survival Analysis 提出一种基于低秩Transformer的Flow模型,用于不完全多模态生存分析。 multimodal
7 Structured and Abstractive Reasoning on Multi-modal Relational Knowledge Images 提出STAR-64K数据集和两阶段训练框架,提升多模态大语言模型在结构化和抽象推理上的能力。 large language model chain-of-thought

🔬 支柱三:空间感知与语义 (Perception & Semantics) (1 篇)

#题目一句话要点标签🔗
8 Extreme Views: 3DGS Filter for Novel View Synthesis from Out-of-Distribution Camera Poses 提出基于梯度的3DGS滤波方法,解决极端视角下的新视角合成伪影问题 3D gaussian splatting 3DGS gaussian splatting

🔬 支柱七:动作重定向 (Motion Retargeting) (1 篇)

#题目一句话要点标签🔗
9 FootFormer: Estimating Stability from Visual Input FootFormer:一种从视觉输入估计人体稳定性的跨模态方法 human motion

⬅️ 返回 cs.CV 首页 · 🏠 返回主页