cs.CV（2025-10-22）

📊 共 9 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱二：RL算法与架构 (RL & Architecture) (4 🔗3) 支柱九：具身大模型 (Embodied Foundation Models) (3) 支柱三：空间感知与语义 (Perception & Semantics) (1 🔗1) 支柱七：动作重定向 (Motion Retargeting) (1 🔗1)

🔬 支柱二：RL算法与架构 (RL & Architecture) (4 篇)

#	题目	一句话要点	标签	🔗	⭐
1	MoE-GS: Mixture of Experts for Dynamic Gaussian Splatting	提出MoE-GS，利用专家混合模型提升动态高斯溅射的渲染质量与效率。	distillation 3D gaussian splatting gaussian splatting
2	X-Ego: Acquiring Team-Level Tactical Situational Awareness via Cross-Egocentric Contrastive Video Representation Learning	提出基于跨视角对比学习的X-Ego方法，用于获取团队级战术态势感知	representation learning contrastive learning egocentric	✅
3	Rethinking Driving World Model as Synthetic Data Generator for Perception Tasks	提出Dream4Drive框架以提升自动驾驶感知任务的合成数据生成	world model multimodal	✅
4	From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction	提出策略世界模型，融合世界建模与轨迹规划，提升自动驾驶决策能力	world model	✅

🔬 支柱九：具身大模型 (Embodied Foundation Models) (3 篇)

#	题目	一句话要点	标签	🔗	⭐
5	PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning	提出PruneHal，通过自适应KV缓存剪枝减少多模态大语言模型中的幻觉问题	large language model
6	A Flow Model with Low-Rank Transformers for Incomplete Multimodal Survival Analysis	提出一种基于低秩Transformer的Flow模型，用于不完全多模态生存分析。	multimodal
7	Structured and Abstractive Reasoning on Multi-modal Relational Knowledge Images	提出STAR-64K数据集和两阶段训练框架，提升多模态大语言模型在结构化和抽象推理上的能力。	large language model chain-of-thought

🔬 支柱三：空间感知与语义 (Perception & Semantics) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
8	Extreme Views: 3DGS Filter for Novel View Synthesis from Out-of-Distribution Camera Poses	提出基于梯度的3DGS滤波方法，解决极端视角下的新视角合成伪影问题	3D gaussian splatting 3DGS gaussian splatting	✅

🔬 支柱七：动作重定向 (Motion Retargeting) (1 篇)

#	题目	一句话要点	标签	🔗	⭐
9	FootFormer: Estimating Stability from Visual Input	FootFormer：一种从视觉输入估计人体稳定性的跨模态方法	human motion	✅

⬅️ 返回 cs.CV 首页 · 🏠 返回主页