cs.CV(2024-09-29)

📊 共 12 篇论文 | 🔗 5 篇有代码

🎯 兴趣领域导航

支柱九:具身大模型 (Embodied Foundation Models) (4 🔗3) 支柱三:空间感知与语义 (Perception & Semantics) (3) 支柱二:RL算法与架构 (RL & Architecture) (3 🔗2) 支柱四:生成式动作 (Generative Motion) (1) 支柱一:机器人控制 (Robot Control) (1)

🔬 支柱九:具身大模型 (Embodied Foundation Models) (4 篇)

#题目一句话要点标签🔗
1 MedViLaM: A multimodal large language model with advanced generalizability and explainability for medical data understanding and generation MedViLaM:面向医学数据理解与生成,具备泛化性和可解释性的多模态大语言模型 large language model multimodal
2 T2Vs Meet VLMs: A Scalable Multimodal Dataset for Visual Harmfulness Recognition 提出VHD11K大规模多模态数据集,用于提升视觉有害内容识别能力。 multimodal
3 One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos VideoLISA:基于语言指令的视频推理分割,实现时序一致性目标追踪 large language model foundation model multimodal
4 Pear: Pruning and Sharing Adapters in Visual Parameter-Efficient Fine-Tuning 提出Pear框架,通过剪枝和共享适配器实现视觉预训练模型的高效微调 foundation model

🔬 支柱三:空间感知与语义 (Perception & Semantics) (3 篇)

#题目一句话要点标签🔗
5 Grounding 3D Scene Affordance From Egocentric Interactions 提出Ego-SAG框架,从第一视角交互视频中定位3D场景中的可交互区域。 affordance egocentric
6 RNG: Relightable Neural Gaussians 提出RNG:一种基于3D高斯分布的可重光照神经渲染方法,适用于复杂形状物体。 3D gaussian splatting 3DGS gaussian splatting
7 Robust Incremental Structure-from-Motion with Hybrid Features 提出一种基于混合特征的鲁棒增量式SfM系统,提升弱纹理场景和低约束条件下的重建效果。 scene reconstruction

🔬 支柱二:RL算法与架构 (RL & Architecture) (3 篇)

#题目一句话要点标签🔗
8 See Detail Say Clear: Towards Brain CT Report Generation via Pathological Clue-driven Representation Learning 提出病理线索驱动的表征学习模型PCRL,用于提升脑部CT报告生成质量。 representation learning large language model
9 fCOP: Focal Length Estimation from Category-level Object Priors 提出fCOP,利用类别级物体先验进行单目焦距估计 representation learning depth estimation monocular depth
10 Hybrid Mamba for Few-Shot Segmentation 提出混合Mamba网络(HMNet)用于解决小样本分割中支持信息利用不足的问题。 Mamba

🔬 支柱四:生成式动作 (Generative Motion) (1 篇)

#题目一句话要点标签🔗
11 Text-driven Human Motion Generation with Motion Masked Diffusion Model 提出运动掩码扩散模型(MMDM),增强文本驱动人体运动生成中时空关系学习能力 motion generation multimodal

🔬 支柱一:机器人控制 (Robot Control) (1 篇)

#题目一句话要点标签🔗
12 Focus On What Matters: Separated Models For Visual-Based RL Generalization 提出SMG,通过分离模型和一致性损失提升视觉RL泛化能力 manipulation reinforcement learning representation learning

⬅️ 返回 cs.CV 首页 · 🏠 返回主页