| 1 |
3D-Consistent Human Avatars with Sparse Inputs via Gaussian Splatting and Contrastive Learning |
CHASE:利用高斯溅射和对比学习,通过稀疏输入实现3D一致的人体化身 |
contrastive learning 3D gaussian splatting 3DGS |
|
|
| 2 |
ExpoMamba: Exploiting Frequency SSM Blocks for Efficient and Effective Image Enhancement |
ExpoMamba:利用频率SSM块实现高效图像增强,解决低光照和混合曝光问题 |
Mamba SSM foundation model |
|
|
| 3 |
R2GenCSR: Retrieving Context Samples for Large Language Model based X-ray Medical Report Generation |
R2GenCSR:提出一种基于上下文检索的X射线医学报告生成框架,提升LLM生成质量。 |
Mamba large language model |
✅ |
|
| 4 |
MambaLoc: Efficient Camera Localisation via State Space Model |
MambaLoc:提出基于状态空间模型的高效相机定位方法,解决训练成本高和数据稀疏问题。 |
Mamba SSM state space model |
|
|
| 5 |
OccMamba: Semantic Occupancy Prediction with State Space Models |
提出OccMamba,首个基于Mamba架构的语义占据预测网络,提升效率与精度。 |
Mamba state space model |
✅ |
|
| 6 |
$R^2$-Mesh: Reinforcement Learning Powered Mesh Reconstruction via Geometry and Appearance Refinement |
提出基于强化学习的网格重建方法,通过几何与外观优化提升NeRF重建质量 |
reinforcement learning NeRF neural radiance field |
|
|
| 7 |
Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality Data |
Factorized-Dreamer:利用有限低质量数据训练高质量视频生成器 |
dreamer optical flow spatiotemporal |
✅ |
|
| 8 |
P3P: Pseudo-3D Pre-training for Scaling 3D Voxel-based Masked Autoencoders |
提出P3P框架,利用伪3D预训练扩展体素化掩码自编码器,提升3D感知任务性能。 |
masked autoencoder MAE depth estimation |
✅ |
|
| 9 |
Multi-Scale Representation Learning for Image Restoration with State-Space Model |
提出基于状态空间模型的多尺度图像复原网络MS-Mamba,实现高效高质量图像重建。 |
Mamba SSM representation learning |
|
|
| 10 |
CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs |
CLIP-DPO:利用视觉-语言模型偏好优化减少LVLM幻觉 |
DPO |
|
|
| 11 |
C${^2}$RL: Content and Context Representation Learning for Gloss-free Sign Language Translation and Retrieval |
提出C${^2}$RL,用于无词汇的手语翻译和检索,提升表征学习能力。 |
representation learning |
|
|