PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback

作者: Sixiang Chen, Jianyu Lai, Jialin Gao, Hengyu Shi, Zhongying Liu, Tian Ye, Junfeng Luo, Xiaoming Wei, Lei Zhu

分类: cs.CV

发布日期: 2026-02-12

💡 一句话要点

PosterOmni：通过任务蒸馏和统一奖励反馈实现通用艺术海报创作

🎯 匹配领域: 支柱二：RL算法与架构 (RL & Architecture)

关键词: 图像到海报生成 任务蒸馏 统一奖励反馈 局部编辑 全局创作

📋 核心要点

现有图像到海报生成方法难以兼顾局部实体保持和全局设计概念，导致生成质量受限。
PosterOmni通过数据蒸馏和统一奖励反馈，将局部编辑和全局创作集成到统一框架中。
实验表明，PosterOmni在参考一致性、构图质量和美学和谐性方面显著优于现有方法。

📝 摘要（中文）

图像到海报的生成是一项高需求任务，它不仅需要局部调整，还需要对高级设计有深刻的理解。模型必须生成文本、布局、风格和视觉元素，同时保持语义保真度和美学一致性。该过程跨越两个领域：局部编辑，其中ID驱动的生成、缩放、填充和扩展必须保留具体的视觉实体；以及全局创作，其中布局和风格驱动的任务依赖于理解抽象的设计概念。这些相互交织的需求使得图像到海报成为一个多维过程，将实体保持编辑与图像提示控制下的概念驱动创作相结合。为了应对这些挑战，我们提出了PosterOmni，一个通用的艺术海报创作框架，它释放了基础编辑模型在多任务图像到海报生成中的潜力。PosterOmni通过高效的数据蒸馏-奖励流水线在一个系统中集成了局部编辑和全局创作：(i) 构建涵盖六种任务类型的多场景图像到海报数据集，包括基于实体的创作和基于概念的创作；(ii) 在局部和全局专家之间进行知识蒸馏，以进行监督微调；(iii) 应用统一的PosterOmni奖励反馈，以联合对齐所有任务中的视觉实体保持和美学偏好。此外，我们建立了PosterOmni-Bench，一个用于评估局部编辑和全局创作的统一基准。大量的实验表明，PosterOmni显著提高了参考一致性、全局构图质量和美学和谐性，优于所有开源基线，甚至超过了几个专有系统。

🔬 方法详解

问题定义：图像到海报生成任务需要同时处理局部编辑（如ID保持、缩放、填充）和全局创作（如布局和风格设计），现有方法难以有效整合这两种能力，导致生成的海报在视觉一致性和美学质量上存在不足。现有方法通常专注于单一任务，缺乏通用性和泛化能力。

核心思路：PosterOmni的核心思路是通过数据蒸馏和统一奖励反馈，将局部编辑和全局创作两种能力集成到一个统一的框架中。通过知识蒸馏，将局部和全局专家的知识迁移到PosterOmni模型中，使其能够同时处理实体保持和概念驱动的生成任务。统一奖励反馈则用于对齐视觉实体保持和美学偏好，确保生成的海报在视觉上和美学上都令人满意。

技术框架：PosterOmni框架包含以下几个主要模块：1) 多场景图像到海报数据集构建，涵盖六种任务类型；2) 局部和全局专家知识蒸馏，用于监督微调PosterOmni模型；3) 统一的PosterOmni奖励反馈，用于联合优化视觉实体保持和美学偏好。整个流程首先构建数据集，然后进行知识蒸馏，最后通过奖励反馈进行优化。

关键创新：PosterOmni的关键创新在于其统一的框架，能够同时处理局部编辑和全局创作任务。通过数据蒸馏，模型能够学习到局部和全局专家的知识，从而更好地处理各种类型的图像到海报生成任务。统一奖励反馈则确保了生成的海报在视觉和美学上都符合要求。此外，PosterOmni-Bench的建立为评估图像到海报生成模型提供了一个统一的基准。

关键设计：PosterOmni使用了一种统一的奖励函数，该函数结合了视觉实体保持和美学偏好两个方面。具体来说，奖励函数可能包括图像相似度度量（如LPIPS）和美学评分器（如CLIP-based aesthetic predictor）。数据集的构建也至关重要，需要涵盖各种场景和任务类型，以确保模型的泛化能力。具体的网络结构和参数设置在论文中可能没有详细说明，属于未知信息。

📊 实验亮点

实验结果表明，PosterOmni在参考一致性、全局构图质量和美学和谐性方面显著优于所有开源基线，甚至超过了几个专有系统。具体性能数据未知，但论文强调了PosterOmni在多个指标上的显著提升，证明了其有效性。

🎯 应用场景

PosterOmni具有广泛的应用前景，可用于广告设计、社交媒体内容创作、电影海报生成等领域。该技术可以帮助设计师快速生成高质量的海报，提高工作效率。此外，PosterOmni还可以用于个性化海报生成，根据用户的喜好和需求定制海报内容，具有很高的商业价值。

📄 摘要（原文）

Image-to-poster generation is a high-demand task requiring not only local adjustments but also high-level design understanding. Models must generate text, layout, style, and visual elements while preserving semantic fidelity and aesthetic coherence. The process spans two regimes: local editing, where ID-driven generation, rescaling, filling, and extending must preserve concrete visual entities; and global creation, where layout- and style-driven tasks rely on understanding abstract design concepts. These intertwined demands make image-to-poster a multi-dimensional process coupling entity-preserving editing with concept-driven creation under image-prompt control. To address these challenges, we propose PosterOmni, a generalized artistic poster creation framework that unlocks the potential of a base edit model for multi-task image-to-poster generation. PosterOmni integrates the two regimes, namely local editing and global creation, within a single system through an efficient data-distillation-reward pipeline: (i) constructing multi-scenario image-to-poster datasets covering six task types across entity-based and concept-based creation; (ii) distilling knowledge between local and global experts for supervised fine-tuning; and (iii) applying unified PosterOmni Reward Feedback to jointly align visual entity-preserving and aesthetic preference across all tasks. Additionally, we establish PosterOmni-Bench, a unified benchmark for evaluating both local editing and global creation. Extensive experiments show that PosterOmni significantly enhances reference adherence, global composition quality, and aesthetic harmony, outperforming all open-source baselines and even surpassing several proprietary systems.

PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback

💡 一句话要点

📋 核心要点

📝 摘要（中文）

🔬 方法详解

📊 实验亮点

🎯 应用场景

📄 摘要（原文）

⭐ 我的收藏

📁 新建收藏夹

⚙️ 管理收藏夹

🔍 搜索论文

🔐 登录 / 注册

👤 用户管理