More Control for Free! Image Synthesis with Semantic Diffusion Guidance [article]

Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell
<span title="2022-04-14">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Controllable image synthesis models allow creation of diverse images based on text instructions or guidance from a reference image. Recently, denoising diffusion probabilistic models have been shown to generate more realistic imagery than prior methods, and have been successfully demonstrated in unconditional and class-conditional settings. We investigate fine-grained, continuous control of this model class, and introduce a novel unified framework for semantic diffusion guidance, which allows
more &raquo; ... ther language or image guidance, or both. Guidance is injected into a pretrained unconditional diffusion model using the gradient of image-text or image matching scores. We explore CLIP-based language guidance as well as both content and style-based image guidance in a unified framework. Our text-guided synthesis approach can be applied to datasets without associated text annotations. We conduct experiments on FFHQ and LSUN datasets, and show results on fine-grained text-guided image synthesis, synthesis of images related to a style or content reference image, and examples with both textual and image guidance.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2112.05744v3">arXiv:2112.05744v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/vlb23oi6ybh6hizkulhaqphuo4">fatcat:vlb23oi6ybh6hizkulhaqphuo4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211223160736/https://arxiv.org/pdf/2112.05744v2.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e3/57/e357f175516db266d066724139be05749926e530.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2112.05744v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>