A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2023; you can also visit the original URL.
The file type is application/pdf
.
Compositional Visual Generation with Composable Diffusion Models
[article]
2023
arXiv
pre-print
Large text-guided diffusion models, such as DALLE-2, are able to generate stunning photorealistic images given natural language descriptions. While such models are highly flexible, they struggle to understand the composition of certain concepts, such as confusing the attributes of different objects or relations between objects. In this paper, we propose an alternative structured approach for compositional generation using diffusion models. An image is generated by composing a set of diffusion
arXiv:2206.01714v6
fatcat:dxo2pvf6hracvfptuk4ezdh7qu