A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Semantic Layout Manipulation with High-Resolution Sparse Attention
[article]
2022
arXiv
pre-print
To adapt this paradigm for the layout manipulation task, we propose a high-resolution sparse attention module that effectively transfers visual details to new layouts at a resolution up to 512x512. ...
We tackle the problem of semantic image layout manipulation, which aims to manipulate an input image by editing its semantic label map. ...
High-resolution Sparse Attention Image correspondence is often sparse [53] , [54] , meaning that for an image pair A, B, a pixel A often matches only with a sparse set of pixels from B. ...
arXiv:2012.07288v4
fatcat:3wx7o3z7azgmrigu4o6fygndo4
Controlling Style and Semantics in Weakly-Supervised Image Generation
[article]
2020
arXiv
pre-print
In order to condition our model on textual descriptions, we introduce a semantic attention module whose computational cost is independent of the image resolution. ...
We exploit sparse semantic maps to control object shapes and classes, as well as textual descriptions or attributes to control both local and global style. ...
high-resolution settings. ...
arXiv:1912.03161v2
fatcat:gcl26aptnzhd3mmhnhu5tfrfoa
Fashion Editing with Adversarial Parsing Learning
[article]
2019
arXiv
pre-print
Extensive experiments on high-resolution fashion image datasets demonstrate that the proposed method significantly outperforms the state-of-the-art methods on image manipulation. ...
textures with semantic guidance from the human parsing map. ...
Introduction Fashion image manipulation aims to generate high-resolution realistic fashion images with userprovided sketches and color strokes. It has huge potential values in various applications. ...
arXiv:1906.00884v2
fatcat:3i2kxaal7rh65dh7kw6absgila
Fashion Editing With Adversarial Parsing Learning
2020
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Extensive experiments on high-resolution fashion image datasets demonstrate that the proposed FE-GAN significantly outperforms the state-of-the-art methods on fashion image manipulation. ...
textures with semantic guidance from the human parsing map. ...
multi-scale attention normalization layers, which can generate high-resolution realistic edited fashion images. ...
doi:10.1109/cvpr42600.2020.00814
dblp:conf/cvpr/DongLZZSXW020
fatcat:kre4h64vc5dohczeiboc3unlyi
Person-in-Context Synthesiswith Compositional Structural Space
[article]
2020
arXiv
pre-print
The context is specified by the bounding box object layout which lacks shape information, while pose of the person(s) by keypoints which are sparsely annotated. ...
Despite significant progress, controlled generation of complex images with interacting people remains difficult. ...
As can be seen we can generate complex images with multiple objects at high resolution and with realistic details. ...
arXiv:2008.12679v1
fatcat:nwksbmdsc5g73b45h7xsdwdthm
Adversarial Text-to-Image Synthesis: A Review
[article]
2021
arXiv
pre-print
However, the field still faces several challenges that require further research efforts such as enabling the generation of high-resolution images with multiple objects, and developing suitable and reliable ...
It is a flexible and intuitive way for conditional image generation with significant progress in the last years regarding visual realism, diversity, and semantic alignment. ...
The generator of PPAN applies a pyramid framework [44, 45] to combine low-resolution, semantically strong features with high-resolution, seman- StackGAN [33] and StackGAN++ [40] architectures. ...
arXiv:2101.09983v1
fatcat:as5i4mk4kndrzpcshlewkbgge4
Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis
[article]
2020
arXiv
pre-print
Semantic image synthesis aims at generating photorealistic images from semantic layouts. ...
Previous approaches with conditional generative adversarial networks (GAN) show state-of-the-art performance on this task, which either feed the semantic label maps as inputs to the generator, or use them ...
Conclusion We propose a novel approach (CC-FPSE) for image synthesis from a given semantic layout via better using the semantic layout information to generate images with high-quality details and well ...
arXiv:1910.06809v3
fatcat:cttcvpisnbeufe3dt6myawin3i
A Unified Framework for Biphasic Facial Age Translation with Noisy-Semantic Guided Generative Adversarial Networks
[article]
2021
arXiv
pre-print
ProjectionNet introduces the low-level structural semantic information with noise map and produces soft latent maps. ...
Structurally, we project the class-aware noisy semantic layouts to soft latent maps for the following injection operation on the individual facial parts. ...
SLMNet [25] proposes a high-resolution sparse attention module that effectively transfers visual details to 'new' layouts at a resolution up to 512 × 512. ...
arXiv:2109.07373v1
fatcat:fo424mxpxrbh3dmimvsx5grjwq
InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset
[article]
2018
arXiv
pre-print
Our dataset leverages the availability of millions of professional interior designs and millions of production-level furniture and object assets -- all coming with fine geometric details and high-resolution ...
We render high-resolution and high frame-rate video sequences following realistic trajectories while supporting various camera types as well as providing inertial measurements. ...
Acknowledgements We would like to thank Kujiale.com for providing their database of production furniture models and layouts, as well as access to their GPU/CPU clusters. ...
arXiv:1809.00716v1
fatcat:xcav5mdu25gezjdzulqx252d6y
Video2StyleGAN: Encoding Video in Latent Space for Manipulation
[article]
2022
arXiv
pre-print
Based on the vision transformer, our network reuses the high-resolution portion of the latent vector to enforce temporal consistency. ...
To this end, we propose a novel network to encode face videos into the latent space of StyleGAN for semantic face video manipulation. ...
The lower-resolution layers control the high-level layout/shape of the face, while the higher-resolution layers control the face details. The space of these vectors is denoted as W+ ∈ R 18×512 . ...
arXiv:2206.13078v1
fatcat:bfbvwxtdf5ftnlq2in4malzov4
Depth-SIMS: Semi-Parametric Image and Depth Synthesis
[article]
2022
arXiv
pre-print
the RGB canvases into high quality RGB images and the sparse depth maps into pixel-wise dense depth maps. ...
In this paper we present a compositing image synthesis method that generates RGB canvases with well aligned segmentation maps and sparse depth maps, coupled with an in-painting network that transforms ...
We then sample locations from the validity mask M valid depth , and produce a sparse depth map D sparse aligned with depth information taken from the depth map D guide of the guiding semantic layout S ...
arXiv:2203.03405v2
fatcat:d72gw6g2h5f45fikoyfxdw5cwu
Semantic-Preserving Word Clouds by Seam Carving
2011
Computer graphics forum (Print)
With seam carving, we can pack the word cloud compactly and effectively, while preserving its overall semantic structure. ...
Word clouds are proliferating on the Internet and have received much attention in visual analytics. ...
The created word layout is a preliminary and sparse layout for showing the overall semantic relations among all the extracted keywords. ...
doi:10.1111/j.1467-8659.2011.01923.x
fatcat:llkpskliurdydcss5jjxwrw4fi
STEEX: Steering Counterfactual Explanations with Semantics
[article]
2022
arXiv
pre-print
Leveraging recent semantic-to-image models, we propose a new generative counterfactual explanation framework that produces plausible and sparse modifications which preserve the overall scene structure. ...
In this work, we address the problem of producing counterfactual explanations for high-quality images and complex scenes. ...
CelebAMask-HQ contains 30,000 high-quality face portraits with semantic segmentation annotation maps including 19 semantic classes (e.g., skin, mouth nose, etc.). ...
arXiv:2111.09094v3
fatcat:ecl5powswjdb3fwlqzma6476yu
Spatially Multi-conditional Image Generation
[article]
2022
arXiv
pre-print
Our choice of spatial conditioning, such as by semantics and depth, is driven by the promise it holds for better control of the image generation process. ...
The entire dataset consists of over 4 Million images from around 500 different buildings with high resolution RGB images, segmentation masks and other labels. ...
This model first synthesizes the high-resolution pixels using lightweight and highly parallelizable operators. ASAP performs most computationally expensive image analysis at a very coarse resolution. ...
arXiv:2203.13812v2
fatcat:6xjovlkaerf2vb5dtcki7epet4
Deep Image Synthesis from Intuitive User Input: A Review and Perspectives
[article]
2021
arXiv
pre-print
In many applications of computer graphics, art and design, it is desirable for a user to provide intuitive non-image input, such as text, sketch, stroke, graph or layout, and have a computer system automatically ...
The CelebA-HQ dataset [49] consists of 30,000 high resolution images from the CelebA dataset. ...
To improve synthesis quality, they proposed a Cascaded Refinement Network (CRN), which progressively generates images from low resolution to high resolution (up to 2 megapixels at 1024x2048 pixel resolution ...
arXiv:2107.04240v2
fatcat:ticrsi27nzhozmw7dp7wwja2ni
« Previous
Showing results 1 — 15 out of 2,254 results