Filters








211 Hits in 2.0 sec

E2Style: Improve the Efficiency and Effectiveness of StyleGAN Inversion [article]

Tianyi Wei and Dongdong Chen and Wenbo Zhou and Jing Liao and Weiming Zhang and Lu Yuan and Gang Hua and Nenghai Yu
2022 arXiv   pre-print
The goal of StyleGAN inversion is to find the exact latent code of the given image in the latent space of StyleGAN. This problem has a high demand for quality and efficiency.  ...  In this paper, we present a new feed-forward network "E2Style" for StyleGAN inversion, with significant improvement in terms of efficiency and effectiveness.  ...  a new application of StyleGAN inversion: secure deep hiding.  ... 
arXiv:2104.07661v2 fatcat:fp3ocdqppfgdvnilkfppghrtva

Overparameterization Improves StyleGAN Inversion [article]

Yohan Poirier-Ginter, Alexandre Lessard, Ryan Smith, Jean-François Lalonde
2022 arXiv   pre-print
In this work, we address this directly and dramatically overparameterize the latent space, before training, with simple changes to the original StyleGAN architecture.  ...  We show that this allows us to obtain near-perfect image reconstruction without the need for encoders nor for altering the latent space after training.  ...  Interpolation quality for selected examples. We compare inversion targeting w+ for the baseline (first row) to inversion targeting W for our method (second row).  ... 
arXiv:2205.06304v1 fatcat:lrcvul44pvehrk37mgmo6ng2gu

AE-StyleGAN: Improved Training of Style-Based Auto-Encoders [article]

Ligong Han, Sri Harsha Musunuri, Martin Renqiang Min, Ruijiang Gao, Yu Tian, Dimitris Metaxas
2021 arXiv   pre-print
We show that our proposed model consistently outperforms baselines in terms of image inversion and generation quality. Supplementary, code, and pretrained models are available on the project website.  ...  In this paper, we focus on style-based generators asking a scientific question: Does forcing such a generator to reconstruct real data lead to more disentangled latent space and make the inversion process  ...  In the following text, we use AE-StyleGAN with adaptive β for all experiments if not specified. Baselines. We compare our proposed models with ALAE as a competitive baseline.  ... 
arXiv:2110.08718v1 fatcat:dseo36a3xjcg7ab6qo5j75skbq

StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets [article]

Axel Sauer, Katja Schwarz, Andreas Geiger
2022 arXiv   pre-print
StyleGAN was designed for controllability; hence, prior works suspect its restrictive design to be unsuitable for diverse datasets.  ...  StyleGAN in particular sets new standards for generative modeling regarding image quality and controllability.  ...  We would like to thank Kashyap Chitta, Michael Niemeyer, and Božidar Antić for proofreading. Lastly, we would like to thank Vanessa Sauer for her general support.  ... 
arXiv:2202.00273v2 fatcat:ecxudyms7bf77d7jwwxro362tm

Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering [article]

Yuxuan Zhang, Wenzheng Chen, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, Sanja Fidler
2021 arXiv   pre-print
Key to our approach is to exploit GANs as a multi-view data generator to train an inverse graphics network using an off-the-shelf differentiable renderer, and the trained inverse graphics network as a  ...  We further showcase the disentangled GAN as a controllable 3D "neural renderer", complementing traditional graphics renderers.  ...  DISENTANGLING STYLEGAN WITH THE INVERSE GRAPHICS MODEL The inverse graphics model allows us to infer a 3D mesh and texture from a given image.  ... 
arXiv:2010.09125v2 fatcat:bxhd2qnncrgwfdsabvk542wsxa

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort [article]

Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler
2021 arXiv   pre-print
To showcase the power of our approach, we generated datasets for 7 image segmentation tasks which include pixel-level labels for 34 human face parts, and 32 car parts.  ...  We show how the GAN latent code can be decoded to produce a semantic segmentation of the image.  ...  Recently, [55] used StyleGAN [27] as a multi-view image dataset generator for training an inverse graphics network to predict 3D shapes.  ... 
arXiv:2104.06490v2 fatcat:rtw46jinvbegxmdlf53rsvpapi

Toward Spatially Unbiased Generative Models [article]

Jooyoung Choi, Jungbeom Lee, Yonghyun Jeong, Sungroh Yoon
2021 arXiv   pre-print
By learning the spatially unbiased generator, we facilitate the robust use of generators in multiple tasks, such as GAN inversion, multi-scale generation, generation of arbitrary sizes and aspect ratios  ...  Conclusion We introduced a simple method for learning spatially unbiased generative models.  ...  Fig. 5 .Figure 5 :Figure 6 : 556 Baseline StyleGAN fails to reconstruct unaligned face images, whereas ours reconstructs faces shifted GAN Inversion. (a) Standard position.  ... 
arXiv:2108.01285v1 fatcat:ws2jy2m3vzaitbswbh2y7d4qjy

LARGE: Latent-Based Regression through GAN Semantics [article]

Yotam Nitzan, Rinon Gal, Ofir Brenner, Daniel Cohen-Or
2021 arXiv   pre-print
We propose a novel method for solving regression tasks using few-shot or weak supervision.  ...  For modern generative frameworks, this semantic encoding manifests as smooth, linear directions which affect image attributes in a disentangled manner.  ...  In their work, Xu et al. train an encoder for GAN Inversion into the latent space of a pretrained StyleGAN and demonstrate that the visual features learned by this encoder can be used to train a variety  ... 
arXiv:2107.11186v1 fatcat:vsvubx2ijnfurkpzhisliuhaci

Identity-Guided Face Generation with Multi-modal Contour Conditions [article]

Qingyan Bai, Weihao Xia, Fei Yin, Yujiu Yang
2021 arXiv   pre-print
The encoder output is iteratively fed into a pre-trained StyleGAN generator until getting a satisfying result.  ...  This task especially fits the situation of tracking the known criminals or making intelligent creations for entertainment.  ...  GAN inversion is to map an image into the latent space of a pretrained GAN model for a desired latent code, which can be faithfully reconstructed afterwards.  ... 
arXiv:2110.04854v1 fatcat:zwog2bewzffnrhropumpiffggq

Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation [article]

Gihyun Kwon, Jong Chul Ye
2021 arXiv   pre-print
Although StyleGAN can generate content feature vectors from random noises, the resulting spatial content control is primarily intended for minor spatial variations, and the disentanglement of global content  ...  from styles in a hierarchical manner.  ...  While using a simple network architecture, this is a more efficient method as it enables much more powerful control of global content compared to the baseline StyleGAN model (see Fig. 2 (a)).  ... 
arXiv:2103.16146v2 fatcat:42z4wkveavb7rgreucswa744z4

Image-Based CLIP-Guided Essence Transfer [article]

Hila Chefer, Sagie Benaim, Roni Paiss, Lior Wolf
2022 arXiv   pre-print
Our blending operator combines the powerful StyleGAN generator and the semantic encoder of CLIP in a novel way that is simultaneously additive in both latent spaces, resulting in a mechanism that guarantees  ...  The first is based on optimization, while the second fine-tunes an existing inversion encoder to perform essence extraction.  ...  Most methods for StyleGAN inversion focus on the W, W+ latent spaces. The W space is more editable, yet suffers from degraded expressiveness [28] , therefore W+ has been adopted for GAN inversion.  ... 
arXiv:2110.12427v3 fatcat:z6fawah6knfmhlnvelrc2y2y34

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation [article]

Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, Daniel Cohen-Or
2021 arXiv   pre-print
Our pSp framework is based on a novel encoder network that directly generates a series of style vectors which are fed into a pretrained StyleGAN generator, forming the extended W+ latent space.  ...  Finally, we demonstrate the potential of our framework on a variety of facial image-to-image translation tasks, even when compared to state-of-the-art solutions designed specifically for a single task,  ...  StyleGAN Inversion We start by evaluating the usage of the pSp framework for StyleGAN Inversion, that is, finding the latent code of real images in the latent domain.  ... 
arXiv:2008.00951v2 fatcat:k6ibmw3sqzempodxbdkab43zaq

High-Fidelity GAN Inversion for Image Attribute Editing [article]

Tengfei Wang, Yong Zhang, Yanbo Fan, Jue Wang, Qifeng Chen
2022 arXiv   pre-print
To improve image fidelity without compromising editability, we propose a distortion consultation approach that employs a distortion map as a reference for high-fidelity reconstruction.  ...  Extensive experiments in the face and car domains show a clear improvement in both inversion and editing quality.  ...  ] fine-tuned StyleGAN parameters for each image after predicting an initial latent code, which takes a few minutes for an image.  ... 
arXiv:2109.06590v3 fatcat:qppzp4xrzng5pl6pxail22f5wq

TediGAN: Text-Guided Diverse Face Image Generation and Manipulation

Weihao Xia, Yujiu Yang, Jing-Hao Xue, Baoyuan Wu
2021 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
The inversion module maps real images to the latent space of a well-trained StyleGAN.  ...  The proposed method consists of three components: StyleGAN inversion module, visual-linguistic similarity learning, and instance-level optimization.  ...  StyleGAN Inversion Module The inversion module aims at training an image encoder that can map a real face image to the latent space of a fixed StyleGAN model pretrained on the FFHQ dataset [16] .  ... 
doi:10.1109/cvpr46437.2021.00229 fatcat:3nh6zuuemndqbp5ldbl4wgiz2e

StyleGAN-Human: A Data-Centric Odyssey of Human Generation [article]

Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Chen Qian, Chen Change Loy, Wayne Wu, Ziwei Liu
2022 arXiv   pre-print
Equipped with this large dataset, we rigorously investigate three essential factors in data engineering for StyleGAN-based human generation, namely data size, data distribution, and data alignment.  ...  vanilla StyleGAN. 2) A balanced training set helps improve the generation quality with rare face poses compared to the long-tailed counterpart, whereas simply balancing the clothing texture distribution  ...  We thank Hao Zhu, Zhaoyang Liu and Zhuoqian Yang for their feedback and discussions.  ... 
arXiv:2204.11823v1 fatcat:vsb3zbar7bgrfaefdwurevijrq
« Previous Showing results 1 — 15 out of 211 results