Filters








979 Hits in 4.3 sec

APB2FaceV2: Real-Time Audio-Guided Multi-Face Reenactment [article]

Jiangning Zhang, Xianfang Zeng, Chao Xu, Jun Chen, Yong Liu, Yunliang Jiang
2020 arXiv   pre-print
To solve the above challenge, we propose a novel Real-time Audio-guided Multi-face reenactment approach named APB2FaceV2, which can reenact different target faces among multiple persons with corresponding  ...  Audio-guided face reenactment aims to generate a photorealistic face that has matched facial expression with the input audio.  ...  audio-guided multi-face reenactment Go End-to-end and efficient network • Adaptive Convolution • Audio-guided multi-face reenactment Related Fig. 2 . 2 Experimental results among multiple persons on  ... 
arXiv:2010.13017v1 fatcat:cgglya3urrh7nmbpjqmai37afa

APB2Face: Audio-guided face reenactment with auxiliary pose and blink signals [article]

Jiangning Zhang, Liang Liu, Zhucun Xue, Yong Liu
2020 arXiv   pre-print
Audio-guided face reenactment aims at generating photorealistic faces using audio information while maintaining the same facial movement as when speaking to a real person.  ...  GeometryPredictor uses extra head pose and blink state signals as well as audio to predict the latent landmark geometry information, while FaceReenactor inputs the face landmark image to reenact the photorealistic  ...  Face Reenactment via Audio. Some recent works reenact face by predicting parameters of the predefined face model [9, 10, 11] . Tian et al.  ... 
arXiv:2004.14569v1 fatcat:uqogtp4f35avpbqpmo3cqgxxgi

Supplementary Evidence: Towards Higher Levels of Assurance in Remote Identity Proofing [article]

Jongkil Jeong, Syed Wajid Ali Shah, Ashish Nanda, Robin Doss
2022 figshare.com  
following topics:Quality Requirements for Identity EvidenceStrength of Methods Employed for Evidence ValidationPopular approaches for generating Replacement DeepfakesPopular approaches for generating Reenactment  ...  Reenactment (Pose) Pose-Guided [15] Leverages Couple-Agent Pose-Guided Generative Adversarial Network (CAPG-GAN) for face rotation to synthesize arbitrary view of images.  ...  Reenactment (Pose) Multi-View Face Image Synthesis [16] Leverages 3D aided duet generative adversarial networks (AD-GAN) to rotate input face image to any angle Same as [15] Reenactment (Gaze) User-Specific  ... 
doi:10.6084/m9.figshare.19119680.v2 fatcat:ijki7jkshzbrfhk7ufsfuh2ri4

The Creation and Detection of Deepfakes: A Survey [article]

Yisroel Mirsky, Wenke Lee
2020 arXiv   pre-print
Generative deep learning algorithms have progressed to a point where it is difficult to tell the difference between what is real and what is fake.  ...  Finally, a new trend is real-time deepfakes. Works such as [74, 121] have achieved real-time deepfakes at 30fps.  ...  Using Multi-Modal Sources. In [172] the authors propose X2Face which can reenact x t with x s or some other modality such as audio or a pose vector.  ... 
arXiv:2004.11138v3 fatcat:xqabyslmdfhyznm7msqp3wznnq

Audio-Visual Person-of-Interest DeepFake Detection [article]

Davide Cozzolino, Matthias Nießner, Luisa Verdoliva
2022 arXiv   pre-print
In addition, our method can detect both single-modality (audio-only, video-only) and multi-modality (audio-video) attacks, and is robust to low-quality or corrupted videos by building only on high-level  ...  We leverage a contrastive learning paradigm to learn the moving-face and audio segments embeddings that are most discriminative for each identity.  ...  To generate cloned fake audio a transfer learning-based real-time voice cloning tool (SV2TTS [22] ) is used.  ... 
arXiv:2204.03083v1 fatcat:76hgejh5jvgjnejf2wdubhr7dm

StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN [article]

Fei Yin and Yong Zhang and Xiaodong Cun and Mingdeng Cao and Yanbo Fan and Xuan Wang and Qingyan Bai and Baoyuan Wu and Jue Wang and Yujiu Yang
2022 arXiv   pre-print
One-shot talking face generation aims at synthesizing a high-quality talking face video from an arbitrary portrait image, driven by a video or an audio segment.  ...  Our framework elevates the resolution of the synthesized talking face to 1024*1024 for the first time, even though the training dataset has a lower resolution.  ...  Top row: a real face is driven by a real face. Bottom row: a synthetic face is driven by a real face. Real faces are from HDTF [72]. Synthetic faces are sampled from StyleGAN.  ... 
arXiv:2203.04036v2 fatcat:uyo7v5gvefgbnlxzxa5gp7bafy

Face2Face: Real-Time Face Capture and Reenactment of RGB Videos

Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, Matthias NieBner
2016 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
Proposed online reenactment setup: a monocular target video sequence (e.g., from Youtube) is reenacted based on the expressions of a source actor who is recorded live with a commodity webcam.  ...  We thank Angela Dai for the video voice over and Daniel Ritchie for video reenactment.  ...  Acknowledgements We would like to thank Chen Cao and Kun Zhou for the blendshape models and comparison data, as well as Volker Blanz, Thomas Vetter, and Oleg Alexander for the provided face data.  ... 
doi:10.1109/cvpr.2016.262 dblp:conf/cvpr/ThiesZSTN16 fatcat:sjn6w57wlfgpnkeuhciotnmtym

Face2Face: Real-time Face Capture and Reenactment of RGB Videos [article]

Justus Thies and Michael Zollhöfer and Marc Stamminger and Christian Theobalt and Matthias Nießner
2020 arXiv   pre-print
We demonstrate our method in a live setup, where Youtube videos are reenacted in real time.  ...  We present Face2Face, a novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video).  ...  This is a preprint of the accepted version of the following CVPR2016 article: "Face2Face: Real-time Face Capture and Reenactment of RGB Videos".  ... 
arXiv:2007.14808v1 fatcat:crdeml5vjnhabhfnwybiwlhlai

Talking Faces: Audio-to-Video Face Generation [chapter]

Yuxin Wang, Linsen Song, Wayne Wu, Chen Qian, Ran He, Chen Change Loy
2022 Advances in Computer Vision and Pattern Recognition  
AbstractTalking face generation aims at synthesizing coherent and realistic face sequences given an input speech.  ...  Despite great research efforts in talking face generation, the problem remains challenging due to the need for fine-grained control of face components and the generalization to arbitrary sentences.  ...  [58] proposed the first real-time face reenactment system by transferring the expression coefficients of a source actor to a target actor while preserving person-specificness. Gecer et al.  ... 
doi:10.1007/978-3-030-87664-7_8 fatcat:5qh2bxrthrbthgjwjzlmm3je4i

FaR-GAN for One-Shot Face Reenactment [article]

Hanxiang Hao and Sriram Baireddy and Amy R. Reibman and Edward J. Delp
2020 arXiv   pre-print
This face reenactment process is challenging due to the complex geometry and movement of human faces.  ...  In this paper, we present a one-shot face reenactment model, FaR-GAN, that takes only one face image of any given source identity and a target expression as input, and then produces a face image of the  ...  [27] propose a real-time face reenactment approach based on the 3D morphable face model (3DMM) [1] of the source and target faces.  ... 
arXiv:2005.06402v1 fatcat:adicx22cabhqzizningfm26tnm

Head2Head++: Deep Facial Attributes Re-Targeting [article]

Michail Christos Doukas, Mohammad Rami Koujan, Viktoriia Sharmanska, Anastasios Roussos
2020 arXiv   pre-print
Most importantly, our system performs end-to-end reenactment in nearly real-time speed (18 fps).  ...  We leverage the 3D geometry of faces and Generative Adversarial Networks (GANs) to design a novel deep learning architecture for the task of facial and head reenactment.  ...  During test time, our Head2Head++ pipeline performs head reenactment from web-camera captures in nearly real-time speeds (18 fps).  ... 
arXiv:2006.10199v1 fatcat:ylapnwbkzjes5gejb4zmnou6ym

VR content creation and exploration with deep learning: A survey

Miao Wang, Xu-Quan Lyu, Yi-Jun Li, Fang-Lue Zhang
2020 Computational Visual Media  
Virtual reality (VR) offers an artificial, computer generated simulation of a real life environment.  ...  to generate the reenacted face.  ...  It first converts input audio to a time-varying sparse mouth shape based on RNN and learns the mapping from raw audio features to mouth shape.  ... 
doi:10.1007/s41095-020-0162-z fatcat:lgogzx26bvhn5f7uyefjkz7zny

A comprehensive survey on semantic facial attribute editing using generative adversarial networks [article]

Ahmad Nickabadi, Maryam Saeedi Fard, Nastaran Moradzadeh Farid, Najmeh Mohammadbagheri
2022 arXiv   pre-print
Based on their architectures, the state-of-the-art models are categorized and studied as encoder-decoder, image-to-image, and photo-guided models.  ...  Among different domains, face photos have received a great deal of attention and a large number of face generation and manipulation models have been proposed.  ...  CrossID-GAN [65] is a landmark-guided model to reenact a target image with a driving video to generate a moving video of the target face.  ... 
arXiv:2205.10587v1 fatcat:thpe4crcgndifb5mhtuveww4ji

Towards Realistic Visual Dubbing with Heterogeneous Sources [article]

Tianyi Xie, Liucheng Liao, Cheng Bi, Benlai Tang, Xiang Yin, Jianfei Yang, Mingjie Wang, Jiali Yao, Yang Zhang, Zejun Ma
2022 arXiv   pre-print
In practice, it may be intractable to collect the perfect homologous data in some cases, for example, audio-corrupted or picture-blurry videos.  ...  Albeit moderate improvements in current approaches, they commonly require high-quality homologous data sources of videos and audios, thus causing the failure to leverage heterogeneous data sufficiently  ...  Besides, [31] makes real-time talking head generation possible in the few-shot setting.  ... 
arXiv:2201.06260v1 fatcat:nzenrmsfinbqrnynka5aw7cmce

FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning [article]

Chenxu Zhang, Yifan Zhao, Yifei Huang, Ming Zeng, Saifeng Ni, Madhukar Budagavi, Xiaohu Guo
2021 arXiv   pre-print
In this paper, we propose a talking face generation method that takes an audio signal as input and a short target video clip as reference, and synthesizes a photo-realistic video of the target face with  ...  To model such complicated relationships among different face attributes with input audio, we propose a FACe Implicit Attribute Learning Generative Adversarial Network (FACIAL-GAN), which integrates the  ...  We further compare our method with the audio-driven facial reenactment methods [28, 29] , which first generate the lip area that is in sync with the input audio, and compose it to an original video.  ... 
arXiv:2108.07938v1 fatcat:2nodbkvg3rh3fifbfnc2byowjy
« Previous Showing results 1 — 15 out of 979 results