2,792 Hits in 17.5 sec

Image Talk: a real time synthetic talking head using one single image with Chinese text-to-speech capability

Woei-Luen Perng, Yungkang Wu, Ming Ouhyoung
Proceedings Pacific Graphics '98. Sixth Pacific Conference on Computer Graphics and Applications (Cat. No.98EX208)  
Image Talk uses a single image to automatically create talking sequences in real time. The image can be acquired from a photograph, video clip, or hand drawn characters.  ...  This interactive system accepts Chinese text and talks back in Mandarin Chinese, generating facial expression in real-time.  ...  Combining with a Chinese Text-To-Speech System Mandarin Chinese Syllables Chinese is a tonal language.  ... 
doi:10.1109/pccga.1998.732094 dblp:conf/pg/PerngWO98 fatcat:3k4n44rkbvat5mg2t3d5eo72cu

AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person [article]

Xinsheng Wang, Qicong Xie, Jihua Zhu, Lei Xie, Scharenborg
2021 arXiv   pre-print
In this paper, we present an automatic method to generate synchronized speech and talking-head videos on the basis of text and a single face image of an arbitrary person as input.  ...  Specifically, the proposed method decomposes the generation of synchronized speech and talking head videos into two stages, i.e., a text-to-speech (TTS) stage and a speech-driven talking head generation  ...  ACKNOWLEDGMENT The authors thank Dong Wang et al. who built the CN-Celeb database provided us the real speaker identities in this database.  ... 
arXiv:2108.04325v2 fatcat:64vpm5cz7za27ov6ltjl6dutui

Visemes of Chinese Shaanxi Xi'an Dialect Talking Head

2019 Acta Polytechnica Hungarica  
In this study, the objective is to identify articulation features and a dynamic system for visual representation of speech sounds for a Shaanxi Xi'an dialect talking head.  ...  For definition of each uttered viseme the visual information obtained is classified and then used to create the dynamic viseme system of the tongue for a talking head using the Shaanxi Xi'an dialect of  ...  A synthetic talking head using computer animation to illustrate the facial motions of lips, the jaw and the tongue with was utilized in training in speech perception and production by Beskow et al.  ... 
doi:10.12700/aph.16.5.2019.5.10 fatcat:c2njdrlk5rft5kuspmlrz7zpoq

Head and facial gestures synthesis using PAD model for an expressive talking avatar

Jia Jia, Zhiyong Wu, Shen Zhang, Helen M. Meng, Lianhong Cai
2013 Multimedia tools and applications  
A PAD-driven talking avatar in text-to-visual-speech system is implemented by generating expressive head motions at the prosodic word level based on the (P, A) descriptors of lexical appraisal, and facial  ...  This paper proposes to synthesize expressive head and facial gestures on talking avatar using the three dimensional pleasure-displeasure, arousal-nonarousal and dominancesubmissiveness (PAD) descriptors  ...  In this version, we extend our approach to be combined with both head and facial gesture, carry out detailed analysis, and present more performance results.  ... 
doi:10.1007/s11042-013-1604-8 fatcat:7fkruq5jvjfgzoxaoeykudxvcm

Facial Expression Synthesis Based on Emotion Dimensions for Affective Talking Avatar [chapter]

Shen Zhang, Zhiyong Wu, Helen M. Meng, Lianhong Cai
2010 Smart Innovation, Systems and Technologies  
The synthetic emotional facial expression is combined with the talking avatar speech animation in a text to audio visual speech system.  ...  The PAD dimensions are used to capture the high-level emotional state of talking avatar with specific facial expression.  ...  facial expression on a single talking avatar.  ... 
doi:10.1007/978-3-642-12604-8_6 fatcat:4bheuwohvfbkrdzzflek4pcjie

Neural Voice Puppetry: Audio-driven Facial Reenactment [article]

Justus Thies, Mohamed Elgharib, Ayush Tewari, Christian Theobalt, Matthias Nießner
2020 arXiv   pre-print
Neural Voice Puppetry has a variety of use-cases, including audio-driven video avatars, video dubbing, and text-driven video synthesis of a talking head.  ...  standard text-to-speech approaches.  ...  To this end, we build on the recent advances in text-to-speech synthesis literature [16, 24] , which is able to provide a synthetic audio stream from a text that can be generated by a digital agent.  ... 
arXiv:1912.05566v2 fatcat:sazmvejurrbu7kadsdjgejc2am

Image-to-Image Translation: Methods and Applications [article]

Yingxue Pang, Jianxin Lin, Tao Qin, Zhibo Chen
2021 arXiv   pre-print
Image-to-image translation (I2I) aims to transfer images from a source domain to a target domain while preserving the content representations.  ...  Additionally, we will elaborate on the effect of I2I on the research and industry community and point out remaining challenges in related fields.  ...  In contrast to the one-shot setting in [164] , [165] that uses a single image from the source domain and a set of images from the target domain, Lin et al. propose TuiGAN [80] to achieve one-shot  ... 
arXiv:2101.08629v2 fatcat:i6pywjwnvnhp3i7cmgza2slnle

Images in Language, Media, and Mind

Roy E. Fox
1995 College composition and communication  
Images can be sensory experiences that exist only for us, with or without the actual stimuli present, or images can refer to actual pictures, from the simplest scrawl on a piece of paper to the ceiling  ...  This book is dedicated to her. a ix book, image (and imaging) refers to any form of mental, pictorial representation, however generic or fleeting.  ...  A single image dramatizes the event.  ... 
doi:10.2307/358338 fatcat:duwc6wzyxjfsblg5ib44hdub5e

Ultra‐high‐speed imaging of bubbles interacting with cells and tissue

Michel Versluis, Philippe Marmottant, Sascha Hilgenfeldt, Claus‐Dieter Ohl, Chien T. Chin, Annemieke van Wamel, Nico de Jong, Detlef Lohse
2006 Journal of the Acoustical Society of America  
CAM algorithm, By slightly compressing or relaxing the body through freehand operation, the strain images are obtained with real-time and superimposed on B-mode images with a translucent color scale.  ...  equivalent to that of a single image ͓Lenz et al., J.  ...  This talk will reminisce a bit about Fred as well as present some results from an ambient noise experiment conducted in 1992 on the continental shelf using the Vertical DIFAR Array co-deployed with MPL's  ... 
doi:10.1121/1.4788217 fatcat:5drbrqyk65cqzff2cmj5ne6n5e

Image splicing detection with local illumination estimation

Yu Fan, Philippe Carre, Christine Fernandez-Maloigne
2015 2015 IEEE International Conference on Image Processing (ICIP)  
GOOGLE GLASS TO ASSIST INDIVIDUALS WITH AUTISM IN JOB INTERVIEWS Sen-Ching Samson CHEUNG, University of Kentucky Neelkamal SOARES, SNT-S1.2 - ANTI-COLLUSION VIDEO WATERMARKING WITH REAL-TIME FALSE ALARM  ...  You are welcome to use this at any time during the official opening hours .  ...  SNT-S8.2 -A REAL-TIME ACTION RECOGNITION SYSTEM USING DEPTH AND INERTIAL SENSOR FUSION Chen CHEN, University of Texas at Dallas Nasser KEHTARNAVAZ, University of Texas at Dallas This demonstration presents  ... 
doi:10.1109/icip.2015.7351341 dblp:conf/icip/FanCF15 fatcat:7ja5gjnp5rafvedc2nman7xcru

Advances and Challenges in Deep Lip Reading [article]

Marzieh Oghbaie, Arian Sabaghi, Kooshan Hashemifard, Mohammad Akbari
2021 arXiv   pre-print
This paper provides a comprehensive survey of the state-of-the-art deep learning based VSR research with a focus on data challenges, task-specific complications, and the corresponding solutions.  ...  Finally, we introduce some typical VSR application concerns and impediments to real-world scenarios as well as future research directions.  ...  In this technique which is employed to generate photo-realistic talking head sequences of unspoken words, the text of utterance is fed to a Text-to-Speech module and then CMU 27 Pronouncing Dictionary  ... 
arXiv:2110.07879v1 fatcat:eimcuzdz5va3vdlgw2g7y25tki

Generation and Detection of Media Clones

Isao ECHIZEN, Noboru BABAGUCHI, Junichi YAMAGISHI, Naoko NITTA, Yuta NAKASHIMA, Kazuaki NAKAMURA, Kazuhiro KONO, Fuming FANG, Seiko MYOJIN, Zhenzhong KUANG, Huy H. NGUYEN, Ngoc-Dung T. TIEU
2021 IEICE transactions on information and systems  
that are generated using high-quality learning data and are very close to the real thing are causing serious social problems.  ...  from fake information and 2) realization of a protection shield for media clones' attacks by recognizing them. key words: media clone, information security, image and speech processing, social media,  ...  Speech-Driven Face Generation [41] A classic approach for talking head or face generation relies on 3D reconstruction of the target face, modifying it in accordance with the target speech or target face  ... 
doi:10.1587/transinf.2020mui0002 fatcat:2tom7k5wrbhbbiha6g2txlgaqi

Full Issue PDF

2014 SMPTE Motion Imaging Journal  
track of synthetic images, implementing a distributed workflow to connect a chain of post production services, or integrating multichannel storytelling and nonlinear distribution services, the real challenge  ...  Overall con- tent management is extremely complex, as it is not centralized Depending on the context, a single file can be renamed up to ten times in the process.  ... 
doi:10.5594/j18476 fatcat:yq2uu5zu35b4bjmy3vjotkijma

Looking to listen at the cocktail party

Ariel Ephrat, Inbar Mosseri, Oran Lang, Tali Dekel, Kevin Wilson, Avinatan Hassidim, William T. Freeman, Michael Rubinstein
2018 ACM Transactions on Graphics  
The visual features are used to "focus" the audio on desired speakers in a scene and to improve the speech separation quality.  ...  We present a joint audio-visual model for isolating a single speech signal from a mixture of sounds such as other speakers and background noise.  ...  We also thank Arkady Ziefman for his help with figure design and video editing, and Rachel Soh for helping us procure permissions for video content in our results.  ... 
doi:10.1145/3197517.3201357 fatcat:naturkvlifd7dfvc6vbn5r4fbm

Themenheft zu Heft 5

(:Unkn) Unknown, Mediarep.Org, Jörg Schirra
2021 Image  
We also thank Sheelagh Carpendale, Pauline Jepp, Petra Neumann, and Martin Schwarz for fruitful discussions on the topic.  ...  Acknowledgments We would like to thank Alberta Ingenuity (AI) for funding this research.  ...  one time can be portrayed in a single diagram".  ... 
doi:10.25969/mediarep/16743 fatcat:uh3wpotdmjgpjciazcj4cmbjie
« Previous Showing results 1 — 15 out of 2,792 results