A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Filters
A deep learning approach for generalized speech animation
2017
ACM Transactions on Graphics
We introduce a simple and efective deep learning approach to automatically generate natural looking speech animation that synchronizes to input speech. ...
A machine learning approach is used to learn a regression function mapping phoneme labels to speech animation. ...
Scott Jones at Lucasilm and Hao Li at USC generously provided facial rigs. Thanks to the diverse members of Disney Research Pittsburgh who recorded foreign language speech examples. ...
doi:10.1145/3072959.3073699
fatcat:w42eaqtt4rbudmowb63veaofzq
End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech
[article]
2017
arXiv
pre-print
We present a deep learning framework for real-time speech-driven 3D facial animation from just raw waveforms. ...
In particular, our deep model is able to learn the latent representations of time-varying contextual information and affective states within the speech. ...
DEEP END-TO-END LEARNING FOR 3D FACE SYNTHESIS FROM SPEECH
A. ...
arXiv:1710.00920v2
fatcat:bis4z3hys5dxhg2tf3hx24b7eq
Speech-Driven 3D Facial Animation with Implicit Emotional Awareness: A Deep Learning Approach
2017
2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
We introduce a long short-term memory recurrent neural network (LSTM-RNN) approach for real-time facial animation, which automatically estimates head rotation and facial action unit activations of a speaker ...
Experiments on an evaluation dataset of different speakers across a wide range of affective states demonstrate promising results of our approach in real-time speech-driven facial animation. ...
Conclusion and Future Work This paper presents a deep recurrent learning approach for speech-driven 3D facial animation. ...
doi:10.1109/cvprw.2017.287
dblp:conf/cvpr/PhamCP17
fatcat:2pgkt24qjfg7vl2iciqujrexte
Facial Modelling and Animation: An Overview of The State-of-The Art
2021
Iraqi Journal for Electrical And Electronic Engineering
, moving pictures experts group-4 facial animation, physics-based muscle modeling, performance driven facial animation, visual speech animation. ...
This paper reviewed the approaches used in facial modeling and animation and described their strengths and weaknesses. ...
For this approach it is difficult to generate a single AU respectively without touched the other AU. With the recent rise of deep learning, CNN have been widely used to extract AU features. Zhao et al ...
doi:10.37917/ijeee.18.1.4
fatcat:yububcsiznam3ozsazq5kn6pmi
A Translation System That Converts English Text to American Sign Language Enhanced with Deep Learning Modules
2019
VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE
In doing so, we are able to achieve both the accuracy of a rule-based approach and the scale of a deep learning one. ...
(NLP) and Deep Learning. ...
The enhancement module to this phase is the paraphrase generator which uses deep learning. ...
doi:10.35940/ijitee.l3781.1081219
fatcat:xu7fhimjyneynczptah2t6qmiy
Deep learning approaches for neural decoding: from CNNs to LSTMs and spikes to fMRI
[article]
2020
arXiv
pre-print
The success of deep networks in other domains has led to a new wave of applications in neuroscience. In this article, we review deep learning approaches to neural decoding. ...
Deep learning has been shown to be a useful tool for improving the accuracy and flexibility of neural decoding across a wide range of tasks, and we point out areas for future scientific development. ...
Acknowledgements We would like to thank Ella Batty and Charles Frye for very helpful comments on this manuscript. ...
arXiv:2005.09687v1
fatcat:grboww5ptvah5npbl3xeehbady
VisemeNet: Audio-Driven Animator-Centric Speech Animation
[article]
2018
arXiv
pre-print
We present a novel deep-learning based approach to producing animator-centric speech motion curves that drive a JALI or standard FACS-based production face-rig, directly from input audio. ...
We evaluate our results by: cross-validation to ground-truth data; animator critique and edits; visual comparison to recent deep-learning lip-synchronization solutions; and showing our approach to be resilient ...
We thank Pif Edwards and anonymous reviewers for their valuable feedback. ...
arXiv:1805.09488v1
fatcat:ahgmxkuawjg7hguwc37rs5lezy
Multimodal Speech Driven Facial Shape Animation Using Deep Neural Networks
2018
2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
In this paper we present a deep learning multimodal approach for speech driven generation of face animations. ...
Training a speaker independent model, capable of generating different emotions of the speaker, is crucial for realistic animations. ...
ACKNOWLEDGMENT We thank to NVIDIA for donating Titan XP within the GPU Grant Program. ...
doi:10.23919/apsipa.2018.8659713
dblp:conf/apsipa/AsadiabadiSE18
fatcat:hsbfi7jaxjcn7loyxdwelhs434
Emotion Dependent Facial Animation from Affective Speech
[article]
2019
arXiv
pre-print
In this paper, we present a two-stage deep learning approach for affective speech driven facial shape animation. In the first stage, we classify affective speech into seven emotion categories. ...
The proposed emotion dependent facial shape model performs better in terms of the Mean Squared Error (MSE) loss and in generating the landmark animations, as compared to training a universal model regardless ...
In our earlier work, [22] , we proposed a deep multi-modal framework, combining the work of [19] using phoneme sequence with spectral speech features to generate facial animations. ...
arXiv:1908.03904v1
fatcat:olupfm2egreqdge66ez5jye3ay
Joint Learning of Speech-Driven Facial Motion with Bidirectional Long-Short Term Memory
[chapter]
2017
Lecture Notes in Computer Science
The face conveys a blend of verbal and nonverbal information playing an important role in daily interaction. ...
These relationships are ignored when facial movements across the face are separately generated. ...
Speech-Driven Models with Deep Learning Deep learning structures are very powerful to learn complex temporal relationships between modalities, hence, they are a perfect framework for speech-driven models ...
doi:10.1007/978-3-319-67401-8_49
fatcat:6j52e3odbfekbmr6piot2lhy54
Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation
[article]
2021
arXiv
pre-print
The first stage is a deep neural network that extracts deep audio features along with a manifold projection to project the features to the target person's speech space. ...
To the best of our knowledge, we first present a live system that generates personalized photorealistic talking-head animation only driven by audio signals at over 30 fps. ...
ACKNOWLEDGMENTS We would like to thank Shuaizhen Jing for the help with the Tensorrt implementation. We are grateful to Qingqing Tian for the facial capture. ...
arXiv:2109.10595v2
fatcat:s35nqajynjeefcx67k42rpr7r4
Audio-to-Visual Speech Conversion Using Deep Neural Networks
2016
Interspeech 2016
We present a sliding window deep neural network that learns a mapping from a window of acoustic features to a window of visual features from a large audio-visual speech dataset. ...
Overlapping visual predictions are averaged to generate continuous, smoothly varying speech animation. ...
Sliding-Window Deep Neural Network The goal of this work is to learn a model h(x) := y that can predict a realistic facial pose for any audio speech given audio features x that encode the acoustic speech ...
doi:10.21437/interspeech.2016-483
dblp:conf/interspeech/TaylorKMM16
fatcat:y7nb5la3kngtrlq2vkpp7lqkay
Investigating the use of recurrent motion modelling for speech gesture generation
2018
Proceedings of the 18th International Conference on Intelligent Virtual Agents - IVA '18
Machine learning approaches have yielded only marginal success, indicating a high complexity of the speech-to-motion learning task. ...
In this work, we explore the use of transfer learning using previous motion modelling research to improve learning outcomes for gesture generation from speech. ...
As an alternative approach, research has explored methods to automatically generate animation for virtual humans from speech. ...
doi:10.1145/3267851.3267898
dblp:conf/iva/FerstlM18
fatcat:qhbirpnbuzac3mytoqhosfz5pe
Expressive talking avatar synthesis and animation
2015
Multimedia tools and applications
Specific applications may include a virtual storyteller for children, a virtual guider or presenter for personal or commercial website, a representative of user in computer games and a funny puppetry for ...
The talking avatar, an animated speaking virtual character with vivid human-like appearance and real or synthetic speech, has gradually shown its potential in applications involving human-computer intelligent ...
Taking the advantage of the rich non-linear learning ability, Wu et al. [13] develop a DNN approach for real-time speech driven talking avatar. ...
doi:10.1007/s11042-015-2460-5
fatcat:otot2mqdcjbpzoucbbsbbj7nse
DECAR: Deep Clustering for learning general-purpose Audio Representations
[article]
2022
arXiv
pre-print
In this paper, we introduce DECAR (DEep Clustering for learning general-purpose Audio Representations), a self-supervised pre-training approach for learning general-purpose audio representations. ...
, including speech, music, animal sounds, and acoustic scenes. ...
Common general-purpose audio representation learning approaches include [10, 11, 12, 13, 14, 15, 16] . ...
arXiv:2110.08895v3
fatcat:6yszijdh75bdrmjf2lntsyqlrq
« Previous
Showing results 1 — 15 out of 74,907 results