A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Auxiliary Multimodal LSTM for Audio-visual Speech Recognition and Lipreading
[article]
2017
arXiv
pre-print
I would also thank my laboratory for providing computational resources. ...
Acknowledgments I would like to acknowledge my friends Yi Sun, Wenqiang Yang and Peng Xie for helpful support. I would like to thank the developers of Torch [6] and Tensorflow [1] . ...
of DNN can be exploited for varieties of tasks, especially the CNN for image feature extraction. ...
arXiv:1701.04224v2
fatcat:qezsua73qrewtnweanyojrzv7y
Learning from Videos with Deep Convolutional LSTM Networks
[article]
2019
arXiv
pre-print
We describe our experiments involving convolution LSTMs for lipreading that demonstrate the model is capable of selectively choosing which spatiotemporal scales are most relevant for a particular dataset ...
spatiotemporal features existent within the problem. ...
Spatiotemporal Features Sensitivity Analysis The internal cell states and the hidden states are reset to 0 before processing a new sequence. ...
arXiv:1904.04817v1
fatcat:ck5bx6dexvhw5ir7xr4ys5auo4
What accounts for individual differences in susceptibility to the McGurk effect?
2018
PLoS ONE
These results provide support for the claim that a small amount of the variability in susceptibility to the McGurk effect is attributable to lipreading skill. ...
., a more fine-grained analysis of lipreading ability), but not to scores on tasks measuring attentional control, processing speed, working memory capacity, or auditory perceptual gradiency. ...
Acknowledgments Authors' note: We are grateful to Eun Jong Kong and Jan Edwards for providing stimuli for the Visual Analogue Scale task, to Hunter Brown for feedback on an early draft of the paper, and ...
doi:10.1371/journal.pone.0207160
pmid:30418995
pmcid:PMC6231656
fatcat:tfcnvc76xvc6biwize7mfunrmq
About Face: Seeing the Talker Improves Spoken Word Recognition but Increases Listening Effort
2019
Journal of Cognition
Acknowledgements The authors would like to thank Jonathan Peelle for helpful feedback on an earlier draft of this paper and the undergraduate research assistants at Carleton College for helpful conversations ...
One possible explanation for this finding is that the effort required to process AV speech slows response times to the secondary task, and this slowing allows participants to have more time to accurately ...
For all mixed effects models reported in this paper, we attempted to utilize the maximal random effects structure justified by the design (Barr, Levy, Scheepers, & Tily, 2013) . ...
doi:10.5334/joc.89
pmid:31807726
pmcid:PMC6873894
fatcat:agvmhhy3n5fo7opc42mlkv6y2m
Page 2712 of Psychological Abstracts Vol. 71, Issue 10
[page]
1984
Psychological Abstracts
An analysis of probable environmental input and of the features’ utility in separating already-counted from to-be-counted objects is proposed to account for the relative probabilities that Ss knew that ...
It is concluded that movement of stimulus features need not account for the extensive recency advantage in remembering lipread
seaal difficulty Gutnautthing Leb-cihe aad an bts eed Ot tee
DEVELOPMENTAL ...
Comparison of Spatiotemporal Networks for Learning Video Related Tasks
[article]
2020
arXiv
pre-print
Many methods for learning from video sequences involve temporally processing 2D CNN features from the individual frames or directly utilizing 3D convolutions within high-performing 2D CNN architectures ...
spatiotemporal features. ...
This issue becomes more apparent after viewing recent techniques for lipreading. ...
arXiv:2009.07338v1
fatcat:bgc4ixqc6faybfascyj236dae4
Page 1440 of Psychological Abstracts Vol. 42, Issue 9
[page]
1968
Psychological Abstracts
—The performance of 110 aphasic Ss on 25 subtests of a battery of tests for spoken language were subjected to a dimensional or factorial analysis. 2 factors were found which accounted for 95.9°, of the ...
Slow Learning Child: The Australian Journal on the Education of Backward Children, 1967, 14(2), 117- 122. ...
Tactual display of consonant voicing as a supplement to lipreading
2005
Journal of the Acoustical Society of America
This research is concerned with the development of tactual displays to supplement the information available through lipreading. ...
A special thank to Andy Brughera for the funny times working together, and to Lorraine Delhorne for her help in speech segmentation. ...
Chapter 3 describes the method of lipreading, and motivation for pursuing study of the feature of voicing as a supplement to lipreading. ...
doi:10.1121/1.1945787
pmid:16158656
fatcat:vd4vbwiyzrb37imlpfa4ptwc6m
Point-Light Facial Displays Enhance Comprehension of Speech in Noise
1996
Journal of Speech, Language and Hearing Research
These results have implications for uncovering salient visual speech information as well as the development of telecommunication systems for listeners who are hearing-impaired. ...
There is little known about which characteristics of the face are useful for enhancing the degraded signal. ...
For example, additional points on relatively slow-moving articulators might help provide better references against which movement of the more animated points are seen. ...
doi:10.1044/jshr.3906.1159
fatcat:64ux7zmnbjad7acee23doqmdu4
A comparison of bound and unbound audio–visual information processing in the human cerebral cortex
2002
Cognitive Brain Research
A region-of-interest analysis of the STS and parietal areas found no difference between audio-visual conditions. ...
However, this analysis found that synchronized audio-visual stimuli led to a higher signal change in the claustrum region. ...
Frost, and Ainer Mencel for help in stimulus preparation. ...
doi:10.1016/s0926-6410(02)00067-8
pmid:12063136
fatcat:gszmefbg6vhnbm3tfzfroiwqoa
A Chinese Lip-Reading System Based on Convolutional Block Attention Module
2021
Mathematical Problems in Engineering
We also add the time attention mechanism to the GRU neural network, which helps to extract the features among consecutive lip motion images. ...
Considering the effects of the moments before and after on the current moment in the lip-reading process, we assign more weights to the key frames, which makes the features more representative. ...
Experiment Evaluation and Analysis (1) Experiment Evaluation. ...
doi:10.1155/2021/6250879
fatcat:qbbvesio6vgdxnu4chuikhalsa
Local spatiotemporal descriptors for visual recognition of spoken phrases
2007
Proceedings of the international workshop on Human-centered multimedia - HCM '07
Spatiotemporal local binary patterns extracted from these regions are used for describing phrase sequences. ...
Positions of the eyes determined by a robust face and eye detector are used for localizing the mouth regions in face images. ...
method which uses a non-linear scale-space analysis to form features directly from the pixel intensity. ...
doi:10.1145/1290128.1290138
dblp:conf/mm/ZhaoPH07
fatcat:uooot3dn35a7faffo2e34betii
Cued Speech and the Reception of Spoken Language
1982
Journal of Speech, Language and Hearing Research
from adding an acoustic signal to lipreading result from the subjects· utilization of tüne-intensity eues alone. '!' ...
An analysis of .. varianoe comparinq the results for syllables, and high and low predictability sentences was carried out. ...
Speech envelope eues as an acoustic aid to lipreading for profoqndly deaf children. J. Acoust. Soc. Amer., 51, Hannah, E.P. Speechreading: some 1ingui~tic factors. ...
doi:10.1044/jshr.2502.262
fatcat:gfvrpitbn5falbb3ieyudst3ny
Visual Speech Recognition with Lightweight Psychologically Motivated Gabor Features
2020
Entropy
One key difference between using these Gabor-based features and using other features such as traditional DCT, or the current fashion for CNN features is that these are human-centric features that can be ...
features (produced using Gabor-based image patches), can successfully be used for speech recognition with LSTM-based machine learning. ...
Acknowledgments: The authors would like to thank Erick Purwanto for his contributions, and Cynthia Marquez for her vital assistance. ...
doi:10.3390/e22121367
pmid:33279914
fatcat:sqmegoznnjdqdgndddfh4vqelu
Fast Feature Extraction Approach for Multi-Dimension Feature Space Problems
2006
18th International Conference on Pattern Recognition (ICPR'06)
Recently, we proposed a fast feature extraction approach denoted FSOM utilizes Self Organizing Map (SOM). FSOM [1] overcomes the slowness of traditional SOM search algorithm. ...
Again, we show here how is FSOM reduces the feature extraction time of traditional SOM drastically while preserving same SOM's qualities. ...
Figure 1 shows samples for the utilized images. ...
doi:10.1109/icpr.2006.545
dblp:conf/icpr/SagheerTTAM06
fatcat:7c34tqmdzncxphab3eazriloxy
« Previous
Showing results 1 — 15 out of 248 results