Filters








88 Hits in 4.5 sec

Towards Using Context-Dependent Symbols in CTC Without State-Tying Decision Trees [article]

Jan Chorowski, Adrian Lancucki, Bartosz Kostka, Michal Zapotoczny
2019 arXiv   pre-print
The latter typically requires reducing the CD symbol inventory with state-tying decision trees, which have to be transferred from classical GMM-HMM systems.  ...  Deep neural acoustic models benefit from context-dependent (CD) modeling of output symbols. We consider direct training of CTC networks with CD outputs, and identify two issues.  ...  by generating them with an auxiliary Context-Dependent Embedding (CDE) neural network, analogous to a state-tying decision tree.  ... 
arXiv:1901.04379v2 fatcat:gh7ry67moba5pdii3tewy4gwre

Towards Using Context-Dependent Symbols in CTC Without State-Tying Decision Trees

Jan Chorowski, Adrian Łańcucki, Bartosz Kostka, Michał Zapotoczny
2019 Interspeech 2019  
The latter typically requires reducing the CD symbol inventory with state-tying decision trees, which have to be transferred from classical GMM-HMM systems.  ...  Deep neural acoustic models benefit from context-dependent (CD) modeling of output symbols. We consider direct training of CTC networks with CD outputs, and identify two issues.  ...  by generating them with an auxiliary Context-Dependent Embedding (CDE) neural network, analogous to a state-tying decision tree.  ... 
doi:10.21437/interspeech.2019-2720 dblp:conf/interspeech/ChorowskiLKZ19 fatcat:aeo74hwxczcj7cndgkcfifnqu4

CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency [article]

Keyu An, Hongyu Xiang, Zhijian Ou
2020 arXiv   pre-print
In this paper, we present a new open source toolkit for speech recognition, named CAT (CTC-CRF based ASR Toolkit).  ...  Experiments show CAT obtains state-of-the-art results, which are comparable to the fine-tuned hybrid models in Kaldi but with a much simpler training pipeline.  ...  The hybrid approach usually consists of an DNN-HMM based acoustic model (AM), a state-tying decision tree for context-dependent phone modeling, a pronunciation lexicon and a language model (LM), which  ... 
arXiv:2005.13326v2 fatcat:ofhytwbf35ai5da4l7wwzzmutm

CAT: A CTC-CRF Based ASR Toolkit Bridging the Hybrid and the End-to-End Approaches Towards Data Efficiency and Low Latency

Keyu An, Hongyu Xiang, Zhijian Ou
2020 Interspeech 2020  
In this paper, we present a new open source toolkit for speech recognition, named CAT (CTC-CRF based ASR Toolkit).  ...  Experiments show CAT obtains state-of-the-art results, which are comparable to the fine-tuned hybrid models in Kaldi but with a much simpler training pipeline.  ...  The hybrid approach usually consists of an DNN-HMM based acoustic model (AM), a state-tying decision tree for context-dependent phone modeling, a pronunciation lexicon and a language model (LM), which  ... 
doi:10.21437/interspeech.2020-2732 dblp:conf/interspeech/AnXO20 fatcat:5psfvsnn2nefdeb5zbgo2xo3iq

CAT: CRF-based ASR Toolkit [article]

Keyu An, Hongyu Xiang, Zhijian Ou
2019 arXiv   pre-print
A key feature of CAT is discriminative training in the framework of conditional random field (CRF), particularly with connectionist temporal classification (CTC) inspired state topology.  ...  Towards flexibility, we show that i-vector based speaker-adapted recognition and latency control mechanism can be explored easily and effectively in CAT.  ...  , forced alignments, or building state-tying decision trees." 2 https://github.com/thu-spmi/cat simplified pipeline and being data-efficient in the sense that cheaply available language models (LMs) can  ... 
arXiv:1911.08747v1 fatcat:meychp57xjd7bhu3ell4dkd2pa

Multi-Spectral Widefield Microscopy of the Beating Heart Through Post-Acquisition Synchronization and Unmixing

Christian Jaques, Linda Bapst-Wicht, Daniel F. Schorderet, Michael Liebling
2019 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019)  
However, context-dependent states modeling creates difficulties for multilingual and cross-lingual ASR because of the large increase in context dependent labels arising from the phone set mismatch.  ...  In the other part of the thesis, we conducted more theoretical analysis of techniques found to be useful in sequential multilingual training.  ...  For a typical ASR task, training the GMM/HMM system involves creating the set of contextdependent states using decision tree based state tying and learning the HMM parameters using the training data.  ... 
doi:10.1109/isbi.2019.8759472 dblp:conf/isbi/JaquesBSL19 fatcat:flypznnglbfrzm3ayf6tsfof34

Towards Consistent Hybrid HMM Acoustic Modeling [article]

Tina Raissi, Eugen Beck, Ralf Schlüter, Hermann Ney
2021 arXiv   pre-print
In this work, we propose a flat-start factored hybrid model trained by modeling the full set of triphone states explicitly without relying on clustering methods.  ...  The same complex pipeline is often utilized in order to generate an alignment for use in frame-wise cross-entropy training.  ...  training of a hybrid model without state-tying.  ... 
arXiv:2104.02387v3 fatcat:k7pnbxi7efgv7djvty5cbhqh6u

Statistical models of morphology predict eye-tracking measures during visual word recognition

Minna Lehtonen, Matti Varjokallio, Henna Kivikari, Annika Hultén, Sami Virpioja, Tero Hakala, Mikko Kurimo, Krista Lagus, Riitta Salmelin
2019 Memory & Cognition  
Thus, we assume that the good performance of such models in global measures such as gaze durations or reaction times in lexical decision largely stems from postlexical reanalysis or decision processes.  ...  More specifically, we studied the predictive power of such models at early vs. late stages of word recognition by using eye-tracking during two tasks.  ...  Lexical decision times were studied in an eye-tracking context in [82] .  ... 
doi:10.3758/s13421-019-00931-7 pmid:31102191 pmcid:PMC6800854 fatcat:lbrhtvfbo5dlhorny4g6iihdei

RWTH OCR: A Large Vocabulary Optical Character Recognition System for Arabic Scripts [chapter]

Philippe Dreuw, David Rybach, Georg Heigold, Hermann Ney
2012 Guide to OCR for Arabic Scripts  
Unlike cursive writing based on the Latin alphabet, the stan-3 dard Arabic style has substantially different shapes depending on the glyph context. 4 Standard Arabic Unicode character encodings do typically  ...  Evaluation on state-of-the-art systems. Ideally, we directly improve over the best 24 discriminative system, e.g. conventional (i.e., without margin) MMI/MPE for 25 handwriting recognition. 26 6.  ...  The toolkit supports context dependent 4 modeling of subunits (glyphs for OCR, phones for ASR) using decision trees for 5 HMM state model tying.  ... 
doi:10.1007/978-1-4471-4072-6_9 fatcat:haxdzanqlffztfxvjdbz6x737i

Environmental justice in Cuba

Karen Bell
2011 Critical Social Policy  
environmental decision-making process.  ...  Environmental justice' refers to the human right to a healthy and safe environment, a fair share of natural resources, access to environmental information and participation in environmental decision-making  ...  For us, it does not make sense to chain yourselves to trees and things like that . . .  ... 
doi:10.1177/0261018310396032 fatcat:wnlie2pjqfdf5c7fzmrkdfbqdm

End-to-End Acoustic Modeling using Convolutional Neural Networks for HMM-based Automatic Speech Recognition

Dimitri Palaz, Mathew Magimai-Doss, Ronan Collobert
2019 Speech Communication  
Motivated from these studies, we propose an end-to-end acoustic modeling approach using convolution neural networks (CNNs), where the CNN takes as input raw speech signal and estimates the HMM states class  ...  In hidden Markov model (HMM) based automatic speech recognition (ASR) system, modeling the statistical relationship between the acoustic speech signal and the HMM states that represent linguistically motivated  ...  Michael Liebling for his critical inputs regarding the analysis of the first convolution layer in Section 6.1.3.  ... 
doi:10.1016/j.specom.2019.01.004 fatcat:ch64ijeyzbcrvhje2d4glbwaxe

Unsupervised Spoken Term Discovery on Untranscribed Speech [article]

Man-Ling Sung
2020 arXiv   pre-print
(Part of the abstract) In this thesis, we investigate the use of unsupervised spoken term discovery in tackling this problem.  ...  Unsupervised spoken term discovery aims to discover topic-related terminologies in a speech without knowing the phonetic properties of the language and content.  ...  Illustrated in Figure 5.6, the input to the model covers 3 left-context word clusters (or phrases) and 3 right-context word clusters, without any overlapping with the target word cluster w(n).  ... 
arXiv:2011.14060v1 fatcat:vqxrmzjq35codkddbza6fdd4a4

Social Policies and Institutional Reform in Post-COVID Cuba: A Necessary Agenda [chapter]

Bert Hoffmann
2021 Social Policies and Institutional Reform in Post-COVID Cuba  
Introduction Contemporary Cuba faces severe difficulties on multiple fronts -the Covid pandemic; the persistence of unilateral US sanctions; the foundering of chavismo in Venezuela; the slow exit of the  ...  In the event Fidel Castro stood down as president in 2008, and died in 2016.  ...  conducted as part of two research projects funded by the National Science Centre, Poland, number 2017/25/N/HS3/00315 and number 2019/32/T/HS3/00379, and during a research stay at the Ibero-American Institute in  ... 
doi:10.3224/84742546.01 fatcat:5tnpim67sfcptjd7ad34glz3vq

Continuities in Cuban Revolutionary Politics

Richard R. Fagen
1972 Monthly review  
(Style is used here in its dictionary sense of a "characteristic mode of presentation, construction, or execution.... ") If so, how does an appreciation of this style contribute to an understanding of  ...  In the pages that follow I shall not indulge in detailed description or analysis of what the Cuban revolutionary government has tried to do or how it has gone about trying to do it.  ...  -engineered decisions of the Organization of American States, Cuba has continued to express solidarity with "the people" of the hemisphere.  ... 
doi:10.14452/mr-023-11-1972-04_2 fatcat:rssdwasz3zgr3avwhrj3syo6wm

Deep Spoken Keyword Spotting: An Overview

Ivan Lopez-Espejo, Zheng-Hua Tan, John Hansen, Jesper Jensen
2021 IEEE Access  
TY KWS acoustic modeling.  ...  In this context, filterbank parameters are tuned especially useful for personalized, open-vocabulary KWS, by towards optimizing word posterior generation.  ... 
doi:10.1109/access.2021.3139508 fatcat:i4pfpfxcpretlkbefp7owtxcti
« Previous Showing results 1 — 15 out of 88 results