Filters








38,598 Hits in 3.0 sec

Performance Evaluation of Automatic Speaker Recognition Techniques for Forensic Applications [chapter]

Francesco Beritelli, Andrea Spadaccini
<span title="2012-11-28">2012</span> <i title="InTech"> New Trends and Developments in Biometrics </i> &nbsp;
that first estimates the characteristics of background noise and then select the algorithm that performs better in that context [20] .  ...  More broadly, based on the results of the research work described in this chapter, it is clear that both the SNR estimation and VAD algorithm selection problems could benefit from an adaptive approach  ...  In automatic or semi-automatic speaker recognition, background noise is one of the main causes of alteration of the acoustic indexes used in the biometric recognition phase [3, 4] .  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5772/52000">doi:10.5772/52000</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/g5iotu6n2neavf3uw3scmvanpq">fatcat:g5iotu6n2neavf3uw3scmvanpq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170922195501/https://cdn.intechopen.com/pdfs-wm/40309.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f4/cb/f4cb140589cfa91b407d1d96e68cb01443ed64b4.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5772/52000"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

UT-Scope: Towards LVCSR under Lombard effect induced by varying types and levels of noisy background

Hynek Boril, John H. L. Hansen
<span title="">2011</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/rc5jnc4ldvhs3dswicq5wk3vsq" style="color: black;">2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</a> </i> &nbsp;
The impact of noisy background variations on speech parameters is studied together with the effects on automatic speech recognition.  ...  Adverse environments impact the performance of automatic speech recognition systems in two ways -directly by introducing acoustic mismatch between the speech signal and acoustic models, and indirectly  ...  The goal is to analyze speech production variations as a function of type and level of noise, and their impact on automatic speech recognition, with a particular focus on the evaluation of efficiency of  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icassp.2011.5947347">doi:10.1109/icassp.2011.5947347</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/icassp/BorilH11.html">dblp:conf/icassp/BorilH11</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/voex3n73srdnbe5lhbxjbfju5y">fatcat:voex3n73srdnbe5lhbxjbfju5y</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170810133958/http://www.utdallas.edu/~hxb076000/pdfs/Hynek_ICASSP11.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/80/ac/80ac32218e4a15f08392cae091100aa1a5807e87.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icassp.2011.5947347"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Automatic Segmentation of Spontaneous Speech / Segmentação automática da fala espontânea

Brigitte Bigi, Christine Meunier
<span title="2018-10-11">2018</span> <i title="Faculdade de Letras da UFMG"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/qdz4qypxgnaarg2msoofhjt72e" style="color: black;">Revista de Estudos da Linguagem</a> </i> &nbsp;
This paper aims to compare read speech and spontaneous speech in order to evaluate the impact of speech style on a speech segmentation task.  ...  They occur in a maximum of 3 % of the Inter-Pausal Units of a read speech corpus and from 20 % up to 36 % of the Inter-Pausal Units in the spontaneous speech corpora.  ...  Most of these aspects can have an impact on speech segmentation for both the grapheme-to-phoneme and the alignment tasks.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.17851/2237-2083.26.4.1489-1530">doi:10.17851/2237-2083.26.4.1489-1530</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hb5q5z5d4fgt3obeavdoemmaia">fatcat:hb5q5z5d4fgt3obeavdoemmaia</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200313145759/https://hal.archives-ouvertes.fr/hal-01908434/file/13026-1125612814-3-PB.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f9/57/f957fad5ecd9f24d5a9af2e1203ded4db5ff98cb.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.17851/2237-2083.26.4.1489-1530"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

Does noise affect learning? A short review on noise effects on cognitive performance in children

Maria Klatte, Kirstin Bergström, Thomas Lachmann
<span title="">2013</span> <i title="Frontiers Media SA"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/5r5ojcju2repjbmmjeu5oyawti" style="color: black;">Frontiers in Psychology</a> </i> &nbsp;
Experimental studies addressing the impact of acute exposure showed negative effects on speech perception and listening comprehension.  ...  The impact of chronic exposure to noise was examined in quasi-experimental studies.  ...  The impact of attention capture depends on characteristics of the sound, and on the attentional abilities of the participants.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.3389/fpsyg.2013.00578">doi:10.3389/fpsyg.2013.00578</a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pubmed/24009598">pmid:24009598</a> <a target="_blank" rel="external noopener" href="https://pubmed.ncbi.nlm.nih.gov/PMC3757288/">pmcid:PMC3757288</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/5a23lqxalbazvefmaz7amhu4dq">fatcat:5a23lqxalbazvefmaz7amhu4dq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170809010513/https://www.sowi.uni-kl.de/fileadmin/frueh/publications/klatte_bergstroem_lachmann_noise_learning.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/74/12/74129ecb95bd06dc30b2624818701f929a1ce55b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.3389/fpsyg.2013.00578"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> frontiersin.org </button> </a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3757288" title="pubmed link"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> pubmed.gov </button> </a>

UT-Scope: Speech under Lombard Effect and Cognitive Stress

Ayako Ikeno, Vaishnevi Varadarajan, Sanjay Patil, John H.L. Hansen
<span title="">2007</span> <i title="IEEE"> 2007 IEEE Aerospace Conference </i> &nbsp;
A clear demarcation between the effect of noise and Lombard effect on noise is also given by testing with noisy Lombard speech.  ...  This paper presents UT-Scope data base, and automatic and perceptual an evaluation of Lombard speech in In-Set Speaker Recognition.  ...  An in-set speaker ID system is one that identifies if the speech input belongs to one of the group of speakers defined in the system.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/aero.2007.352975">doi:10.1109/aero.2007.352975</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/c3ayjopnkjgtroinviirxp4xsa">fatcat:c3ayjopnkjgtroinviirxp4xsa</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200324232917/https://works.bepress.com/sanjay_patil/2/download/" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/97/28/9728df28e7d04cf89672c20538f06e47e0c3fb54.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/aero.2007.352975"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

The impact of the Lombard effect on audio and visual speech recognition systems

Ricard Marxer, Jon Barker, Najwa Alghamdi, Steve Maddock
<span title="">2018</span> <i title="Elsevier BV"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/yx5bmukg7revlpm4yd5obfwtsa" style="color: black;">Speech Communication</a> </i> &nbsp;
Surprisingly, the Lombard benefit was observed to a small degree even when training on mismatched non-Lombard visual speech, i.e., the increased clarity of the Lombard speech outweighed the impact of the  ...  ., Barker, J.P. orcid.org/0000-0002- 1684-5660, Alghamdi, N. et al. (1 more author) (2018) The impact of the Lombard effect on audio and visual speech recognition systems.  ...  Acknowledgements This research has been funded by the UK Engineering and Physical Sciences Research Council (EPSRC project AV-COGHEAR, EP/ M026981/1) and by the Saudi Ministry of Education, King Saud University  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.specom.2018.04.006">doi:10.1016/j.specom.2018.04.006</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ndebaursqjad3ng4cndsa27uqm">fatcat:ndebaursqjad3ng4cndsa27uqm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180723102209/http://eprints.whiterose.ac.uk/130687/8/1-s2.0-S0167639317302674-main.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/f7/c7/f7c7d3b069c81c6b2c64fdcfbd0cb650ed90c389.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.specom.2018.04.006"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> elsevier.com </button> </a>

Auditory distraction in open-plan study environments: Effects of background speech and reverberation time on a collaboration task

Ella Braat-Eggen, Marijke Keus v.d. Poll, Maarten Hornikx, Armin Kohlrausch
<span title="">2019</span> <i title="Elsevier BV"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/sqo5d557xffgfl6pdjr3p6fpou" style="color: black;">Applied Acoustics</a> </i> &nbsp;
Therefore, the aim of this study was to analyze the influence of irrelevant background speech on student-collaboration.  ...  On the other hand, the data show an increased perceived disturbance for a longer reverberation time, which we interpret as an increased difficulty of interpersonal communication in the collaboration task  ...  Acknowledgements The authors would like to thank Marc Kalee for his contribution to the digital model of the open-plan study environment.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.apacoust.2019.04.038">doi:10.1016/j.apacoust.2019.04.038</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/jmsiqmnrvnhgjahbt4g2vnbfny">fatcat:jmsiqmnrvnhgjahbt4g2vnbfny</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200307010429/https://pure.tue.nl/ws/files/125145435/Braat2019J1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/59/cc/59cc220e4a3e709bf854e5ab31c1f40bf237e4ec.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1016/j.apacoust.2019.04.038"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> elsevier.com </button> </a>

Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training [article]

Qiao Cheng, Meiyuan Fang, Yaqian Han, Jin Huang, Yitao Duan
<span title="2019-10-28">2019</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
A standard machine translation system is usually trained on parallel corpus composed of clean text and will perform poorly on text with recognition noise, a gap well known in speech translation community  ...  In a pipeline speech translation system, automatic speech recognition (ASR) system will transmit errors in recognition to the downstream machine translation (MT) system.  ...  We reported its BLEU score on tst2010 for speech translation task in Table 2 . It is clear that training with generic artificial noise only brings minor improvement.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1909.11430v3">arXiv:1909.11430v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mzxgb57tung5vixko56btcqgy4">fatcat:mzxgb57tung5vixko56btcqgy4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200822020109/https://arxiv.org/pdf/1909.11430v3.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d4/8b/d48b7671d2686318c4e74ac7510f6bd47a11d7d1.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1909.11430v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training

Qiao Cheng, Meiyuan Fan, Yaqian Han, Jin Huang, Yitao Duan
<span title="2019-11-02">2019</span> <i title="Zenodo"> Zenodo </i> &nbsp;
A standard machine translation system is usually trained on parallel corpus composed of clean text and will perform poorly on text with recognition noise, a gap well known in speech translation community  ...  In a pipeline speech translation system, automatic speech recognition (ASR) system will transmit errors in recognition to the downstream machine translation (MT) system.  ...  We reported its BLEU score on tst2010 for speech translation task in Table 2 . It is clear that training with generic artificial noise only brings minor improvement.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5281/zenodo.3524968">doi:10.5281/zenodo.3524968</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/5a6hwed6mvfanjdkppararpyte">fatcat:5a6hwed6mvfanjdkppararpyte</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200219095607/https://zenodo.org/record/3524969/files/IWSLT2019_paper_6.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/42/b2/42b25ac16b29cfd158e5d43c4f40df52fbf97a33.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5281/zenodo.3524968"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> zenodo.org </button> </a>

Effect of Microphone Position Measurement Error on RIR and its Impact on Speech Intelligibility and Quality

Aditya Raikar, Karan Nathwani, Ashish Panda, Sunil Kumar Kopparapu
<span title="2020-10-25">2020</span> <i title="ISCA"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/trpytsxgozamtbp7emuvz2ypra" style="color: black;">Interspeech 2020</a> </i> &nbsp;
In literature, the adverse impact of ambient noise on RIR measurement is mostly explored.  ...  In this paper, we investigate the error in RIR introduced due to error in measurement of the microphone position. We also study the impact of this on the quality and intelligibility of speech.  ...  The adverse impact of ambient noise on RIR measurement has been explored in literature.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.21437/interspeech.2020-1578">doi:10.21437/interspeech.2020-1578</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/interspeech/RaikarNPK20.html">dblp:conf/interspeech/RaikarNPK20</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/iizjis2rg5dq7mvo5sg4tcvl6e">fatcat:iizjis2rg5dq7mvo5sg4tcvl6e</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210325143218/https://www.isca-speech.org/archive/Interspeech_2020/pdfs/1578.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/27/c4/27c479e4143d261ce5d24d7d038bc2b30aa211d0.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.21437/interspeech.2020-1578"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Robustness in Speech, Speaker, and Language Recognition: "You've Got to Know Your Limitations"

John H.L. Hansen, Hynek Bořil
<span title="2016-09-08">2016</span> <i title="ISCA"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/trpytsxgozamtbp7emuvz2ypra" style="color: black;">Interspeech 2016</a> </i> &nbsp;
This study focuses on exploring several challenging domains in formulating effective solutions in realistic speech data, and in particular the notion of using naturalistic data to better reflect the potential  ...  In the field of speech, speaker and language recognition, significant gains have and are being made with new machine learning strategies along with the availability of new and emerging speech corpora.  ...  not reproduce the effects of noise on speakers.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.21437/interspeech.2016-1395">doi:10.21437/interspeech.2016-1395</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/interspeech/HansenB16.html">dblp:conf/interspeech/HansenB16</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/df755tcbafbihkavm46y2gsidi">fatcat:df755tcbafbihkavm46y2gsidi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20161213080927/http://www.utdallas.edu:80/~hynek/pdfs/IS16_JohnHynek.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a4/17/a4179fd60cb78b63f9464970d715051185e39e3f.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.21437/interspeech.2016-1395"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Toward the ultimate synthesis/recognition system

S. Furui
<span title="1995-10-24">1995</span> <i title="Proceedings of the National Academy of Sciences"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/nvtuoas5pbdsllkntnhizy4f4q" style="color: black;">Proceedings of the National Academy of Sciences of the United States of America</a> </i> &nbsp;
The problems for speech recognition include robust recognition against speech variations, adaptation/ 10040 The publication costs of this article were defrayed in part by page charge payment.  ...  The problems for speech synthesis include natural and intelligible voice production, prosody control based on meaning, capability of controlling synthesized voice quality and choosing individual speaking  ...  Discovery of good methods for representing the dynamics of speech associated with various time lengths is expected to have a substantial impact on the course of speech research.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1073/pnas.92.22.10040">doi:10.1073/pnas.92.22.10040</a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pubmed/7479723">pmid:7479723</a> <a target="_blank" rel="external noopener" href="https://pubmed.ncbi.nlm.nih.gov/PMC40732/">pmcid:PMC40732</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mhteq63onzhcfmnuxkgf2rwtt4">fatcat:mhteq63onzhcfmnuxkgf2rwtt4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180724072845/http://www.pnas.org/content/pnas/92/22/10040.full.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/c6/3c/c63c7b4f708d115bbec3d81cdef3c8d7f6a9e500.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1073/pnas.92.22.10040"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a> <a target="_blank" rel="external noopener" href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC40732" title="pubmed link"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> pubmed.gov </button> </a>

Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement [article]

Sefik Emre Eskimez, Xiaofei Wang, Min Tang, Hemin Yang, Zirun Zhu, Zhuo Chen, Huaming Wang, Takuya Yoshioka
<span title="2021-06-05">2021</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
However, most monaural speech enhancement (SE) models introduce processing artifacts and thus degrade the performance of downstream tasks, including automatic speech recognition (ASR).  ...  With the surge of online meetings, it has become more critical than ever to provide high-quality speech audio and live captioning under various noise conditions.  ...  Although these models substantially remove background noise, most of them degrade the performance of downstream tasks such as automatic speech recognition (ASR) performance significantly, as modern commercial  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2106.02896v1">arXiv:2106.02896v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qqsnkym4cvcwtf7b223m6ulxna">fatcat:qqsnkym4cvcwtf7b223m6ulxna</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210610214203/https://arxiv.org/pdf/2106.02896v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/dd/4c/dd4c34df951a68f751ae0e8c0d2195d692c74945.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2106.02896v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Learning with noisy supervision for Spoken Language Understanding

Christian Raymond, Giuseppe Riccardi
<span title="">2008</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2omreisfsje33bgvx3orrdifre" style="color: black;">Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing</a> </i> &nbsp;
We show that our noise-robust algorithm could improve the accuracy up to 6% (absolute) depending on the noise level and the labeling cost.  ...  Active learning has been successfully applied to automatic speech recognition and utterance classification.  ...  We can see that noise impact strongly on the learner performance: in order of comparison the accuracy upper bound obtained with the clean version of corpora using the same learner is 95% for ATIS and 92%  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icassp.2008.4518778">doi:10.1109/icassp.2008.4518778</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/icassp/RaymondR08.html">dblp:conf/icassp/RaymondR08</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/wfijxxw6kzgznmcmk6xyvuykny">fatcat:wfijxxw6kzgznmcmk6xyvuykny</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20171025220159/https://core.ac.uk/download/pdf/22867679.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/e9/57/e95714a249e07366a2c4839441779de044c1e7ca.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icassp.2008.4518778"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Speaker Recognition for Surveillance Application

KIKTOVA Eva, JUHAR Jozef
<span title="">2015</span> <i title="Editura Universităţii din Oradea"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/jjqucr4ojbbdxcn3n26dwlprai" style="color: black;">Journal of Electrical and Electronics Engineering</a> </i> &nbsp;
In this paper the comparison and evaluation of speech based parametrizations and noise elimination techniques are presented regarding to the noisy acoustic data.  ...  Our solution is based on the text-independent approach by using Mel-Frequency Cepstral coefficients and fundamental frequency for extracting the identity from a voice signal.  ...  ACKNOWLEDGMENTS The presented research was supported by the Ministry of Education, Science, Research and Sport of the Slovak Republic under research project VEGA1/0075/15 (50%) and Research and Development  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://doaj.org/article/b7711165836e406a80217fe8a7fb95e3">doaj:b7711165836e406a80217fe8a7fb95e3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/j5vlufkeq5ahxl6d3dxxzqivga">fatcat:j5vlufkeq5ahxl6d3dxxzqivga</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200711165848/http://electroinf.uoradea.ro/images/articles/CERCETARE/Reviste/JEEE/JEEE_V8_N2_OCT_2015/Kiktova-JEEE_Vol_8_Nr_2_OCT_2015.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/3f/b8/3fb80cc4b274a5604d05b75d68ac73b29e21c684.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 38,598 results