A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit <a rel="external noopener" href="https://arxiv.org/pdf/2012.03411v1.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
MLS: A Large-Scale Multilingual Dataset for Speech Research
[article]
<span title="2020-12-07">2020</span>
<i >
arXiv
</i>
<span class="release-stage" >pre-print</span>
This paper introduces Multilingual LibriSpeech (MLS) dataset, a large multilingual corpus suitable for speech research. The dataset is derived from read audiobooks from LibriVox and consists of 8 languages, including about 44.5K hours of English and a total of about 6K hours for other languages. Additionally, we provide Language Models (LM) and baseline Automatic Speech Recognition (ASR) models and for all the languages in our dataset. We believe such a large transcribed dataset will open new
<span class="external-identifiers">
<a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2012.03411v1">arXiv:2012.03411v1</a>
<a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/krcmqjo2jzatfh6ahrlykqeooi">fatcat:krcmqjo2jzatfh6ahrlykqeooi</a>
</span>
more »
... enues in ASR and Text-To-Speech (TTS) research. The dataset will be made freely available for anyone at http://www.openslr.org.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20201209055949/https://arxiv.org/pdf/2012.03411v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext">
<button class="ui simple right pointing dropdown compact black labeled icon button serp-button">
<i class="icon ia-icon"></i>
Web Archive
[PDF]
<div class="menu fulltext-thumbnail">
<img src="https://blobs.fatcat.wiki/thumbnail/pdf/8f/c6/8fc66ae6f5339e1add726ddca798a5490d508d7d.180px.jpg" alt="fulltext thumbnail" loading="lazy">
</div>
</button>
</a>
<a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2012.03411v1" title="arxiv.org access">
<button class="ui compact blue labeled icon button serp-button">
<i class="file alternate outline icon"></i>
arxiv.org
</button>
</a>