Filters








4 Hits in 7.0 sec

Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings? [article]

Łukasz Augustyniak, Piotr Szymanski, Mikołaj Morzy, Piotr Zelasko, Adrian Szymczak, Jan Mizgajski, Yishay Carmiel, Najim Dehak
2020 arXiv   pre-print
These errors usually take the form of homonyms. We show how retrofitting of the word embeddings on the domain-specific data can mitigate ASR errors.  ...  We record the absolute improvement in punctuation prediction accuracy between 6.2% (for question marks) to 9% (for periods) when compared with the state-of-the-art model.  ...  During punctuation prediction we cannot correct the ASR errors, but the retrofitted representation of words allows us to improve the accuracy of punctuation prediction models.  ... 
arXiv:2004.05985v1 fatcat:thz4mmrviba6hacm3guwlflnxq

Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings?

Łukasz Augustyniak, Piotr Szymański, Mikołaj Morzy, Piotr Żelasko, Adrian Szymczak, Jan Mizgajski, Yishay Carmiel, Najim Dehak
2020 Interspeech 2020  
We show how retrofitting of the word embeddings on the domain-specific data can mitigate ASR errors.  ...  We record the absolute improvement in punctuation prediction accuracy between 6.2% (for question marks) to 9% (for periods) when compared with the state-of-the-art model.  ...  During punctuation prediction we cannot correct the ASR errors, but the retrofitted representation of words allows us to improve the accuracy of punctuation prediction models.  ... 
doi:10.21437/interspeech.2020-1250 dblp:conf/interspeech/AugustyniakSMZS20 fatcat:zss4hqhdfffqhn3gfr335bxl2u

Joint prediction of truecasing and punctuation for conversational speech in low-resource scenarios [article]

Raghavendra Pappagari, Piotr Żelasko, Agnieszka Mikołajczyk, Piotr Pęzik, Najim Dehak
2021 arXiv   pre-print
Further, we show that by training the model in the written text domain and then transfer learning to conversations, we can achieve reasonable performance with less data.  ...  We propose to use a multi-task system that can exploit the relations between casing and punctuation to improve their prediction performance.  ...  The distribution mismatch between text and conversational domains can be mitigated by retrofitting word embeddings to the target domain [17] when GloVe [18] embeddings are used in the model.  ... 
arXiv:2109.06103v1 fatcat:3cve2tmfojht3mzoqprzf5ehxq

Capitalization and punctuation restoration: a survey

Vasile Păiş, Dan Tufiş
2021 Artificial Intelligence Review  
This survey offers an overview of both historical and state-of-the-art techniques for restoring punctuation and correcting word casing.  ...  Additionally, short text messages and micro-blogging platforms offer unreliable and often wrong punctuation and casing.  ...  Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings? arXiv:2004.05985 [cs.CL].  ... 
doi:10.1007/s10462-021-10051-x fatcat:j4blakzh5rew3iljtytpcnnc4q