Speech sound detection employing deep learning

Cezary Polak, Jakub Mańkowski, Wiktor Uciński, Patryk Schramka, Mikołaj Mysiakowski, Adam Kurowski
2021 Position and Communication Papers of the 16th Conference on Computer Science and Intelligence Systems   unpublished
The primary way of communication between people is speech, both in the form of everyday conversation and speech signal transmitted and recorded in numerous ways. The latter example is especially important in the modern days of the global SARS-CoV-2 pandemic when it is often not possible to meet with people and talk with them in person. Streaming, VoIP calls, live podcasts are just some of the many applications that have seen a significant increase in usage due to the necessity of social
more » ... ng. In our paper, we provide a method to design, develop, and test the deep learning-based algorithm capable of performing voice activity detection in a manner better than other benchmark solutions like the WebRTC VAD algorithm, which is an industry standard based mainly on a classic approach to speech signal processing.
doi:10.15439/2021f146 fatcat:6slx7n5ytzgvrlpzlq3t7lvj5u