A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filler Word Detection and Classification: A Dataset and Benchmark
[article]
2022
arXiv
pre-print
Filler words such as 'uh' or 'um' are sounds or words people use to signal they are pausing to think. Finding and removing filler words from recordings is a common and tedious task in media editing. Automatically detecting and classifying filler words could greatly aid in this task, but few studies have been published on this problem to date. A key reason is the absence of a dataset with annotated filler words for model training and evaluation. In this work, we present a novel speech dataset,
arXiv:2203.15135v2
fatcat:lqd6r3iprraa7pxiqcaa2kiciq