Multimedia Content Indexing and Retrieval Using Speech and Speaker Recognition

Mahesh Viswanathan, Homayoon S. M. Beigi, Alain Tritschler, Fereydoun Maali
2000 Open research Areas in Information Retrieval  
There have been several new systems for multimedia information access reported in recent years. The system presented here shares many of their aspects, but it differs in a significant way from them; it extends the realm of multimedia access to include speaker-based information. We have already prototyped and reported such a system elsewhere whose main features include SVAPI-based speaker recognition combined with speech recognition for joint text-and speaker-based retrieval from audio and
more » ... A vital component of such a system is speaker identification whose performance degrades for utterances smaller than eight seconds to such an extent that such segments have to be dismissed with a catch-all, neutral label. Here, we use a Bayesian Information Criterion based speaker clustering technique to analyze the same audio data. The results of this classifier are combined with those from our SVAPI-based speaker classifier using a decision integration scheme to produce new labels for many such short speaker segments. We discuss the details of this combined analysis and its results. We additionally report on a on-the-fly speaker enrollment scheme using this BIC-based speaker clustering technique.
dblp:conf/riao/ViswanathanBTM00 fatcat:nxhaucpwnrepbam5i4imf6zumm