Clustering Speech Utterances by Speaker Using Eigenvoice-Motivated Vector Space Models

Wei-Ho Tsai, Shih-Sian Cheng, Yi-Hsiang Chao, Hsin-Min Wang
Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.  
This study investigates the problem of automatically grouping unknown speech utterances based on their associated speakers. The proposed method utilizes the vector space model, which was originally developed in document-retrieval research, to characterize each utterance as a tf-idf-based vector of acoustic terms, thereby deriving a reliable measurement of similarity between utterances. To define the required acoustic terms that are most representative in terms of voice characteristics, the
more » ... voice approach is applied on the utterances to be clustered, which creates a set of eigenvector-based terms. To further improve speaker-clustering performance, the proposed method encompasses a mechanism of blind relevance feedback for refining the inter-utterance similarity measure.
doi:10.1109/icassp.2005.1415216 dblp:conf/icassp/TsaiCCW05 fatcat:qwvanpyvijguxl5nbst3a3bdju