A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
ARET: Aggregated Residual Extended Time-Delay Neural Networks for Speaker Verification
2020
Interspeech 2020
The time-delay neural network (TDNN) is widely used in speaker verification to extract long-term temporal features of speakers. Although common TDNN approaches well capture time-sequential information, they lack the delicate transformations needed for deep representation. To solve this problem, we propose two TDNN architectures. RET integrates shortcut connections into conventional time-delay blocks, and ARET adopts a split-transform-merge strategy to extract more discriminative representation.
doi:10.21437/interspeech.2020-1626
dblp:conf/interspeech/ZhangWLWLZJX20
fatcat:by5pzsk46rhbpnuzifvb5sunku