Filters








5 Hits in 6.1 sec

Prosodic-Enhanced Siamese Convolutional Neural Networks for Cross-Device Text-Independent Speaker Verification [article]

Sobhan Soleymani, Ali Dabouei, Seyed Mehdi Iranmanesh, Hadi Kazemi, Jeremy Dawson, Nasser M. Nasrabadi
<span title="2018-07-31">2018</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
In this paper a novel cross-device text-independent speaker verification architecture is proposed.  ...  To compensate for these inherited shortcomings in spectro-temporal features, we propose to enhance the proposed Siamese convolutional neural network architecture by deploying a multilayer perceptron network  ...  ACKNOWLEDGEMENT This work is based upon a work supported by the Center for Identification Technology Research and the National Science Foundation under Grant #1650474.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1808.01026v1">arXiv:1808.01026v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/4wy6mdgoxfamvb2n3gr7peafp4">fatcat:4wy6mdgoxfamvb2n3gr7peafp4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200823062554/https://arxiv.org/pdf/1808.01026v1.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/34/82/3482b2a6182b8261647f28823d2c4b3196d356b0.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1808.01026v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

SpeakerNet for Cross-lingual Text-Independent Speaker Verification

<span title="">2020</span>
In this paper, we addressed text independent speaker verification using Siamese convolutional network. Siamese networks are twin networks with shared weights.  ...  Experiments made on cross language audios of multi-lingual speakers confirm the capability of our architecture to handle gender, age and language independent speaker verification.  ...  Conclusion In this paper, we have proposed a model based on special Siamese like network for text-independent speaker verification which uses speech independent feature learning.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.24425/aoa.2020.134073">doi:10.24425/aoa.2020.134073</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fbx32dgi7jbr3hsyzzgoycbmlu">fatcat:fbx32dgi7jbr3hsyzzgoycbmlu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210103095524/http://journals.pan.pl/Content/117167/PDF/aoa.2020.134073.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/28/64/28642b18068cb9cb0a4d8ff776dfcb7136783f7e.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.24425/aoa.2020.134073"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Deep Learning: Methods and Applications

Li Deng
<span title="">2014</span> <i title="Now Publishers"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/qqxcxey64vdbrpzf3zwaaw652a" style="color: black;">Foundations and Trends® in Signal Processing</a> </i> &nbsp;
A specialized neural network architecture, a generalization of "Siamese" network, is used.  ...  I-vectors or (speaker) identity vectors are commonly used for speaker verification and speaker recognition applications, as they encapsulate relevant information about a speaker's identity in a low-dimensional  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1561/2000000039">doi:10.1561/2000000039</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/vucffxhse5gfhgvt5zphgshjy4">fatcat:vucffxhse5gfhgvt5zphgshjy4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20150425062228/http://research.microsoft.com:80/pubs/219984/DeepLearningBook_RefsByLastFirstNames.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/fc/d3/fcd37fe5de53436e7b8e5705b21e8d6bda81473b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1561/2000000039"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Ego4D: Around the World in 3,000 Hours of Egocentric Video [article]

Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan (+73 others)
<span title="2022-03-11">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Acknowledgements We gratefully acknowledge the following colleagues for valuable discussions and support of our project: Aaron Adcock, Andrew Allen, Behrouz Behmardi, Serge Belongie, Mark Broyles, Xiao  ...  It applies a convolutional neural network on the 2D map to predict the Intersection over Union of each moment candidate and the ground-truth moment. Please see [229] for more details.  ...  b i ), F(v)) (9) The Siamese network projects each proposal / visual-crop feature to a 1024-D feature vector using a convolutional projection module P, p b = P(F(b i )); p v = P(F(v)) (10) and predicts  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2110.07058v3">arXiv:2110.07058v3</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/lgh27km63nhcdcpkvbr2qarsru">fatcat:lgh27km63nhcdcpkvbr2qarsru</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20211016001207/https://arxiv.org/pdf/2110.07058v1.pdf" title="fulltext PDF download [not primary version]" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <span style="color: #f43e3e;">&#10033;</span> <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/84/7a/847a153286d7f6f496f1ff61089831c267d68e30.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2110.07058v3" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Anti-Spoofing Countermeasures for Automatic Speaker Verification System [article]

Yuanjun Zhao
<span title="">2019</span>
Generally, deep neural networks (DNNs) are used for extracting discriminative embeddings for each speaker [112, 114] and as part of an end-to-end system for speaker verification [49] .  ...  . . . . . . 97 xii List of Acronyms ASI Automatic Speaker Identification ASV Automatic Speaker Verification CNN Convolutional Neural Network CS Compressed Sensing DA Data Augmentation DET  ...  The total number of trainable parameters in the network are increased compared to the multi-label architecture in chapter 6 since more fully connected layers are applied for different tasks.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.26182/5ee030f1192eb">doi:10.26182/5ee030f1192eb</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/vcszrox7fjhstg6fxvpact76de">fatcat:vcszrox7fjhstg6fxvpact76de</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200611011928/https://api.research-repository.uwa.edu.au/files/79861203/THESIS_DOCTOR_OF_PHILOSOPHY_ZHAO_Yuanjun_2020.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.26182/5ee030f1192eb"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>