11,391 Hits in 4.5 sec

Improving One-Shot Learning through Fusing Side Information [article]

Yao-Hung Hubert Tsai, Ruslan Salakhutdinov
2018 arXiv   pre-print
We introduce two statistical approaches for fusing side information into data representation learning to improve one-shot learning.  ...  We empirically show that our learning architecture improves over traditional softmax regression networks as well as state-of-the-art attentional regression networks on one-shot recognition tasks.  ...  Supplementary for Improving One-Shot Learning through Fusing Side Information Yao-Hung Hubert Tsai † Ruslan Salakhutdinov † † School of Computer Science, Machine Learning Department, Carnegie Mellon University  ... 
arXiv:1710.08347v2 fatcat:cm5chbpui5fc7kfxl5ffcjnncm

MDFM: Multi-Decision Fusing Model for Few-Shot Learning [article]

Shuai Shao, Lei Xing, Rui Xu, Weifeng Liu, Yan-Jiang Wang, Bao-Di Liu
2021 arXiv   pre-print
In recent years, researchers pay growing attention to the few-shot learning (FSL) task to address the data-scarce problem. A standard FSL framework is composed of two components: i) Pre-train.  ...  We evaluate the proposed method on five benchmark datasets and achieve significant improvements of 3.4%-7.3% compared with state-of-the-arts.  ...  From this figure, we can see that the performance of fusing-view improves significantly compared with the singleview, especially on the 1-shot case.  ... 
arXiv:2112.00690v2 fatcat:4pwnwzdgrndpnevgyuijwzbdhq

Deep Double-Side Learning Ensemble Model for Few-Shot Parkinson Speech Recognition [article]

Yongming Li, Lang Zhou, Lingyun Qin, Yuwei Zeng, Yuchuan Liu, Yan Lei, Pin Wang, Fan Li
2020 arXiv   pre-print
Finally, the bagging ensemble learning mode is adopted to fuse the deep feature learning algorithm and the deep samples learning algorithm together, thereby constructing a deep double-side learning ensemble  ...  extraction, it suffers from few-shot learning problem.  ...  learning mode(BEM), a deep double-side learning ensemble model(DDSLEM) is constructed by combining EGSAE with DSL, which is helpful to improving the accuracy of few-shot PD speech recognition.  ... 
arXiv:2006.11593v1 fatcat:qn33ijpmincjtelkvwqlg2jjvu

Ad-Net: Audio-Visual Convolutional Neural Network for Advertisement Detection In Videos [article]

Shervin Minaee, Imed Bouazizi, Prakash Kolan, Hossein Najafzadeh
2018 arXiv   pre-print
We propose a two-stream audio-visual convolutional neural network, that one branch analyzes the visual information and the other one analyzes the audio information, and then the audio and visual embedding  ...  This network is trained on a dataset of more than 50k regular video and commercial shots, and achieved much better performance compared to the models based on hand-crafted features.  ...  One main difference of our framework with previous deep learning approaches toward video processing, is that we rely on both the visual and audio information of the video.  ... 
arXiv:1806.08612v1 fatcat:dsdbrrhqrzauddyfo3jv3fkkwe

Task-wise attention guided part complementary learning for few-shot image classification

Gong Cheng, Ruimin Li, Chunbo Lang, Junwei Han
2021 Science China Information Sciences  
Sun et al. [31] presented a new few-shot learning approach named meta-transfer learning (MTL) which can learn to adapt a deep neural network to few-shot learning tasks.  ...  In addition to optimizing the network classifier, how to capture specific but discriminative information under different task requirements is equally vital for the few-shot learning scenarios.  ...  dataset is improved by 2.44% (1-shot) and 0.96% (5-shot), and that on CUB dataset is improved by 1.69% and 0.6%, respectively.  ... 
doi:10.1007/s11432-020-3156-7 fatcat:hl6evdxcqraflnkpbvjtzrqoju

Zero-Shot Visual Recognition via Semantic Attention-based Compare Network

Fudong Nian, Yikun Sheng, Junfeng Wang, Teng Li
2020 IEEE Access  
Meanwhile, to build the knowledge bridge for images from two disjoint label spaces, the side information (attributes are the most popular semantic representation form of the side information) of both seen  ...  Early works [5] [6] [7] [8] of zero-shot visual recognition utilize the attributes as side information and infer the class of the test image via a two-stage approach.  ... 
doi:10.1109/access.2020.2971174 fatcat:4ek3j2f4jva4bnn3cl5yt2szwq

Multi-task deep visual-semantic embedding for video thumbnail selection

Wu Liu, Tao Mei, Yongdong Zhang, Cherry Che, Jiebo Luo
2015 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
In particular, we train the embedding model by exploring the large-scale and freely accessible click-through video and image data, as well as employing a multi-task learning strategy to holistically exploit  ...  In this paper, we have developed a multi-task deep visualsemantic embedding model, which can automatically select query-dependent video thumbnails according to both visual and side information.  ...  This side information is important for thumbnail selection and often overlooked in previous research.  ... 
doi:10.1109/cvpr.2015.7298994 dblp:conf/cvpr/LiuMZCL15 fatcat:74q724lavzgvrannwyvkiuhekq

Complementary Attributes: A New Clue to Zero-Shot Learning [article]

Xiaofeng Xu, Ivor W. Tsang, Chuancai Liu
2019 arXiv   pre-print
Zero-shot learning (ZSL) aims to recognize unseen objects using disjoint seen objects via sharing attributes.  ...  Extensive experiments on five ZSL benchmark datasets and the large-scale ImageNet dataset demonstrate that the proposed complementary attributes and rank aggregation can significantly and robustly improve  ...  So as future work, we plan to fuse multi-sources assistant information to increase the discriminative power of current zero-shot learning models.  ... 
arXiv:1804.06505v2 fatcat:gmot6ih2ofbcpc4e2y6xt3cy2q

Deep Edge Computing for Videos

Jun-Hwa Kim, Namho Kim, Chee Sun Won
2021 IEEE Access  
Therefore, we use only one of them and share it for learning both spatial and temporal information.  ...  The first stream of 2D CNN is for learning the spatial information via a single frame chosen from the video shot.  ... 
doi:10.1109/access.2021.3109904 fatcat:ctfwuywyfjh2rg74vxqkp2zf44

FFESSD: An Accurate and Efficient Single-Shot Detector for Target Detection

Wenxu Shi, Shengli Bao, Dailun Tan
2019 Applied Sciences  
The Single Shot MultiBox Detector (SSD) is one of the fastest algorithms in the current target detection field.  ...  On extended experiment, the performance of FFESSD in fuzzy target detection was better than the conventional SSD.  ...  Through the deconvolution layer, large context information is propagated to a feature Appl.  ... 
doi:10.3390/app9204276 fatcat:u2rlgz7fn5hqnaykavufoi42om

ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [article]

Mike Wu, Noah Goodman, Chris Piech, Chelsea Finn
2021 arXiv   pre-print
Because data for meta-training is limited, we propose a number of amendments to the typical few-shot learning framework, including task augmentation to create synthetic tasks, and additional side information  ...  On a suite of few-shot natural language processing tasks, we match or outperform state-of-the-art performance.  ...  A question remains on how to incorporate side information vectors g φ (z) into the embedding function f θ so that the model can fuse information from the support set and the side information together.  ... 
arXiv:2107.14035v2 fatcat:4x2ojsdqija4zn4n4yvqlajlby

Learning to Rank Intents in Voice Assistants [article]

Raviteja Anantha, Srinivas Chappidi, William Dawoodi
2020 arXiv   pre-print
Finally, we evaluate the robustness of our algorithm on the intent ranking task and show our algorithm improves the robustness by 33.3%.  ...  Furthermore we present a Multisource Denoising Autoencoder based pretraining that is capable of learning fused representations of data from multiple sources.  ...  One approach is to use a convolutional deep structured semantic model (CDSSM), which performs zero-shot learning by jointly learning the representations for user intents and associated utterances [12]  ... 
arXiv:2005.00119v2 fatcat:2hoj32tfmbc5fm7ehdsn3y7r6u

Scaling Human-Object Interaction Recognition in the Video through Zero-Shot Learning

Vali Ollah Maraghi, Karim Faez, Miguel Cazorla
2021 Computational Intelligence and Neuroscience  
We propose an approach for scaling human-object interaction recognition in video data through the zero-shot learning technique to solve this problem.  ...  The lateral information comes from word embedding techniques.  ...  We focus on this problem and try to solve it through the zero-shot learning approach. Zero-Shot Learning. Zero-shot learning is an exciting approach in different areas [42] [43] [44] [45] .  ... 
doi:10.1155/2021/9922697 fatcat:b6a73bphufcbzjssfyjocs4m4i

Rectifying the Shortcut Learning of Background for Few-Shot Learning [article]

Xu Luo, Longhui Wei, Liangjian Wen, Jinrong Yang, Lingxi Xie, Zenglin Xu, Qi Tian
2022 arXiv   pre-print
The category gap between training and evaluation has been characterised as one of the main obstacles to the success of Few-Shot Learning (FSL).  ...  Extensive experiments carried on inductive FSL tasks demonstrate the effectiveness of our approaches.  ...  Acknowledgments and Disclosure of Funding Special thanks to Qi Yong, who gives indispensable support on the spirit of this paper. We also thank Junran Peng for his help and fruitful discussions.  ... 
arXiv:2107.07746v3 fatcat:eklgmvkvizc35eoj5yyfekvia4

From Generalized zero-shot learning to long-tail with class descriptors [article]

Dvir Samuel, Yuval Atzmon, Gal Chechik
2020 arXiv   pre-print
It learns to (1) correct the bias towards head classes on a sample-by-sample basis; and (2) fuse information from class-descriptions to improve the tail-class accuracy.  ...  Often, classes can be accompanied by side information like textual descriptions, but it is not fully clear how to use them for learning with unbalanced long-tail data.  ...  Acknowledgments DS was funded by a grant from the Israeli innovation authority, through the AVATAR consortium and by a grant from the Israel Science Foundation (ISF 737/2018).  ... 
arXiv:2004.02235v4 fatcat:pqlgr4ok4ncpvfabjawwdwvx3i
« Previous Showing results 1 — 15 out of 11,391 results