3,757 Hits in 5.0 sec

Domain-aware Visual Bias Eliminating for Generalized Zero-Shot Learning [article]

Shaobo Min, Hantao Yao, Hongtao Xie, Chaoqun Wang, Zheng-Jun Zha, Yongdong Zhang
2020 arXiv   pre-print
In this paper, we propose a novel Domain-aware Visual Bias Eliminating (DVBE) network that constructs two complementary visual representations, i.e., semantic-free and semantic-aligned, to treat seen and  ...  Recent methods focus on learning a unified semantic-aligned visual representation to transfer knowledge between two domains, while ignoring the effect of semantic-free visual representation in alleviating  ...  Domain-aware Visual Bias Eliminating Generalized zero-shot learning aims to recognize images from seen and unseen domains.  ... 
arXiv:2003.13261v2 fatcat:w2oyekgre5h2jmwiqvhlngjive

Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval [article]

Qing Liu, Lingxi Xie, Huiyu Wang, Alan Yuille
2019 arXiv   pre-print
Recently, research interests arise in solving this problem under the more realistic and challenging setting of zero-shot learning.  ...  In this paper, we investigate this problem from the viewpoint of domain adaptation which we show is critical in improving feature embedding in the zero-shot scenario.  ...  We thank Chenxi Liu for helping design Figure 1 and proofreading. We thank Chenglin Yang for discussions on knowledge distillation.  ... 
arXiv:1904.03208v3 fatcat:ovs6hdklwnhanogtm7wdvsi5ay

Attribute-Modulated Generative Meta Learning for Zero-Shot Classification [article]

Yun Li, Zhe Liu, Lina Yao, Xiaojun Chang
2021 arXiv   pre-print
To this end, we propose an Attribute-Modulated generAtive meta-model for Zero-shot learning (AMAZ).  ...  The promising strategies for ZSL are to synthesize visual features of unseen classes conditioned on semantic side information and to incorporate meta-learning to eliminate the model's inherent bias towards  ...  Zero-shot Learning A common strategy views zero-shot learning as an embedding problem of visual or semantic features. For example, Ye et al.  ... 
arXiv:2104.10857v3 fatcat:midcen2vkjgofnrxe4wypjnazm

Unravelling Small Sample Size Problems in the Deep Learning World [article]

Rohit Keshari, Soumyadeep Ghosh, Saheb Chhabra, Mayank Vatsa, Richa Singh
2020 arXiv   pre-print
It has been observed that deep learning models do not generalize well on S^3 problems and specialized solutions are required.  ...  For problems with large training databases, deep learning models have achieved superlative performances.  ...  [29] have proposed Domain-aware Visual Bias Eliminating (DVBE) network by constructing two complementary visual representations; semantic-free and semanticaligned.  ... 
arXiv:2008.03522v1 fatcat:nigmkyma6rahvfhylcml3xmmxq

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color [article]

Cory Paik, Stéphane Aroca-Ouellette, Alessandro Roncone, Katharina Kann
2021 arXiv   pre-print
We then demonstrate that multimodal models can leverage their visual training to mitigate these effects, providing a promising avenue for future research.  ...  To accomplish this, we 1) generate the Color Dataset (CoDa), a dataset of human-perceived color distributions for 521 common objects; 2) use CoDa to analyze and compare the color distribution found in  ...  Acknowledgments We would like to thank the members of CU Boulder's NALA Group for their feedback on this work.  ... 
arXiv:2110.08182v1 fatcat:mkvte5jbqvfspdmdcjwaeuklke

Few-shot Font Generation with Weakly Supervised Localized Representations [article]

Song Park, Sanghyuk Chun, Junbum Cha, Bado Lee, Hyunjung Shim
2021 arXiv   pre-print
Automatic few-shot font generation aims to solve a well-defined, real-world problem because manual font designs are expensive and sensitive to the expertise of designers.  ...  However, learning component-wise styles solely from a few reference glyphs is infeasible when a target script has a large number of components, for example, over 200 for Chinese.  ...  The component-conditional module learns a set of channel-wise biases where each bias value is in charge of each component.  ... 
arXiv:2112.11895v1 fatcat:tqt6vpymmjdoxghfxnp25oymde

Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation [article]

Donghyeon Baek, Youngmin Oh, Bumsub Ham
2021 arXiv   pre-print
We address the problem of generalized zero-shot semantic segmentation (GZS3) predicting pixel-wise semantic labels for seen and unseen classes.  ...  To this end, we leverage visual and semantic encoders to learn a joint embedding space, where the semantic encoder transforms semantic features to semantic prototypes that act as centers for visual features  ...  Related work Zero-shot image classification. Many zero-shot learning (ZSL) [11, 29, 42] methods have been proposed for image classification.  ... 
arXiv:2108.06536v1 fatcat:japzmaebdfhrrbgooky6p3k4xy

DASZL: Dynamic Action Signatures for Zero-shot Learning [article]

Tae Soo Kim, Jonathan D. Jones, Michael Peven, Zihao Xiao, Jin Bai, Yi Zhang, Weichao Qiu, Alan Yuille, Gregory D. Hager
2020 arXiv   pre-print
We also extend this method to form a unique framework for zero-shot joint segmentation and classification of activities in video and demonstrate the first results in zero-shot decoding of complex action  ...  by deep-learned components.  ...  We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Ti-tan V GPUs used for this research.  ... 
arXiv:1912.03613v3 fatcat:adjifazqhvc4lpyqgveu3imfkq

Small Sample Learning in Big Data Era [article]

Jun Shu, Zongben Xu, Deyu Meng
2018 arXiv   pre-print
The purpose is mainly to simulate human learning behaviors like recognition, generation, imagination, synthesis and analysis.  ...  The second category is called "experience learning", which usually co-exists with the large sample learning manner of conventional machine learning.  ...  The inter-modal label transfer is generalized to zero-shot recognition.  ... 
arXiv:1808.04572v3 fatcat:lqqzzrmgfnfb3izctvdzgopuny

Zero-Shot Action Recognition in Videos: A Survey [article]

Valter Estevam, Helio Pedrini, David Menotti
2020 arXiv   pre-print
specifically zero-shot action recognition in videos.  ...  Zero-Shot Action Recognition has attracted attention in the last years and many approaches have been proposed for recognition of objects, events and actions in images and videos.  ...  Another important issue is the Few-Shot Learning (FSL) or generalized zero-shot learning.  ... 
arXiv:1909.06423v2 fatcat:w5eh7wjdmnaktnbsqczsdmhane


Guangda Li, Meng Wang, Yan-Tao Zheng, Haojie Li, Zheng-Jun Zha, Tat-Seng Chua
2011 Proceedings of the 1st ACM International Conference on Multimedia Retrieval - ICMR '11  
There are two steps to accomplish the location of tags at shot level. The first is to estimate the distribution of tags within the video, which is based on a multiple instance learning framework.  ...  at shot level.  ...  mainly designed for narrow domain such as news video, but not for general domain, such as web-based videos.  ... 
doi:10.1145/1991996.1992033 dblp:conf/mir/LiWZLZC11 fatcat:mqstsnzutfdy5h65fv5eskhjgy

Toward Open-World Electroencephalogram Decoding Via Deep Learning: A Comprehensive Survey [article]

Xun Chen, Chang Li, Aiping Liu, Martin J. McKeown, Ruobing Qian, Z. Jane Wang
2021 arXiv   pre-print
In recent years, deep learning (DL) has emerged as a potential solution for such problems due to its superior capacity in feature extraction.  ...  Combining DL with domain-specific knowledge may allow for development of robust approaches to decode brain activity even with small-sample data.  ...  Her research interests include statistical signal processing and machine learning, with applications in digital media and biomedical data analytics.  ... 
arXiv:2112.06654v2 fatcat:roxf5k7ypfcvtdzz3pbho3kdri

Few-Shot Induction of Generalized Logical Concepts via Human Guidance

Mayukh Das, Nandini Ramanan, Janardhan Rao Doppa, Sriraam Natarajan
2020 Frontiers in Robotics and AI  
First, we define a distance measure between candidate concept representations that improves the efficiency of search for target concept and generalization.  ...  We consider the problem of learning generalized first-order representations of concepts from a small number of examples. We augment an inductive logic programming learner with 2 novel contributions.  ...  ACKNOWLEDGMENTS The authors acknowledge the support of members of STARLING lab for the discussions.  ... 
doi:10.3389/frobt.2020.00122 pmid:33501288 pmcid:PMC7805948 fatcat:khw63epsjbbdlhmq35necqfjaq

LaTr: Layout-Aware Transformer for Scene-Text VQA [article]

Ali Furkan Biten, Ron Litman, Yusheng Xie, Srikar Appalaraju, R. Manmatha
2021 arXiv   pre-print
We propose a novel multimodal architecture for Scene Text Visual Question Answering (STVQA), named Layout-Aware Transformer (LaTr).  ...  In addition, by leveraging a vision transformer, we eliminate the need for an external object detector. LaTr outperforms state-of-the-art STVQA methods on multiple datasets.  ...  Zero-shot Language Models on TextVQA To quantify the importance of language understanding in STVQA , we devise a novel zero-shot setting where we use the T5 language model pre-trained on C4 and only fine-tuned  ... 
arXiv:2112.12494v2 fatcat:chdp2ozx5vfmromsdxksjwf63e

Scene Classification for Sports Video Summarization Using Transfer Learning

Muhammad Rafiq, Ghazala Rafiq, Rockson Agyeman, Seong-Il Jin, Gyu Sang Choi
2020 Sensors  
We evaluate our performance results on cricket videos and compare various deep-learning models, i.e., Inception V3, Visual Geometry Group (VGGNet16, VGGNet19) , Residual Network (ResNet50), and AlexNet  ...  Due to the growing demand for video summarization in marketing, advertising agencies, awareness videos, documentaries, and other interest groups, researchers are continuously proposing automation frameworks  ...  Previously, several researchers proposed generalized schemes for shot classification, shot-boundary detection, and scene classification; however, generalized proposals are less helpful for a specific domain  ... 
doi:10.3390/s20061702 pmid:32197502 pmcid:PMC7146586 fatcat:4nswdsrbhzdljaap4odnx5tkhu
« Previous Showing results 1 — 15 out of 3,757 results