A NOVEL FRAMEWORK FOR UNDERSTANDING ATYPICAL IMAGES A Novel Framework for Understanding Atypical Images

Babak Saleh, Babak Saleh
OF THE DISSERTATION In the past few years, there has been a tremendous amount of progress in the field of computer vision. As of now, we have reliable object detectors and classifiers that can recognize thousands of object categories. However, the ultimate goal of computer vision is to build systems that can understand and reason about images, far beyond scene categorization and object detection. In this thesis, algorithms have been proposed to empower computers with the human-level ability of
more » ... etecting and reasoning about images that are understudied in the mainstream computer vision community. In chapter 1, we open the conversation about abnormality detection, by discussing how humans form visual concepts (e.g. an object category) and perceive meaningful deviations from these learned concepts as signals for abnormality. However, there is not a comprehensive study about what factors lead humans in this decision-making process. In chapter 2 we collect the first dataset of abnormal images from the web. Conduct several human subject experiments, and perform a thorough set of analysis to discover hidden factors in human judgment about abnormality. These analyses lead us to propose a taxonomy of comprehensive reasons of abnormality in images. Inspired by human reasoning, we address the problem of detecting abnormal objects and reasoning about their abnormality in terms of visual attributes, such as irregular shape, texture ii or color (chapter 3). Although our computational models are learned without seeing any abnormal objects at training time, but still are capable of detecting and reasoning about abnormal images at the test time. In chapter 4 we develop probabilistic frameworks to model typical images and find atypical images as a meaningful deviation from this model. In chapter 5, we use the typicality scores of images and objects to improve the generalization capacity of the state-of-the-art Convolutional Neural Networks (CNN) for the task of object classification. We train these CNN models by minimizing a weighted loss function that incorporates in the typicality scores of samples. Our experiments show that this training strategy results in more generalized classifiers, which can be applied even to the extent of abnormal images. In chapter 6 of this thesis, we study two problems that extend our framework for abnormality detection to special cases. We develop algorithms for detecting and localizing attributes in images. In addition to the application of localized attributes for the problem of abnormality detection, we show that fine-grained object categorization benefit from such rich information as well. We also propose algorithms to learn visual classifiers directly from the textual description of an object category. This zero-shot learning strategy extends the abnormality detection framework to object categories that are not present at the time of training. We close this thesis by discussing the main contributions and some future work. iii Acknowledgements This dissertation would not have been possible without the support of many people. First of all, Professor Ahmed Elgammal who gave me the opportunity of working on interesting problems, taught me conducting scientific research, enriched my learning experiences with a perfect balance of freedom and supervision, and helped me through the challenges of my graduate studies.