Filters








306 Hits in 4.7 sec

GOGGLES: Automatic Image Labeling with Affinity Coding [article]

Nilaksh Das, Sanya Chaba, Renzhi Wu, Sakshi Gandhi, Duen Horng Chau, Xu Chu
2020 arXiv   pre-print
We propose affinity coding, a new domain-agnostic paradigm for automated training data labeling.  ...  We compare GOGGLES with existing data programming systems on 5 image labeling tasks from diverse domains.  ...  We also use the probabilistic labels generated by Snorkel, Snuba and GOGGLES to train downstream discriminative models following the similar approach taken in [19, 29] .  ... 
arXiv:1903.04552v2 fatcat:jsd7vl6xureh5ner3snid7kb4y

Icon scanning: Towards next generation QR codes

I. Friedman, L. Zelnik-Manor
2012 2012 IEEE Conference on Computer Vision and Pattern Recognition  
Such a solution exists today for QR codes, which can be thought of as icons with a binary pattern.  ...  In addition, our system should further deal with the challenges introduced by taking pictures of a screen.  ...  First, for each original icon in our training set we generate K blurred versions using the gaussian kernels obtained in the previous stage.  ... 
doi:10.1109/cvpr.2012.6247793 dblp:conf/cvpr/FriedmanZ12 fatcat:lvumzvj3vvcprjstomz3afej6q

Training Deep Neural Networks to Detect Repeatable 2D Features Using Large Amounts of 3D World Capture Data [article]

Alexander Mai, Joseph Menke, Allen Yang
2019 arXiv   pre-print
We further present an algorithm for automatically generating labels of repeatable 2D features, and present a fast, easy to use test algorithm for evaluating a detector in an 3D environment.  ...  To this end, we generate labeled 2D images from a photo-realistic 3D dataset. These images are used for training a neural network based feature detector.  ...  Proposed Technique Training Set Generation To generate a large amount of 3D data for training our network, we utilize the Gibson simulator, which renders photo realistic viewpoints of scenes captured  ... 
arXiv:1912.04384v1 fatcat:dkvekwqhwneq7j77xu45mnbf4q

Intelligent Splicing Method of Virtual Reality Lingnan Cultural Heritage Panorama Based on Automatic Machine Learning

Yao Fu, Tingting Guo, Xingfang Zhao, Sang-Bing Tsai
2021 Mobile Information Systems  
We use automatic machine learning models to train the visual feature set and use the bagging method to generate different training subsets.  ...  With the increasing expansion of virtual reality application fields and the complexity of application content, the demand for real-time rendering of realistic graphics has increased sharply.  ...  Use automatic machine learning models to train the visual feature sets and use the bagging method to generate different training subsets.  ... 
doi:10.1155/2021/8693436 fatcat:ylb6r6qewnbgpjvelj5qroooqq

Image search—from thousands to billions in 20 years

Lei Zhang, Yong Rui
2013 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)  
Starting with a retrospective review of three stages of image search in the history, the article highlights major breakthroughs around the year 2000 in image search features, indexing methods, and commercial  ...  ACKNOWLEDGMENTS The authors gratefully acknowledge Wei-Ying Ma for his visionary long-term support and encouragement, and Xin-Jing Wang, Changhu Wang, Xirong Li, and Zhiwei Li for their years of collaboration with  ...  The progress is particularly promising due to the help of large-scale training data.  ... 
doi:10.1145/2490823 fatcat:cor23f3c7nb7fimy4ixp32bdk4

Smooth object retrieval using a bag of boundaries

Relja Arandjelovic, Andrew Zisserman
2011 2011 International Conference on Computer Vision  
We introduce a new dataset of 6K images containing sculptures by Moore and Rodin, and annotated with ground truth for the occurrence of twenty 3D sculptures.  ...  There have been several large scale demonstrations [11, 21, 22] with Google Goggles as a commercial application.  ...  We use the method and code from [4] which generates a hierarchy of regions based on the output of the gPb contour detector [15] This provides a partition of the image into a set of closed regions for  ... 
doi:10.1109/iccv.2011.6126265 dblp:conf/iccv/ArandjelovicZ11 fatcat:zpc6twt5fndcbk6cgqtcxvhuwu

Deep learning hashing for mobile visual search

Wu Liu, Huadong Ma, Heng Qi, Dong Zhao, Zhineng Chen
2017 EURASIP Journal on Image and Video Processing  
Firstly, we present a comprehensive survey of the existed deep learning based hashing methods, which showcases their remarkable power of automatic learning highly robust and compact binary code representation  ...  The proliferation of mobile devices is producing a new wave of mobile visual search applications that enable users to sense their surroundings with smart phones.  ...  generates less effective hash codes.  ... 
doi:10.1186/s13640-017-0167-4 fatcat:vcdhjjbe6jai7hyigxstihcega

Supervising the Transfer of Reasoning Patterns in VQA [article]

Corentin Kervadec, Christian Wolf, Grigory Antipov, Moez Baccouche, Madiha Nadri
2021 arXiv   pre-print
Methods for Visual Question Anwering (VQA) are notorious for leveraging dataset biases rather than performing reasoning, hindering generalization.  ...  This provides evidence that deep neural networks can learn to reason when training conditions are favorable enough.  ...  GQA is a dataset with question-answer pairs automatically generated from real images, and is particularly well suited for evaluating a large variety of reasoning skills.  ... 
arXiv:2106.05597v1 fatcat:naevbagtvvgy3ppxyyllanudia

IE-Vnet: Deep Learning-Based Segmentation of the Inner Ear's Total Fluid Space

Seyed-Ahmad Ahmadi, Johann Frei, Gerome Vivar, Marianne Dieterich, Valerie Kirsch
2022 Frontiers in Neurology  
Code and pre-trained models are available free and open-source under https://github.com/pydsgz/IEVNet.  ...  Its output works seamlessly with a previously published open-source pipeline for automatic ELS segmentation.  ...  The dataset D1 was split into 90% training data (N = 161 subjects, 322 inner ears) and 10% validation data (N = 18 subjects, 36 inner ears).  ... 
doi:10.3389/fneur.2022.663200 pmid:35645963 pmcid:PMC9130477 fatcat:xp26uysrwfau7okpiswcsrnf64

On-the-fly learning for visual search of large-scale image and video datasets

Ken Chatfield, Relja Arandjelović, Omkar Parkhi, Andrew Zisserman
2015 International Journal of Multimedia Information Retrieval  
The paradigm we explore is constructing visual models for such semantic entities on-the-fly, i.e. at run time, by using an image search engine to source visual training data for the text query.  ...  We describe three classes of queries, each with its associated visual search method: object instances (using a bag of visual words approach for matching); object categories (using a discriminative classifier  ...  Along with the fixed pool of pre-computed negative training data, these are used to train a linear SVM w, φ(I ) by fitting w to the available training data by minimizing an objective function balancing  ... 
doi:10.1007/s13735-015-0077-0 pmid:26191469 pmcid:PMC4498639 fatcat:prpxk47u4bdzxkhe5nfrssshpa

Comparative Study of Trust Modeling for Automatic Landmark Tagging

Ivan Ivanov, Peter Vajda, Pavel Korshunov, Touradj Ebrahimi
2013 IEEE Transactions on Information Forensics and Security  
We compare this socially-driven approach with other user trust models via experiments and subjective testing on an image database of various famous landmarks.  ...  He also worked as a radio access network conceptual planning expert in Vip mobile, Serbia, focusing on the implementation of second-and third-generation radio access technologies.  ...  In a real-life scenario, an image with unknown landmark will be automatically tagged with either one geotag or none, depending on the level of similarity with the known (trained) landmarks.  ... 
doi:10.1109/tifs.2013.2242889 fatcat:7a5lrf4kzbeqnjgmgizaeulgaq

Unbiased look at dataset bias

Antonio Torralba, Alexei A. Efros
2011 CVPR 2011  
They have been the chief reason for the considerable progress in the field, not just as source of large amounts of training data, but also as means of measuring and comparing performance of competing algorithms  ...  We present a comparison study using a set of popular datasets, evaluated based on a number of criteria including: relative data bias, cross-dataset generalization, effects of closed-world assumption, and  ...  This work is part of a larger effort, joint with David Forsyth and Jay Yagnik, on understanding the benefits and pitfalls of using large data in vision.  ... 
doi:10.1109/cvpr.2011.5995347 dblp:conf/cvpr/TorralbaE11 fatcat:bm2of7hygjcgfck7vt6qzbsdz4

Visual Object Recognition

Kristen Grauman, Bastian Leibe
2011 Synthesis Lectures on Artificial Intelligence and Machine Learning  
We introduce primary representations and learning approaches, with an emphasis on recent advances in the field.  ...  connected constellations; pyramid match kernels; detection via sliding windows; Hough voting; Generalized distance transform; the Implicit Shape Model; the Deformable Part-based Model vii Contents  ...  Both the part appearances and their location distributions are learned automatically from training data.  ... 
doi:10.2200/s00332ed1v01y201103aim011 fatcat:fhz7aokkfjav7fuauuorfstq4y

Spatially-Constrained Similarity Measurefor Large-Scale Object Retrieval

Xiaohui Shen, Zhe Lin, Jonathan Brandt, Ying Wu
2014 IEEE Transactions on Pattern Analysis and Machine Intelligence  
Furthermore, based on the retrieval and localization results of SCSM, we introduce a novel and robust re-ranking method with the k-nearest neighbors of the query for automatically refining the initial  ...  One fundamental problem in object retrieval with the bag-of-words model is its lack of spatial information.  ...  seconds without code optimization.  ... 
doi:10.1109/tpami.2013.237 pmid:26353283 fatcat:gup7cqh4lfd4jgtvo76mqh6ewi

Self-reported empathy and neural activity during action imitation and observation in schizophrenia

William P. Horan, Marco Iacoboni, Katy A. Cross, Alex Korb, Junghee Lee, Poorang Nori, Javier Quintana, Jonathan K. Wynn, Michael F. Green
2014 NeuroImage: Clinical  
These findings suggest that patients show a disjunction between automatic neural responses to low level social cues and higher level, integrative social cognitive processes involved in self-perceived empathy  ...  This study investigated neural activity during imitation and observation of finger movements and facial expressions in schizophrenia, and their correlates with self-reported empathy.  ...  The authors wish to thank Amanda Bender, Michelle Dolinsky, Crystal Gibson, Cory Tripp, and Katherine Weiner for their assistance in data collection.  ... 
doi:10.1016/j.nicl.2014.06.006 pmid:25009771 pmcid:PMC4087183 fatcat:3qwue6sojbdrtio7j4ra6lnyvq
« Previous Showing results 1 — 15 out of 306 results