Filters








163 Hits in 5.0 sec

Visual-Inertial-Semantic Scene Representation for 3-D Object Detection [article]

Jingming Dong, Xiaohan Fei, Stefano Soatto
2017 arXiv   pre-print
filter, and a likelihood function, which can be approximated by a discriminatively-trained convolutional neural network.  ...  A minimal sufficient representation, the posterior of semantic (identity) and syntactic (pose) attributes of objects in space, can be decomposed into a geometric term, which can be maintained by a localization-and-mapping  ...  Acknowledgments Research sponsored by ARO W911NF-15-1-0564/66731-CS, ONR N00014-17-1-2072, AFOSR FA9550-15-1-0229.  ... 
arXiv:1606.03968v2 fatcat:vg37xf55hvdylcvtxaam2xn4yy

Visual-Inertial-Semantic Scene Representation for 3D Object Detection

Jingming Dong, Xiaohan Fei, Stefano Soatto
2017 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
filter, and a likelihood function, which can be approximated by a discriminatively-trained convolutional neural network.  ...  A minimal sufficient representation, the posterior of semantic (identity) and syntactic (pose) attributes of objects in space, can be decomposed into a geometric term, which can be maintained by a localization-and-mapping  ...  Acknowledgments Research sponsored by ARO W911NF-15-1-0564/66731-CS, ONR N00014-17-1-2072, AFOSR FA9550-15-1-0229.  ... 
doi:10.1109/cvpr.2017.380 dblp:conf/cvpr/DongFS17 fatcat:vn5o3krz7jh6phjnqxtm2jur3i

Deep learning prototype domains for person re-identification

Arne Schumann, Shaogang Gong, Tobias Schuchert
2017 2017 IEEE International Conference on Image Processing (ICIP)  
Typically, this is achieved by learning either optimal features or matching metrics which are adapted to specific pairs of camera views dictated by the pairwise labelled training datasets.  ...  We learn a separate re-id model for each of the discovered prototype-domains and during model deployment, use the person probe image to select automatically the model of the closest prototypedomain.  ...  This configuration of multiple layers with small filter sizes was shown to perform well for image classification in the VGG nets [26] .  ... 
doi:10.1109/icip.2017.8296585 dblp:conf/icip/SchumannGS17 fatcat:cqn5izs6gzd3vnzi2rfgxe6qqe

Deep Learning Prototype Domains for Person Re-Identification [article]

Arne Schumann, Shaogang Gong, Tobias Schuchert
2017 arXiv   pre-print
Typically, this is achieved by learning either optimal features or matching metrics which are adapted to specific pairs of camera views dictated by the pairwise labelled training datasets.  ...  We learn a separate re-id model for each of the discovered prototype-domains and during model deployment, use the person probe image to select automatically the model of the closest prototype domain.  ...  This configuration of multiple layers with small filter sizes was shown to perform well for image classification in the VGG nets [26] .  ... 
arXiv:1610.05047v2 fatcat:rau2u2wsdfbbzpv6zfyx5szm4m

Investigating Nuisance Factors in Face Recognition with DCNN Representation

Claudio Ferrari, Giuseppe Lisanti, Stefano Berretti, Alberto Del Bimbo
2017 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
These problems include variations in illumination, pose, expression and occlusion, to mention some.  ...  Considering this, it can be assumed that the performance of a DCNN are influenced by the characteristics of the raw image data that are fed to the network.  ...  Government is authorized to reproduce and distribute reprints for Governmental purpose notwithstanding any copyright annotation thereon.  ... 
doi:10.1109/cvprw.2017.86 dblp:conf/cvpr/FerrariLBB17 fatcat:5y6gvq5vkjegnjvroyua5q6ize

Interpretable Deep Learning-Based Forensic Iris Segmentation and Recognition [article]

Andrey Kuehlkamp, Aidan Boyd, Adam Czajka, Kevin Bowyer, Patrick Flynn, Dennis Chute, Eric Benjamin
2021 arXiv   pre-print
by eye decomposition processes, such as furrows or irregular specular highlights present on the drying and wrinkling cornea.  ...  To our knowledge, this is the largest corpus of data used in postmortem iris recognition research to date. The source code of the proposed method are offered with the paper.  ...  the only minor occlusions occur to the iris area.  ... 
arXiv:2112.00849v2 fatcat:35uvyz2gzncwnbdnw6ch7uzriy

Brightness Transformation and CNN-MRF Model for Road Network Extraction using RSI

Sadaf Jahan, Dr. Abhishek Bhatt
2020 SMART MOVES JOURNAL IJOSCIENCE  
The very high spatial resolution images (VHR) taken by space and space probes are the main source of an accurate extraction of the route.  ...  The proposed method includes noise removal and enhancement using brightness transformation function then segmentation of road and non-road pixels using CNN and edges are joined using CNN model also.  ...  TABLE I PROPOSED CNN MODEL CONFIGURATION Layer Filters Kernel Size Stride Output size Conv 96 11*11 4 11*11*96 Pooling N/A 3*3 2 N/A Conv 256 5*5 1 5*5*256 Pooling N/A 3*  ... 
doi:10.24113/ijoscience.v6i2.267 fatcat:xib33uopyjbgte52hm5bgfr2ni

Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition under Occlusion [article]

Adam Kortylewski and Qing Liu and Angtian Wang and Yihong Sun and Alan Yuille
2020 arXiv   pre-print
We overcome these limitations by unifying DCNNs with part-based models into Compositional Convolutional Neural Networks (CompositionalNets) - an interpretable deep architecture with innate robustness to  ...  Computer vision systems in real-world applications need to be robust to partial occlusion while also being explainable.  ...  Alain and Bengio [1] probe mid-layer filters by training linear classifiers on the intermediate activations.  ... 
arXiv:2006.15538v1 fatcat:x26py4g7mjgbbkjdn3qi3bjvza

A Non-linear Differential CNN-Rendering Module for 3D Data Enhancement [article]

Yonatan Svirsky, Andrei Sharf
2019 arXiv   pre-print
Thus, through their optimization process, cells learn to focus on important parts of the data, bypassing occlusions, clutter and noise.  ...  Since sensor cells originally lie on a grid, this equals to a highly non-linear rendering of the scene into a 2D image. Our module performs especially well in presence of clutter and occlusions.  ...  yield partial representations of the 3D objects with missing parts due to self occlusions and complex configurations.  ... 
arXiv:1904.04850v1 fatcat:4neyi6oh7jhzzjvfmeboxotucm

Visualizing Deep Convolutional Neural Networks Using Natural Pre-images

Aravindh Mahendran, Andrea Vedaldi
2016 International Journal of Computer Vision  
Image representations, from SIFT and bag of visual words to Convolutional Neural Networks (CNNs) are a crucial component of almost all computer vision systems.  ...  In particular, we show that this method can invert representations such as HOG more accurately than recent alternatives while being applicable to CNNs too.  ...  interpretable filters).  ... 
doi:10.1007/s11263-016-0911-8 fatcat:a3lf7nhzbjdx7hvw4lsadaiog4

Remote Sensor Design for Visual Recognition With Convolutional Neural Networks

Lucas Jaffe, Michael Zelinski, Wesam Sakla
2019 IEEE Transactions on Geoscience and Remote Sensing  
In particular, remote sensing systems are usually constructed to optimize sensing cost-quality tradeoffs with respect to human image interpretability.  ...  Our results are compared to standard image quality measurements based on human visual perception, and we conclude not only that machine and human interpretability differ significantly but also that computer  ...  Probe images are indicated on the left of each row, separated from the gallery by a gray bar. (From left to right) Gallery images are ranked by least to greatest distance from the probe.  ... 
doi:10.1109/tgrs.2019.2925813 fatcat:4cr35vkmzrdqrp7yygpkp737ti

Past, Present, and Future of Face Recognition: A Review

Insaf Adjabi, Abdeldjalil Ouahabi, Amir Benzaoui, Abdelmalik Taleb-Ahmed
2020 Electronics  
The advantage of 3D data lies in its invariance to pose and lighting conditions, which has enhanced recognition systems efficiency. 3D data, however, is somewhat sensitive to changes in facial expressions  ...  Besides, we pay particular attention to deep learning approach as it presents the actuality in this field.  ...  They have finished their work by presenting a semantic bootstrapping that predicts which network is more consistent with noisy labels. To tackle class-imbalanced learning using deep CNN, Hayat et al.  ... 
doi:10.3390/electronics9081188 fatcat:ufopgfvmw5dg3nwhgle2tgvoj4

Predicting the Future with Transformational States [article]

Andrew Jaegle, Oleh Rybkin, Konstantinos G. Derpanis, Kostas Daniilidis
2018 arXiv   pre-print
We propose a model that predicts future images by learning to represent the present state and its transformation given only a sequence of images.  ...  We describe how this model can be integrated into an encoder-decoder convolutional neural network (CNN) architecture that uses weighted residual connections to integrate representations of the past with  ...  K.G.D. is supported by a Canadian NSERC Discovery grant.  ... 
arXiv:1803.09760v1 fatcat:4bclbywvhbdv5jkwwypqq74iga

Gait Recognition and Understanding Based on Hierarchical Temporal Memory Using 3D Gait Semantic Folding

Jian Luo, Tardi Tjahjadi
2020 Sensors  
., they are easily influenced by multi-views, occlusion, clothes, and object carrying conditions.  ...  Second, by using gait semantic folding, the estimated body parameters are encoded using a sparse 2D matrix to construct the structural gait semantic image.  ...  Before evaluating the gait recognition rate, one group of gait sequences with conditional configurations (self-occlusions and static occlusion from one record time) are selected for Figure 13 shows  ... 
doi:10.3390/s20061646 pmid:32188067 pmcid:PMC7146167 fatcat:asvte7dstvb7jhf772tyhnj3de

Smart Fashion: A Review of AI Applications in the Fashion Apparel Industry [article]

Seyed Omid Mohammadi, Ahmad Kalhor
2021 arXiv   pre-print
For each task, a time chart is provided to analyze the progress through the years.  ...  Furthermore, we provide a list of 86 public fashion datasets accompanied by a list of suggested applications and additional information for each.  ...  Yang [108] Tree-based model, GBDT, CNN, MLP, 50.66% Hit@10 Item pair, Attribute-based, Interpretable 37 J.  ... 
arXiv:2111.00905v2 fatcat:6n6d62lntjfu5pxmjzgi4mpv6i
« Previous Showing results 1 — 15 out of 163 results