Filters








5 Hits in 4.4 sec

Introduction to the 1st Place Winning Model of OpenImages Relationship Detection Challenge [article]

Ji Zhang, Kevin Shih, Andrew Tao, Bryan Catanzaro, Ahmed Elgammal
2018 arXiv   pre-print
This article describes the model we built that achieved 1st place in the OpenImage Visual Relationship Detection Challenge on Kaggle.  ...  We show in ablation study that each factor can improve the performance to a non-trivial extent, and the model reaches optimal when all of them are combined.  ...  One major failure case of our model is on the predicate "hold", where the model usually needs to focus on the small area of the intersection of a human hand and the object, which our model is currently  ... 
arXiv:1811.00662v2 fatcat:plvufcrqova7plywgi4m7uacuu

Graphical Contrastive Losses for Scene Graph Parsing [article]

Ji Zhang, Kevin J. Shih, Ahmed Elgammal, Andrew Tao, Bryan Catanzaro
2019 arXiv   pre-print
Our model outperforms the winning method of the OpenImages Relationship Detection Challenge by 4.7\% (16.5\% relative) on the test set.  ...  Most scene graph parsers use a two-stage pipeline to detect visual relationships: the first stage detects entities, and the second predicts the predicate for each entity pair using a softmax distribution  ...  Our best model achieves 0.328 on the Private set of the OpenImages Relationship Detection Challenge, outperforming the winning model by a significant 4.7% (16.5% relative) margin.  ... 
arXiv:1903.02728v5 fatcat:s4cfzxjwszcgtmpiiydgbb7ete

Scene graph parsing and its application in cross-modal reasoning tasks

Ji Zhang
2020
This task is commonly seen as an extension to the object detection task where objects are detected individually, while the former requires recognizing relationships between object pairs.  ...  In thesis we start with an inherent issue lying in scene graph parsing: the unbearable quadratic complexity of relationship detection.  ...  Our model outperforms the winning method of the OpenImages Relationship Detection Challenge by 4.7% (16.5% relative) on the test set.  ... 
doi:10.7282/t3-ka2q-b984 fatcat:eqsq3xw5vffabh7yq57wqdby3e

Implications of mountain shading on calculating energy for snowmelt using unstructured triangular meshes

Christopher B. Marsh, John W. Pomeroy, Raymond J. Spiteri
2012 Hydrological Processes  
detection The calculation of the z 0 values for each triangle gives an ordering to the trian- gles, allowing for obscuring triangles to be detected.  ...  Requests for permission to copy or to make other use of material in this thesis in whole or part should be addressed to: Head of the Department of Geography and Planning 117 Science Place University  ... 
doi:10.1002/hyp.9329 fatcat:mgmmy5h6vfch5fkhtd55ax2zg4

Learning from Multimodal Web Data

John Miles Hessel
2020
The ultimate aim of this line of work is to build models capable of drawing connections between different modes of data, e.g., images+text.  ...  To this end, we present algorithms that discover grounded image-text relationships from noisy, long documents, e.g., Wikipedia articles and the images they contain.  ...  The top, middle, and bottom rows are sampled from the 99th, 50th, and 1st percentiles of model scores respectively.  ... 
doi:10.7298/fzce-qv86 fatcat:limoc6b6xjgm5b2dbzh3f72tuq