Filters








78 Hits in 1.5 sec

Joint Person Objectness and Repulsion for Person Search [article]

Hantao Yao, Changsheng Xu
2020 arXiv   pre-print
Once localizing the candidate person proposals, person search can measure the Hantao Yao is with National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing  ... 
arXiv:2006.00155v1 fatcat:qn6fwij6y5d5jdnwhozkr7xqai

One-Shot Fine-Grained Instance Retrieval [article]

Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian
2017 arXiv   pre-print
Yao 1,2 , Shiliang Zhang 3 , Yongdong Zhang 1,2 , Jintao Li 1 , Qi Tian 4  ...  CompactBilinear CNN [13] 448×448 8192 15.06 L2_f C N 448×448 2048 16.90 C-to-F_f 448×448 1024 23.68 C-to-F_f+QE 448×448 1024 26.31 MM'17, October 23-27 2017, Mountain View, CA USA Hantao  ... 
arXiv:1707.00811v1 fatcat:zfnpunlofrbspkgpz755xe5p2a

Attribute-Induced Bias Eliminating for Transductive Zero-Shot Learning [article]

Hantao Yao, Shaobo Min, Yongdong Zhang, Changsheng Xu
2020 arXiv   pre-print
Transductive Zero-shot learning (ZSL) targets to recognize the unseen categories by aligning the visual and semantic information in a joint embedding space. There exist four kinds of domain biases in Transductive ZSL, i.e., visual bias and semantic bias between two domains and two visual-semantic biases in respective seen and unseen domains, but existing work only focuses on the part of them, which leads to severe semantic ambiguity during the knowledge transfer. To solve the above problem, we
more » ... above problem, we propose a novel Attribute-Induced Bias Eliminating (AIBE) module for Transductive ZSL. Specifically, for the visual bias between two domains, the Mean-Teacher module is first leveraged to bridge the visual representation discrepancy between two domains with unsupervised learning and unlabelled images. Then, an attentional graph attribute embedding is proposed to reduce the semantic bias between seen and unseen categories, which utilizes the graph operation to capture the semantic relationship between categories. Besides, to reduce the semantic-visual bias in the seen domain, we align the visual center of each category, instead of the individual visual data point, with the corresponding semantic attributes, which further preserves the semantic relationship in the embedding space. Finally, for the semantic-visual bias in the unseen domain, an unseen semantic alignment constraint is designed to align visual and semantic space in an unsupervised manner. The evaluations on several benchmarks demonstrate the effectiveness of the proposed method, e.g., obtaining the 82.8%/75.5%, 97.1%/82.5%, and 73.2%/52.1% for Conventional/Generalized ZSL settings for CUB, AwA2, and SUN datasets, respectively.
arXiv:2006.00412v1 fatcat:vpz5hp3otfbdbeyt3aptuj5tla

Dual Cluster Contrastive learning for Person Re-Identification [article]

Hantao Yao, Changsheng Xu
2021 arXiv   pre-print
Dual Cluster Contrastive learning for Person Re-Identification Hantao  ...  Yao, Changsheng Xu National Laboratory of Pattern Recognition, Institute of Automation, CAS  ... 
arXiv:2112.04662v2 fatcat:jdgaunbkhjdt3hflltjftd5n2a

Measuring optical vortices by means of dual shearing-type Sagnac interferometers [article]

Hantao Wang, Huajun Zhang, Mingyuan Ren, Wenkai Yao, Yu Zhang
2021 arXiv   pre-print
Measuring the positions of optical vortices is an essential part in the researches of speckles and adaptive optics. The measurement accuracy is restricted by the performance of optical devices and the properties of optical vortices, such as density and size. In order to achieve high accuracy and wide range of application, the dual shearing-type Sagnac interferometers is proposed using two shearing plates to adjust the precision of optical vortices measurement. The shearing displacements are
more » ... splacements are able to balance the measuring precision and the value of the intensity ratio point to provide optimum measurement performance. This method is useful for the observation of optical vortices with different sizes and densities, especially for the high density condition.
arXiv:2106.11667v1 fatcat:5ubsukdsknhfnjwyhysehoxpuq

GLAD

Longhui Wei, Shiliang Zhang, Hantao Yao, Wen Gao, Qi Tian
2017 Proceedings of the 2017 ACM on Multimedia Conference - MM '17  
The huge variance of human pose and the misalignment of detected human images significantly increase the difficulty of person Re-Identification (Re-ID). Moreover, efficient Re-ID systems are required to cope with the massive visual data being produced by video surveillance systems. Targeting to solve these problems, this work proposes a Global-Local-Alignment Descriptor (GLAD) and an efficient indexing and retrieval framework, respectively. GLAD explicitly leverages the local and global cues in
more » ... and global cues in human body to generate a discriminative and robust representation. It consists of part extraction and descriptor learning modules, where several part regions are first detected and then deep neural networks are designed for representation learning on both the local and global regions. A hierarchical indexing and retrieval framework is designed to eliminate the huge redundancy in the gallery set, and accelerate the online Re-ID procedure. Extensive experimental results show GLAD achieves competitive accuracy compared to the state-of-the-art methods. Our retrieval framework significantly accelerates the online Re-ID procedure without loss of accuracy. Therefore, this work has potential to work better on person Re-ID tasks in real scenarios.
doi:10.1145/3123266.3123279 dblp:conf/mm/WeiZY0T17 fatcat:z63h634unngtpebaxuxmvy3y4y

One-Shot Fine-Grained Instance Retrieval

Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian
2017 Proceedings of the 2017 ACM on Multimedia Conference - MM '17  
Fine-Grained Visual Categorization (FGVC) has achieved signicant progress recently. However, the number of ne-grained species could be huge and dynamically increasing in real scenarios, making it di cult to recognize unseen objects under the current FGVC framework. This raises an open issue to perform large-scale negrained identi cation without a complete training set. Aiming to conquer this issue, we propose a retrieval task named One-Shot Fine-Grained Instance Retrieval (OSFGIR). "One-Shot"
more » ... FGIR). "One-Shot" denotes the ability of identifying unseen objects through a ne-grained retrieval task assisted with an incomplete auxiliary training set. This paper rst presents the detailed description to OSFGIR task and our collected OSFGIR-378K dataset. Next, we propose the Convolutional and Normalization Networks (CN-Nets) learned on the auxiliary dataset to generate a concise and discriminative representation. Finally, we present a coarse-to-ne retrieval framework consisting of three components, i.e., coarse retrieval, ne-grained retrieval, and query expansion, respectively. The framework progressively retrieves images with similar semantics, and performs ne-grained identi cation. Experiments show our OSFGIR framework achieves signi cantly better accuracy and e ciency than existing FGVC and image retrieval methods, thus could be a better solution for large-scale ne-grained object identi cation.
doi:10.1145/3123266.3123278 dblp:conf/mm/YaoZZLT17 fatcat:s6stj7xujzganjypus6l2phive

Predicted phase diagram of boron-carbon-nitrogen

Hantao Zhang, Sanxi Yao, Michael Widom
2016 Physical review B  
Noting the structural relationships between phases of carbon and boron carbide with phases of boron nitride and boron subnitride, we investigate their mutual solubilities using a combination of first principles total energies supplemented with statistical mechanics to address finite temperatures. Owing to large energy costs of substitution, we find the mutual solubilities of the ultra hard materials diamond and cubic boron nitride are negligible, and the same for the quasi-two dimensional
more » ... o dimensional materials graphite and hexagonal boron nitride. In contrast, we find a continuous range of solubility connecting boron carbide to boron subnitride at elevated temperatures. The electron precise compound B_13CN consisting of B_12 icosahedra with NBC chains is found to be stable at all temperatures up to melting. It exhibits an order-disorder transition in the orientation of NBC chains at approximately T=500K.
doi:10.1103/physrevb.93.144107 fatcat:2ylg6izeorgg5n2ztgidow5i5a

Deep Representation Learning with Part Loss for Person Re-Identification [article]

Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian
2017 arXiv   pre-print
Learning discriminative representations for unseen person images is critical for person Re-Identification (ReID). Most of current approaches learn deep representations in classification tasks, which essentially minimize the empirical classification risk on the training set. As shown in our experiments, such representations commonly focus on several body parts discriminative to the training set, rather than the entire human body. Inspired by the structural risk minimization principle in SVM, we
more » ... inciple in SVM, we revise the traditional deep representation learning procedure to minimize both the empirical classification risk and the representation learning risk. The representation learning risk is evaluated by the proposed part loss, which automatically generates several parts for an image, and computes the person classification loss on each part separately. Compared with traditional global classification loss, simultaneously considering multiple part loss enforces the deep network to focus on the entire human body and learn discriminative representations for different parts. Experimental results on three datasets, i.e., Market1501, CUHK03, VIPeR, show that our representation outperforms the existing deep representations.
arXiv:1707.00798v2 fatcat:t4nei7ouxvehffdsdxjkoheifi

Oceanic non-Kolmogorov optical turbulence and spherical wave propagation [article]

Jinren Yao, Hantao Wang, Huajun Zhang, Jiandong Cai, Mingyuan Ren, Yu Zhang, Olga Korotkova
2020 arXiv   pre-print
Light propagation in turbulent media is conventionally studied with the help of the spatio-temporal power spectra of the refractive index fluctuations. In particular, for natural water turbulence several models for the spatial power spectra have been developed based on the classic, Kolmogorov postulates. However, as currently widely accepted, non-Kolmogorov turbulent regime is also common in the stratified flow fields, as suggested by recent developments in atmospheric optics. Until now all the
more » ... . Until now all the models developed for the non-Kolmogorov optical turbulence were pertinent to atmospheric research and, hence, involved only one advected scalar, e.g., temperature. We generalize the oceanic spatial power spectrum, based on two advected scalars, temperature and salinity concentration, to the non-Kolmogorov turbulence regime, with the help of the so-called "Upper-Bound Limitation" and by adopting the concept of spectral correlation of two advected scalars. The proposed power spectrum can handle general non-Kolmogorov, anisotropic turbulence but reduces to Kolmogorov, isotropic case if the power law exponents of temperature and salinity are set to 11/3 and anisotropy coefficient is set to unity. To show the application of the new spectrum, we derive the expression for the second-order mutual coherence function of a spherical wave and examine its coherence radius (in both scalar and vector forms) to characterize the turbulent disturbance. Our numerical calculations show that the statistics of the spherical wave vary substantially with temperature and salinity non-Kolmogorov power law exponents and temperature-salinity spectral correlation coefficient. The introduced spectrum is envisioned to become of significance for theoretical analysis and experimental measurements of non-classic natural water double-diffusion turbulent regimes.
arXiv:2009.02447v1 fatcat:rcqrypepnrahrfi6kceh4invxm

Domain-aware Visual Bias Eliminating for Generalized Zero-Shot Learning [article]

Shaobo Min, Hantao Yao, Hongtao Xie, Chaoqun Wang, Zheng-Jun Zha, Yongdong Zhang
2020 arXiv   pre-print
Recent methods focus on learning a unified semantic-aligned visual representation to transfer knowledge between two domains, while ignoring the effect of semantic-free visual representation in alleviating the biased recognition problem. In this paper, we propose a novel Domain-aware Visual Bias Eliminating (DVBE) network that constructs two complementary visual representations, i.e., semantic-free and semantic-aligned, to treat seen and unseen domains separately. Specifically, we explore
more » ... , we explore cross-attentive second-order visual statistics to compact the semantic-free representation, and design an adaptive margin Softmax to maximize inter-class divergences. Thus, the semantic-free representation becomes discriminative enough to not only predict seen class accurately but also filter out unseen images, i.e., domain detection, based on the predicted class entropy. For unseen images, we automatically search an optimal semantic-visual alignment architecture, rather than manual designs, to predict unseen classes. With accurate domain detection, the biased recognition problem towards the seen domain is significantly reduced. Experiments on five benchmarks for classification and segmentation show that DVBE outperforms existing methods by averaged 5.7% improvement.
arXiv:2003.13261v2 fatcat:w2oyekgre5h2jmwiqvhlngjive

Deep Representation Learning with Part Loss for Person Re-Identification

Hantao Yao, Shiliang Zhang, Richang Hong, Yongdong Zhang, Changsheng Xu, Qi Tian
2019 IEEE Transactions on Image Processing  
Learning discriminative representations for unseen person images is critical for person Re-Identification (ReID). Most of current approaches learn deep representations in classification tasks, which essentially minimize the empirical classification risk on the training set. As shown in our experiments, such representations easily get overfitted on a discriminative human body part among the training set. To gain the discriminative power on unseen person images, we propose a deep representation
more » ... ep representation learning procedure named Part Loss Networks (PL-Net), to minimize both the empirical classification risk and the representation learning risk. The representation learning risk is evaluated by the proposed part loss, which automatically detects human body parts, and computes the person classification loss on each part separately. Compared with traditional global classification loss, simultaneously considering part loss enforces the deep network to learn representations for different parts and gain the discriminative power on unseen persons. Experimental results on three person ReID datasets, i.e., Mar-ket1501, CUHK03, VIPeR, show that our representation outperforms existing deep representations.
doi:10.1109/tip.2019.2891888 fatcat:sixozjalnnh6lmcfturmo4jese

Multi-Objective Matrix Normalization for Fine-grained Visual Recognition

Shaobo Min, Hantao Yao, Hongtao Xie, Zheng-Jun Zha, Yongdong Zhang
2020 IEEE Transactions on Image Processing  
Yao et al. [41] design two complementary and elaborate part-level and objectlevel visual descriptions to capture robust and discriminative visual descriptions.  ...  Yao is with the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences. Beijing, China. (e-mail: hantao.yao@nlpr.ia.ac.cn) H. Xie and Y.  ... 
doi:10.1109/tip.2020.2977457 pmid:32149637 fatcat:w5syyp7emnhxfp63yydnar3v5q

Phase discontinuities induced scintillation enhancement: coherent vortex beams propagating through weak oceanic turbulence [article]

Hantao Wang, Huajun Zhang, Mingyuan Ren, Jinren Yao, Yu Zhang
2021 arXiv   pre-print
Under the impact of an infinitely extended edge phase dislocation, optical vortices (screw phase dislocations) induce scintillation enhancement. The scintillation index of a beam consisting of two Gaussian vortex beams with ±1 topological charges through weak oceanic turbulence is researched via derivation and phase screen simulation. Different combinations of two types of phase discontinuities can be obtained by changing the overlapping degree and the phase difference of two coherent Gaussian
more » ... coherent Gaussian vortex beams. The scintillation indexes for them verify that the formation condition of the phenomenon is the coexistence of two types of phase discontinuities. And the enhanced scintillation index can be several orders of magnitude larger than that of a plane wave under weak perturbation (Rytov variance). This phenomenon could be useful for both optical vortex detection and perturbation measurement.
arXiv:2102.03184v3 fatcat:dcrcdg3umfeh5dwtq5plmakrem

Rapid, accurate, multifunctional and self-assisted vision assessment and screening with interactive desktop autostereoscopy

Xiaoke Li, Jing Zhong, Yiyao Wang, Hantao Zhang, Jinrong Li, Kunyang Li, Li Gu, Min Zheng, Jin Yuan, Hang Fan, Dongyan Deng, Yao Wang (+1 others)
2021 Annals of Translational Medicine  
This study aimed to develop an interactive vision screening tool based on desktop autostereoscopy and evaluate its feasibility for testing visual acuity, colour vision, stereo vision and binocular balance clinically. An interactive desktop autostereoscopy vision test was developed making it remarkably convenient for individuals to undergo multiple visual function assessments in a single test. With this rapid screening process, an individual's visual acuity, colour vision, stereo vision and
more » ... reo vision and binocular balance can be assessed within several minutes. A total of 155 healthy subjects were enrolled to compare the clinical repeatability, accuracy, inter-visit variability, likeability and efficiency between the autostereoscopy and traditional method. In the repeatability test, the visual acuity measured with autostereoscopy was 0.045±0.018 and 0.035±0.018 (P=0.702) for the first and second tests, respectively. The mean logarithm of the Minimum Angle of Resolution (logMAR) visual acuities measured with the Early Treatment Diabetic Retinopathy Study (EDTRS) chart and autostereoscopy test were 0.04±0.02 and 0.05±0.02, respectively, which were not significantly different (P=0.849). The correlation between these two kinds of tests was statistically significant (Spearman correlation coefficient =0.829, P<0.001). The results for colour vision, stereo vision, and binocular vision are presented, and the effectiveness of the autostereoscopic method is supported with qualitative data comparing its results with those of the traditional methods. In the likeability test, the EDTRS chart and autostereoscopy test had scores of 2.21±0.53 and 3.04±0.07, while the traditional and autostereoscopy tests for colour vision, stereo vision, and binocular vision had scores of 2.02±0.59 and 3.36±0.93, respectively (P<0.001). Regarding visual fatigue, the mean scores were 0.69±0.04 and 0.42±0.04 (P<0.001) with the EDTRS chart and autostereoscopy test, respectively. Regarding work efficiency, the average testing times per person was 59.65±0.66 and 48.92±0.86 s (P<0.001) with the EDTRS chart and autostereoscopy test, respectively. The autostereoscopy test was conclusively shown to be valid, efficient and repeatable for the measurement of visual acuity, colour vision, stereo vision, and binocular vision, and the process was subjectively well-liked and comfortable.
doi:10.21037/atm-20-3555 pmid:33553316 pmcid:PMC7859821 fatcat:54prxowpyfcehe7mvussonfqz4
« Previous Showing results 1 — 15 out of 78 results