Filters








71 Hits in 1.2 sec

Deep Keyframe Detection in Human Action Videos [article]

Xiang Yan, Syed Zulqarnain Gilani, Hanlin Qin, Mingtao Feng, Liang Zhang, Ajmal Mian
2018 arXiv   pre-print
Detecting representative frames in videos based on human actions is quite challenging because of the combined factors of human pose in action and the background. This paper addresses this problem and formulates the key frame detection as one of finding the video frames that optimally maximally contribute to differentiating the underlying action category from all other categories. To this end, we introduce a deep two-stream ConvNet for key frame detection in videos that learns to directly
more » ... the location of key frames. Our key idea is to automatically generate labeled data for the CNN learning using a supervised linear discriminant method. While the training data is generated taking many different human action videos into account, the trained CNN can predict the importance of frames from a single video. We specify a new ConvNet framework, consisting of a summarizer and discriminator. The summarizer is a two-stream ConvNet aimed at, first, capturing the appearance and motion features of video frames, and then encoding the obtained appearance and motion features for video representation. The discriminator is a fitting function aimed at distinguishing between the key frames and others in the video. We conduct experiments on a challenging human action dataset UCF101 and show that our method can detect key frames with high accuracy.
arXiv:1804.10021v1 fatcat:jciq5xtdfbdbhd3q44ixqpzt2y

Learning a Convolutional Autoencoder for Nighttime Image Dehazing

Mengyao Feng, Teng Yu, Mingtao Jing, Guowei Yang
2020 Information  
Currently, haze removal of images captured at night for foggy scenes rely on the traditional, prior-based methods, but these methods are frequently ineffective at dealing with night hazy images. In addition, the light sources at night are complicated and there is a problem of inconsistent brightness. This makes the estimation of the transmission map complicated in the night scene. Based on the above analysis, we propose an autoencoder method to solve the problem of overestimation or
more » ... ion of transmission captured by the traditional, prior-based methods. For nighttime hazy images, we first remove the color effect of the haze image with an edge-preserving maximum reflectance prior (MRP) method. Then, the hazy image without color influence is input into the self-encoder network with skip connections to obtain the transmission map. Moreover, instead of using the local maximum method, we estimate the ambient illumination through a guiding image filtering. In order to highlight the effectiveness of our experiments, a large number of comparison experiments were conducted between our method and the state-of-the-art methods. The results show that our method can effectively suppress the halo effect and reduce the effectiveness of glow. In the experimental part, we calculate that the average Peak Signal to Noise Ratio (PSNR) is 21.0968 and the average Structural Similarity (SSIM) is 0.6802.
doi:10.3390/info11090424 fatcat:lzxhmdv7bnaqxbnr7l2s7swqtu

CGGAN: A Context Guided Generative Adversarial Network For Single Image Dehazing [article]

Zhaorun Zhou, Zhenghao Shi, Mingtao Guo, Yaning Feng, Minghua Zhao
2020 arXiv   pre-print
Image haze removal is highly desired for the application of computer vision. This paper proposes a novel Context Guided Generative Adversarial Network (CGGAN) for single image dehazing. Of which, an novel new encoder-decoder is employed as the generator. And it consists of a feature-extraction-net, a context-extractionnet, and a fusion-net in sequence. The feature extraction-net acts as a encoder, and is used for extracting haze features. The context-extraction net is a multi-scale parallel
more » ... mid decoder, and is used for extracting the deep features of the encoder and generating coarse dehazing image. The fusion-net is a decoder, and is used for obtaining the final haze-free image. To obtain more better results, multi-scale information obtained during the decoding process of the context extraction decoder is used for guiding the fusion decoder. By introducing an extra coarse decoder to the original encoder-decoder, the CGGAN can make better use of the deep feature information extracted by the encoder. To ensure our CGGAN work effectively for different haze scenarios, different loss functions are employed for the two decoders. Experiments results show the advantage and the effectiveness of our proposed CGGAN, evidential improvements over existing state-of-the-art methods are obtained.
arXiv:2005.13884v1 fatcat:eogreqdminabzpb5qdtl3rgf3y

Point Attention Network for Semantic Segmentation of 3D Point Clouds [article]

Mingtao Feng, Liang Zhang, Xuefei Lin, Syed Zulqarnain Gilani and Ajmal Mian
2019 arXiv   pre-print
Convolutional Neural Networks (CNNs) have performed extremely well on data represented by regularly arranged grids such as images. However, directly leveraging the classic convolution kernels or parameter sharing mechanisms on sparse 3D point clouds is inefficient due to their irregular and unordered nature. We propose a point attention network that learns rich local shape features and their contextual correlations for 3D point cloud semantic segmentation. Since the geometric distribution of
more » ... neighboring points is invariant to the point ordering, we propose a Local Attention-Edge Convolution (LAE Conv) to construct a local graph based on the neighborhood points searched in multi-directions. We assign attention coefficients to each edge and then aggregate the point features as a weighted sum of its neighbors. The learned LAE-Conv layer features are then given to a point-wise spatial attention module to generate an interdependency matrix of all points regardless of their distances, which captures long-range spatial contextual features contributing to more precise semantic information. The proposed point attention network consists of an encoder and decoder which, together with the LAE-Conv layers and the point-wise spatial attention modules, make it an end-to-end trainable network for predicting dense labels for 3D point cloud segmentation. Experiments on challenging benchmarks of 3D point clouds show that our algorithm can perform at par or better than the existing state of the art methods.
arXiv:1909.12663v1 fatcat:vpk2ae4m7zfqppy2c43o2fxd5i

Relation Graph Network for 3D Object Detection in Point Clouds [article]

Mingtao Feng, Syed Zulqarnain Gilani, Yaonan Wang, Liang Zhang and Ajmal Mian
2019 arXiv   pre-print
Convolutional Neural Networks (CNNs) have emerged as a powerful strategy for most object detection tasks on 2D images. However, their power has not been fully realised for detecting 3D objects in point clouds directly without converting them to regular grids. Existing state-of-art 3D object detection methods aim to recognize 3D objects individually without exploiting their relationships during learning or inference. In this paper, we first propose a strategy that associates the predictions of
more » ... rection vectors and pseudo geometric centers together leading to a win-win solution for 3D bounding box candidates regression. Secondly, we propose point attention pooling to extract uniform appearance features for each 3D object proposal, benefiting from the learned direction features, semantic features and spatial coordinates of the object points. Finally, the appearance features are used together with the position features to build 3D object-object relationship graphs for all proposals to model their co-existence. We explore the effect of relation graphs on proposals' appearance features enhancement under supervised and unsupervised settings. The proposed relation graph network consists of a 3D object proposal generation module and a 3D relation module, makes it an end-to-end trainable network for detecting 3D object in point clouds. Experiments on challenging benchmarks ( SunRGB-Dand ScanNet datasets ) of 3D point clouds show that our algorithm can perform better than the existing state-of-the-art methods.
arXiv:1912.00202v1 fatcat:eookfz7syjempkfvfe2fbhcenu

Self-Supervised Learning to Detect Key Frames in Videos

Xiang Yan, Syed Zulqarnain Gilani, Mingtao Feng, Liang Zhang, Hanlin Qin, Ajmal Mian
2020 Sensors  
Detecting key frames in videos is a common problem in many applications such as video classification, action recognition and video summarization. These tasks can be performed more efficiently using only a handful of key frames rather than the full video. Existing key frame detection approaches are mostly designed for supervised learning and require manual labelling of key frames in a large corpus of training data to train the models. Labelling requires human annotators from different
more » ... to annotate key frames in videos which is not only expensive and time consuming but also prone to subjective errors and inconsistencies between the labelers. To overcome these problems, we propose an automatic self-supervised method for detecting key frames in a video. Our method comprises a two-stream ConvNet and a novel automatic annotation architecture able to reliably annotate key frames in a video for self-supervised learning of the ConvNet. The proposed ConvNet learns deep appearance and motion features to detect frames that are unique. The trained network is then able to detect key frames in test videos. Extensive experiments on UCF101 human action and video summarization VSUMM datasets demonstrates the effectiveness of our proposed method.
doi:10.3390/s20236941 pmid:33291759 fatcat:pqiovyqo2baa5cyq37os7hxjmy

Learning from Pixel-Level Noisy Label : A New Perspective for Light Field Saliency Detection [article]

Mingtao Feng, Kendong Liu, Liang Zhang, Hongshan Yu, Yaonan Wang, Ajmal Mian
2022 arXiv   pre-print
Saliency detection with light field images is becoming attractive given the abundant cues available, however, this comes at the expense of large-scale pixel level annotated data which is expensive to generate. In this paper, we propose to learn light field saliency from pixel-level noisy labels obtained from unsupervised hand crafted featured based saliency methods. Given this goal, a natural question is: can we efficiently incorporate the relationships among light field cues while identifying
more » ... lean labels in a unified framework? We address this question by formulating the learning as a joint optimization of intra light field features fusion stream and inter scenes correlation stream to generate the predictions. Specially, we first introduce a pixel forgetting guided fusion module to mutually enhance the light field features and exploit pixel consistency across iterations to identify noisy pixels. Next, we introduce a cross scene noise penalty loss for better reflecting latent structures of training data and enabling the learning to be invariant to noise. Extensive experiments on multiple benchmark datasets demonstrate the superiority of our framework showing that it learns saliency prediction comparable to state-of-the-art fully supervised light field saliency methods. Our code is available at https://github.com/OLobbCode/NoiseLF.
arXiv:2204.13456v1 fatcat:np3aj7mzingu3gt46zwym5emha

3D Face Reconstruction from Light Field Images: A Model-Free Approach [chapter]

Mingtao Feng, Syed Zulqarnain Gilani, Yaonan Wang, Ajmal Mian
2018 Lecture Notes in Computer Science  
Reconstructing 3D facial geometry from a single RGB image has recently instigated wide research interest. However, it is still an ill-posed problem and most methods rely on prior models hence undermining the accuracy of the recovered 3D faces. In this paper, we exploit the Epipolar Plane Images (EPI) obtained from light field cameras and learn CNN models that recover horizontal and vertical 3D facial curves from the respective horizontal and vertical EPIs. Our 3D face reconstruction network
more » ... eLFnet) comprises a densely connected architecture to learn accurate 3D facial curves from low resolution EPIs. To train the proposed FaceLFnets from scratch, we synthesize photo-realistic light field images from 3D facial scans. The curve by curve 3D face estimation approach allows the networks to learn from only 14K images of 80 identities, which still comprises over 11 Million EPIs/curves. The estimated facial curves are merged into a single pointcloud to which a surface is fitted to get the final 3D face. Our method is model-free, requires only a few training samples to learn FaceLFnet and can reconstruct 3D faces with high accuracy from single light field images under varying poses, expressions and lighting conditions. Comparison on the BU-3DFE and BU-4DFE datasets show that our method reduces reconstruction errors by over 20% compared to recent state of the art.
doi:10.1007/978-3-030-01249-6_31 fatcat:5qajm25cjnautmbrqw2lrbeph4

Chaotic Brillouin optical correlation-domain analysis

Jianzhong Zhang, Mingtao Zhang, Mingjiang Zhang, Yi Liu, Changkun Feng, Yahui Wang, Yuncai Wang
2018 Optics Letters  
We propose and experimentally demonstrate a chaotic Brillouin optical correlation-domain analysis (BOCDA) system for distributed fiber sensing. The utilization of the chaotic laser with low coherent state ensures high spatial resolution. The experimental results demonstrate a 3.92-cm spatial resolution over a 906-m measurement range. The uncertainty in the measurement of the local Brillouin frequency shift is 1.2MHz. The measurement signal-to-noise ratio is given, which is agreement with the theoretical value.
doi:10.1364/ol.43.001722 pmid:29652349 fatcat:xfa4invzlvcafkiqt7vu4lnqvu

Rescheduling Plan Optimization of Underground Mine Haulage Equipment Based on Random Breakdown Simulation

Ning Li, Shuzhao Feng, Tao Lei, Haiwang Ye, Qizhou Wang, Liguan Wang, Mingtao Jia
2022 Sustainability  
Due to production space and operating environment requirements, mine production equipment often breaks down, seriously affecting the mine's production schedule. To ensure the smooth completion of the haulage operation plan under abnormal conditions, a model of the haulage equipment rescheduling plan based on the random simulation of equipment breakdowns is established in this paper. The model aims to accomplish both the maximum completion rate of the original mining plan and the minimum
more » ... ion of the ore grade during the rescheduling period. This model is optimized by improving the wolf colony algorithm and changing the location update formula of the individuals in the wolf colony. Then, the optimal model solution can be used to optimize the rescheduling of the haulage plan by considering equipment breakdowns. The application of the proposed method in an underground mine revealed that the completion rate of the mine's daily mining plan reached 83.40% without increasing the amount of equipment, while the ore quality remained stable. Moreover, the improved optimization algorithm converged quickly and was characterized by high robustness.
doi:10.3390/su14063448 fatcat:xrcoyyg6kjglvbopxjncnv7t2m

Alterations of oral microbiota distinguish children with autism spectrum disorders from healthy controls

Yanan Qiao, Mingtao Wu, Yanhuizhi Feng, Zhichong Zhou, Lei Chen, Fengshan Chen
2018 Scientific Reports  
Altered gut microbiota is associated with autism spectrum disorders (ASD), a group of complex, fast growing but difficult-to-diagnose neurodevelopmental disorders worldwide. However, the role of the oral microbiota in ASD remains unexplored. Via high-throughput sequencing of 111 oral samples in 32 children with ASD and 27 healthy controls, we demonstrated that the salivary and dental microbiota of ASD patients were highly distinct from those of healthy individuals. Lower bacterial diversity was
more » ... observed in ASD children compared to controls, especially in dental samples. Also, principal coordinate analysis revealed divergences between ASD patients and controls. Moreover, pathogens such as Haemophilus in saliva and Streptococcus in plaques showed significantly higher abundance in ASD patients, whereas commensals such as Prevotella, Selenomonas, Actinomyces, Porphyromonas, and Fusobacterium were reduced. Specifically, an overt depletion of Prevotellaceae co-occurrence network in ASD patients was obtained in dental plaques. The distinguishable bacteria were also correlated with clinical indices, reflecting disease severity and the oral health status (i.e. dental caries). Finally, diagnostic models based on key microbes were constructed, with 96.3% accuracy in saliva. Taken together, this study characterized the habitat-specific profile of the oral microbiota in ASD patients, which might help develop novel strategies for the diagnosis of ASD. Autism spectrum disorders (ASD) are complex neurodevelopmental disorders unfolded in the first couple years of life, and characterized by impairments in language and social interactions, often with restricted interests and repetitive behaviors 1 . ASD affect 1-in-68 children in America, with the prevalence increasing dramatically over the past decades 2 . Currently, the fifth edition of Diagnostic and Statistical Manual of Mental Disorders (DSM-5) is considered the "gold standard" for ASD diagnosis. However, DSM-5 encompasses descriptive criteria merely based on empirical data, with no laboratory or other diagnostic tests 3 . As a result, diagnosis according to the relatively time-consuming DSM-5 guideline is not sufficiently scientific, and needs further validation 4 . Thus, developing precise and reliable diagnosis tools for ASD remains a challenge. Genetic and environmental factors both play roles in the pathogenesis of ASD 5 . Available twin studies showed that environmental factors are more important than genetic predisposition 6 . Among such factors, microbial dysbiosis is of increasing interest, with accumulating reports in animal models and human epidemiologic studies linking disruptive alterations in the gut microbiota to ASD symptomology 7-12 . Although the mechanisms by which gut microorganisms might impact the central nervous system (CNS) are not fully understood, the human microbiota and related metabolites have been reported to affect a variety of complex behaviors, including social, emotional, and anxiety-like behaviors, also contributing to brain development and modulating cognition through imbalances in the microbiota-gut-CNS axis 2,13,14 . Consequently, microbial dysbiosis might play a causal role in the development of mental disorders, in a pathway mediated through the host's metabolism.
doi:10.1038/s41598-018-19982-y pmid:29371629 pmcid:PMC5785483 fatcat:ezfunsts6bdknoe45fpmck3v4i

Minimum Potential Energy of Point Cloud for Robust Global Registration [article]

Zijie Wu, Yaonan Wang, Qing Zhu, Jianxu Mao, Haotian Wu, Mingtao Feng, Ajmal Mian
2020 arXiv   pre-print
In this paper, we propose a novel minimum gravitational potential energy (MPE)-based algorithm for global point set registration. The feature descriptors extraction algorithms have emerged as the standard approach to align point sets in the past few decades. However, the alignment can be challenging to take effect when the point set suffers from raw point data problems such as noises (Gaussian and Uniformly). Different from the most existing point set registration methods which usually extract
more » ... he descriptors to find correspondences between point sets, our proposed MPE alignment method is able to handle large scale raw data offset without depending on traditional descriptors extraction, whether for the local or global registration methods. We decompose the solution into a global optimal convex approximation and the fast descent process to a local minimum. For the approximation step, the proposed minimum potential energy (MPE) approach consists of two main steps. Firstly, according to the construction of the force traction operator, we could simply compute the position of the potential energy minimum; Secondly, with respect to the finding of the MPE point, we propose a new theory that employs the two flags to observe the status of the registration procedure. The method of fast descent process to the minimum that we employed is the iterative closest point algorithm; it can achieve the global minimum. We demonstrate the performance of the proposed algorithm on synthetic data as well as on real data. The proposed method outperforms the other global methods in terms of both efficiency, accuracy and noise resistance.
arXiv:2006.06460v2 fatcat:l37nglx7yvbjfjkd5soyw5xcga

Genetically determined height was associated with lung cancer risk in East Asian population

Lu Wang, Mingtao Huang, Hui Ding, Guangfu Jin, Liang Chen, Feng Chen, Hongbing Shen
2018 Cancer Medicine  
doi:10.1002/cam4.1557 pmid:29790669 pmcid:PMC6051217 fatcat:p2so3h4rmjen3o7xbr3e2ddfie

A Systematic Collection of Medical Image Datasets for Deep Learning [article]

Johann Li, Guangming Zhu, Cong Hua, Mingtao Feng, BasheerBennamoun, Ping Li, Xiaoyuan Lu, Juan Song, Peiyi Shen, Xu Xu, Lin Mei, Liang Zhang (+2 others)
2021 arXiv   pre-print
The astounding success made by artificial intelligence (AI) in healthcare and other fields proves that AI can achieve human-like performance. However, success always comes with challenges. Deep learning algorithms are data-dependent and require large datasets for training. The lack of data in the medical imaging field creates a bottleneck for the application of deep learning to medical image analysis. Medical image acquisition, annotation, and analysis are costly, and their usage is constrained
more » ... by ethical restrictions. They also require many resources, such as human expertise and funding. That makes it difficult for non-medical researchers to have access to useful and large medical data. Thus, as comprehensive as possible, this paper provides a collection of medical image datasets with their associated challenges for deep learning research. We have collected information of around three hundred datasets and challenges mainly reported between 2013 and 2020 and categorized them into four categories: head & neck, chest & abdomen, pathology & blood, and "others". Our paper has three purposes: 1) to provide a most up to date and complete list that can be used as a universal reference to easily find the datasets for clinical image analysis, 2) to guide researchers on the methodology to test and evaluate their methods' performance and robustness on relevant datasets, 3) to provide a "route" to relevant algorithms for the relevant medical topics, and challenge leaderboards.
arXiv:2106.12864v1 fatcat:bjzkgce2xvaexmb6cdznws7fye

Dispatch Optimization Model for Haulage Equipment between Stopes Based on Mine Short-Term Resource Planning

Ning Li, Shuzhao Feng, Haiwang Ye, Qizhou Wang, Mingtao Jia, Liguan Wang, Shugang Zhao, Dongfang Chen
2021 Metals  
The working environment of underground mines is complicated, making it difficult to construct an underground mine production plan. In response to the requirements for the preparation of a short-term production plan for underground mines, an optimization model for short-term resource planning was constructed, with the goal of maximizing the total revenue during the planning period. The artificial bee colony optimization algorithm is used to solve the model using MATLAB. According to the basic
more » ... uirements of underground mine ore haulage and ore hoisting, a haulage equipment inter-stopes dispatch plan model was constructed, with the primary goal of minimizing the haulage equipment wait time. A non-dominated sorting genetic algorithm is used to solve the optimization model. An underground mine is examined using the two models, and the optimization results are compared and verified with the scheme obtained by using traditional optimization algorithms. Results show that based on the improved optimization algorithm, the use of short-term production planning schemes to guide mine production operations can increase the haulage equipment utilization rate, thereby increasing mine production revenue.
doi:10.3390/met11111848 fatcat:4mpot7omdngptirvvl6vuq7ljq
« Previous Showing results 1 — 15 out of 71 results