Multi-focus Image Fusion using Fully Convolutional Two-stream Network for Visual Sensors

2018 KSII Transactions on Internet and Information Systems  
We propose a deep learning method for multi-focus image fusion. Unlike most existing pixel-level fusion methods, either in spatial domain or in transform domain, our method directly learns an end-to-end fully convolutional two-stream network. The framework maps a pair of different focus images to a clean version, with a chain of convolutional layers, fusion layer and deconvolutional layers. Our deep fusion model has advantages of efficiency and robustness, yet demonstrates state-of-art fusion
more » ... ality. We explore different parameter settings to achieve trade-offs between performance and speed. Moreover, the experiment results on our training dataset show that our network can achieve good performance with subjective visual perception and objective assessment metrics. employed in fusing substantial information from multiple images of a same scene to generate a clear composite image. Multi-focus image fusion can overcome the diversities and limitations of single sensor in spatial resolution, geometry, and spectrum, thus enhance the reliability of image processing tasks, such as feature extraction, edge detection, object recognition and image segmentation. Currently, multi-focus image fusion technology has a wide range of applications in transportation, medical imaging, military operations and machine vision [1]. In each application, the key task of image fusion is to find the accurate information from the source images. That is, the fused image containing all relevant objects in focus can be obtained by composing the clear regions or pixels. However, it is difficult to determine which regions or pixels are located in focus [2] . To solve this problem, many researchers have proposed information theory for performing fusion. Generally, pixel, feature and decision levels are three levels of image fusion process [3] [4] . Pixel-level fusion deals with pixels obtained from source images directly, which is the lowest level of image fusion and mainly concentrates on visual enhancement. It can preserve the original information in the scene more easily. Advantages of pixel level fusion are low complexity and high accuracy [5] [6] . Feature-level fusion performs on features extracted from source images for analysis and processing, which can be support for decision-level fusion. In feature-level, features of images include size, edges, corners and textures [7] [8] . Feature-level fusion does not require source images registration strictly. Moreover, only the image feature is processed, thus it is convenient for information compression and data transmission. Decision-level fusion is the highest level of image fusion, aiming to make the best decision with credibility criteria. Decision fusion can be defined as the process of fusing information from several individual data sources after each data source is preprocessed, extracted and classified [9] . In summary, pixel-level image fusion can preserve more detailed information than feature and decision level [10] . Pixel-level image fusion is categorized in two domains: spatial domain and frequency domain [11] . Spatial domain processes regions or pixels to combine relevant information directly with focused regions properties, such as focused pixels detection [46], point spread functions (PSFs) [38] and guided filtering (GF) [12]. In frequency domain, source images are transformed in frequency domain, then frequency coefficients are combined and conducted inverse transform to get clear images by fusion rules, such as non-subsampled contourlet transform (NSCT) [47], non-subsampled shearlet transform(NSST) [48] and discrete cosine transform (DCT) [49]. In Recent years, image fusion approaches are proposed using machine learning (ML) algorithms for the classification of focused image regions. Artificial neural network (ANN) and support vector machine (SVM) based fusion methods are explored with visibility, spatial frequency, and edge features [13-14]. Besides, another efficient variant of ANN, probabilistic neural network (PNN), is developed for image fusion [15]. C. M. Sheela Rani and V Vijayakumar et al.proposed an efficient block based feature level contourlet transform with neural network (BFCN) model for image fusion [16] . All the above mentioned approaches are focused on feature-level or decision-level fusion. Among the state-of-the-art methods, Convolutional Neural Network (CNN) has achieved record-breaking performance in computer vision and image processing tasks, ranging from detection, recognition, tracking to
doi:10.3837/tiis.2018.05.019 fatcat:xmdfs3ft2vdxzjgi6coxqparde