Discrimination of Transgenic Maize Kernel Using NIR Hyperspectral Imaging and Multivariate Data Analysis
There are possible environmental risks related to gene flow from genetically engineered organisms. It is important to find accurate, fast, and inexpensive methods to detect and monitor the presence of genetically modified (GM) organisms in crops and derived crop products. In the present study, GM maize kernels containing both cry1Ab/cry2Aj-G10evo proteins and their non-GM parents were examined by using hyperspectral imaging in the near-infrared (NIR) range (874.41-1733.91 nm) combined with
... metric data analysis. The hypercubes data were analyzed by applying principal component analysis (PCA) for exploratory purposes, and support vector machine (SVM) and partial least squares discriminant analysis (PLS-DA) to build the discriminant models to class the GM maize kernels from their contrast. The results indicate that clear differences between GM and non-GM maize kernels can be easily visualized with a nondestructive determination method developed in this study, and excellent classification could be achieved, with calculation and prediction accuracy of almost 100%. This study also demonstrates that SVM and PLS-DA models can obtain good performance with 54 wavelengths, selected by the competitive adaptive reweighted sampling method (CARS), making the classification processing for online application more rapid. Finally, GM maize kernels were visually identified on the prediction maps by predicting the features of each pixel on individual hyperspectral images. It was concluded that hyperspectral imaging together with chemometric data analysis is a promising technique to identify GM maize kernels, since it overcomes some disadvantages of the traditional analytical methods, such as complex and monotonous sampling. been argued that the use of GM techniques could possibly result in unpredictable adverse effects on food and environment safety. These unintended effects include the transfer of an uncontrollable escape of exogenous genes into neighboring wild plants by pollen, the formation of toxins associated with GM food, and modification of the biodiversity of the host plant by changing the expression of the existing genes  . The introduction of genetically-modified organisms (GMOs) in agro-food markets should be accompanied by a regulatory need to monitor and verify the presence and amount of GM varieties to guarantee consumer safety. Consequently, there is a need for GMO detection methods that are accurate, fast, and inexpensive. Currently, there are several analytical methods proposed for the determination, characterization, and authentication of GMOs in crops and derived crop products, such as polymerase chain reaction (PCR)/restriction enzyme assay , enzyme-linked immunosorbent assays , lateral flow strip , and microarray  . As a whole, the DNA-and protein-based methods for the identification of GMOs are versatile, sensitive, and accurate. However, there are also some disadvantages-they are destructive, laborious, expensive, time-consuming, and require highly-skilled operators; thus, they are unsuitable for online process control  . As non-destructive, synchronous, and coherent detection tools, spectroscopic techniques are environmentally friendly, fast, and easy to operate without complex sample pretreatments. The application of a method involving near-infrared (NIR) combined with chemometrics for the identification of GMOs in the agro-food market is feasible    . NIR is the region of the electromagnetic spectrum between 750 nm and 2500 nm, and NIR spectroscopy is often used to gather information on the relative proportions of C-H, N-H, and O-H bonds in organic molecules  . The basis of this technology for the detection of mutants, mediated by transgenic technology, is that it can identify phenotypic changes caused by genotypic changes, which ultimately bring about changes of organic molecular bonds  . Liu et al. (2014) distinguished GM rice seeds from their counterparts by using visible/near-infrared spectroscopy (VIS-NIR) spectroscopy combined with a chemometric tool with classification accuracy up to 100% with the least squares-support vector machine (LS-SVM) model  . Garcíamolina et al. (2016) applied NIR spectroscopy technology to discriminate GM wheat gain and flour from non-GM wheat lines  . Guo et al. (2014) also demonstrated that clear differences between GM and non-GM tomatoes could be identified by using VIS-NIR together with discriminant partial least squares regression with excellent classification accuracy of up to 100%  . However, conventional NIR-widely used for transgenic foods identification-lacks spatial dimension information. In contrast, NIR hyperspectral imaging combines traditional optical imaging and the spectral method which is capable of capturing images over broad contiguous wavelengths in the NIR region, and has received much attention in cereal science    . These images form a three-dimensional structure (x, y, λ) of multivariate data for processing and analysis, where x and y are the spatial dimensions (the number of rows and columns in pixels), and λ represents the number of wavelengths [19, 20] . NIR hyperspectral imaging is a powerful spectroscopic tool for seed classification, quality discrimination, and detection of an object by obtaining visual information about the samples  . The benefits of using NIR hyperspectral imaging for cereal science are numerous, including disease and pest diagnoses [18, 22] , kernel density classification [16, 17] , seed moisture determination  , and rice cultivar identification  . Currently, limited research has used this technique to distinguish GM from non-GM. Prior to this study, no research had mapped the spatial heterogeneity between GMOs from non-GM controls based on their different spectral signatures. The purpose of this study was to investigate four goals: (1) to examine the feasibility of using NIR hyperspectral imaging techniques to identify GM maize kernels mediated by Agrobacterium tumefaciens and detect spatial heterogeneity in spectral variability; (2) to identify important wavelengths that identify the differences between GM and non-GM maize kernels; (3) to build an optimal discrimination model based on these important wavelengths to simplify the prediction model and to speed up the operation; and (4) to visualize the number and locations of GM maize kernel by developing imaging processing algorithms.