Optimize Neural Network Algorithm of Missing Value Imputation for Clustering Chocolate Product Types Following "STEAMS" Methodology

Mason Chen, Charles Chen
"STEAMS" (Science, Technology, Engineering, Artificial Intelligence, Math, Statistics) approach was conducted to handle the missing value imputation of clustering Chocolate Science patterns. Hierarchical clustering and Dendrogram were utilized to cluster the commercial chocolate products into different product groups which can indicate the nutrition compositions and product health. To further handle the missing value imputation, Neural Network algorithm was utilized to predict the missed Cocoa%
more » ... t the missed Cocoa% based on the other available Nutrition components. The Hyperbolic Tangent activation function was used to create the hidden layer with three nodes. Neural networks are very flexible models and tend to over-fit data. Definitive Screening Design (DSD) was conducted to optimize the Neural setting in order to minimize the over-fit concern. Both the Goodness Fit of Training set and Validation set can reach 99% R-Square. The Profiler Sensitivity analysis has shown that the Chocolate Type and Vitamin C are the most sensitive factors to predict the missed Cocoa%. The results also indicated that the "Fruit" Chocolate shall be added as the 4th Chocolate Type. The Neural Black-Box algorithm can reveal the hidden Chocolate Science and Product. This paper has demonstrated the power of using the Engineering DOE and Neural Network (AI) algorithm through "STEAMS".
doi:10.29007/4jgz fatcat:col2gdvygzdlplp6qjo2mgx5t4