Elastic Cloud Logs Traces, Storing and Replaying for Deep Machine Learning

Tariq Daradkeh, Anjali Agarwal, Nishith Goel, Jim Kozlowski
2020 Procedia Computer Science  
Storing logs and retrieving it in cloud computing environment is a critical task for Deep Machine Learning models. Time to load large sizes of logs and pre-processing them (prepare logs to be used in learning phase) is a significant issue to machine learning modules that influence training and decision phases. Storing sampled logs in an optimal way that consumes minimum desk storage with accurate and fast replay will give any cloud management system an advantage to start processing logs
more » ... ely. Using a practical dimensional reduction method to reduce logs sizes and recover them will improve cloud management system in taking right scaling action and will reduce deep machine learning time complexity. There are many methods in literature for feature extraction and dimensional reduction methods such as Principle Components Analysis (PCA) and wavelet transform. These two methods have a good reputation in processing data set for different goals in machine learning methods like image recognition, signal processing and prediction. In this work, an analysis and comparison is made between PCA and wavelet as methods of cloud logs (considered as data set) dimensional reduction and compression to reduce logs size and logs recover time in order to be replayed in a faster way. This will reduce logs processing complexity, logs foot print and increase decision accuracy by focusing on logs main feature in cloud data center management system. Abstract Storing logs and retrieving it in cloud computing environment is a critical task for Deep Machine Learning models. Time to load large sizes of logs and pre-processing them (prepare logs to be used in learning phase) is a significant issue to machine learning modules that influence training and decision phases. Storing sampled logs in an optimal way that consumes minimum desk storage with accurate and fast replay will give any cloud management system an advantage to start processing logs immediately. Using a practical dimensional reduction method to reduce logs sizes and recover them will improve cloud management system in taking right scaling action and will reduce deep machine learning time complexity. There are many methods in literature for feature extraction and dimensional reduction methods such as Principle Components Analysis (PCA) and wavelet transform. These two methods have a good reputation in processing data set for different goals in machine learning methods like image recognition, signal processing and prediction. In this work, an analysis and comparison is made between PCA and wavelet as methods of cloud logs (considered as data set) dimensional reduction and compression to reduce logs size and logs recover time in order to be replayed in a faster way. This will reduce logs processing complexity, logs foot print and increase decision accuracy by focusing on logs main feature in cloud data center management system.
doi:10.1016/j.procs.2020.04.011 fatcat:ehfmytdyyraoji6ourijcla7ey