Malware classification based on double byte feature encoding

Lin Li, Ying Ding, Bo Li, Mengqing Qiao, Biao Ye
2021 Alexandria Engineering Journal  
Many researchers analyze malware through static analysis and dynamic analysis technology, and combine it with excellent deep learning algorithm, which has achieved good results in malware classification. However, many researches only use the. ASM file generated by decompiler or. Bytes file represented by hexadecimal for feature extraction. This paper fully integrates the features of these two files, and uses word frequency and two deep learning algorithms to extract 184 opcode features and 16
more » ... obability features from ASM file and section file of Kaggle dataset respectively. Then, double byte feature coding method is used to fuse the features of the two files. Finally, convolution neural network is used to classify the fused samples. The experimental results show that the accuracy is 98.68% and the logarithm loss is 0.022. Ó 2021 THE AUTHORS. Published by Elsevier BV on behalf of Faculty of Engineering, Alexandria University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/ licenses/by-nc-nd/4.0/). THE AUTHORS. Published by Elsevier BV on behalf of Faculty of Engineering, Alexandria University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Please cite this article in press as: L. Li et al., Malware classification based on double byte feature encoding, Alexandria Eng. J. (2021), https://doi.
doi:10.1016/j.aej.2021.04.076 fatcat:ocrqbt4gpngj7cwg22exyq2x2i