Learning Representations of Network Traffic Using Deep Neural Networks for Network Anomaly Detection: A Perspective towards Oil and Gas IT Infrastructures
Oil and Gas organizations are dependent on their IT infrastructure, which is a small part of their industrial automation infrastructure, to function effectively. The oil and gas (O&G) organizations industrial automation infrastructure landscape is complex. To perform focused and effective studies, Industrial systems infrastructure is divided into functional levels by The Instrumentation, Systems and Automation Society (ISA) Standard ANSI/ISA-95:2005. This research focuses on the ISA-95:2005
... l-4 IT infrastructure to address network anomaly detection problem for ensuring the security and reliability of Oil and Gas resource planning, process planning and operations management. Anomaly detectors try to recognize patterns of anomalous behaviors from network traffic and their performance is heavily dependent on extraction time and quality of network traffic features or representations used to train the detector. Creating efficient representations from large volumes of network traffic to develop anomaly detection models is a time and resource intensive task. In this study we propose, implement and evaluate use of Deep learning to learn effective Network data representations from raw network traffic to develop data driven anomaly detection systems. Proposed methodology provides an automated and cost effective replacement of feature extraction which is otherwise a time and resource intensive task for developing data driven anomaly detectors. The ISCX-2012 dataset is used to represent ISA-95 level-4 network traffic because the O&G network traffic at this level is not much different than normal internet traffic. We trained four representation learning models using popular deep neural network architectures to extract deep representations from ISCX 2012 traffic flows. A total of sixty anomaly detectors were trained by authors using twelve conventional Machine Learning algorithms to compare the performance of aforementioned deep representations with that of a human-engineered handcrafted network data representation. The comparisons were performed using well known model evaluation parameters. Results showed that deep representations are a promising feature in engineering replacement to develop anomaly detection models for IT infrastructure security. In our future research, we intend to investigate the effectiveness of deep representations, extracted using ISA-95:2005 Level 2-3 traffic comprising of SCADA systems, for anomaly detection in critical O&G systems.