A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit <a rel="external noopener" href="https://arxiv.org/ftp/arxiv/papers/1706/1706.03206.pdf">the original URL</a>. The file type is <code>application/pdf</code>.
<span class="release-stage" >pre-print</span>
A considerable portion of the machine learning literature applied to intrusion detection uses outdated data sets based on a simulated network with a limited environment. Moreover, flaws usually appear in datasets and the way we handle them may impact on measurements. Finally, the detection capacity of intrusion detection is highly influenced by the system configuration. We focus on a topic rarely investigated: the characterization of anomalies in a large network environment. Intrusion Detection<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1706.03206v1">arXiv:1706.03206v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/2cjqlogednf3leibr2lewusmwi">fatcat:2cjqlogednf3leibr2lewusmwi</a> </span>
more »... System (IDS) are used to detect exploits or other attacks that raise alarms. These anomalous events usually receive less attention than attack alarms, causing them to be frequently overlooked by security administrators. However, the observation of this activity contributes to understand the traffic network characteristics. On one hand, abnormal behaviors may be legitimate, e.g., misinterpreted protocols or malfunctioning network equipment, but on the other hand an attacker may intentionally craft packets to introduce anomalies to evade monitoring systems. Anomalies found in operational network environments may indicate cases of evasion attacks, application bugs, and a wide variety of factors that highly influence intrusion detection performance. This study explores the nature of anomalies found in U-Tokyo Network using cooperatively Bro and Snort IDS among other resources. We analyze 6.5 TB of compressed binary tcpdump data representing 12 hours of network traffic. Our major contributions can be summarized in: 1) reporting the anomalies observed in real, up-to-date traffic from a large academic network environment, and documenting problems in research that may lead to wrong results due to misinterpretations of data or misconfigurations in software; 2) assessing the quality of data by analyzing the potential and the real problems in the capture process.
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200825091355/https://arxiv.org/ftp/arxiv/papers/1706/1706.03206.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/70/ef/70ef6e84bd69e2d69f88933e6c2fec61125faec7.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/1706.03206v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>