Retrofitting Applications with Provenance-Based Security Monitoring
Data provenance is a valuable tool for detecting and preventing cyber attack, providing insight into the nature of suspicious events. For example, an administrator can use provenance to identify the perpetrator of a data leak, track an attacker's actions following an intrusion, or even control the flow of outbound data within an organization. Unfortunately, providing relevant data provenance for complex, heterogenous software deployments is challenging, requiring both the tedious
... of many application components as well as a unified architecture for aggregating information between components. In this work, we present a composition of techniques for bringing affordable and holistic provenance capabilities to complex application workflows, with particular consideration for the exemplar domain of web services. We present DAP, a transparent architecture for capturing detailed data provenance for web service components. Our approach leverages a key insight that minimal knowledge of open protocols can be leveraged to extract precise and efficient provenance information by interposing on application components' communications, granting DAP compatibility with existing web services without requiring instrumentation or developer cooperation. We show how our system can be used in real time to monitor system intrusions or detect data exfiltration attacks while imposing less than 5.1 ms end-to-end overhead on web requests. Through the introduction of a garbage collection optimization, DAP is able to monitor system activity without suffering from excessive storage overhead. DAP thus serves not only as a provenance-aware web framework, but as a case study in the non-invasive deployment of provenance capabilities for complex applications workflows.