Lambda architecture for cost-effective batch and speed big data processing

Mariam Kiran, Peter Murphy, Inder Monga, Jon Dugan, Sartaj Singh Baveja
2015 2015 IEEE International Conference on Big Data (Big Data)  
Sensor and smart phone technologies present opportunities for data explosion, streaming and collecting from heterogeneous devices every second. Analyzing these large datasets can unlock multiple behaviors previously unknown, and help optimize approaches to city wide applications or societal use cases. However, collecting and handling of these massive datasets presents challenges in how to perform optimized online data analysis 'on-the-fly', as current approaches are often limited by capability,
more » ... expense and resources. This presents a need for developing new methods for data management particularly using public clouds to minimize cost, network resources and on-demand availability. This paper presents an implementation of the lambda architecture design pattern to construct a data-handling backend on Amazon EC2, providing high throughput, dense and intense data demand delivered as services, minimizing the cost of the network maintenance. This paper combines ideas from database management, cost models, query management and cloud computing to present a general architecture that could be applied in any given scenario where affordable online data processing of Big Datasets is needed. The results are presented with a case study of processing router sensor data on the current ESnet network data as a working example of the approach. The results showcase a reduction in cost and argue benefits for performing online analysis and anomaly detection for sensor data.
doi:10.1109/bigdata.2015.7364082 dblp:conf/bigdataconf/KiranMMDB15 fatcat:nbvuxooztfem3ioissj6pwmunu