Can we analyze big data inside a DBMS?

Carlos Ordonez
2013 Proceedings of the sixteenth international workshop on Data warehousing and OLAP - DOLAP '13  
Relational DBMSs remain the main data management technology, despite the big data analytics and no-SQL waves. On the other hand, for data analytics in a broad sense, there are plenty of non-DBMS tools including statistical languages, matrix packages, generic data mining programs and largescale parallel systems, being the main technology for big data analytics. Such large-scale systems are mostly based on the Hadoop distributed file system and MapReduce. Thus it would seem a DBMS is not a good
more » ... chnology to analyze big data, going beyond SQL queries, acting just as a reliable and fast data repository. In this survey, we argue that is not the case, explaining important research that has enabled analytics on large databases inside a DBMS. However, we also argue DBMSs cannot compete with parallel systems like MapReduce to analyze web-scale text data. Therefore, each technology will keep influencing each other. We conclude with a proposal of long-term research issues, considering the "big data analytics" trend.
doi:10.1145/2513190.2513198 dblp:conf/dolap/Ordonez13 fatcat:ejuvzywvqrd5jig6bfcmixc66m