The Internet Archive has a preservation copy of this work in our general collections.
The file type is
Shark is a new data analysis system that marries query processing with complex analytics on large clusters. It leverages a novel distributed memory abstraction to provide a unified engine that can run SQL queries and sophisticated analytics functions (e.g., iterative machine learning) at scale, and efficiently recovers from failures mid-query. This allows Shark to run SQL queries up to 100x faster than Apache Hive, and machine learning programs up to 100x faster than Hadoop. Unlike previousarXiv:1211.6176v1 fatcat:cdpyu3sp3bd7rcdzaaci4juayi