Value Proposition for Big Data

Asmita Sharma
2019 International Journal for Research in Applied Science and Engineering Technology  
Value proposition of the system is done to specify the need of the project. Once the project is approved, spark programs, also called spark jobs, are written to extract data from the databases and stored on Hadoop clusters. The data is then filtered according to the business needs using spark jobs and then this filtered data is placed on the file system again using spark in a format that can be used by SparkML. Spark speeds up the operation by parallelizing the operations. The data stored on
more » ... e data stored on Hadoop distributed file system is then used as a test data to be used by SparkML to train. Once the training is done, the machine learning algorithm is then used on production data to forecast the quantity of item required at a particular store at a given date.
doi:10.22214/ijraset.2019.6123 fatcat:rv5xc4aixzg2tnidixfoglstoa