A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
FlashR: R-Programmed Parallel and Scalable Machine Learning using SSDs
[article]
2017
arXiv
pre-print
R is one of the most popular programming languages for statistics and machine learning, but the R framework is relatively slow and unable to scale to large datasets. The general approach for speeding up an implementation in R is to implement the algorithms in C or FORTRAN and provide an R wrapper. FlashR takes a different approach: it executes R code in parallel and scales the code beyond memory capacity by utilizing solid-state drives (SSDs) automatically. It provides a small number of
arXiv:1604.06414v4
fatcat:dsobnkm2tbbn5oe4h4orqzdzfa