4 Hits in 4.4 sec


Josep Lluís Berral, Nicolas Poggi, David Carrera, Aaron Call, Rob Reinauer, Daron Green
2015 Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '15  
ALOJA-ML provides such an automated system allowing knowledge discovery by modeling Hadoop executions from observed benchmarks across a broad set of configuration parameters.  ...  In addition to learning from the methodology presented in this work, the community can benefit in general from ALOJA data-sets, framework, and derived insights to improve the design and deployment of Big  ...  ALOJA-ML provides tools to automate both the knowledge discovery process and performance prediction of Hadoop benchmark data.  ... 
doi:10.1145/2783258.2788600 dblp:conf/kdd/BerralPCCRG15 fatcat:3y7pnkbwxvbzjodjfwhm4ckjla

ALOJA: A Framework for Benchmarking and Predictive Analytics in Hadoop Deployments

Josep Lluis Berral, Nicolas Poggi, David Carrera, Aaron Call, Rob Reinauer, Daron Green
2017 IEEE Transactions on Emerging Topics in Computing  
The predictive analytics extension, ALOJA-ML, provides an automated system allowing knowledge discovery by modeling environments from observed executions.  ...  ALOJA is part of a long-term collaboration between BSC and Microsoft to automate the characterization of cost-effectiveness on Big Data deployments, currently focusing on Hadoop.  ...  This work is partially supported by the Ministry of Economy of Spain under contracts TIN2012-34557 and 2014SGR1051.  ... 
doi:10.1109/tetc.2015.2496504 fatcat:7kpa5wvwfzfs3jtd6aqjfbq5du

ALOJA: A Benchmarking and Predictive Platform for Big Data Performance Analysis [chapter]

Nicolas Poggi, Josep Ll. Berral, David Carrera
2016 Lecture Notes in Computer Science  
The main goals of the ALOJA research project from BSC-MSR, are to explore and automate the characterization of cost-effectiveness of Big Data deployments.  ...  This article describes the evolution of the project's focus and research lines from over a year of continuously benchmarking Hadoop under different configuration and deployments options, presents results  ...  Acknowledgements This work is partially supported the BSC-Microsoft Research Centre, the Spanish Ministry of Education (TIN2012-34557), the MINECO Severo Ochoa Research program (SEV-2011-0067) and the  ... 
doi:10.1007/978-3-319-49748-8_4 fatcat:lgzpi3vmabfbbb7vw7r6otrogq

Sequence-to-sequence models for workload interference prediction on batch processing datacenters

David Buchaca, Joan Marcual, Josep LLuis Berral, David Carrera
2020 Future generations computer systems  
Co-scheduling of jobs in data-centers is a challenging scenario, where jobs can compete for resources yielding to severe slowdowns or failed executions.  ...  In this work we propose a methodology for modeling co-scheduling of jobs on data-centers, based on their behavior towards resources and execution time, using sequence-to-sequence models based on recurrent  ...  In [18] Aloja-ML is presented as a framework for characterization and knowledge discovery in Hadoop deployments.  ... 
doi:10.1016/j.future.2020.03.058 fatcat:vw33tgjwdjfahfxfq7crf5fpqe