Centaur: A Chiplet-based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations [article]

Ranggi Hwang, Taehun Kim, Youngeun Kwon, Minsoo Rhu
2020 arXiv   pre-print
Personalized recommendations are the backbone machine learning (ML) algorithm that powers several important application domains (e.g., ads, e-commerce, etc) serviced from cloud datacenters. Sparse embedding layers are a crucial building block in designing recommendations yet little attention has been paid in properly accelerating this important ML algorithm. This paper first provides a detailed workload characterization on personalized recommendations and identifies two significant performance
more » ... imiters: memory-intensive embedding layers and compute-intensive multi-layer perceptron (MLP) layers. We then present Centaur, a chiplet-based hybrid sparse-dense accelerator that addresses both the memory throughput challenges of embedding layers and the compute limitations of MLP layers. We implement and demonstrate our proposal on an Intel HARPv2, a package-integrated CPU+FPGA device, which shows a 1.7-17.2x performance speedup and 1.7-19.5x energy-efficiency improvement than conventional approaches.
arXiv:2005.05968v1 fatcat:ko6c4blrsnez7awjmm3ptwgg5a