XNORBIN: A 95 TOp/s/W hardware accelerator for binary convolutional neural networks

Andrawes Al Bahou, Geethan Karunaratne, Renzo Andri, Lukas Cavigelli, Luca Benini
2018 2018 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)  
Deploying state-of-the-art CNNs requires power-hungry processors and off-chip memory. This precludes the implementation of CNNs in low-power embedded systems. Recent research shows CNNs sustain extreme quantization, binarizing their weights and intermediate feature maps, thereby saving 8-32× memory and collapsing energy-intensive sum-of-products into XNOR-and-popcount operations. We present XNORBIN, a flexible accelerator for binary CNNs with computation tightly coupled to memory for aggressive
more » ... data reuse supporting even non-trivial network topologies with large feature map volumes. Implemented in UMC 65nm technology XNORBIN achieves an energy efficiency of 95 TOp/s/W and an area efficiency of 2.0 TOp/s/MGE at 0.8 V.
doi:10.1109/coolchips.2018.8373076 dblp:conf/coolchips/BahouKACB18 fatcat:cnddwsys7bg45owi5wfdlw2u5y