A Real Time Super Resolution Accelerator with Tilted Layer Fusion [article]

An-Jung Huang, Kai-Chieh Hsu, Tian-Sheuan Chang
2022 arXiv   pre-print
Deep learning based superresolution achieves high-quality results, but its heavy computational workload, large buffer, and high external memory bandwidth inhibit its usage in mobile devices. To solve the above issues, this paper proposes a real-time hardware accelerator with the tilted layer fusion method that reduces the external DRAM bandwidth by 92\% and just needs 102KB on-chip memory. The design implemented with a 40nm CMOS process achieves 1920x1080@60fps throughput with 544.3K gate count
more » ... when running at 600MHz; it has higher throughput and lower area cost than previous designs.
arXiv:2205.03997v1 fatcat:2zkp5q44mrewhjpecej7vvjrhu