Colin SchmidtJohn Charles WrightZhongkai WangEric ChangAlbert J. OuWoo-Rham BaeSean HuangAnita FlynnBrian C. RichardsKrste AsanovicElad AlonBorivoje Nikolic

Modern workloads, such as deep neural networks (DNNs), increasingly rely on dense arithmetic compute patterns that are ill-suited for general-purpose processors, leading to a rise in domain-specific compute accelerators [1]. Many of these workloads can benefit from varying precision during computation, e.g. different precisions among layers and between training and inference for DNNs has been shown to improve energy efficiency [2].

URL: https://ieeexplore.ieee.org/document/9365789