量化深度神经网络以累积受限的处理器

论文标题

量化深度神经网络以累积受限的处理器

Quantization of Deep Neural Networks for Accumulator-constrained Processors

论文作者

de Bruin, Barry, Zivkovic, Zoran, Corporaal, Henk

论文摘要

我们为没有广泛积累寄存器的平台引入了人工神经网络（ANN）量化方法。这使得在嵌入式计算平台上的定点模型部署并非专门用于大型内核计算（即累加器约束处理器）。我们将量化问题作为累积大小的函数提出量化问题，并旨在通过最大化输入数据和权重的位宽度来最大化模型精度。为了减少要考虑的配置数量，仅对完全利用可用蓄能器位的解决方案进行了测试。我们证明，16位蓄能器能够在CIFAR-10和ILSVRC2012图像分类基准的1 \％范围内获得分类精度。此外，通过利用16位累加器在All-CNN-C和Alexnet网络上利用16位累加器进行图像分类，可以在ARM处理器上获得近乎最佳的$ 2 \ times $加速。

We introduce an Artificial Neural Network (ANN) quantization methodology for platforms without wide accumulation registers. This enables fixed-point model deployment on embedded compute platforms that are not specifically designed for large kernel computations (i.e. accumulator-constrained processors). We formulate the quantization problem as a function of accumulator size, and aim to maximize the model accuracy by maximizing bit width of input data and weights. To reduce the number of configurations to consider, only solutions that fully utilize the available accumulator bits are being tested. We demonstrate that 16-bit accumulators are able to obtain a classification accuracy within 1\% of the floating-point baselines on the CIFAR-10 and ILSVRC2012 image classification benchmarks. Additionally, a near-optimal $2\times$ speedup is obtained on an ARM processor, by exploiting 16-bit accumulators for image classification on the All-CNN-C and AlexNet networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题