Deep neural network with low-precision dynamic fixed-point in reconfigurable hardware design
11663464 · 2023-05-30
Assignee
Inventors
- Jie Wu (San Diego, CA)
- Bike Xie (San Diego, CA, US)
- Hsiang-Tsun Li (Taichung, TW)
- JUNJIE SU (San Diego, CA, US)
- Chun-Chen Liu (San Diego, CA, US)
Cpc classification
International classification
Abstract
A system for operating a floating-to-fixed arithmetic framework includes a floating-to-fix arithmetic framework on an arithmetic operating hardware such as a central processing unit (CPU) for computing a floating pre-trained convolution neural network (CNN) model to a dynamic fixed-point CNN model. The dynamic fixed-point CNN model is capable of implementing a high performance convolution neural network (CNN) on a resource limited embedded system such as mobile phone or video cameras.
Claims
1. An arithmetic framework system comprising: a floating-to-fixed arithmetic framework on an arithmetic operating hardware, the floating-to-fixed arithmetic framework being configured to: receive a floating pre-trained convolution neural network (CNN) model; retrieve weights, a bias, and activations for each CNN layer of the floating pre-trained CNN model; determine a symmetric dynamic range between an absolute value of a maximum value of absolute values of the weights and a negative absolute value of the maximum value of the absolute values of the weights, a maximum value of biases for CNN layers of the floating pre-trained CNN model, and a maximum value of the dynamic fixed-point format activations for each CNN layer of the floating pre-trained CNN model; sum products of each dynamic fixed-point format weight and its corresponding dynamic fixed-point format activation for the each CNN layer of the floating pre-trained CNN model for generating a first output of each CNN layer of a CNN model; generate a second output of the each CNN layer of the CNN model and express the second output to have an integer word length and a fractional word length same as the dynamic fixed-point format activations; add the dynamic fixed-point format bias with the second output of the each CNN layer of the CNN model for generating a third output of each CNN layer of the CNN model; truncate the third output of the each CNN layer of the CNN model according to the dynamic fixed-point format activations for generating a dynamic fixed-point output of the each CNN layer of the CNN model; combine dynamic fixed-point outputs of CNN layers of the CNN model to generate a dynamic fixed-point CNN model; and output the dynamic fixed-point CNN model; and a memory configured to save the floating pre-trained convolution neural network (CNN) model, the CNN model, and the dynamic fixed-point CNN model.
2. The arithmetic framework system of claim 1, wherein the arithmetic operating hardware is a central processing unit (CPU) or a graphics processing unit (GPU).
3. The arithmetic framework system of claim 1, wherein the floating-to-fixed arithmetic framework is further configured to: input the dynamic fixed-point CNN model to the floating-to-fixed arithmetic framework.
4. A method for operating a floating-to-fixed arithmetic framework, the method comprising: inputting a floating pre-trained convolution neural network (CNN) model to the floating-to-fixed arithmetic framework in an arithmetic operating hardware; retrieving weights, a bias, and activations for each CNN layer of the floating pre-trained CNN model by the arithmetic operating hardware; determining a symmetric dynamic range between an absolute value of a maximum value of absolute values of the weights and a negative absolute value of the maximum value of the absolute values of the weights, a maximum value of biases for CNN layers of the floating pre-trained CNN model, and a maximum value of the dynamic fixed-point format activations for each CNN layer of the floating pre-trained CNN model by the arithmetic operating hardware; summing products of each dynamic fixed-point weight and its corresponding dynamic fixed-point format activation for the each CNN layer of the floating pre-trained CNN model for generating a first output of each CNN layer of a CNN model by the arithmetic operating hardware; truncating the first output of the each CNN layer of the CNN model according to the dynamic fixed-point format activations for generating a second output of the each CNN layer of the CNN model by the arithmetic operating hardware; adding the dynamic fixed-point format bias with the second output of the each CNN layer of the CNN model for generating a third output of the each CNN layer of the CNN model by the arithmetic operating hardware; generating a dynamic fixed-point output of the each CNN layer of the CNN model and expressing the second output to have an integer word length and a fractional word length same as the dynamic fixed-point format activations; combining dynamic fixed-point outputs of CNN layers of the CNN model to generate a dynamic fixed-point CNN model by the arithmetic operating hardware; and the floating-to-fixed arithmetic framework outputting the dynamic fixed-point CNN model.
5. The method of claim 4, further comprising: inputting the dynamic fixed-point CNN model to the floating-to-fixed arithmetic framework.
6. A method for operating a floating-to-fixed arithmetic framework, the method comprising: inputting a floating pre-trained convolution neural network (CNN) model to the floating-to-fixed arithmetic framework in an arithmetic operating hardware; retrieving weights, a bias, and activations for each CNN layer of the floating pre-trained CNN model by the arithmetic operating hardware; determining a symmetric dynamic range between an absolute value of a maximum value of absolute values of the weights and a negative absolute value of the maximum value of the absolute values of the weights, a maximum value of biases for CNN layers of the floating pre-trained CNN model, and a maximum value of the dynamic fixed-point format activations for each CNN layer of the floating pre-trained CNN model; summing products of each dynamic fixed-point weight and its corresponding dynamic fixed-point format activation for the each CNN layer of the floating pre-trained CNN model for generating a first output of each CNN layer of a CNN model by the arithmetic operating hardware; truncating the first output of the each CNN layer of the CNN model according to the dynamic fixed-point format activations for generating a second output of the each CNN layer of the CNN model by the arithmetic operating hardware; adding the dynamic fixed-point format bias with the second output of the each CNN layer of the CNN model for generating a third output of the each CNN layer of the CNN model by the arithmetic operating hardware; generating a dynamic fixed-point output of the each CNN layer of the CNN model and expressing the dynamic fixed-point output to have an integer word length and a fractional word length same as the dynamic fixed-point format activations; combining dynamic fixed-point outputs of CNN layers of the CNN model to generate a dynamic fixed-point CNN model by the arithmetic operating hardware; and the floating-to-fixed arithmetic framework outputting the dynamic fixed-point CNN model.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
DETAILED DESCRIPTION
(6) The present invention provides a floating-to-fixed arithmetic framework system that outputs a dynamic fixed-point CNN model.
(7)
(8)
(9) Step S202: inputting a floating pre-trained convolution neural network (CNN) model 106 to the floating-to-fixed arithmetic framework 104;
(10) Step S204: retrieving weights, a bias, and activations for the each CNN layer of the floating pre-trained CNN model 106;
(11) Step S206: determining dynamic fixed-point formats of the weights, the bias, and the activations for the each CNN layer of the floating pre-trained CNN model 106 to generate dynamic fixed-point format weights, a dynamic fixed-point format bias, and dynamic fixed-point format activations for the each CNN layer of the floating pre-trained CNN model 106;
(12) Step S208: summing products of each dynamic fixed-point format weight and its corresponding dynamic fixed-point format activation for the each CNN layer of the floating pre-trained CNN model 106 for generating a first output of each CNN layer of a CNN model;
(13) Step S210: truncating the first output of the each CNN layer of the CNN model according to the dynamic fixed-point format activations for generating a second output of the each CNN layer of the CNN model;
(14) Step S212: adding the dynamic fixed-point format bias with the second output of the each CNN layer of the CNN model for generating a third output of the each CNN layer of the CNN model;
(15) Step S214: truncating the third output of the each CNN layer of the CNN model according to the dynamic fixed-point format activations for generating a dynamic fixed-point output of the each CNN layer of the CNN model;
(16) Step S216: combing dynamic fixed-point outputs of CNN layers of the CNN model to generate a dynamic fixed-point CNN model; and
(17) Step S218: the floating-to-fixed arithmetic framework outputting the dynamic fixed-point CNN model 110.
(18)
Σ.sub.i=1.sup.NW.sub.iX.sub.i+B (1)
(19) Where W is weight, X is activation, B is the bias, N=k.Math.k.Math.in.sub.c.Math.out.sub.c which is the total number of weights. K is the kernel size, in.sub.c is the number of input channels, and out.sub.c is the number of output channels. In this equation, we can see there are N arithmetic operations in each layer of the CNN model. This is the most complicate part that requires lots of arithmetic operations. As shown in
(20) The dynamic fixed-point format method is used to obtain the fixed-point formats for weights, biases, and activations as mentioned above, including two parameters to represent a dynamic fixed-point format as shown in equation (2):
s=(2.sup.p-1−1)/Max.sub.v (2)
(21) In Eq. (2), p represents the quantization bit-width, and the symmetric dynamic range is [−Max.sub.v,Max.sub.v]. From the perspective of weights, Max.sub.v equals max(|min(w)|,|max(w)|) where |max(w)| is the absolute value of the weight having the largest value and |min(w)| is the absolute value of the weight having the smallest value. In
(22) First, for each layer, the scalar factor s mentioned in equation 2 is expressed as equation (3):
(23)
where q equals
(24)
n is the number of fractional bits, and q represents the residual value between s and 2.sup.n.
(25) According to the given resolution of dynamic fixed-point defined as M, the integer word length M1 equals M minus M2. Using the proposed floating scalar factor value, the proposed the algorithm can achieve the approximated theoretically signal to quantization noise ratio (SQNR).
(26)
(27)
(28) The invention provides a system and method for operating a floating-to-fixed arithmetic framework. The system and method comprises a floating-to-fix arithmetic framework on an arithmetic operating hardware such as central processing unit (CPU) for computing a floating pre-trained convolution neural network (CNN) model to a dynamic fixed-point CNN model. The floating-to-fix arithmetic framework receives a floating pre-trained convolution neural network (CNN) model and retrieves weights, a bias, and activations for the each CNN layer of the floating pre-trained CNN model. The floating-to-fix arithmetic framework then computes each channel of each layer of the floating pre-trained convolution neural network (CNN) model for generating a dynamic fixed-point CNN model. This floating to fixed format optimizes the CNN model to fit the dependent hardware constraints. The outputted dynamic fixed-point CNN model is capable of implementing a high performance convolution neural network (CNN) on a resource limited embedded system.
(29) Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.