G06F2207/4824

Training program, training method, and information processing apparatus
11593620 · 2023-02-28 · ·

An information processing apparatus that performs deep learning using a neural network includes a memory, and an arithmetic processing device that performs a process for layers of the neural network in a predetermined direction. The process for the layers includes: pre-determining a decimal point position of a fixed-point number of an intermediate data obtained by an operation of each of the layers; performing the arithmetic operation for each layer with the pre-determined decimal point position to obtain the intermediate data and acquiring first statistical information of a distribution of bits of the intermediate data; determining a decimal point position of the intermediate data based on the statistical information; and performing the arithmetic operation for each layer with the determined decimal point position again when the difference of the determined decimal point position and the pre-determined decimal point position is greater than a threshold value.

Arithmetic processing apparatus, control method of arithmetic processing apparatus, and non-transitory computer-readable storage medium for storing program
11593071 · 2023-02-28 · ·

An arithmetic processing apparatus includes: a plurality of nodes (N nodes) capable of communicating with each other, each of the plurality of nodes including a memory and a processor, the memory being configured to store a value and an operation result, the processor being configured to execute first processing when N is a natural number of 2 or more, n is a natural number of 1 or more, and N≠2.sup.n, wherein the first processing is configured to divide by 2 a value held by a first node, the first node being any of the plurality of nodes and a last node in an order of counting, obtain one or more node pairs by pairing remaining nodes among the plurality of nodes exception for the first node, and calculate repeatedly an average value of values held by each node pair of the one or more node pairs.

Neural network computation method using adaptive data representation

A method for neural network computation using adaptive data representation, adapted for a processor to perform multiply-and-accumulate operations on a memory having a crossbar architecture, is provided. The memory comprises multiple input and output lines crossing each other, multiple cells respectively disposed at intersections of the input and output lines, and multiple sense amplifiers respectively connected to the output lines. In the method, an input cycle of kth bits respectively in an input data is adaptively divided into multiple sub-cycles, wherein a number of the divided sub-cycles is determined according to a value of k. The kth bits of the input data are inputted to the input lines with the sub-cycles and computation results of the output lines are sensed by the sense amplifiers. The computation results sensed in each sub-cycle are combined to obtain the output data corresponding to the kth bits of the input data.

Adaptive quantization and mixed precision in a network

A method of adaptive quantization for a convolutional neural network, includes at least one of receiving an acceptable model accuracy, determining a float value multiply accumulate for the layer based on a float value weight and a float value input, quantizing the float value weight at multiple weight quantization precisions, quantizing the float value input at multiple input quantization precisions, determining a multiply accumulate at multiple multiply accumulate quantization precisions based on the weight quantization precisions and the input quantization precisions, determining multiple quantization errors based on differences between the float value multiply accumulate and the multiple multiply accumulate quantization precisions and selecting one of the multiple weight quantization precisions, one of the multiple input quantization precisions and one of the multiple multiply accumulate quantization precisions based on the predetermined acceptable model accuracy and the multiple quantization errors.

Neural network based on total hamming distance
11507814 · 2022-11-22 · ·

Disclosed herein includes a system, a method, and a device for improving power efficiency of a neural network implemented in an AI chip. In a neural network, large amounts of computations for multiply and accumulate can result in frequent toggles or transitions in states of logic circuits in the AI chip. Such frequent toggles or transitions of states of logic circuits can cause a large overall power consumption. In one aspect, to minimize the number of toggles, a sequence or order of computations can be rearranged. In one approach, total hamming distances for weights or input strings in different arrangements or sequences can be identified, and an arrangement or a sequence of weights or input strings with a reduced or minimum total hamming distance can be identified. An arrangement or a sequence of weights that render a reduced total hamming distance can be identified.

APPARATUS AND METHOD WITH IN-MEMORY COMPUTING

A multiply-accumulator (MAC) circuit includes: a plurality of multipliers each comprising: a field-effect transistor configured to apply an intermediate voltage to a node; a pair of resistive devices having resistance values determined based on the intermediate voltage applied to one ends connected to the node and weight setting voltages applied to the other ends; and a capacitor configured to be charged and discharged with an electric charge by receiving a voltage generated in the node based on a combined resistance value of the pair of resistive devices and input voltages applied individually to the other ends of the pair of resistive devices in response to individual resistance values of the pair of resistive devices being determined; and an output line configured to output a voltage based on electric charges charged to and discharged from the plurality of multipliers.

Method and apparatus for generating fixed-point quantized neural network

A method of generating a fixed-point quantized neural network includes analyzing a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network, determining a fixed-point expression of each of the parameters for each channel statistically covering a distribution range of the floating-point parameter values based on the statistical distribution for each channel, determining fractional lengths of a bias and a weight for each channel among the parameters of the fixed-point expression for each channel based on a result of performing a convolution operation, and generating a fixed-point quantized neural network in which the bias and the weight for each channel have the determined fractional lengths.

Integrated circuit chip apparatus

Provided are an integrated circuit chip apparatus and a related product, the integrated circuit chip apparatus being used for executing a multiplication operation, a convolution operation or a training operation of a neural network. The present technical solution has the advantages of a small amount of calculation and low power consumption.

In-memory computing architecture and methods for performing MAC operations

In-memory computing architectures and methods of performing multiply-and-accumulate operations are provided. The method includes sequentially shifting bits of first input bytes into each row in an array of memory cells arranged in rows and columns. Each memory cell is activated based on the bit to produce a bit-line current from each activated memory cell in a column on a shared bit-line proportional to a product of the bit and a weight stored therein. Charges produced by a sum of the bit-line currents in a column are accumulated in first charge-storage banks coupled to a shared bit-line in each of the columns. Concurrently, charges from second input bytes accumulated in second charge-storage banks previously coupled to the columns are sequentially converted into output bytes. The charge-storage banks are exchanged after the first input bytes have been accumulated and the charges from the second input bytes converted. The method then repeats.

Processing-in-memory (PIM) devices
11586500 · 2023-02-21 · ·

A method of performing a MAC arithmetic operation includes detecting error correction capability for first data when a command has a logic level combination for performing the MAC arithmetic operation; correcting an error, included in the first data, when the number of erroneous bits included in the first data is equal to or less than the error correction capability; and outputting, to a PIM controller, MAC calculation result data generated by performing the MAC arithmetic operation on the error-corrected first data.