Patent classifications
G06F7/57
Computing chip, hashrate board and data processing apparatus
This disclosure relates to a computing chip, a hashrate board, and a data processing apparatus. The computing chip includes a plurality of operation stages arranged in a pipeline configuration. Each operation stage includes: a first combinational logic circuit occupying a plurality of first cell points adjacent to each other, at least a portion of the first cell points being located in a first incomplete column; one or more second combinational logic circuits each occupying one or more second cell points, at least a portion of the second cell points being located in a second incomplete column; and a plurality of registers each occupying a plurality of third cell points, at least a portion of the third cell points being located in the first incomplete column or the second incomplete column. The first cell points, the second cell points, and third cell points occupy equal areas on the computing chip.
Computing chip, hashrate board and data processing apparatus
This disclosure relates to a computing chip, a hashrate board, and a data processing apparatus. The computing chip includes a plurality of operation stages arranged in a pipeline configuration. Each operation stage includes: a first combinational logic circuit occupying a plurality of first cell points adjacent to each other, at least a portion of the first cell points being located in a first incomplete column; one or more second combinational logic circuits each occupying one or more second cell points, at least a portion of the second cell points being located in a second incomplete column; and a plurality of registers each occupying a plurality of third cell points, at least a portion of the third cell points being located in the first incomplete column or the second incomplete column. The first cell points, the second cell points, and third cell points occupy equal areas on the computing chip.
Method and system for processing neural network
The present disclosure provides a neural network processing system that comprises a multi-core processing module composed of a plurality of core processing modules and for executing vector multiplication and addition operations in a neural network operation, an on-chip storage medium, an on-chip address index module, and an ALU module for executing a non-linear operation not completable by the multi-core processing module according to input data acquired from the multi-core processing module or the on-chip storage medium, wherein the plurality of core processing modules share an on-chip storage medium and an ALU module, or the plurality of core processing modules have an independent on-chip storage medium and an ALU module. The present disclosure improves an operating speed of the neural network processing system, such that performance of the neural network processing system is higher and more efficient.
Method and system for processing neural network
The present disclosure provides a neural network processing system that comprises a multi-core processing module composed of a plurality of core processing modules and for executing vector multiplication and addition operations in a neural network operation, an on-chip storage medium, an on-chip address index module, and an ALU module for executing a non-linear operation not completable by the multi-core processing module according to input data acquired from the multi-core processing module or the on-chip storage medium, wherein the plurality of core processing modules share an on-chip storage medium and an ALU module, or the plurality of core processing modules have an independent on-chip storage medium and an ALU module. The present disclosure improves an operating speed of the neural network processing system, such that performance of the neural network processing system is higher and more efficient.
METHOD AND APPARATUS FOR VECTOR SORTING USING VECTOR PERMUTATION LOGIC
A method for sorting of a vector in a processor is provided that includes performing, by the processor in response to a vector sort instruction, generating a control input vector for vector permutation logic comprised in the processor based on values in lanes of the vector and a sort order for the vector indicated by the vector sort instruction and storing the control input vector in a storage location.
METHOD AND APPARATUS FOR VECTOR SORTING USING VECTOR PERMUTATION LOGIC
A method for sorting of a vector in a processor is provided that includes performing, by the processor in response to a vector sort instruction, generating a control input vector for vector permutation logic comprised in the processor based on values in lanes of the vector and a sort order for the vector indicated by the vector sort instruction and storing the control input vector in a storage location.
Method and apparatus with neural network parameter quantization
Provided is a processor implemented method that includes performing training or an inference operation with a neural network by obtaining a parameter for the neural network in a floating-point format, applying a fractional length of a fixed-point format to the parameter in the floating-point format, performing an operation with an integer arithmetic logic unit (ALU) to determine whether to round off a fixed point based on a most significant bit among bit values to be discarded after a quantization process, and performing an operation of quantizing the parameter in the floating-point format to a parameter in the fixed-point format, based on a result of the operation with the ALU.
Method and apparatus with neural network parameter quantization
Provided is a processor implemented method that includes performing training or an inference operation with a neural network by obtaining a parameter for the neural network in a floating-point format, applying a fractional length of a fixed-point format to the parameter in the floating-point format, performing an operation with an integer arithmetic logic unit (ALU) to determine whether to round off a fixed point based on a most significant bit among bit values to be discarded after a quantization process, and performing an operation of quantizing the parameter in the floating-point format to a parameter in the fixed-point format, based on a result of the operation with the ALU.
MEMORY ARRAY STRUCTURE
The present invention disclosures a memory array structure, comprising an array composed of multiple memory devices arranged in rows and columns, each of the rows is set with a row leading-out wire, and each of the columns is set with a column leading-out wire, memory devices are correspondingly positioned at intersection points of each row leading-out wire and each column leading-out wire; wherein, the first terminal of each of the memory devices is individually connected to the row leading-out wire of the same row, and the second terminal of each of the memory devices is connected to a first terminal of a switch in the same column, the second terminal of the switch is connected to the column leading-out wire of the same column; wherein, each of the rows is set with one to multiple the switches, and the first terminal of each of the switches is connected to one to all of the second terminals of the memory devices in the same column. The advantage of the present invention is that the corresponding analog currents output of input signals of different specified rows according to multiply-accumulate operation requirements of each of the columns can be obtained simultaneously, thus multiply-accumulate operations of different input signals of different scales can be performed, which greatly improves operation speed and using efficiency of the array.
MEMORY ARRAY STRUCTURE
The present invention disclosures a memory array structure, comprising an array composed of multiple memory devices arranged in rows and columns, each of the rows is set with a row leading-out wire, and each of the columns is set with a column leading-out wire, memory devices are correspondingly positioned at intersection points of each row leading-out wire and each column leading-out wire; wherein, the first terminal of each of the memory devices is individually connected to the row leading-out wire of the same row, and the second terminal of each of the memory devices is connected to a first terminal of a switch in the same column, the second terminal of the switch is connected to the column leading-out wire of the same column; wherein, each of the rows is set with one to multiple the switches, and the first terminal of each of the switches is connected to one to all of the second terminals of the memory devices in the same column. The advantage of the present invention is that the corresponding analog currents output of input signals of different specified rows according to multiply-accumulate operation requirements of each of the columns can be obtained simultaneously, thus multiply-accumulate operations of different input signals of different scales can be performed, which greatly improves operation speed and using efficiency of the array.