G06F7/53

COMPUTING APPARATUS AND METHOD FOR NEURAL NETWORK OPERATION, INTEGRATED CIRCUIT, AND DEVICE
20220350569 · 2022-11-03 ·

The present disclosure relates to a computing apparatus, a method, an integrated circuit chip and an integrated circuit device for performing a neural network operation. The computing apparatus may be included in a combined processing apparatus. The combined processing apparatus may further include a general interconnection interface and other processing apparatus. The computing apparatus interacts with other processing apparatus to jointly complete calculation operations specified by users. The combined processing apparatus may further include a storage apparatus. The storage apparatus is respectively connected to the computing apparatus and other processing apparatus, and the storage apparatus is used for storing data of the computing apparatus and other processing apparatus. Solutions of the present disclosure may be widely applied to various floating-point data computations.

COMPUTING APPARATUS AND METHOD FOR NEURAL NETWORK OPERATION, INTEGRATED CIRCUIT, AND DEVICE
20220350569 · 2022-11-03 ·

The present disclosure relates to a computing apparatus, a method, an integrated circuit chip and an integrated circuit device for performing a neural network operation. The computing apparatus may be included in a combined processing apparatus. The combined processing apparatus may further include a general interconnection interface and other processing apparatus. The computing apparatus interacts with other processing apparatus to jointly complete calculation operations specified by users. The combined processing apparatus may further include a storage apparatus. The storage apparatus is respectively connected to the computing apparatus and other processing apparatus, and the storage apparatus is used for storing data of the computing apparatus and other processing apparatus. Solutions of the present disclosure may be widely applied to various floating-point data computations.

POLYNOMIAL MULTIPLICATION FOR SIDE-CHANNEL PROTECTION IN CRYPTOGRAPHY
20230091951 · 2023-03-23 · ·

Polynomial multiplication for side-channel protection in cryptography is described. An example of a apparatus includes one or more processors to process data; a memory to store data; and polynomial multiplier circuitry to multiply a first polynomial by a second polynomial, the first polynomial and the second polynomial each including a plurality of coefficients, the polynomial multiplier circuitry including a set of multiplier circuitry, wherein the polynomial multiplier circuitry is to select a first coefficient of the first polynomial for processing, and multiply the first coefficient of the first polynomial by all of the plurality of coefficients of the second polynomial in parallel using the set of multiplier circuits.

POLYNOMIAL MULTIPLICATION FOR SIDE-CHANNEL PROTECTION IN CRYPTOGRAPHY
20230091951 · 2023-03-23 · ·

Polynomial multiplication for side-channel protection in cryptography is described. An example of a apparatus includes one or more processors to process data; a memory to store data; and polynomial multiplier circuitry to multiply a first polynomial by a second polynomial, the first polynomial and the second polynomial each including a plurality of coefficients, the polynomial multiplier circuitry including a set of multiplier circuitry, wherein the polynomial multiplier circuitry is to select a first coefficient of the first polynomial for processing, and multiply the first coefficient of the first polynomial by all of the plurality of coefficients of the second polynomial in parallel using the set of multiplier circuits.

METHOD AND APPARATUS FOR IMPLIED BIT HANDLING IN FLOATING POINT MULTIPLICATION
20230085048 · 2023-03-16 ·

A method is provided that includes performing, by a processor in response to a floating point multiply instruction, multiplication of floating point numbers, wherein determination of values of implied bits of leading bit encoded mantissas of the floating point numbers is performed in parallel with multiplication of the encoded mantissas, and storing, by the processor, a result of the floating point multiply instruction in a storage location indicated by the floating point multiply instruction.

METHOD AND APPARATUS FOR IMPLIED BIT HANDLING IN FLOATING POINT MULTIPLICATION
20230085048 · 2023-03-16 ·

A method is provided that includes performing, by a processor in response to a floating point multiply instruction, multiplication of floating point numbers, wherein determination of values of implied bits of leading bit encoded mantissas of the floating point numbers is performed in parallel with multiplication of the encoded mantissas, and storing, by the processor, a result of the floating point multiply instruction in a storage location indicated by the floating point multiply instruction.

CONFIGURABLE LOGIC CELL
20230077881 · 2023-03-16 ·

Configurable circuits include an input selection region, a computation region, a switching region, and an output region. The input selection region includes a set of input multiplexers and selects and routes input signals. The computation region includes a set of lookup tables, each lookup table being coupled to selected signals from the input selection stage to generate a respective output signal. The switching region includes a set of output multiplexers, each output multiplexer being coupled to output signals from the set of lookup tables to provide circuit outputs responsive to respective output selection signals. The output region includes a domino logic stage, having a set of transistors, coupled to output signals from the set of lookup tables to provide circuit outputs that determine combinations of the signals output by the set of lookup tables.

Memory device and computing device using the same
11474785 · 2022-10-18 · ·

A memory device is provided. The memory device includes: a cell region including a first metal pad, a memory cell in the cell region configured to store weight data, a peripheral region including a second metal pad and vertically connected to the memory cell by the first metal pad and the second metal pad, a buffer memory in the peripheral region configured to read the weight data from the memory cell, an input/output pad in the peripheral region configured to receive input data; and a multiply-accumulate (MAC) operator in the peripheral region configured to receive the weight data from the buffer memory and receive the input data from the input/output pad to perform a convolution operation of the weight data and the input data, wherein the input data is provided to the MAC operator during a first period, and wherein the MAC operator performs the convolution operation of the weight data and the input data during a second period overlapping with the first period.

Memory device and computing device using the same
11474785 · 2022-10-18 · ·

A memory device is provided. The memory device includes: a cell region including a first metal pad, a memory cell in the cell region configured to store weight data, a peripheral region including a second metal pad and vertically connected to the memory cell by the first metal pad and the second metal pad, a buffer memory in the peripheral region configured to read the weight data from the memory cell, an input/output pad in the peripheral region configured to receive input data; and a multiply-accumulate (MAC) operator in the peripheral region configured to receive the weight data from the buffer memory and receive the input data from the input/output pad to perform a convolution operation of the weight data and the input data, wherein the input data is provided to the MAC operator during a first period, and wherein the MAC operator performs the convolution operation of the weight data and the input data during a second period overlapping with the first period.

Fast digital multiply-accumulate (MAC) by fast digital multiplication circuit

Certain aspects provide methods and apparatus for multiplication of digital signals. In accordance with certain aspects, a multiplication circuit may be used to multiply a portion of a first digital input signal with a portion of a second digital input signal via a first multiplier circuit to generate a first multiplication signal, and multiply another portion of the first digital input signal with another portion of the second digital input signal via a second multiplier circuit to generate a second multiplication signal. A third multiplier circuit and multiple adder circuits may be used to generate an output of the multiplication circuit based on the first and second multiplication signals.