G06F7/525

REDUCED RESULT MATRIX
20220035891 · 2022-02-03 ·

Matrix multiple operations may use a reduced result matrix to increase the speed and accuracy of the operation. In one example, each higher precision row/column is decomposed into multiple component rows/columns of the base type that can be combined as weighted sums to form the original higher precision row/column. In another example, the decomposition may be independent for each input matrix and decompose to any multiple of the base type. In another example, the base type for each input matrix could be different. In another example, after decomposition, a matrix operation is performed (e.g. matrix multiply, convolutional layer, or possibly other matrix operation) on decomposed base type input matrices to yield a result matrix that contains components of the higher precision results. The results may be combined together to obtain higher-precision results.

REDUCED RESULT MATRIX
20220035891 · 2022-02-03 ·

Matrix multiple operations may use a reduced result matrix to increase the speed and accuracy of the operation. In one example, each higher precision row/column is decomposed into multiple component rows/columns of the base type that can be combined as weighted sums to form the original higher precision row/column. In another example, the decomposition may be independent for each input matrix and decompose to any multiple of the base type. In another example, the base type for each input matrix could be different. In another example, after decomposition, a matrix operation is performed (e.g. matrix multiply, convolutional layer, or possibly other matrix operation) on decomposed base type input matrices to yield a result matrix that contains components of the higher precision results. The results may be combined together to obtain higher-precision results.

Protection of an iterative calculation

Cryptographic circuitry, in operation, performs a calculation on a first number and a second number. The performing of the calculation is protected by breaking the second number into a plurality of third numbers, a sum of values of the third numbers being equal to a value of the second number. The calculation is performed bit by bit for each rank of the third numbers. Functional circuitry, coupled to the cryptographic circuitry, uses a result of the calculation.

Protection of an iterative calculation

Cryptographic circuitry, in operation, performs a calculation on a first number and a second number. The performing of the calculation is protected by breaking the second number into a plurality of third numbers, a sum of values of the third numbers being equal to a value of the second number. The calculation is performed bit by bit for each rank of the third numbers. Functional circuitry, coupled to the cryptographic circuitry, uses a result of the calculation.

COMPUTE IN MEMORY ACCUMULATOR

A compute-in memory (CIM) device is configured to determine at least one input according to a type of an application and at least one weight according to a training result or a configuration of a user. The CIM device performs a bit-serial multiplication based on the input and the weight, from a most significant bit (MSB) of the input to a least significant bit (LSB) of the input to obtain a result according to a plurality of partial-products. A first partial-sum of a first bit of the input is left shifted one bit and then added with a second partial-product of a second bit of the input to obtain a second partial-sum of the second bit. The second bit is one bit after the first bit, and the result is output by the CIM device.

COMPUTE IN MEMORY ACCUMULATOR

A compute-in memory (CIM) device is configured to determine at least one input according to a type of an application and at least one weight according to a training result or a configuration of a user. The CIM device performs a bit-serial multiplication based on the input and the weight, from a most significant bit (MSB) of the input to a least significant bit (LSB) of the input to obtain a result according to a plurality of partial-products. A first partial-sum of a first bit of the input is left shifted one bit and then added with a second partial-product of a second bit of the input to obtain a second partial-sum of the second bit. The second bit is one bit after the first bit, and the result is output by the CIM device.

PROTECTION OF AN ITERATIVE CALCULATION
20200313846 · 2020-10-01 ·

Cryptographic circuitry, in operation, performs a calculation on a first number and a second number. The performing of the calculation is protected by breaking the second number into a plurality of third numbers, a sum of values of the third numbers being equal to a value of the second number. The calculation is performed bit by bit for each rank of the third numbers. Functional circuitry, coupled to the cryptographic circuitry, uses a result of the calculation.

PROTECTION OF AN ITERATIVE CALCULATION
20200313846 · 2020-10-01 ·

Cryptographic circuitry, in operation, performs a calculation on a first number and a second number. The performing of the calculation is protected by breaking the second number into a plurality of third numbers, a sum of values of the third numbers being equal to a value of the second number. The calculation is performed bit by bit for each rank of the third numbers. Functional circuitry, coupled to the cryptographic circuitry, uses a result of the calculation.

Counter-based multiplication using processing in memory
11934798 · 2024-03-19 · ·

The present disclosure is directed to systems and methods for a memory device such as, for example, a Processing-In-Memory Device that is configured to perform multiplication operations in memory using a popcount operation. A multiplication operation may include a summation of multipliers being multiplied with corresponding multiplicands. The inputs may be arranged in particular configurations within a memory array. Sense amplifiers may be used to perform the popcount by counting active bits along bit lines. One or more registers may accumulate results for performing the multiplication operations.

Apparatus and method for vector instructions for large integer arithmetic

An apparatus is described that includes a semiconductor chip having an instruction execution pipeline having one or more execution units with respective logic circuitry to: a) execute a first instruction that multiplies a first input operand and a second input operand and presents a lower portion of the result, where, the first and second input operands are respective elements of first and second input vectors; b) execute a second instruction that multiplies a first input operand and a second input operand and presents an upper portion of the result, where, the first and second input operands are respective elements of first and second input vectors; and, c) execute an add instruction where a carry term of the add instruction's adding is recorded in a mask register.