Patent classifications
G06F7/764
Method and apparatus for performing a vector permute with an index and an immediate
A processor for performing a vector permute comprises: a source vector register to store a plurality of source data elements; a destination vector register to store a plurality of destination data elements; a control vector register to store a plurality of control data elements, each control data element corresponding to one of the destination data elements and including an N bit value indicating whether a source data element is to be copied to the corresponding destination data element; vector permute logic to compare the N bit value of each control data element to an N bit portion of an immediate to determine whether to copy a source data element to the corresponding destination data element, wherein if the N bit values match, then the vector permute logic is to identify a source data element using an index value included in the control data element.
Multiplier protected against power analysis attacks
A multi-word multiplier circuit includes an interface and circuitry. The interface is configured to receive a first parameter X including one or more first words, and a second parameter Y including multiple second words. The second parameter includes a blinded version of a non-blinded parameter Y that is blinded using a blinding parameter A.sub.Y so that Y=Y+A.sub.Y. The circuitry is configured to calculate a product Z=X.Math.Y by summing multiple sub-products, each of the sub-products is calculated by multiplying a first word of X by a second word of Y, and subtracting from intermediate temporary sums of the sub-products respective third words of a partial product P=X.Math.B.sub.Y, B.sub.Y is a blinding word included in A.sub.Y.
ARITHMETIC PROCESSING DEVICE, IMAGE PROCESSING DEVICE, AND IMAGING DEVICE
An arithmetic processing device of a pipeline configuration in which a combination of a combination circuit and a flip-flop circuit group including a plurality of flip-flop circuits corresponding to each bits of output data of the combination circuit is connected in a plurality of stages includes a mask processing section configured to control a mask of an operation clock signal to be supplied to each flip-flop circuit, wherein the mask processing section is configured to supply the operation clock signal to each flip-flop circuit corresponding to a bit of the input data for use in the arithmetic process in the combination circuit, and wherein the mask processing section is configured to mask the operation clock signal corresponding to a bit of the input data that is unused in the arithmetic process in the combination circuit.
Statistical object generator
The present invention provides methods and apparatus to generate a statistical object, the deterministic statistical representation of an original object, using a Deterministic Random Bit Generator (DRBG) (10). Multiple DRBG Statistical Object Generators (10) may be chained together to increase security by using independent security configurations (22) for each DRBG Statistical Object Generator (10).
Zero coefficient skipping convolution neural network engine
A convolution engine, such as a convolution neural network, operates efficiently with respect to sparse kernels by implementing zero skipping. An input tile is loaded and accumulated sums are calculated for the input tile for non-zero coefficients by shifting the tile according to a row and column index of the coefficient in the kernel. Each coefficient is applied individually to tile and the result written to an accumulation buffer before moving to the next non-zero coefficient. A 3D or 4D convolution may be implemented in this manner with separate regions of the accumulation buffer storing accumulated sums for different indexes along one dimension. Images are completely processed and results for each image are stored in the accumulation buffer before moving to the next image.
Weight processing for a neural network
Systems and methods for processing data for a neural network are described. The system comprises non-transitory memory configured to receive data bits defining a kernel of weights, the data bits being suitable for processing input data; and a data processing unit, configured to: receive bits defining a kernel of weights for the neural network, the kernel of weights comprising one or more non-zero value weights and one or more zero-valued weights; generate a set of mask bits, a position of each bit in the set of mask bits corresponds to a position within the kernel of weights and the value of each bit indicates whether a weight in the corresponding position is a zero-valued weight or a non-zero value weight; and transmit the non-zero value weights and the set of mask bits for storage, the non-zero value weights and the set of mask bits represent the kernel of weights.
CONVERTING A BOOLEAN MASKED VALUE TO AN ARITHMETICALLY MASKED VALUE FOR CRYPTOGRAPHIC OPERATIONS
A first input share value, a second input share value, and a third input share value may be received. The first input share value may be converted to a summation or subtraction between an input value and a combination of the second input share value and the third input share value. A random number value may be generated and combined with the second input share value and the third input share value to generate a combined value. Furthermore, a first output share value may be generated based on a combination of the converted first input share value, the combined value, and additional random number values.
METHOD FOR SELECTING A VALUE AMONGST TWO VALUES RECORDED IN TWO DIFFERENT REGISTERS
A method includes performing a cryptographic operation using a processing device. The performing the cryptographic operation includes protecting the performing of the cryptographic operation against side channel attacks by selecting a value amongst two values based on a selection bit. Selecting the value includes concatenating the two values in a register, generating a concatenated word including the two values in two distinct portions of the concatenated word in the register. The concatenated word is rotated according to the value of the selection bit to position the selected value in a determined portion of the concatenated word in the register amongst said two portions. The unselected value in the concatenated word is suppressed. One or more processing operations is performed based on a result of the cryptographic operation.
METHOD FOR CALCULATING A TRANSITION FROM A BOOLEAN MASKING TO AN ARITHMETIC MASKING
A method is provided for re-masking from a Boolean mask to an arithmetic mask with a modulus (2m*p), in which m is an integer greater than or equal to zero, and p has at least one prime divisor unequal to 2, so that a carry is generated. The carry is masked or balanced to protect it against intrusion attacks.
SYSTEMS AND METHODS FOR HARDWARE ACCELERATION OF DATA MASKING
A field programmable gate array (FPGA) including a configurable interconnect fabric connecting a plurality of logic blocks, the configurable interconnect fabric and the logic blocks being configured to implement a data masking circuit configured to: receive input data including data values at a plurality of indices of the input data; select between a data value of the data values and an alternative value using a masking multiplexer to generate masked data, the masking multiplexer being controlled by a mask value of a plurality of mask values at indices corresponding to the indices of the input data; and output the masked data. In some examples, the configurable interconnect fabric and the logic blocks are further configured to implement a mask generation circuit configured to generate the mask values. In some examples, the mask values are received from external memory.