G06F9/30156

Methods and apparatuses for command shifter reduction
10825492 · 2020-11-03 · ·

Apparatuses and methods for reducing a number of command shifters are disclosed. An example apparatus includes an encoder circuit, a latency shifter circuit, and a decoder circuit. The encoder circuit may be configured to encode commands, wherein the commands are encoded based on their command type and the latency shifter circuit, coupled to the encoder circuit, may be configured to provide a latency to the encoded commands. The decoder circuit, coupled to the latency shifter circuit, may be configured to decode the encoded commands and provide decoded commands to perform memory operations associated with the command types of the decoded commands.

ARRAY BROADCAST AND REDUCTION SYSTEMS AND METHODS

The present disclosure is directed to systems and methods of performing one or more broadcast or reduction operations using direct memory access (DMA) control circuitry. The DMA control circuitry executes a modified instruction set architecture (ISA) that facilitates the broadcast distribution of data to a plurality of destination addresses in system memory circuitry. The broadcast instruction may include broadcast of a single data value to each destination address. The broadcast instruction may include broadcast of a data array to each destination address. The DMA control circuitry may also execute a reduction instruction that facilitates the retrieval of data from a plurality of source addresses in system memory and performing one or more operations using the retrieved data. Since the DMA control circuitry, rather than the processor circuitry performs the broadcast and reduction operations, system speed and efficiency is beneficially enhanced.

ACCELERATOR SYSTEMS AND METHODS FOR MATRIX OPERATIONS

The present disclosure is directed to systems and methods for performing one or more operations on a two dimensional tile register using an accelerator that includes a tiled matrix multiplication unit (TMU). The processor circuitry includes reservation station (RS) circuitry to communicatively couple the processor circuitry to the TMU. The RS circuitry coordinates the operations performed by the TMU. TMU dispatch queue (TDQ) circuitry in the TMU maintains the operations received from the RS circuitry in the order that the operations are received from the RS circuitry. Since the duration of each operation is not known prior to execution by the TMU, the RS circuitry maintains shadow dispatch queue (RS-TDQ) circuitry that mirrors the operations in the TDQ circuitry. Communication between the RS circuitry 134 and the TMU provides the RS circuitry with notification of successfully executed operations and allows the RS circuitry to cancel operations where the operations are associated with branch mispredictions and/or non-retired speculatively executed instructions.

Variable register and immediate field encoding in an instruction set architecture
10776114 · 2020-09-15 · ·

A method and apparatus provide means for compressing instruction code size. An Instruction Set Architecture (ISA) encodes instructions compact, usual or extended bit lengths. Commonly used instructions are encoded having both compact and usual bit lengths, with compact or usual bit length instructions chosen based on power, performance or code size requirements. Instructions of the ISA can be used in both privileged and non-privileged operating modes of a microprocessor. The instruction encodings can be used interchangeably in software applications. Instructions from the ISA may be executed on any programmable device enabled for the ISA, including a single instruction set architecture processor or a multi-instruction set architecture processor.

Method and apparatus to process SHA-2 secure hashing algorithm

A processor includes an instruction decoder to receive a first instruction to process a secure hash algorithm 2 (SHA-2) hash algorithm, the first instruction having a first operand associated with a first storage location to store a SHA-2 state and a second operand associated with a second storage location to store a plurality of messages and round constants. The processor further includes an execution unit coupled to the instruction decoder to perform one or more iterations of the SHA-2 hash algorithm on the SHA-2 state specified by the first operand and the plurality of messages and round constants specified by the second operand, in response to the first instruction.

Data compression optimization based on client clusters

Data compression optimization based on client clusters is described. A system identifies a cluster of similar client devices in a group of client devices, by comparing data compression factors that correspond to each client device in the group of client devices. The system identifies a relationship between data compression factors corresponding to the cluster and data compression ratios corresponding to the cluster. The system identifies a client device, in the cluster, which corresponds to a data compression ratio that is inefficient relative to other compression ratios corresponding to other client devices in the cluster. The system outputs a data compression recommendation for the client device, based on data compression factors corresponding to the client device and the identified relationship between the data compression factors corresponding to the cluster and the data compression ratios corresponding to the cluster.

Encoding and Decoding Variable Length Instructions
20200183693 · 2020-06-11 ·

Methods of encoding and decoding are described which use a variable number of instruction words to encode instructions from an instruction set, such that different instructions within the instruction set may be encoded using different numbers of instruction words. To encode an instruction, the bits within the instruction are reordered and formed into instruction words based upon their variance as determined using empirical or simulation data. The bits in the instruction words are compared to corresponding predicted values and some or all of the instruction words that match the predicted values are omitted from the encoded instruction.

ENHANCED PROTECTION OF PROCESSORS FROM A BUFFER OVERFLOW ATTACK
20200183691 · 2020-06-11 ·

A method for changing a processor instruction randomly, covertly, and uniquely, so that the reverse process can restore it faithfully to its original form, making it virtually impossible for a malicious user to know how the bits are changed, preventing them from using a buffer overflow attack to write code with the same processor instruction changes into said processor's memory with the goal of taking control of the processor. When the changes are reversed prior to the instruction being executed, reverting the instruction back to its original value, malicious code placed in memory will be randomly altered so that when it is executed by the processor it produces chaotic, random behavior that will not allow control of the processor to be compromised, eventually producing a processing error that will cause the processor to either shut down the software process where the code exists to reload, or reset.

NEURAL NETWORK CONTROL DEVICE AND METHOD

An embodiment of the present invention provides a neural network operator that performs a plurality of processes for each of a plurality of layers of a neural network, including: a memory that includes a data-storing space storing a plurality of data for performing the plurality of processes and a synapse code-storing space storing a plurality of descriptors with respect to the plurality of processes; a memory-transmitting processor that obtains the plurality of descriptors and transmits the plurality of data to the neural network operator based on the plurality of descriptors; an embedded instruction processor that obtains the plurality of descriptors from the memory-transmitting processor, transmits a first data set in a first descriptor to the neural network operator based on the first descriptor corresponding to the first process among the plurality of processes, reads a second descriptor corresponding to a second process, which is a next operation of the first process, based on the first descriptor, and controls the memory-transmitting processor to transmit second data corresponding to the second descriptor to the neural network operator based on the second descriptor; and a synapse code generator that generates the plurality of descriptors, and thus it is possible to operate the neural network operator at high speed without interference of other devices, and it is possible to reduce the memory-storing space for the descriptors.

NEURAL NETWORK CONTROL DEVICE AND METHOD

An embodiment of the present invention provides a neural network operator that performs a plurality of processes for each of a plurality of layers of a neural network, including: a memory that includes a data-storing space storing a plurality of data for performing the plurality of processes and a synapse code-storing space storing a plurality of descriptors with respect to the plurality of processes; a memory-transmitting processor that obtains the plurality of descriptors and transmits the plurality of data to the neural network operator based on the plurality of descriptors; an embedded instruction processor that obtains the plurality of descriptors from the memory-transmitting processor, transmits a first data set in a first descriptor to the neural network operator based on the first descriptor corresponding to the first process among the plurality of processes, reads a second descriptor corresponding to a second process, which is a next operation of the first process, based on the first descriptor, and controls the memory-transmitting processor to transmit second data corresponding to the second descriptor to the neural network operator based on the second descriptor; and a synapse code generator that generates the plurality of descriptors, and thus it is possible to operate the neural network operator at high speed without interference of other devices, and it is possible to reduce the memory-storing space for the descriptors.