Patent classifications
G06F9/30069
Picoengine having a hash generator with remainder input S-box nonlinearizing
A processor includes a hash register and a hash generating circuit. The hash generating circuit includes a novel programmable nonlinearizing function circuit as well as a modulo-2 multiplier, a first modulo-2 summer, a modulor-2 divider, and a second modulo-2 summer. The nonlinearizing function circuit receives a hash value from the hash register and performs a programmable nonlinearizing function, thereby generating a modified version of the hash value. In one example, the nonlinearizing function circuit includes a plurality of separately enableable S-box circuits. The multiplier multiplies the input data by a programmable multiplier value, thereby generating a product value. The first summer sums a first portion of the product value with the modified hash value. The divider divides the resulting sum by a fixed divisor value, thereby generating a remainder value. The second summer sums the remainder value and the second portion of the input data, thereby generating a hash result.
Indirect instruction predication
A circuit arrangement and program product selectively predicate instructions in an instruction stream by determining a first register address from an instruction, determining a second register address based on a value stored at the first register address, and determining whether to predicate the instruction based at least in part on a value stored at the second register address. Predication logic may analyze the instruction to determine the first register address, analyze a register corresponding to the first register address to determine the second register address, and communicate a predication signal to an execution unit based at least in part on the value stored at the second register address.
Slice-based intelligent packet data register file
A multi-processor includes a pool of processors and a common packet buffer memory. Bytes of packet data of a packet are stored in the packet buffer memory. Each of the processors has an intelligent packet data register file. One processor is tasked with processing the packet data, and its packet data register file caches a subset of the bytes. Some instructions when executed require that the packet data register file supply the processor execute stage with certain bytes of the packet data. The register file includes a set of slice portions, where each slice portion is responsible for different bytes of the overall packet data. Each slice portion independently handles stalling the processor and prefetching any bytes it is responsible for. The slice portions output their bytes in a shifted and masked fashion to that the overall register file output is properly presented to the execute stage.
Processor that leapfrogs MOV instructions
A processor performs out-of-order execution of a first instruction and a second instruction after the first instruction in program order, the first instruction includes source and destination indicators, the source indicator specifies a source of data, the destination indicator specifies a destination of the data, the first instruction instructs the processor to move the data from the source to the destination, the second instruction specifies a source indicator that specifies a source of data. A rename unit updates the second instruction source indicator with the first instruction source indicator if there are no intervening instructions that write to the source or to the destination of the first instruction and the second instruction source indicator matches the first instruction destination indicator.
Indirect instruction predication
A method for selectively predicating instructions in an instruction stream by determining a first register address from an instruction, determining a second register address based on a value stored at the first register address, and determining whether to predicate the instruction based at least in part on a value stored at the second register address. Predication logic may analyze the instruction to determine the first register address, analyze a register corresponding to the first register address to determine the second register address, and communicate a predication signal to an execution unit based at least in part on the value stored at the second register address.
HASHING FOR DEDUPLICATION THROUGH SKIPPING SELECTED DATA
A system for calculating a fingerprint across a data set by identifying a data set to hash, the data set comprising a set of data blocks, identifying data within the data set to skip, generating, by a hash engine, a hash for each data block in the set of data blocks within the data set except for the data within the data set to skip, and compressing the data.
Branch prediction method, branch prediction apparatus, processor, medium, and device
A branch prediction method includes obtaining an instruction block containing an instruction, performing detection on the instruction block according to branch instruction information stored in a branch target buffer of a branch predictor of a processor, and in response to detecting that the instruction is a branch instruction, detecting a type of the branch instruction. The method further includes, in response to the type of the branch instruction being a type other than a target type, searching for a predicted jump address of the branch instruction in the branch target buffer, and, in response to the type of the branch instruction being the target type, searching for the predicted jump address of the branch instruction in other address areas of the branch predictor. The target type includes at least one of a function call instruction type, a function return instruction type, or a loop instruction type.
BITWISE PRODUCT-SUM ACCUMULATIONS WITH SKIP LOGIC
A method, device, and system for performing a partial sum accumulation of a product of input vectors and weight vectors in a wordwise-input and bitwise-weight manner results in a partial accumulated product sum. The partial accumulated product sum is compared with a threshold condition after each weight bit, and when the partial accumulated product sum meets the threshold condition, a skip indicator is asserted to indicate that remaining computations of a sum accumulation are skipped.
Control register for storing instruction size information
A processor circuit that includes a plurality of register circuits and an event handler circuit is disclosed. The event handler circuit may detect a processing event that causes the processor circuit to halt execution of a current instruction and transfer control to a kernel. In response to a detection of the processing event, the event handler circuit may store a program counter value corresponding to the current instruction, information indicative of a cause of the processing event, and a size of the current instruction in corresponding register circuits of the plurality of register circuits.
CONTROL REGISTER FOR STORING INSTRUCTION SIZE INFORMATION
A processor circuit that includes a plurality of register circuits and an event handler circuit is disclosed. The event handler circuit may detect a processing event that causes the processor circuit to halt execution of a current instruction and transfer control to a kernel. In response to a detection of the processing event, the event handler circuit may store a program counter value corresponding to the current instruction, information indicative of a cause of the processing event, and a size of the current instruction in corresponding register circuits of the plurality of register circuits.