Patent classifications
G06F9/30021
TRUE/FALSE VECTOR INDEX REGISTERS AND METHODS OF POPULATING THEREOF
Disclosed herein are vector index registers for storing or loading indexes of true and/or false results of comparison operations in vector processors. Each of the vector index registers store multiple addresses for accessing multiple positions in operand vectors.
CONDITIONAL MODULAR SUBTRACTION INSTRUCTION
One embodiment provides a processor comprising first circuitry to decode an instruction into a decoded instruction, the instruction to indicate a first source operand and a second source operand and second circuitry including a processing resource to execute the decoded instruction, wherein responsive to the decoded instruction, the processing resource is to output a result of first source operand data minus second source operand data in response to a determination by the processing resource that the first source operand data is greater than or equal to the second source operand data, otherwise the processing resource is to output the first source operand data.
METHOD FOR EXTRACTING TARGET STRING AT HIGH-SPEED USING VECTOR INSTRUCTION
The present disclosure provides a computer-implemented method for extracting a target string excluding delimiter from a character string, which comprises a first step of loading a unit string into a 1-0 register; a second step of loading a delimiter boundary value into a 1-1 register; a third step of loading a value calculated based on the comparison result between the 1-0 register and the 1-1 register, into a 1-2 register; a fourth step of creating a mask by transferring a feature value of the value loaded on the 1-2 register to a second register; a fifth step of creating delimiter array by calculating offset of the delimiter based on the feature value; and a sixth step of extracting the target string based on the delimiter array.
BFLOAT16 COMPARISON INSTRUCTIONS
Techniques for comparing BF16 data elements are described. An exemplary BF16 comparison instruction includes fields for an opcode, an identification of a location of a first packed data source operand, and an identification of a location of a second packed data source operand, wherein the opcode is to indicate that execution circuitry is to perform, for a particular data element position of the packed data source operands, a comparison of a data element at that position, and update a flags register based on the comparison.
METHOD AND APPARATUS FOR PERFORMING REDUCTION OPERATIONS ON A PLURALITY OF ASSOCIATED DATA ELEMENT VALUES
Embodiments detailed herein relate to reduction operations on a plurality of data element values. In one embodiment, a process comprises decoding circuitry to decode an instruction and execution circuitry to execute the decoded instruction. The instruction specifies a first input register containing a plurality of data element values, a first index register containing a plurality of indices, and an output register, where each index of the plurality of indices maps to one unique data element position of the first input register. The execution includes to identify data element values that are associated with one another based on the indices, perform one or more reduction operations on the associated data element values based on the identification, and store results of the one or more reduction operations in the output register.
BFLOAT16 CLASSIFICATION AND MANIPULATION INSTRUCTIONS
Techniques for BF16 classification or manipulation using single instructions are described. An exemplary instruction includes fields for an opcode, an identification of a location of a packed data source operand, an indication of one or more classification checks to perform, and an identification of a packed data destination operand, wherein the opcode is to indicate that execution circuitry is to perform, for each data element position of the packed data source operand, a classification according to the indicated one or more classification checks and store a result of the classification in a corresponding data element position of the destination operand.
Vector population count determination via comparsion iterations in memory
Examples of the present disclosure provide apparatuses and methods for determining a vector population count in a memory. An example method comprises determining, using sensing circuitry, a vector population count of a number of fixed length elements of a vector stored in a memory array.
CALCULATOR AND CALCULATION METHOD
A calculator includes: registers each including sub-registers that hold pieces of data for use in operation; an operator that executes, in parallel, operations of the pieces of data; and a memory configured to hold a first vector and second vectors to be compared with the first vector. Each second vector is divided into sub-vectors and sub-vector groups each including the sub-vectors of the second vectors are arranged in units of sub-vector groups. A first process of transferring one of sub-vectors of the first vector to sub-registers of a first register among the registers, a second process of transferring the sub-vector group of the second vectors corresponding to the transferred sub-vector of the first vector to sub-registers of a second register, the sub-vector group being held in the memory, and a third process of calculating and integrating numbers of mismatches between bit values of the sub-vectors held are repeatedly executed.
ACCELERATED REASONING GRAPH EVALUATION
Embodiments disclosed herein relate to methods, systems, and computer programs for automatically determining an outcome associated with a reasoning graph, based on one or more data sets. The methods, systems, and computer programs compare hash values associated with different data sets to determine if they match to assign the outcome associated with a pre-existing hash to the later provided hash and data set associated therewith.
Unified register file for supporting speculative architectural states
A method for supporting architecture speculation in an out of order processor is disclosed. The method comprises fetching two threads into the processor, wherein a first thread executes in a speculative state and a second thread executes in a non-speculative state. The method also comprises enabling a speculative scope for an execution of the first thread and a non-speculative scope for an execution of the second thread in an architecture of the processor, wherein the speculative scope and the non-speculative scope can both be fetched into the architecture and be present concurrently.