Patent classifications
G06F9/3552
Cyclic buffer pointer fixing
A processor device is provided with hardware-implemented logic to receive an instruction including a pointer identifier and a pointer change value, the pointer identifier including a pointer address field encoded with an address of a line of memory corresponding to a location of a pointer of a particular one of the one or more cyclic buffers, one or more cushion bits, and a buffer identifier field encoded with a buffer identifier assigned to the particular cyclic buffer. The logic further enables the processor to identify that the instruction is to apply to the particular cyclic buffer based on the buffer identifier, determine that the pointer change value causes a wraparound of the pointer in the particular cyclic buffer, and fix location of the pointer in the particular cyclic buffer based on the wraparound.
FETCHING VECTOR DATA ELEMENTS WITH PADDING
A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template register specifies a circular address mode for the loop, first and second block size numbers and a circular address block size selection. For a first circular address block size selection the block size corresponds to the first block size number. For a first circular address block size selection the block size corresponds to the first block size number. For a second circular address block size selection the block size corresponds to a sum of the first block size number and the second block size number.
VECTOR GENERATING INSTRUCTION
An apparatus and method are provided for performing vector processing operations. In particular the apparatus has processing circuitry to perform the vector processing operations and an instruction decoder to decode vector instructions to control the processing circuitry to perform the vector processing operations specified by the vector instructions. The instruction decoder is responsive to a vector generating instruction identifying a scalar start value and wrapping control information, to control the processing circuitry to generate a vector comprising a plurality of elements. In particular, the processing circuitry is arranged to generate the vector such that the first element in the plurality is dependent on the scalar start value, and the values of the plurality of elements follow a regularly progressing sequence that is constrained to wrap as required to ensure that each value is within bounds determined from the wrapping control information. The vector generating instruction can be useful in a variety of situations, a particular use case being to implement a circular addressing mode within memory, where the vector generating instruction can be coupled with an associated vector memory access instruction. Such an approach can remove the need to provide additional logic within the memory access path to support such circular addressing.
STREAMING ENGINE WITH MULTI DIMENSIONAL CIRCULAR ADDRESSING SELECTABLE AT EACH DIMENSION
A streaming engine employed in a digital data processor may specify a fixed read-only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template register independently specifies a linear address or a circular address mode for each of the nested loops.
STRIDESHIFT INSTRUCTION FOR TRANSPOSING BITS INSIDE VECTOR REGISTER
A processor includes a decode circuit to decode an instruction into a decoded instruction and an execution circuit to execute the decoded instruction to access a first bit of a first input vector located at a bit position indicated by an element of a second input vector, stride over bits of the first input vector using a stride to access bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector, and store the first bit of the first input vector and the bits of the first input vector that are located at a strided bit position with respect to the first bit of the first input vector as consecutive bits in a destination vector.
STREAMING ENGINE WITH COMPRESSED ENCODING FOR LOOP CIRCULAR BUFFER SIZES
A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template register specifies a circular address mode for the loop, first and second block size numbers and a circular address block size selection. For a first circular address block size selection the block size corresponds to the first block size number. For a first circular address block size selection the block size corresponds to the first block size number. For a second circular address block size selection the block size corresponds to a sum of the first block size number and the second block size number.
SECURE CONTROL FLOW PREDICTION
Systems and methods are disclosed for secure control flow prediction. Some implementations may be used to eliminate or mitigate the Spectre-class of attacks in a processor. For example, an integrated circuit (e.g., a processor) for executing instructions includes a control flow predictor with entries that include respective indications of whether the entry has been activated for use in a current process, wherein the integrated circuit is configured to access the indication in one of the entries that is associated with a control flow instruction that is scheduled for execution; determine, based on the indication, whether the entry of the control flow predictor associated with the control flow instruction is activated for use in a current process; and responsive to a determination that the entry is not activated for use in the current process, apply a constraint on speculative execution based on control flow prediction for the control flow instruction.
Processor prefetch throttling based on short streams
In an embodiment, a processor comprises a prefetch history array and a prefetch circuit. The prefetch history array comprises a plurality of entries corresponding to prefetch addresses, each entry of the plurality of entries comprising a sublength value associated with a frequency that a stride is repeated. The prefetch circuit is to: for each entry of the plurality of entries, adjust the sublength value based on stride matches for an address of the entry; adjust a short stream counter based on the sublength values of the plurality of entries in the prefetch history array; determine whether the short stream counter has exceeded a throttling threshold; and in response to a determination that the short stream counter has exceeded the throttling threshold, throttle a prefetch level of the prefetch circuit. Other embodiments are described and claimed.
Streaming engine with compressed encoding for loop circular buffer sizes
A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template register specifies a circular address mode for the loop, first and second block size numbers and a circular address block size selection. For a first circular address block size selection the block size corresponds to the first block size number. For a first circular address block size selection the block size corresponds to the first block size number. For a second circular address block size selection the block size corresponds to a sum of the first block size number and the second block size number.
CYCLIC BUFFER POINTER FIXING
A processor device is provided with hardware-implemented logic to receive an instruction including a pointer identifier and a pointer change value, the pointer identifier including a pointer address field encoded with an address of a line of memory corresponding to a location of a pointer of a particular one of the one or more cyclic buffers, one or more cushion bits, and a buffer identifier field encoded with a buffer identifier assigned to the particular cyclic buffer. The logic further enables the processor to identify that the instruction is to apply to the particular cyclic buffer based on the buffer identifier, determine that the pointer change value causes a wraparound of the pointer in the particular cyclic buffer, and fix location of the pointer in the particular cyclic buffer based on the wraparound.