G06F9/30018

Two dimensional masked shift instruction
11544060 · 2023-01-03 · ·

An image processor is described. The image processor includes a two dimensional shift register array that couples certain ones of its array locations to support execution of a shift instruction. The shift instruction is to include mask information. The mask information is to specify which of the array locations are to be written to with information being shifted. The two dimensional shift register array includes masking logic circuitry to write the information being shifted into specified ones of the array locations in accordance with the mask information.

Control flow mechanism for execution of graphics processor instructions using active channel packing

An apparatus to facilitate control flow in a graphics processing system is disclosed. The apparatus includes logic a plurality of execution units to execute single instruction, multiple data (SIMD) and flow control logic to detect a diverging control flow in a plurality of SIMD channels and reduce the execution of the control flow to a subset of the SIMD channels.

Compression assist instructions
11537399 · 2022-12-27 · ·

In an embodiment, a processor supports one or more compression assist instructions which may be employed in compression software to improve the performance of the processor when performing compression/decompression. That is, the compression/decompression task may be performed more rapidly and consume less power when the compression assist instructions are employed then when they are not. In some cases, the cost of a more effective, more complex compression algorithm may be reduced to the cost of a less effective, less complex compression algorithm.

GENERATING MASKS FOR FORMATS INCLUDING MASKING RESTRICTIONS
20220405099 · 2022-12-22 ·

An example system includes a processor to receive an instance of a composite format comprising a masking restriction. The processor can generate a mask for the instance of the composite format based on the masking restriction. The processor can output the generated mask.

OPERATING SYSTEM DEACTIVATION OF STORAGE BLOCK WRITE PROTECTION ABSENT QUIESCING OF PROCESSORS

Operating system deactivation of write protection for a storage block is provided absent quiescing of processors in a multi-processor computing environment. The process includes receiving an address translation protection exception interrupt resulting from an attempted write access by a processor to a storage block, and determining by the operating system whether write protection for the storage block is active. Based on write protection for the storage block not being active, the operating system issues an instruction to clear or modify translation lookaside buffer entries of the processor associated with the storage block, absent waiting for an action by another processor of multiple processors of the computing environment, to facilitate write access to the storage block proceeding at the processor.

APPARATUS AND METHOD FOR IMPLEMENTING VECTOR MASK IN VECTOR PROCESSING UNIT
20220382546 · 2022-12-01 · ·

The mask data corresponding to each data element of the issued instruction may be handled by a mask queue, where only the valid mask data are stored to the mask queue. The mask data of multiple vector instructions may be stored in the mask queue. The corresponding mask data may be accessed from the mask queue when the vector instruction(s) is dispatched from the execution queue to the functional unit for execution. In the case of 512-bit wide mask data is needed, the issuing of the vector instruction from the decode/issue unit to the execution queue may be stalled until the mask queue is available. In some embodiments, one mask queue may be dedicated to one execution queue. Alternatively, one mask queue may be shared between two different execution queues. In the disclosure, resources are conserved without dedicating additional storage space for handling mask data of the vector instruction.

Vector logical operation and test instructions with result negation
11593105 · 2023-02-28 · ·

Systems, methods, and apparatuses relating to performing logical operations on packed data elements and testing the results of that logical operation to generate a packed data resultant are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction, the instruction having fields that identify a first packed data source, a second packed data source, and a packed data destination, and an opcode that indicates a bitwise logical operation to perform on the first packed data source and the second packed data source and indicates a width of each element of the first packed data source and the second packed data source; and an execution circuit to execute the decoded instruction to perform the bitwise logical operation indicated by the opcode on the first packed data source and the second packed data source to produce a logical operation result of packed data elements having a same width as the width indicated by the opcode, perform a test operation on each element of the logical operation result to set a corresponding bit in a packed data test operation result to a first value when any of the bits in a respective element of the logical operation result are set to the first value, and set the corresponding bit to a second value otherwise, and store the packed data test operation result into the packed data destination.

True/false vector index registers and methods of populating thereof
11507374 · 2022-11-22 · ·

Disclosed herein are vector index registers for storing or loading indexes of true and/or false results of comparison operations in vector processors. Each of the vector index registers store multiple addresses for accessing multiple positions in operand vectors.

Operating system deactivation of storage block write protection absent quiescing of processors

Operating system deactivation of write protection for a storage block is provided absent quiescing of processors in a multi-processor computing environment. The process includes receiving an address translation protection exception interrupt resulting from an attempted write access by a processor to a storage block, and determining by the operating system whether write protection for the storage block is active. Based on write protection for the storage block not being active, the operating system issues an instruction to clear or modify translation lookaside buffer entries of the processor associated with the storage block, absent waiting for an action by another processor of multiple processors of the computing environment, to facilitate write access to the storage block proceeding at the processor.

MASKING FOR COARSE GRAINED RECONFIGURABLE ARCHITECTURE
20230058355 · 2023-02-23 ·

A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. The tiles can be arranged in an array or grid and can be communicatively coupled. In an example, a first node can include a tile cluster of N memory-compute tiles, and the N memory-compute tiles can be coupled using a first portion of a synchronous compute fabric. Operations performed by the respective processing and storage elements of the N memory-compute tiles can be selectively enabled or disabled based on information in a mask field of data propagated through the first portion of the synchronous compute fabric.