G06F9/30101

Operating system assisted prioritized thread execution

The present disclosure is directed to dynamically prioritizing, selecting or ordering a plurality threads for execution by processor circuitry based on a quality of service and/or class of service value/indicia assigned to the thread by an operating system executed by the processor circuitry. As threads are executed by processor circuitry, the operating system dynamically updates/associates respective class of service data with each of the plurality of threads. The current quality of service/class of service data assigned to the thread by the operating system is stored in a manufacturer specific register (MSR) associated with the respective thread. Selection circuitry polls the MSRs on a periodic, aperiodic, intermittent, continuous, or event-driven basis and determines an execution sequence based on the current class of service value associated with each of the plurality of threads.

True/false vector index registers and methods of populating thereof
11507374 · 2022-11-22 · ·

Disclosed herein are vector index registers for storing or loading indexes of true and/or false results of comparison operations in vector processors. Each of the vector index registers store multiple addresses for accessing multiple positions in operand vectors.

System and method for performing computations for deep neural networks

A computation unit for performing a computation of a neural network layer is disclosed. A number of processing element (PE) units are arranged in an array. First input values are provided in parallel in an input dimension of the array during a first processing period, and a second input values are provided in parallel in the input dimension during a second processing period. Computations are performed by the PE units based on stored weight values. An adder coupled to the first set of PE units generates a first sum of results of the computations by the first set of PE units during the first processing cycle, and generates a second sum of results of the computations during the second processing cycle. A first accumulator coupled to the first adder stores the first sum, and further shifts the first sum to a second accumulator prior to storing the second sum.

AN APPARATUS AND METHOD FOR CONTROLLING ACCESS TO A SET OF MEMORY MAPPED CONTROL REGISTERS
20230056039 · 2023-02-23 ·

A technique for controlling access to a set of memory mapped control registers. The apparatus has processing circuitry for executing program code to perform data processing operations, and a set of memory mapped control registers for storing control information used to control operation of the processing circuitry. Further, a lockdown register used to store a lockdown value. The processing circuitry is arranged to execute store instructions to perform write operations to a memory address space . Thethe processing circuitry is arranged to prevent a write operation being performed to change the control information in the memory mapped control registers . This significantly reduces the prospect of an attacker seeking to exploit a software vulnerability to change the control information in the memory mapped control registers.

CONTROLLER WITH CACHING AND NON-CACHING MODES

An apparatus includes a CPU core, a first cache subsystem coupled to the CPU core, and a second memory coupled to the cache subsystem. The first cache subsystem includes a configuration register, a first memory, and a controller. The controller is configured to: receive a request directed to an address in the second memory and, in response to the configuration register having a first value, operate in a non-caching mode. In the non-caching mode, the controller is configured to provide the request to the second memory without caching data returned by the request in the first memory. In response to the configuration register having a second value, the controller is configured to operate in a caching mode. In the caching mode the controller is configured to provide the request to the second memory and cache data returned by the request in the first memory.

SYSTEM AND METHOD FOR HALTING PROCESSING CORES IN A MULTICORE SYSTEM

A multicore computing device includes a memory and a processor coupled to the memory. The processor includes plural cores and a multiple input multiple output (MIMO) block coupled to the cores. The MIMO block receives a halt request from a first core of the cores, transmits a core-halt request to one or more other cores other than the first core, to halt execution of the one or more other cores, and permits the first core to lock with a shared resource.

MANAGING RETURN PARAMETER ALLOCATION
20230058935 · 2023-02-23 ·

A hybrid threading processor (HTP) supports thread creation by executing an instruction that indicates an amount of storage space to reserve for return values. Before a thread is created, the indicated amount of space is reserved. The newly created child thread sends a return packet back to the parent thread when the child thread completes. The thread writes its return information into the reserved space and waits for the parent thread to execute a thread join instruction. The thread join instruction takes the returned information from the reserved space and transfers it to the parent thread's register state. The reserved space is released once the child thread is joined. Using a configurable amount of space for each child thread may allow for more child threads to be executed simultaneously.

POST ERROR CORRECTION CODE REGISTERS FOR CACHE METADATA

Methods, systems, and devices for post error correction code (ECC) registers for cache metadata are described. A device may read metadata from a memory array included in the device. The metadata may include information for operating a volatile memory as a cache for a non-volatile memory. The device may perform an ECC operation on the metadata based on reading the metadata from the memory array. After performing the ECC operation on the metadata, the device may write the metadata to a register that is coupled with the memory array. The device may then write the metadata from the register to the memory array.

INFERRING FUTURE VALUE FOR SPECULATIVE BRANCH RESOLUTION

Aspects of the invention include includes determining a first instruction in a processing pipeline, wherein the first instruction includes a compare instruction, determining a second instruction in the processing pipeline, wherein the second instruction includes a conditional branch instruction relying on the compare instruction, determining a predicted result of the compare instruction, and completing the conditional branch instruction using the predicted result prior to executing the compare instruction.

Method for producing an association list
11586793 · 2023-02-21 · ·

A method for creating an allocation map, wherein the allocation map is created based on an FPGA source code, wherein the source code uses at least a first signal at a first location, wherein at least a first register is mapped to the first signal, wherein in the allocation map, the first signal and the first register are listed as mapped to one another, wherein a second signal is used at a second location in the FPGA source code, wherein it is automatically detected that the value of the second signal can be determined from the value of the first signal according to a first calculation rule, wherein in the allocation map, the second signal, the first register and the first calculation rule are listed as mapped to one another.