G06F9/30123

SYSTEM AND METHOD FOR GENERATING AND USING A CONTEXT BLOCK BASED ON SYSTEM PARAMETERS
20220206837 · 2022-06-30 · ·

A system and method for generating a context block using system parameters. The system parameters include objective parameters, functionality parameters, and interface definitions. Context field definitions are received. The system parameters and context fields definitions may be used to determine context fields and context entries. The system parameters may be used to determine context fields and number of context entries. The context module hardware description may be created using context fields, number of context entries, and context field definitions.

Flash memory controller and method capable of efficiently reporting debug information to host device
11372589 · 2022-06-28 · ·

A method used in a flash memory controller includes: using a watchdog timer to automatically count a number and to generate a reset trigger signal to a processor if the number counted by the watchdog timer is higher than a threshold; after receiving the reset trigger signal from the watchdog timer, using the processor to copy registry information from at least one of processor, flash memory interface controller, and protocol controller, and then to control the memory controller to write the copied registry information into the dynamic random access memory device without rebooting a system of the flash memory controller.

Hierarchical workload allocation in a storage system
11366700 · 2022-06-21 · ·

A method for hierarchical workload allocation in a storage system, the method may include determining to reallocate a compute workload of a current compute core of the storage system; wherein the current compute core is responsible for executing a workload allocation unit that comprises one or more first type shards; and reallocating the compute workload by (a) maintaining the responsibility of the current compute core for executing the workload allocation unit, and (b) reallocating at least one first type shard of the one or more first type shards to a new workload allocation unit that is allocated to a new compute core of new compute cores.

HARDWARE SUPPORT FOR SOFTWARE POINTER AUTHENTIFICATION IN A COMPUTING SYSTEM

A processor and method for processing information is disclosed that in response to encountering a function entry instruction while running an application, computes an entry hash value using a hash of three hash input parameters, wherein one of the input parameters is a secret key stored in the special purpose register; and in response to encountering a function exit instruction, computes an exit hash value using the same three input parameters and the same hash used when computing the entry hash value; and determines if the entry hash value is the same as the exit hash value.

Instruction-level context switch in SIMD processor

Techniques are disclosed relating to context switching in a SIMD processor. In some embodiments, an apparatus includes pipeline circuitry configured to execute graphics instructions included in threads of a group of single-instruction multiple-data (SIMD) threads in a thread group. In some embodiments, context switch circuitry is configured to atomically: save, for the SIMD group, a program counter and information that indicates whether threads in the SIMD group are active using one or more context switch registers, set all threads to an active state for the SIMD group, and branch to handler code for the SIMD group. In some embodiments, the pipeline circuitry is configured to execute the handler code to save context information for the SIMD group and subsequently execute threads of another thread group. Disclosed techniques may allow instruction-level context switching even when some SIMD threads are non-active.

SYSTEMS AND METHODS FOR REDUCING REGISTER BANK CONFLICTS BASED ON SOFTWARE HINT AND HARDWARE THREAD SWITCH

Mechanisms for reducing register bank conflicts based on software hint and hardware thread switch are disclosed. In some embodiments, an apparatus for thread switching includes a graphics processing unit (GPU) that includes a plurality of register banks to store operands that are assigned at least partially to avoid register bank conflicts. A decoding circuitry checks a thread switching field of a first instruction to be executed by a first thread. The GPU performs a thread switch mechanism to cause a second instruction to be executed by a second thread when the thread switching field of the first instruction is set.

Multi-Threaded Processor with Thread Granularity

A multi-thread processor has a canonical thread map register which outputs a sequence of thread_id values indicating a current thread for execution. The thread map register is programmable to provide granularity of number of cycles of the canonical sequence assigned to each thread. In one example of the invention, the thread map register has repeating thread identifiers in a sequential or non-sequential manner to overcome memory latency and avoid thread stalls. In another example of the invention, separate interrupt tasks are placed on each thread to reduce interrupt processing latency.

Secure control flow prediction
11347507 · 2022-05-31 · ·

Systems and methods are disclosed for secure control flow prediction. Some implementations may be used to eliminate or mitigate the Spectre-class of attacks in a processor. For example, an integrated circuit (e.g., a processor) for executing instructions includes a control flow predictor with entries that include respective indications of whether the entry has been activated for use in a current process, wherein the integrated circuit is configured to access the indication in one of the entries that is associated with a control flow instruction that is scheduled for execution; determine, based on the indication, whether the entry of the control flow predictor associated with the control flow instruction is activated for use in a current process; and responsive to a determination that the entry is not activated for use in the current process, apply a constraint on speculative execution based on control flow prediction for the control flow instruction.

Floating Point Norm Instruction
20230273791 · 2023-08-31 ·

A hardware module is provided in an execution unit and is responsive to execution of multiple instances of a new type of instruction to perform a plurality of reductions in parallel. The hardware module comprises: a first accumulator storing first state associated with a first of the reductions; and a second accumulator storing second state associated with a second of the reductions. Upon execution of each of the multiple instances of the first type of instruction: an input value for the respective instance is provided to a first processing circuit of the hardware module such that the first processing circuit performs a first type of operation to update the first state; and the same input value is provided to the second processing circuit of the hardware module such that the second processing circuit performs a second type of operation to update the second state.

Method and apparatus for balancing binary instruction burstization and chaining

A method for grouping computer instructions includes receiving a set of computer instructions, grouping the set of computer instructions by register dependencies, identifying a plurality of single-definition-use flow (SDF) bundles based on a burstization criteria and a chaining criteria; and based on the SDF bundles, transforming the set of computer instructions. The transformation may include splitting one of the set of computer instructions and setting a burst parameter for the one of the set of computer instruction. The transformation may include grouping a plurality of the set of computer instructions and replacing a pair of register file accesses with a pair of temporary register accesses.