IPIQ

G06F9/30123

COMMUNICATION BETWEEN THREADS OF MULTI-THREAD PROCESSOR

20170351518 · 2017-12-07 ·

Thang Tran

Embodiments of the present disclosure support hardware based thread switching in a multithreading environment. The tread switching is implemented on a multithread microprocessor by utilizing thread mailbox registers and other auxiliary registers that can be pre-programmed for hardware based thread switching. A set of mailbox registers can be allocated to each thread of a plurality of threads that can be executed in the microprocessor. A mailbox register in the set of mailbox registers comprises an identifier of a next thread of the plurality of threads to which an active thread switches based on a thread switch condition further indicated in the mailbox register. The auxiliary registers in the microprocessor can be used to configure a number of threads for simultaneous execution in the microprocessor, a priority for thread switching, and to store a program counter of each thread and states of registers of each thread.

THREAD SWITCHING IN MICROPROCESSOR WITHOUT FULL SAVE AND RESTORE OF REGISTER FILE

20170351520 · 2017-12-07 ·

Thang Tran

Certain embodiments of the present disclosure support a method and apparatus for efficient multithreading on a single core microprocessor. Thread switching in the single core microprocessor presented herein is based on a reserved space in a memory allocated to each thread for storing and restoring of registers in a register file. The thread switching is achieved without full save and restore of the register file, and only those registers referenced in the memory are saved and restored during thread switching.

DISTRIBUTED PROCESSOR SYSTEM

20220374389 · 2022-11-24 ·

This disclosure relates to a distributed processing system for configuring multiple processing channels. The distributed processing system includes a main processor, such as an ARM processor, communicatively coupled to a plurality of co-processors, such as stream processors. The co-processors can execute instructions in parallel with each other and interrupt the ARM processor. Longer latency instructions can be executed by the main processor and lower latency instructions can be executed by the co-processors. There are several ways that a stream can be triggered in the distributed processing system. In an embodiment, the distributed processing system is a stream processor system that includes an ARM processor and stream processors configured to access different register sets. The stream processors can include a main stream processor and stream processors in respective transmit and receive channels. The stream processor system can be implemented in a radio system to configure the radio for operation.

ADAPTIVE CREDIT-BASED REPLENISHMENT THRESHOLD USED FOR TRANSACTION ARBITRATION IN A SYSTEM THAT SUPPORTS MULTIPLE LEVELS OF CREDIT EXPENDITURE

20220374358 · 2022-11-24 ·

Daniel Brad Wu

A device includes an arbiter circuit configured to receive a first request for a resource. The first request is associated with a first credit cost. The arbiter circuit is further configured to receive a second request for the resource. The second request is associated with a second credit cost. The arbiter circuit is further configured to select the first request for the resource as an arbitration winner. The arbiter circuit is further configured to decrement a number of available credits associated with the resource by the first credit cost. The arbiter circuit is further configured to, in response to the number of available credits associated with the resource falling to a lower credit threshold, wait until the number of available credits associated with the resource reaches an upper credit threshold to select an additional arbitration winner for the resource.

SOFTWARE ISOLATION USING EVENT DRIVEN MULTI-THREADING

20230169163 · 2023-06-01 ·

An enhanced security of multiple software processes executing on a computer system is provided by isolating those processes from each other and from access to system hardware resources. Embodiments provide such isolation by executing kernel software that manages hardware and controls physical address space on a separate hardware thread (e.g., in an isolation domain) from the process threads executing application programs (e.g., in execution domains). This renders the software executing in the isolation domain safe from privilege escalation attacks and permits implementation of enforceable isolation between execution systems. A multithreaded processor having switch-on-event multithreading is used to provide software isolation and hardware-controlled handling of a subset of system services by a different hardware thread than the one requesting the service.

Vector processor storage

11263145 · 2022-03-01 ·

Nyriad Limited

A method comprising: receiving, at a vector processor, a request to store data; performing, by the vector processor, one or more transforms on the data; and directly instructing, by the vector processor, one or more storage device to store the data; wherein performing one or more transforms on the data comprises: erasure encoding the data to generate n data fragments configured such that any k of the data fragments are usable to regenerate the data, where k is less than n; and wherein directly instructing one or more storage device to store the data comprises: directly instructing the one or more storage devices to store the plurality of data fragments.

Semiconductor device

11263046 · 2022-03-01 ·

Renesas Electronics Corporation

Yasuo SASAKI

A semiconductor device capable of executing a plurality of tasks in real time and improving performances is provided. The semiconductor device comprises a plurality of processors and a plurality of DMA controllers as master, a plurality of memory ways as slave, and a real-time schedule unit for controlling the plurality of masters such that the plurality of tasks are executed in real time. The real-time schedule unit RTSD uses the memory access monitor circuit and the data determination register to determine whether or not the input data of the task has been determined, and causes the task determined to have the input data determined to have been determined to be executed preferentially.

Automatic generation of efficient vector code with low overhead in a time-efficient manner independent of vector width

11262989 · 2022-03-01 ·

Advanced Micro Devices, Inc.

A computing system includes a compatibility graph builder to generate a compatibility graph based on a dependency graph representing program source code, where the compatibility graph indicates compatibility relationships between operations represented in the dependency graph, a clique generator coupled with the compatibility graph builder to generate a set of candidate vector packings based on the compatibility relationships indicated in the compatibility graph, a set cover generator coupled with the clique generator to select a subset of vector packings from the set of candidate vector packings, and a vector code generator coupled with the set cover generator to generate the vector code based on the selected subset of vector packings.

FPGA-BASED RESEQUENCING ANALYSIS METHOD AND DEVICE

20220058321 · 2022-02-24 ·

Proposed by the present disclosure are an FPGA-based resequencing analysis method and device, wherein the method comprises: receiving genomic resequencing data; using the resequencing data as an input of an FPGA, determining a comparison result in the resequencing process according to an output of the FPGA, and simultaneously performing sorting and deduplication processing on the comparison result: correcting a base quality value of the comparison result after sorting and deduplication processing; and detecting a mutation result according to the corrected comparison result. The described method may save program running time, save calculation costs, and improve resequencing efficiency.

Fast mapping table register file allocation algorithm for SIMT processors

09798543 · 2017-10-24 ·

Nvidia Corporation

One embodiment of the present invention sets forth a technique for allocating register file entries included in a register file to a thread group. A request to allocate a number of register file entries to the thread group is received. A required number of mapping table entries included in a register file mapping table (RFMT) is determined based on the request, where each mapping table entry included in the RFMT is associated with a different plurality of register file entries included in the register file. The RFMT is parsed to locate an available mapping table entry in the RFMT for each of the required mapping table entries. For each available mapping table entry, a register file pointer is associated with an address that corresponds to a first register file entry in the plurality of register file entries associated with the available mapping table entry.

Patent classifications

G06F9/30123