Patent classifications
G06F15/7885
RECONFIGURABLE PARALLEL PROCESSING
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise a configuration buffer, a sequencer coupled to the configuration buffer of each of the plurality of PEs and configured to distribute one or more PE configurations to the plurality of PEs, and a gasket memory coupled to the plurality of PEs and being configured to store at least one PE execution result to be used by at least one of the plurality of PEs during a next PE configuration.
Shared Memory Structure for Reconfigurable Parallel Processor
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) each having a plurality of arithmetic logic units (ALUs) that are configured to execute a same instruction in parallel threads and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each of the plurality of MPs may comprise an address calculation unit configured to generate respective memory addresses for each thread to access a common area in the memory unit.
CIRCULAR RECONFIGURATION FOR RECONFIGURABLE PARALLEL PROCESSOR
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of reconfigurable units that may include a plurality of processing elements (PEs) and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each of the plurality of reconfigurable units may comprise a configuration buffer and a reconfiguration counter. The processor may further comprise a sequencer coupled to the configuration buffer of each of the plurality of reconfigurable units and configured to distribute a plurality of configurations to the plurality of reconfigurable units for the plurality of PEs and the plurality of MPs to execute a sequence of instructions.
Private Memory Structure for Reconfigurable Parallel Processor
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each PE may have a plurality of arithmetic logic units (ALUs) that are configured to execute a same instruction in parallel threads. Each of the plurality of MPs may comprise an address calculation unit configured to generate respective memory addresses for each thread to access a different memory bank in the memory unit.
Comparison-based sort in a reconfigurable array processor having multiple processing elements for sorting array elements
An array processor includes a managing element having a load streaming unit coupled to multiple processing elements. The load streaming unit provides input data portions to each of a first subset of the processing elements and also receives output data from each of a second subset of the processing elements based on a comparatively sorted combination of the input data portions provided to the first subset of processing elements. Furthermore, each of processing elements is configurable by the managing element to compare input data portions received from either the load streaming unit or two or more of the other processing elements, wherein the input data portions are stored for processing in respective queues. Each processing unit is further configurable to select an input data portion to be output data based on the comparison, and in response to selecting the input data portion, remove a queue entry corresponding to the selected input data portion. Each processing element may be further configured to provide the selected output data portion to either the managing element or as an input to one of the processing elements.
Reconfigurable parallel processor with a plurality of chained memory ports
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise a configuration buffer, a sequencer coupled to the configuration buffer of each of the plurality of PEs and configured to distribute one or more PE configurations to the plurality of PEs, and a gasket memory coupled to the plurality of PEs and being configured to store at least one PE execution result to be used by at least one of the plurality of PEs during a next PE configuration.
Circular reconfiguration for a reconfigurable parallel processor using a plurality of chained memory ports
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of reconfigurable units that may include a plurality of processing elements (PEs) and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each of the plurality of reconfigurable units may comprise a configuration buffer and a reconfiguration counter. The processor may further comprise a sequencer coupled to the configuration buffer of each of the plurality of reconfigurable units and configured to distribute a plurality of configurations to the plurality of reconfigurable units for the plurality of PEs and the plurality of MPs to execute a sequence of instructions.
Shared memory access for a reconfigurable parallel processor with a plurality of chained memory ports
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) each having a plurality of arithmetic logic units (ALUs) that are configured to execute a same instruction in parallel threads and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each of the plurality of MPs may comprise an address calculation unit configured to generate respective memory addresses for each thread to access a common area in the memory unit.
Private memory access for a reconfigurable parallel processor using a plurality of chained memory ports
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each PE may have a plurality of arithmetic logic units (ALUs) that are configured to execute a same instruction in parallel threads. Each of the plurality of MPs may comprise an address calculation unit configured to generate respective memory addresses for each thread to access a different memory bank in the memory unit.
Reconfigurable Parallel Processing
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise a configuration buffer, a sequencer coupled to the configuration buffer of each of the plurality of PEs and configured to distribute one or more PE configurations to the plurality of PEs, and a gasket memory coupled to the plurality of PEs and being configured to store at least one PE execution result to be used by at least one of the plurality of PEs during a next PE configuration.