G06F15/7885

Static Shared Memory Access for Reconfigurable Parallel Processor
20180267809 · 2018-09-20 ·

Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise an arithmetic logic unit (ALU), a data buffer associated with the ALU, and an indicator associated with the data buffer to indicate whether a piece of data inside the data buffer is to be reused for repeated execution of a same instruction as a pipeline stage.

Reconfigurable Parallel Processing
20180267929 · 2018-09-20 ·

Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise a configuration buffer, a sequencer coupled to the configuration buffer of each of the plurality of PEs and configured to distribute one or more PE configurations to the plurality of PEs, and a gasket memory coupled to the plurality of PEs and being configured to store at least one PE execution result to be used by at least one of the plurality of PEs during a next PE configuration.

Circular Reconfiguration for Reconfigurable Parallel Processor
20180267930 · 2018-09-20 ·

Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of reconfigurable units that may include a plurality of processing elements (PEs) and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each of the plurality of reconfigurable units may comprise a configuration buffer and a reconfiguration counter. The processor may further comprise a sequencer coupled to the configuration buffer of each of the plurality of reconfigurable units and configured to distribute a plurality of configurations to the plurality of reconfigurable units for the plurality of PEs and the plurality of MPs to execute a sequence of instructions.

Private Memory Structure for Reconfigurable Parallel Processor
20180267931 · 2018-09-20 ·

Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each PE may have a plurality of arithmetic logic units (ALUs) that are configured to execute a same instruction in parallel threads. Each of the plurality of MPs may comprise an address calculation unit configured to generate respective memory addresses for each thread to access a different memory bank in the memory unit.

Shared Memory Structure for Reconfigurable Parallel Processor
20180267932 · 2018-09-20 ·

Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) each having a plurality of arithmetic logic units (ALUs) that are configured to execute a same instruction in parallel threads and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each of the plurality of MPs may comprise an address calculation unit configured to generate respective memory addresses for each thread to access a common area in the memory unit.

Comparison-based sort in a reconfigurable array processor having multiple processing elements for sorting array elements

An array processor includes a managing element having a load streaming unit coupled to multiple processing elements. The load streaming unit provides input data portions to each of a first subset of the processing elements and also receives output data from each of a second subset of the processing elements based on a comparatively sorted combination of the input data portions provided to the first subset of processing elements. Furthermore, each of processing elements is configurable by the managing element to compare input data portions received from either the load streaming unit or two or more of the other processing elements, wherein the input data portions are stored for processing in respective queues. Each processing unit is further configurable to select an input data portion to be output data based on the comparison, and in response to selecting the input data portion, remove a queue entry corresponding to the selected input data portion. Each processing element may be further configured to provide the selected output data portion to either the managing element or as an input to one of the processing elements.

Method for configuring an interface unit of a computer system

A method for configuring an interface unit of a computer system with a first processor and a second processor stored in the interface unit. A data link is set up between the first processor and the second processor. A peripheral of the computer system is configured to store input data in an input data channel and to read output data from an output data channel, and the second processor is configured to read the input data from the input data channel and to store output data in the output data channel. A sequence of processor commands for the second processor is created such that a number of subsequences is created.

Configuration Data Store in a Reconfigurable Data Processor Having Two Access Modes

A reconfigurable processor is disclosed, featuring an array of configurable units interconnected by a bus system. Each configurable unit includes a configuration data store structured as a shift register that includes individually addressable argument registers. Program load logic is responsible for receiving sub-files of configuration data through the bus system and sequentially shifting them into the configuration data store, including the argument registers. Argument load logic is designed to receive argument data via the bus system and directly load it into the argument registers without the need for shifting through the shift register.

ANALYZING DATA USING A HIERARCHICAL STRUCTURE
20180096213 · 2018-04-05 ·

Apparatus, systems, and methods for analyzing data are described. The data can be analyzed using a hierarchical structure. One such hierarchical structure can comprise a plurality of layers, where each layer performs an analysis on input data and provides an output based on the analysis. The output from lower layers in the hierarchical structure can be provided as inputs to higher layers. In this manner, lower layers can perform a lower level of analysis (e.g., more basic/fundamental analysis), while a higher layer can perform a higher level of analysis (e.g., more complex analysis) using the outputs from one or more lower layers. In an example, the hierarchical structure performs pattern recognition.

Comparison-based sort in a reconfigurable array processor having multiple processing elements for sorting array elements

A method for sorting data in an array processor. Each of a first tier of processing elements in the array processor receives data inputs from a load streaming unit. Each of the first tier processing elements compares input data portions received from the load streaming unit, wherein the input data portions are stored for processing in respective queues. The first tier processing elements select one of the input data portions to be an output data portion based on the comparison, and in response to the selection, remove a corresponding queue entry and request next input data from the load streaming unit. Each of the first tier processing elements further provides the output data portion as an input data portion to a second tier processing element that generates output data based on a comparison of output data received from at least two first tier processing elements.