Patent classifications
G06F15/7878
PROCESSORS AND METHODS FOR PIPELINED RUNTIME SERVICES IN A SPATIAL ARRAY
Methods and apparatuses relating to pipelined runtime services in spatial arrays are described. In one embodiment, a processor includes processing elements; an interconnect network between the processing elements; a first configuration controller coupled to a first subset of the processing elements; and a second configuration controller coupled to a second, different subset of the processing elements, the first configuration controller and the second configuration controller are to configure the first subset and the second, different subset according to configuration information for a first context, and, for a context switch, the first configuration controller is to configure the first subset according to configuration information for a second context after pending operations of the first context are completed in the first subset and block second context dataflow into the second, different subset's input from the first subset's output until pending operations of the first context are completed in the second, different subset.
Reconfigurable Parallel Processing
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise a configuration buffer, a sequencer coupled to the configuration buffer of each of the plurality of PEs and configured to distribute one or more PE configurations to the plurality of PEs, and a gasket memory coupled to the plurality of PEs and being configured to store at least one PE execution result to be used by at least one of the plurality of PEs during a next PE configuration.
Information processing device, information processing method, information processing program, and recording medium
An image processing device configured to process image data obtained externally, the image processing device including a preprocessing circuit configured to carry out preprocessing during image processing; and a circuit configuration controller configured to carry out partial reconfiguration of the preprocessing circuit; the preprocessing circuit including: a plurality of arithmetic converter circuits configured to perform arithmetic computations on image data to convert the image data; and a timing control circuit provided between each one in the plurality of arithmetic converter circuits connected in order of processing, the timing control circuit configured to secure the reliable exchange of data; and the circuit configuration controller partially reconfiguring at least one arithmetic converter circuit in the plurality of arithmetic converter circuits while not partially reconfiguring the timing control circuit.
Reconfigurable processor and configuration method
The present disclosure discloses a reconfigurable processor and a configuration method. The reconfigurable processor includes a reconfiguration configuration unit and a reconfigurable array. The reconfiguration configuration unit is configured to provide, according to an algorithm matched with a current application scenario, reconfiguration information used for reconfiguring a computation structure in the reconfigurable array. The reconfigurable array includes at least two stages of computational arrays, the reconfigurable array is configured to connect, according to the reconfiguration information provided by the reconfiguration configuration unit, adjacent two stages of the computational arrays to form a data path pipeline structure satisfying computation requirements of the algorithm. In the same stage of the computational array, pipeline depths of different computation modules connected to the data path pipeline structure are equal, such that the different computation modules connected to the data path pipeline structure synchronously output data.
Static Shared Memory Access for Reconfigurable Parallel Processor
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise an arithmetic logic unit (ALU), a data buffer associated with the ALU, and an indicator associated with the data buffer to indicate whether a piece of data inside the data buffer is to be reused for repeated execution of a same instruction as a pipeline stage.
Reconfigurable Parallel Processing
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) that each may comprise a configuration buffer, a sequencer coupled to the configuration buffer of each of the plurality of PEs and configured to distribute one or more PE configurations to the plurality of PEs, and a gasket memory coupled to the plurality of PEs and being configured to store at least one PE execution result to be used by at least one of the plurality of PEs during a next PE configuration.
Circular Reconfiguration for Reconfigurable Parallel Processor
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of reconfigurable units that may include a plurality of processing elements (PEs) and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each of the plurality of reconfigurable units may comprise a configuration buffer and a reconfiguration counter. The processor may further comprise a sequencer coupled to the configuration buffer of each of the plurality of reconfigurable units and configured to distribute a plurality of configurations to the plurality of reconfigurable units for the plurality of PEs and the plurality of MPs to execute a sequence of instructions.
Private Memory Structure for Reconfigurable Parallel Processor
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each PE may have a plurality of arithmetic logic units (ALUs) that are configured to execute a same instruction in parallel threads. Each of the plurality of MPs may comprise an address calculation unit configured to generate respective memory addresses for each thread to access a different memory bank in the memory unit.
Shared Memory Structure for Reconfigurable Parallel Processor
Processors, systems and methods are provided for thread level parallel processing. A processor may comprise a plurality of processing elements (PEs) each having a plurality of arithmetic logic units (ALUs) that are configured to execute a same instruction in parallel threads and a plurality of memory ports (MPs) for the plurality of PEs to access a memory unit. Each of the plurality of MPs may comprise an address calculation unit configured to generate respective memory addresses for each thread to access a common area in the memory unit.
Pipelined Configurable Processor
A configurable processing circuit capable of handling multiple threads simultaneously, the circuit comprising a thread data store, a plurality of configurable execution units, a configurable routing network for connecting locations in the thread data store to the execution units, a configuration data store for storing configuration instances that each define a configuration of the routing network and a configuration of one or more of the plurality of execution units, and a pipeline formed from the execution units, the routing network and the thread data store that comprises a plurality of pipeline sections configured such that each thread propagates from one pipeline section to the next at each clock cycle, the circuit being configured to: (i) associate each thread with a configuration instance; and (ii) configure each of the plurality of pipeline sections for each clock cycle to be in accordance with the configuration instance associated with the respective thread that will propagate through that pipeline section during the clock cycle.