G06F5/16

HIGH-THROUGHPUT ASYNCHRONOUS DATA PIPELINE
20230047511 · 2023-02-16 ·

One embodiment of the present invention sets forth a data pipeline, which includes a first mousetrap element and a second mousetrap element in a first pipeline stage. Each mousetrap element includes a request latch that, when enabled, allows a request signal to pass from the first pipeline stage to a second pipeline stage following the first pipeline stage in the data pipeline. Each mousetrap element also includes a data latch that, when enabled, allows a data element to pass from the first pipeline stage to the second pipeline stage. Each mousetrap element further includes a latch controller that enables and disables the request and data latches based on a phase signal that alternates between a first value that configures the first mousetrap element to transmit data to the second pipeline stage and a second value that configures the second mousetrap element to transmit data to the second pipeline stage.

Hardware architecture for a neural network accelerator

Examples herein describe hardware architecture for processing and accelerating data passing through layers of a neural network. In one embodiment, a reconfigurable integrated circuit (IC) for use with a neural network includes a digital processing engine (DPE) array, each DPE having a plurality of neural network units (NNUs). Each DPE generates different output data based on the currently processing layer of the neural network, with the NNUs parallel processing different input data sets. The reconfigurable IC also includes a plurality of ping-pong buffers designed to alternate storing and processing data for the layers of the neural network.

Hardware architecture for a neural network accelerator

Examples herein describe hardware architecture for processing and accelerating data passing through layers of a neural network. In one embodiment, a reconfigurable integrated circuit (IC) for use with a neural network includes a digital processing engine (DPE) array, each DPE having a plurality of neural network units (NNUs). Each DPE generates different output data based on the currently processing layer of the neural network, with the NNUs parallel processing different input data sets. The reconfigurable IC also includes a plurality of ping-pong buffers designed to alternate storing and processing data for the layers of the neural network.

Processor device for executing SIMD instructions
11500632 · 2022-11-15 · ·

In a processor device according to the present invention, a memory access unit reads data to be processed from an external memory and writes the data to a first register group that a plurality of processors does not access among a plurality of register groups. A control unit sequentially makes each of the plurality of processors implement a same instruction, in parallel with changing an address of a register group that stores the data to be processed. A scheduler, based on specified scenario information, specifies an instruction to be implemented and a register group to be accessed for the plurality of processors, and specifies a register group to be written to among the plurality of register groups and data to be processed that is to be written for the memory access unit.

Information processing apparatus and memory control method
11663453 · 2023-05-30 · ·

There is provided with an information processing apparatus. A control unit controls writing of weight data to a first memory and a second memory, and controls readout of the weight data from the first memory and the second memory. The control unit further switches an operation between a first operation in which a processing unit reads out first weight data from the first memory and performs the convolution operation processing using the first weight data while the processing unit writes second weight data to the second memory in parallel, and a second operation in which the processing unit reads out the first weight data from both the first memory and the second memory and performs the convolution operation processing using the first weight data.

Information processing apparatus and memory control method
11663453 · 2023-05-30 · ·

There is provided with an information processing apparatus. A control unit controls writing of weight data to a first memory and a second memory, and controls readout of the weight data from the first memory and the second memory. The control unit further switches an operation between a first operation in which a processing unit reads out first weight data from the first memory and performs the convolution operation processing using the first weight data while the processing unit writes second weight data to the second memory in parallel, and a second operation in which the processing unit reads out the first weight data from both the first memory and the second memory and performs the convolution operation processing using the first weight data.

IC including logic tile, having reconfigurable MAC pipeline, and reconfigurable memory
11663016 · 2023-05-30 · ·

An integrated circuit including configurable multiplier-accumulator circuitry, wherein, during processing operations, a plurality of the multiplier-accumulator circuits are serially connected into pipelines to perform concatenated multiply and accumulate operations. The integrated circuit includes a first memory and a second memory, and a switch interconnect network, including configurable multiplexers arranged in a plurality of switch matrices. The first and second memories are configurable as either a dedicated read memory or a dedicated write memory and connected to a given pipeline, via the switch interconnect network, during a processing operation performed thereby; wherein, during a first processing operations, the first memory is dedicated to write data to a first pipeline and the second memory is dedicated to read data therefrom and, during a second processing operation, the first memory is dedicated to read data from a second pipeline and the second memory is dedicated to write data thereto.

IC including logic tile, having reconfigurable MAC pipeline, and reconfigurable memory
11663016 · 2023-05-30 · ·

An integrated circuit including configurable multiplier-accumulator circuitry, wherein, during processing operations, a plurality of the multiplier-accumulator circuits are serially connected into pipelines to perform concatenated multiply and accumulate operations. The integrated circuit includes a first memory and a second memory, and a switch interconnect network, including configurable multiplexers arranged in a plurality of switch matrices. The first and second memories are configurable as either a dedicated read memory or a dedicated write memory and connected to a given pipeline, via the switch interconnect network, during a processing operation performed thereby; wherein, during a first processing operations, the first memory is dedicated to write data to a first pipeline and the second memory is dedicated to read data therefrom and, during a second processing operation, the first memory is dedicated to read data from a second pipeline and the second memory is dedicated to write data thereto.

Slip detection on multi-lane serial datalinks
11687320 · 2023-06-27 · ·

The disclosure relates to detecting phase slips that may occur in a multi-lane serial datalink. Phase slips may occur when a lane experiences lane skew, which may introduce a phase slip with respect to another lane. To detect phase slippage, the system may select a reference lane from among the lanes. The system may generate a pre-deskew delta value based on a difference between the FIFO filling level of the reference lane before a deskew and the FIFO filling level of a second lane before the deskew. The system may generate a post-deskew delta value based on a difference between the FIFO filling level of the reference lane after the deskew and the FIFO filling level of the second lane after the deskew. The system may use a difference between the post-deskew delta and the pre-deskew delta to detect phase slip on the second lane relative to the reference lane.

Slip detection on multi-lane serial datalinks
11687320 · 2023-06-27 · ·

The disclosure relates to detecting phase slips that may occur in a multi-lane serial datalink. Phase slips may occur when a lane experiences lane skew, which may introduce a phase slip with respect to another lane. To detect phase slippage, the system may select a reference lane from among the lanes. The system may generate a pre-deskew delta value based on a difference between the FIFO filling level of the reference lane before a deskew and the FIFO filling level of a second lane before the deskew. The system may generate a post-deskew delta value based on a difference between the FIFO filling level of the reference lane after the deskew and the FIFO filling level of the second lane after the deskew. The system may use a difference between the post-deskew delta and the pre-deskew delta to detect phase slip on the second lane relative to the reference lane.