Patent classifications
G06F5/06
Fabric vectors for deep learning acceleration
Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Instructions executed by the compute element include operand specifiers, some specifying a data structure register storing a data structure descriptor describing an operand as a fabric vector or a memory vector. The data structure descriptor further describes various attributes of the fabric vector: length, microthreading eligibility, number of data elements to receive, transmit, and/or process in parallel, virtual channel and task identification information, whether to terminate upon receiving a control wavelet, and whether to mark an outgoing wavelet a control wavelet.
Semiconductor integrated circuit
An asynchronous FIFO is arranged between an input bus and an output bus with a different number of lanes. The asynchronous FIFO supplies a write clock and a read clock. A circuit block receives output data from the asynchronous FIFO via the output bus, and executes predetermined processing. In a test mode, a test circuit supplies a test pattern as interrupt data to the input bus, and detects the presence or absence of an abnormality based on a relation between the output data and its expected value based on the test pattern.
MIMD processor emulated on SIMD architecture
A processor having a SIMD architecture, including an array of elementary processors, each elementary processor being associated with an elementary memory cell, a central controller connected to the elementary processors by an instruction bus and a status bus. The central controller transmits a sequence of instructions in a loop, each instruction including a calculation flow indicator. Each elementary processor has an instruction filter that makes it possible to reject or take into account an instruction depending on the identifier it contains. This operating mode makes it possible to emulate a MIMD processor on a SIMD architecture.
MIMD processor emulated on SIMD architecture
A processor having a SIMD architecture, including an array of elementary processors, each elementary processor being associated with an elementary memory cell, a central controller connected to the elementary processors by an instruction bus and a status bus. The central controller transmits a sequence of instructions in a loop, each instruction including a calculation flow indicator. Each elementary processor has an instruction filter that makes it possible to reject or take into account an instruction depending on the identifier it contains. This operating mode makes it possible to emulate a MIMD processor on a SIMD architecture.
On-chip memory block circuit
A memory block circuit can include a plurality of data interfaces, a switch connected to each data interface of the plurality of data interfaces, and a plurality of memory banks each coupled to the switch. Each memory bank can include a memory controller and a random access memory connected to the memory controller. The memory block circuit also includes a control interface and a management controller connected to the control interface and each memory bank of the plurality of memory banks. Each memory bank can be independently controlled by the management controller.
On-chip memory block circuit
A memory block circuit can include a plurality of data interfaces, a switch connected to each data interface of the plurality of data interfaces, and a plurality of memory banks each coupled to the switch. Each memory bank can include a memory controller and a random access memory connected to the memory controller. The memory block circuit also includes a control interface and a management controller connected to the control interface and each memory bank of the plurality of memory banks. Each memory bank can be independently controlled by the management controller.
Priority-arbitrated access to a set of one or more computational engines
The present invention discloses a method for managing priority-arbitrated access to a set of one or more computational engines of a physical computing device. The method includes providing a multiplexer module and a network bus in the physical computing device, wherein the multiplexer module is connected to the network bus. The method further includes receiving, by the multiplexer module, a first data processing request from a driver and inferring, by the multiplexer module, a first priority class from the first data processing request according to at least one property of the first data processing request. The method further includes manipulating, by the multiplexer module, a priority according to which the physical computing device handles data associated with the first data processing request in relation to data associated with other data processing requests, wherein the priority is determined by the first priority class.
Real-time data processing and storage apparatus
A stream processor is disclosed, the stream processor includes: a first in first out memory FIFO, a calculation unit, and a cache. The FIFO receives current stream information, where the current stream information carries a target stream number and target data; when the FIFO receives a read valid signal, the FIFO sends the target stream number and the target data to the calculation unit, and sends the target stream number to the cache; the cache obtains, based on the target stream number, old data that corresponds to the target stream number, and sends the old data that corresponds to the target stream number to the calculation unit; and the calculation unit performs, based on the target data, calculation on the old data that corresponds to the target stream number to obtain new data, and sends the new data to the cache.
Allocation of buffer interfaces for moving data, and related systems, methods and devices
A buffer interface, data transport method, and computing system are described in which a buffer interface may be configured for communicating data samples to and from frame buffers defined in a memory. The configurable buffer interfaces and frame buffers provide a flexible and scalable platform for use with many applications.
Allocation of buffer interfaces for moving data, and related systems, methods and devices
A buffer interface, data transport method, and computing system are described in which a buffer interface may be configured for communicating data samples to and from frame buffers defined in a memory. The configurable buffer interfaces and frame buffers provide a flexible and scalable platform for use with many applications.