Patent classifications
G06F15/7825
Scalable Network-on-Chip for High-Bandwidth Memory
Described herein are memory controllers for integrated circuits that implement network-on-chip (NoC) to provide access to memory to couple processing cores of the integrated circuit to a memory device. The NoC may be dedicated to service the memory controller and may include one or more routers to facilitate management of the access to the memory controller.
BROADCAST ADAPTERS IN A NETWORK-ON-CHIP
A broadcast adapter in a network-on-chip (NoC) is used for broadcasting transactions in the form of packets from an initiator to multiple targets and for receiving responses from the targets that are combined and sent to the initiator. The transactions originate from an initiator and are send, using the NoC, to broadcast adapters using a special range of addresses. The broadcast adapters receive the transactions from the initiator. The broadcast adapters duplicate the transactions and send the duplicated transaction to multiple targets. The targets send a response, which is transported back by the NoC to the corresponding initiator.
DEVICE WITH DATA PROCESSING ENGINE ARRAY THAT ENABLES PARTIAL RECONFIGURATION
A device may include a processor system and an array of data processing engines (DPEs) communicatively coupled to the processor system. Each of the DPEs includes a core and a DPE interconnect. The processor system is configured to transmit configuration data to the array of DPEs, and each of the DPEs is independently configurable based on the configuration data received at the respective DPE via the DPE interconnect of the respective DPE. The array of DPEs enable, without modifying operation of a first kernel of a first subset of the DPEs of the array of DPEs, reconfiguration of a second subset of the DPEs of the array of DPEs.
MACHINE LEARNING MODEL UPDATES TO ML ACCELERATORS
Examples herein describe a peripheral I/O device with a hybrid gateway that permits the device to have both I/O and coherent domains. As a result, the compute resources in the coherent domain of the peripheral I/O device can communicate with the host in a similar manner as CPU-to-CPU communication in the host. The dual domains in the peripheral I/O device can be leveraged for machine learning (ML) applications. While an I/O device can be used as an ML accelerator, these accelerators previously only used an I/O domain. In the embodiments herein, compute resources can be split between the I/O domain and the coherent domain where a ML engine is in the I/O domain and a ML model is in the coherent domain. An advantage of doing so is that the ML model can be coherently updated using a reference ML model stored in the host.
NoC timing power estimating device and method thereof
A NoC timing power estimating method includes: estimating a plurality of transmission timing of a plurality of transmission units of at least a packet, the transmission timing indicating respective time points at which the transmission units enter/leave a plurality of passing elements of the NoC; based on the transmission timing of the transmission units, estimating respective circuit states and respective power states of the passing elements of the NoC, the circuit state indicating an operation state of the passing element and the power state being related to the circuit state; and based on the power states of the passing elements of the NoC, estimating power consumption of the NoC.
Method and apparatus for message interactive processing
Provided are a message interaction processing method and device. The method includes: a first buffer with a preset size is applied for to a Central Processing Unit (CPU) and/or a chip; and message interaction is performed between the CPU and the chip through the first buffer, wherein the first buffer is used for storing at least two messages. By the disclosure, the problem that frequent switching between states may cause high resource overhead and low message transmission efficiency under the condition of large message interaction between the CPU and the chip is solved, and the effect of remarkably improving message sending and receiving efficiency and performance of network equipment is further achieved.
Loop execution control for a multi-threaded, self-scheduling reconfigurable computing fabric using a reenter queue
Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array. A representative configurable circuit includes a configurable computation circuit and a configuration memory having a first, instruction memory storing a plurality of data path configuration instructions to configure a data path of the configurable computation circuit; and a second, instruction and instruction index memory storing a plurality of spoke instructions and data path configuration instruction indices for selection of a master synchronous input, a current data path configuration instruction, and a next data path configuration instruction for a next configurable computation circuit.
SYSTEMS AND METHODS FOR MINIMUM-IMPLANT-AREA AWARE DETAILED PLACEMENT
The present disclosure is directed to systems and methods for a minimum-implant-area (MIA) aware detailed placement. In embodiments, the present disclosure clusters a violation cell with the cells having a same threshold voltage (Vt) and determines an optimal region for a cluster to minimize the wire-length. In further embodiments, an MIA-aware cell flipping technique minimizes a design area while satisfying the MIA constraint.
Avoiding deadlock with a fabric having multiple systems on chip
Devices and techniques to avoid deadlock in a multi-SOC fabric are described herein. An apparatus comprises a network on chip (NOC) interface to receive on a first virtual channel, from a processor, a memory request for a memory controller; a fabric interface configured to include the first virtual channel and a second virtual channel, connected to a scale fabric; and circuitry to: transmit the memory request toward the memory controller on the first virtual channel via the scale fabric; receive a response from the memory controller over the second virtual channel via the scale fabric; and relay the response to the processor over the second virtual channel.
Electronic device and operation method of sleep mode thereof
An operation method of a sleep mode of an electronic device includes the following steps. A first sub-module of a first module sends a sleep command to a second sub-module of the first module and a third sub-module and a fourth sub-module of a second module, wherein the first sub-module includes first and second modes, the second sub-module includes third and fourth nodes, the third sub-module includes fifth and sixth nodes, and the fourth sub-module includes seventh and eighth nodes. The second sub-module, the third sub-module and fourth sub-module execute a sleep sequence in sequence to enter a sleep mode according to the sleep command. The first node sends the sleep command to the second node to execute the sleep sequence to enter the sleep mode. The first node sends the sleep command to the first node to execute the sleep sequence to enter the sleep mode.