G06F15/7825

TILE-BASED RESULT BUFFERING IN MEMORY-COMPUTE SYSTEMS
20230067771 · 2023-03-02 ·

A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. A first tile in a first node can include a processor with a processor output and a first register network configured to receive information from the processor output and information from one or more of the multiple other tiles in the first node. In response to an output instruction and a delay instruction, the register network can provide an output signal to one of the multiple other tiles in the first node. Based on the output instruction, the output signal can include one or the other of the information from the processor output and the information from one or more of the multiple other tiles in the first node. A timing characteristic of the output signal can depend on the delay instruction.

MECHANISM TO CONTROL ORDER OF TASKS EXECUTION IN A SYSTEM-ON-CHIP (SoC) BY OBSERVING PACKETS IN A NETWORK-ON-CHIP (NoC)
20230111522 · 2023-04-13 · ·

A network-on-chip (NoC) provides packet-based communication between a plurality of initiator computing elements and a plurality of target computing elements. The NoC includes a plurality of observer processors upstream of and corresponding to the target computing elements. Each observer processor is configured to perform packet inspection and generate information in real-time about traffic load on its corresponding target computing element. An aggregator processor is configured to process the traffic load information from the observer processors to identify those target computing elements that are most heavily contended.

Overlay layer for network of processor cores

Methods and systems related to the efficient execution of complex computations by a multicore processor and the movement of data among the various processing cores in the multicore processor are disclosed. A multicore processor stack for the multicore processor can include a computation layer, for conducting computations using the processing cores in the multicore processor, with executable instructions for processing pipelines in the processing cores. The multicore processor stack can also include a network-on-chip layer, for connecting the processing cores in the multicore processor, with executable instructions for routers and network interface units in the multicore processor. The computation layer and the network-on-chip layer can be logically isolated by a network-on-chip overlay layer.

Interconnect for direct memory access controllers

A computing device is provided, including a plurality of memory devices, a plurality of direct memory access (DMA) controllers, and an on-chip interconnect. The on-chip interconnect may be configured to implement control logic to convey a read request from a primary DMA controller of the plurality of DMA controllers to a source memory device of the plurality of memory devices. The on-chip interconnect may be further configured to implement the control logic to convey a read response from the source memory device to the primary DMA controller and one or more secondary DMA controllers of the plurality of DMA controllers.

LOCALIZED NOC SWITCHING INTERCONNECT FOR HIGH BANDWIDTH INTERFACES
20220337923 · 2022-10-20 ·

Embodiments herein describe an integrated circuit that includes a NoC with at least two levels of switching: a sparse network and a non-blocking network. In one embodiment, the non-blocking network is a localized interconnect that provides an interface between the sparse network in the NoC and a memory system that requires additional bandwidth such as HBM2/3 or DDR5. Hardware elements connected to the NoC that do not need the additional benefits provided by the non-blocking network can connect solely to the sparse network. In this manner, the NoC provides a sparse network (which has a lower density of switching elements) for providing communication between lower bandwidth hardware elements and a localized non-blocking network for facilitating communication between the sparse network and higher bandwidth hardware elements.

DATA TRANSMISSION METHOD AND ELECTRONIC CHIP OF THE MANYCORE TYPE

A method for transmitting data between functions implemented on a first electronic chip of the manycore type. The first electronic chip includes a plurality of execution cores, the execution cores being grouped in clusters, the clusters being interconnected by at least two communication systems. The data transmission method includes the steps of: implementing a first function on a first cluster; implementing a second function on a second cluster, characterised in that the second function is also implemented on a third cluster distinct from the first and second clusters; and transmitting at least one data item between the first function and the second function.

SCALABLE ARRAY ARCHITECTURE FOR IN-MEMORY COMPUTING

Various embodiments comprise systems, methods, architectures, mechanisms and apparatus for providing programmable or pre-programmed in-memory computing (IMC) operations via an array of configurable IMC cores interconnected by a configurable on-chip network to support scalable execution and dataflow of an application mapped thereto.

Process-variability-based encryption for photonic communication architectures

The exemplified methods and systems provide hardware-circuit-level encryption for inter-core communication of photonic communication devices such as photonic network-on-chip devices. In some embodiments, the hardware-circuit level encryption uses authentication signatures that are based on process variation that inherently occur during the fabrication of the photonic communication device. The hardware level encryption can facilitate high bandwidth on-chip data transfers while preventing hardware-based trojans embedded in components of the photonic communication device such as PNoC devices or preventing external snooping devices from snooping data from the neighboring photonic signal transmission medium in a shared photonic signal transmission medium. In some embodiments, the hardware-circuit-level encryption is used for unicast/multicast traffic.

Enabling stateless accelerator designs shared across mutually-distrustful tenants
11651112 · 2023-05-16 · ·

An apparatus to facilitate enabling stateless accelerator designs shared across mutually-distrustful tenants is disclosed. The apparatus includes a fully-homomorphic encryption (FHE)-capable circuitry to establish a secure session with a trusted environment executing on a host device communicably coupled to the apparatus; generate, as part of establishing the secure session, per-tenant FHE keys for each tenant utilizing the FHE-capable circuitry, the per-tenant FHE keys utilized to encrypt tenant data provided to an FHE-capable compute kernel of the FHE-capable circuitry; process tenant data that is in an FHE-encrypted format encrypted with a per-tenant FHE key of the per-tenant FHE keys; and store the tenant data that is in the FHE-encrypted format encrypted with the per-tenant FHE key of the per-tenant FHE keys.

Hardware-software design flow with high-level synthesis for heterogeneous and programmable devices

Implementing an application within an integrated circuit (IC) having a data processing engine (DPE) array coupled to a Network-on-Chip (NoC) can include determining, using computer hardware, data transfer requirements for a software portion of the application intended to execute on the DPE array by simulating data traffic to the NoC as generated by the software portion, generating, using the computer hardware, a NoC routing solution for data paths of the application implemented by the NoC based, at least in part, on the data transfer requirements for the software portion. The software portion can be compiled for execution by different ones of a plurality of DPEs of the DPE array based, at least in part, on the NoC routing solution. Configuration data can be generated using the computer hardware. The configuration data, when loaded into the IC, configures the NoC to implement the NoC routing solution.