G06F15/7825

ON-CHIP SYNCHRONOUS SELF-REPAIRING SYSTEM BASED ON LOW-FREQUENCY REFERENCE SIGNAL

The present disclosure discloses an on-chip synchronous self-repairing system based on a low-frequency reference signal. The system adopts a dual-input PLL stellate coupled structure or a dual-input PLL butterfly-shaped coupled structure, and delay of the whole loop is made to be an integral multiple of the reference signal by synchronizing the transmitted reference signal with the received reference signal, so as to ensure synchronization of local oscillation signal of each IC chip. The transmission wire based on an adjustable left-handed material is used as a delay wire to connect the dual-input PLL, thereby achieving low loss and reducing the physical distance of the delay wire. The system has the advantages of small area, low loss, strong adaptability and strict synchronization in various environments.

SAFE TRANSMIT PACKET PROCESSING FOR NETWORK FUNCTION VIRTUALIZATION APPLICATIONS
20170249162 · 2017-08-31 ·

A transmit packet processing system includes a NIC, a memory, one or more processors in communication with the memory, and a device driver. The memory has a first set and a second set of physical memory pages. The device driver is loaded in an OS and is configured to initialize the NIC. The device driver is further configured to assign a plurality of rings to specific physical memory pages. The plurality of rings includes transmit rings and receive rings. The transmit rings are utilized by an application in the application memory space. The transmit rings are assigned to the first set of physical memory pages which are writable by the application. The receive rings are assigned to the second set of physical memory pages which are not writable by the application. The device driver is further configured to initiate a mapping of the transmit rings into the application memory space.

NETWORK-ON-CHIP DATA PROCESSING METHOD AND DEVICE
20220035762 · 2022-02-03 ·

The present application relates to a network-on-chip data processing method. The method is applied to a network-on-chip processing system, the network-on-chip processing system is used for executing machine learning calculation, and the network-on-chip processing system comprises a storage device and a calculation device. The method comprises: accessing the storage device in the network-on-chip processing system by means of a first calculation device in the network-on-chip processing system, and obtaining first operation data; performing an operation on the first operation data by means of the first calculation device to obtain a first operation result; and sending the first operation result to a second calculation device in the network-on-chip processing system. According to the method, operation overhead can be reduced and data read/write efficiency can be improved.

HARDWARE-SOFTWARE DESIGN FLOW WITH HIGH-LEVEL SYNTHESIS FOR HETEROGENEOUS AND PROGRAMMABLE DEVICES

Implementing an application within an integrated circuit (IC) having a data processing engine (DPE) array coupled to a Network-on-Chip (NoC) can include determining, using computer hardware, data transfer requirements for a software portion of the application intended to execute on the DPE array by simulating data traffic to the NoC as generated by the software portion, generating, using the computer hardware, a NoC routing solution for data paths of the application implemented by the NoC based, at least in part, on the data transfer requirements for the software portion. The software portion can be compiled for execution by different ones of a plurality of DPEs of the DPE array based, at least in part, on the NoC routing solution. Configuration data can be generated using the computer hardware. The configuration data, when loaded into the IC, configures the NoC to implement the NoC routing solution.

COMMUNICATION METHODS IN A NETWORK-ON-CHIP
20220311699 · 2022-09-29 ·

A method for multi-source communication for triple-modular redundancy (TMR), a method for branched communication, and a method for virtual buses are disclosed. A method includes a) transmitting by at least two different source nodes in each case at least two identical messages which contain at least flow control data, payload data and check data to at least one predetermined receive node where the messages reach the receive node together at a predetermined time, b) combining by the receive node the messages received by the receive node into a combined message containing flow control data, payload data and check data, or comparing by the receive node of messages received by the receive node, and c) further processing of the combined message by the receive node or further processing of at least one of the messages received by the receive node based on comparison from step b).

SELF-SCHEDULING THREADS IN A PROGRAMMABLE ATOMIC UNIT
20220237020 · 2022-07-28 ·

Devices and techniques for self-scheduling threads in a programmable atomic unit are described herein. When it is determined that an instruction will not complete within a threshold prior to insertion into a pipeline of the processor, a thread identifier (ID) can be passed with the instruction. Here, the thread ID corresponds to a thread of the instruction. When a response to completion of the instruction is received that includes the thread ID, the thread is rescheduled using the thread ID in the response.

Exchange Between Stacked Die

Two or more die are stacked together in a stacked integrated circuit device. Each of the processors on these die is able to communicate with other processors on its die by sending data over the switching fabric of its respective die. The mechanism for sending data between processors on the same die (i.e. intradie communication) is reused for sending data between processors on different die (i.e. interdie communication). The reuse of the mechanism is enabled by assigning each processor a vertical neighbour on its opposing die. Each processor has an interdie connection that connects it to the output exchange bus of its neighbour. A processor is able to borrow the output exchange bus of its neighbour by sending data along the output exchange bus of its neighbour.

SYSTEM AND METHOD FOR PERFORMING TRANSACTION AGGREGATION IN A NETWORK-ON-CHIP (NoC)
20210382847 · 2021-12-09 · ·

System and methods are disclosed for aggregating identical requests sent to a target from multiple initiators through a network-on-chip (NoC). The requests are marked for aggregation. The NoC uses request aggregators (RA) as an aggregation point to aggregate the identical requests that are marked for aggregation. At the aggregation point, the identical requests are reduced to a single request. The single request is sent to the target. The process is repeated in a cascaded fashion through the NoC, possibly involving multiple request aggregators. When a response transaction is received back from the target, which is at the aggregation point closest to the target, the response transaction is duplicated and sent to every original requester, either directly or through other request aggregators, which further duplicate the already duplicated response transaction.

Network-on-chip topology generation

The present disclosure provides a computer-based method and system for synthesizing a NoC. Physical data, device data, bridge data and traffic data are determined based on an input specification for the NoC. A virtual channel (VC) is assigned to each traffic flow. A head of line (HoL) conflict graph (HCG) is constructed based on the traffic data and the VC assignments. The HGC is modified based on bridge data and the traffic data to generate a modified HCG. A plurality of traffic graphs (TGs) are constructed based on the physical data, the bridge data, the traffic data and the modified HCG. A candidate topology is generated for each TG, which includes the bridge ports, routers and connections. The candidate topologies are merged to create a merged candidate topology, and the routers within the merged candidate topology are merged to generate a final topology for the NoC.

Conditional Branching Control for a Multi-Threaded, Self-Scheduling Reconfigurable Computing Fabric
20210373890 · 2021-12-02 ·

Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array. A representative configurable circuit includes a configurable computation circuit and a configuration memory having a first, instruction memory storing a plurality of data path configuration instructions to configure a data path of the configurable computation circuit; and a second, instruction and instruction index memory storing a plurality of spoke instructions and data path configuration instruction indices for selection of a master synchronous input, a current data path configuration instruction, and a next data path configuration instruction for a next configurable computation circuit.