G06F15/17381

Multiple Independent On-chip Interconnect

In an embodiment, a system on a chip (SOC) comprises a semiconductor die on which circuitry is formed, wherein the circuitry comprises a plurality of agents and a plurality of network switches coupled to the plurality of agents. The plurality of network switches are interconnected to form a plurality of physical and logically independent networks. A first network of the plurality of physically and logically independent networks is constructed according to a first topology and a second network of the plurality of physically and logically independent networks is constructed according to a second topology that is different from the first topology. For example, the first topology may a ring topology and the second topology may be a mesh topology. In an embodiment, coherency may be enforced on the first network and the second network may be a relaxed order network.

Network computer with two embedded rings
11625356 · 2023-04-11 · ·

A computer comprising a plurality of interconnected processing nodes arranged in a configuration in which multiple layers of interconnected nodes are arranged along an axis, each layer comprising at least four processing nodes connected in a non-axial ring by at least respective intralayer link between each pair of neighbouring processing nodes, wherein each of the at least four processing nodes in each layer is connected to a respective corresponding node in one or more adjacent layer by a respective interlayer link, the computer being programmed to provide in the configuration two embedded one dimensional paths and to transmit data around each of the two embedded one dimensional paths, each embedded one dimensional path using all processing nodes of the computer in such a manner that the two embedded one dimensional paths operate simultaneously without sharing links.

COMPUTER SYSTEM AND COMPUTER
20170371395 · 2017-12-28 ·

A computer system, comprising a plurality of computers, each of the plurality of computers including at least one processor chip each including a plurality of processor cores, the at least one processor chip constructing a plurality of regions each constructed by at least one processor core, each of the plurality of processor cores carries out calculation processing for executing a predetermined program and inter-core communication processing, which is communication between the plurality of processor cores, the computer system comprising: a regulation module which controls a voltage and a frequency that are supplied to each of the plurality of regions; and a determination module which determines a power mode of each of the plurality of regions, to output an instruction to the regulation module.

SCALABLE COMPUTER ARCHITECTURAL FRAMEWORK FOR QUANTUM OR CLASSICAL DIGITAL COMPUTERS
20230205728 · 2023-06-29 ·

Scalable computer architectural frameworks for quantum computers or for classical digital computers. According to one implementation, the framework includes a plurality of processing nodes, each processing node including at least three processing elements, and a plurality of couplings. The processing elements in a processing node (11) are connected in series forming a string comprising two end processing elements and at least one intermediate processing element. Each processing node is connected to each of the other processing nodes by means of only one external coupling. The intermediate processing elements are connected to processing elements of other processing nodes by the same number of external couplings, and the end processing elements are connected to processing elements of other processing nodes by the same number of external couplings or one more.

Execution engine for executing single assignment programs with affine dependencies

The execution engine is a new organization for a digital data processing apparatus, suitable for highly parallel execution of structured fine-grain parallel computations. The execution engine includes a memory for storing data and a domain flow program, a controller for requesting the domain flow program from the memory, and further for translating the program into programming information, a processor fabric for processing the domain flow programming information and a crossbar for sending tokens and the programming information to the processor fabric.

Multiple independent on-chip interconnect

In an embodiment, a system on a chip (SOC) comprises a semiconductor die on which circuitry is formed, wherein the circuitry comprises a plurality of agents and a plurality of network switches coupled to the plurality of agents. The plurality of network switches are interconnected to form a plurality of physical and logically independent networks. A first network of the plurality of physically and logically independent networks is constructed according to a first topology and a second network of the plurality of physically and logically independent networks is constructed according to a second topology that is different from the first topology. For example, the first topology may a ring topology and the second topology may be a mesh topology. In an embodiment, coherency may be enforced on the first network and the second network may be a relaxed order network.

OPTIMIZING NOC PERFORMANCE USING CROSSBARS
20230176736 · 2023-06-08 ·

A system including an array of processing elements, a plurality of periphery crossbars and a plurality of storage components is described. The array of processing elements is interconnected in a grid via a network on an integrated circuit. The periphery crossbars are connected to a plurality of edges of the array of processing elements. The storage components are connected to the periphery crossbars.

SYSTEMS AND METHODS FOR IMPLEMENTING A MACHINE PERCEPTION AND DENSE ALGORITHM INTEGRATED CIRCUIT AND ENABLING A FLOWING PROPAGATION OF DATA WITHIN THE INTEGRATED CIRCUIT

Systems and methods of propagating data within an integrated circuit includes: identifying a coarse data propagation path for distinct subsets of data of an input dataset that includes: setting inter-core data movements for the distinct subsets of data, the inter-core data movements defining a predetermined propagation of a given subset of data between two or more of a plurality of cores of an integrated circuit array of the integrated circuit; identifying a granular data propagation path for each distinct subset of data that includes: setting intra-core data movements for each distinct subset of data, the intra-core data movements defining a predetermined propagation of the given subset of data within one or more of the plurality of cores of the integrated circuit array of the integrated circuit; enabling a flow of the input dataset within the integrated circuit based on the coarse data propagation path and the granular propagation path.

Methods, Network Node and Wireless Device for Handling Device Capabilities

A network node (200), a wireless device (202) and methods therein, for handling device capabilities. The wireless device (202) sends (2:1) a capability pointer to the network node (200), which capability pointer is associated with a capability configuration of the wireless device (202). The network node (200) then retrieves (2:2) said capability configuration based on the capability pointer from a capability database (204) or the like where a range of predefined capability configurations and associated capability pointers are maintained. The retrieved capability configuration can then be used in radio communication (2:3) with the wireless device (202).

Processor architecture

A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.