G06F15/17368

Scalable System on a Chip

An integrated circuit (IC) including a plurality of processor cores, a plurality of graphics processing units, a plurality of peripheral circuits, and a plurality of memory controllers is configured to support scaling of the system using a unified memory architecture. For example, the IC may include an interconnect fabric configured to provide communication between the one or more memory controller circuits and the processor cores, graphics processing units, and peripheral devices; and an off-chip interconnect coupled to the interconnect fabric and configured to couple the interconnect fabric to a corresponding interconnect fabric on another instance of the integrated circuit, wherein the interconnect fabric and the off-chip interconnect provide an interface that transparently connects the one or more memory controller circuits, the processor cores, graphics processing units, and peripheral devices in either a single instance of the integrated circuit or two or more instances of the integrated circuit.

INTEGRATED CIRCUIT, DATA PROCESSING DEVICE AND METHOD

An integrated circuit, and a data processing device and method are provided. The integrated circuit includes a processor circuit and an accelerator circuit. The processor circuit includes a processor, a first data storage section, and a first data input/output interface. The accelerator circuit includes an accelerator and a second data input/output interface. The second data input/output interface is electrically connected to the first data input/output interface, so that the accelerator circuit can perform information interaction with the first data storage section.

Execution or Write Mask Generation for Data Selection in a Multi-Threaded, Self-Scheduling Reconfigurable Computing Fabric
20210406015 · 2021-12-30 ·

Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an asynchronous packet network having a plurality of data transmission lines forming a data path transmitting operand data; a synchronous mesh communication network; a plurality of configurable circuits arranged in an array, each configurable circuit of the plurality of configurable circuits coupled to the asynchronous packet network and to the synchronous mesh communication network, each configurable circuit of the plurality of configurable circuits adapted to perform a plurality of computations; each configurable circuit of the plurality of configurable circuits comprising: a memory storing operand data; and an execution or write mask generator adapted to generate an execution mask or a write mask identifying valid bits or bytes transmitted on the data path or stored in the memory for a current or next computation.

Platooning of computational resources in automated vehicles networks
11354158 · 2022-06-07 · ·

Novel techniques are described for platooning of computational resources in automated vehicle networks. An on-board computational processor of an automated vehicle typically performs a large number of computational tasks, and some of those computational tasks can be computationally intensive. Some such tasks, referred to as platoonable tasks herein, are well-suited for parallel processing by multiple processors. Embodiments can detect one or more on-board computational processors in one or more automated vehicles that are likely, during the time window in which the platoonable task will be executed, to have available computational resources and to be traveling along respective paths that are close enough to each other to allow for ad hoc network communications to be established between the processors. In response to detecting such cases, embodiments can schedule and instruct shared execution of the platoonable tasks by the multiple processors via the ad hoc network.

A HIGH-PERFORMANCE COMPUTING SYSTEM
20230325315 · 2023-10-12 ·

A high-performance computing system having at least one computational group of at least one core, each computational group being associated with a computational memory, arranged to form a computational resource being utilized for performing computations, a concierge module with at least one concierge group of at least one core associated with a concierge memory arranged to form a reserved support resource being utilized for performing support functions to said computational resource. The computational resource is coupled to the concierge module through a cache coherent interconnection to maintain uniformity of shared resource data that are stored in the computational memory and concierge memory so that the high-performance computing system is functionally transparent to a software code runs on the computational group, and the cores in the computation and concierge groups are interchangeable for the software code such that the cores are used for performing any one of computations or support functions, and the cores use any one of the computational and concierge memory.

Execution or write mask generation for data selection in a multi-threaded, self-scheduling reconfigurable computing fabric
11782710 · 2023-10-10 · ·

Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an asynchronous packet network having a plurality of data transmission lines forming a data path transmitting operand data; a synchronous mesh communication network; a plurality of configurable circuits arranged in an array, each configurable circuit of the plurality of configurable circuits coupled to the asynchronous packet network and to the synchronous mesh communication network, each configurable circuit of the plurality of configurable circuits adapted to perform a plurality of computations; each configurable circuit of the plurality of configurable circuits comprising: a memory storing operand data; and an execution or write mask generator adapted to generate an execution mask or a write mask identifying valid bits or bytes transmitted on the data path or stored in the memory for a current or next computation.

Scalable system on a chip

An integrated circuit (IC) including a plurality of processor cores, a plurality of graphics processing units, a plurality of peripheral circuits, and a plurality of memory controllers is configured to support scaling of the system using a unified memory architecture. For example, the IC may include an interconnect fabric configured to provide communication between the one or more memory controller circuits and the processor cores, graphics processing units, and peripheral devices; and an off-chip interconnect coupled to the interconnect fabric and configured to couple the interconnect fabric to a corresponding interconnect fabric on another instance of the integrated circuit, wherein the interconnect fabric and the off-chip interconnect provide an interface that transparently connects the one or more memory controller circuits, the processor cores, graphics processing units, and peripheral devices in either a single instance of the integrated circuit or two or more instances of the integrated circuit.

Scalable System on a Chip

A system including a plurality of processor cores, a plurality of graphics processing units, a plurality of peripheral circuits, and a plurality of memory controllers is configured to support scaling of the system using a unified memory architecture.

Scalable System on a Chip

A system including a plurality of processor cores, a plurality of graphics processing units, a plurality of peripheral circuits, and a plurality of memory controllers is configured to support scaling of the system using a unified memory architecture.

PLATOONING OF COMPUTATIONAL RESOURCES IN AUTOMATED VEHICLES NETWORKS
20210334136 · 2021-10-28 ·

Novel techniques are described for platooning of computational resources in automated vehicle networks. An on-board computational processor of an automated vehicle typically performs a large number of computational tasks, and some of those computational tasks can be computationally intensive. Some such tasks, referred to as platoonable tasks herein, are well-suited for parallel processing by multiple processors. Embodiments can detect one or more on-board computational processors in one or more automated vehicles that are likely, during the time window in which the platoonable task will be executed, to have available computational resources and to be traveling along respective paths that are close enough to each other to allow for ad hoc network communications to be established between the processors. In response to detecting such cases, embodiments can schedule and instruct shared execution of the platoonable tasks by the multiple processors via the ad hoc network.