G06F15/17325

Cost-effective and self-adaptive operators for distributed data processing
11184293 · 2021-11-23 · ·

Disclosed herein are system, method, and computer program product embodiments for efficiently maintaining a distributed processing of data between a source and sink. An embodiment operates by maintaining a scheduler in communication with the source and the sink, wherein the source and the sink communicate over a network. The scheduler identifies an utilization of a resource unit of the source, the sink and/or the network meeting or exceeding a predetermined threshold. After identifying that the utilization of the resource unit of the source, the sink and/or the network meets or exceeds a predetermined threshold, the scheduler triggers an operator of the source and/or the sink. The operator modifies a processing of data by the at least one of the source and the sink.

Allocation of buffer interfaces for moving data, and related systems, methods and devices

A buffer interface, data transport method, and computing system are described in which a buffer interface may be configured for communicating data samples to and from frame buffers defined in a memory. The configurable buffer interfaces and frame buffers provide a flexible and scalable platform for use with many applications.

Apparatus and method for performance state matching between source and target processors based on interprocessor interrupts
11775336 · 2023-10-03 · ·

Apparatus, method, and machine-readable medium to provide performance state matching between source and target processors based on inter-processor interrupts. An exemplary apparatus includes a target processor to execute a receiving task at a first performance level and a source processor to execute a sending task at a second performance level higher than the first performance level. The sending task is to store interrupt routing data indicating a pairing between the sending task and the receiving task into a memory location and that the sending task is to dispatch work to be processed by the receiving task. The apparatus further includes a performance management unit to detect the pairing between the sending task and the receiving task based on the interrupt routing data and responsively adjust the performance level of the target processor from the first performance level to the second performance level based, at least in part, on the pairing.

Synchronizing systems on a chip using a shared clock
11775005 · 2023-10-03 · ·

An electronic eyewear device includes first and second systems on a chip (SoCs) having independent time bases that are synchronized by generating a common clock signal from a clock generator of the first SoC and simultaneously applying the common clock signal to a first counter of the first SoC and a second counter of the second SoC whereby the first counter and the second counter count clock edges of the common clock. The clock counts are shared through an interface between the first SoC and the second SoC and compared to each other. When the clock counts are different, a clock count of the first counter or the second counter is adjusted to cause the clock counts to match each other. The adjusted clock count is synchronized to the respective clocks of the first and second SoCs, thus synchronizing the first and second SoCs to each other.

Networked computer with multiple embedded rings

According to an aspect of the invention, there is provided a computer comprising a plurality of interconnected processing nodes arranged in a configuration with multiple stacked layers. Each layer comprises four processing nodes connected by respective links between the processing nodes. In end layers of the stack, the four processing nodes are interconnected in a ring formation by two links between the nodes, the two links adapted to operate simultaneously. Processing nodes in the multiple stacked layers provide four faces, each face comprising multiple layers, each layer comprising a pair of processing nodes. The processing nodes are programmed to operate a configuration to transmit data around embedded one-dimensional rings, each ring formed by processing nodes in two opposing faces.

Reset of a Multi-Node System
20230281018 · 2023-09-07 ·

Each of the nodes stores a number, referred to herein as a generation number, which is updated whenever the respective node undergoes a reset and restart from checkpoint. Since the nodes of the system participate in the same reset event, at most times, each generation number held by a node will be the same across the system. However, in some cases, when one node resets before another node, the generation numbers between those two nodes will differ. The data frames sent between the nodes each comprise a generation number of the sending node, which is checked by the recipient and only accepted if the generation number in the frames matches the generation number of the recipient node.

Data exchange pathways between pairs of processing units in columns in a computer

A time deterministic computer is architected so that exchange code compiled for one set of tiles, e.g., a column, can be reused on other sets. The computer comprises: a plurality of processing units each having an input interface with a set of input wires, and an output interface with a set of output wires: a switching fabric connected to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective set of output wires and connectable to each of the processing units by the respective input wires via switching circuitry controllable by its associated processing unit; the processing units arranged in columns, each column having a base processing unit proximate the switching fabric and multiple processing units one adjacent the other in respective positions in the direction of the column.

MULTI-PROCESSOR SYNCHRONIZATION
20230350842 · 2023-11-02 ·

A method of synchronizing system state data is provided. The method includes executing a first processor based on initial state data during an update cycle, wherein the initial state data represents a state of the system prior to initiation of the update cycle, detecting changes in state of the system by the first processor using sensors, the changes in state being added to a record of modified state data until a predefined progress position within the update cycle, designating the modified state data as next state data, based on reaching the predefined progress position within the update cycle, and transitioning from execution of the first processor based on the initial state data to execution of the first processor based on the next state data, based on completion of the update cycle.

Memory management in a multiple processor system

Methods and apparatus for memory management are described. In one example, this disclosure describes a method that includes executing, by a first processing unit, first work unit operations specified by a first work unit message, wherein execution of the first work unit operations includes accessing data from shared memory included within the computing system, modifying the data, and storing the modified data in a first cache associated with the first processing unit; identifying, by the computing system, a second work unit message that specifies second work unit operations that access the shared memory; updating, by the computing system, the shared memory by storing the modified data in the shared memory; receiving, by the computing system, an indication that updating the shared memory with the modified data is complete; and enabling the second processing unit to execute the second work unit operations.

Synchronisation for a Multi-Tile Processing Unit

A multi-tile processing unit in which the tiles in the processing unit may be divided between two or more different external sync groups for performing barrier synchronisations. In this way, different sets of tiles of the same processing unit each sync with different sets of tiles external to that processing unit.