Communication architecture for exchanging data between processing units

11216308 · 2022-01-04

Assignee

Inventors

Cpc classification

International classification

Abstract

A communication architecture, for exchanging data between processing units that are intended to operate in parallel comprises a communication system comprising a set of interfaces each intended to be linked to a processing unit, a set of sequencers that are able to define, for each processing unit, time intervals of access to a shared memory accessible by the processing units for writing and reading data, for the sequential arbitration of accesses to said memory, and a set of address managers able to allocate each processing unit ports for access to the shared memory.

Claims

1. A communication architecture for exchanging data between processing units that are able to operate in parallel, wherein the communication architecture comprises a communication system comprising: a plurality of interfaces each individually linking a corresponding processing unit to an individual sequencer of a plurality of sequencers, the plurality of sequencers configured to define, for each processing unit, time intervals of access to a shared memory accessible by the processing units for writing and reading data, for the sequential arbitration of accesses to said memory, wherein the sequencers of the plurality of the sequencers are time-division multiple access (TDMA) sequencers, wherein the sizes of the time intervals of each slot are either the same or different, wherein the time intervals, for each period, are either loaded at startup or stored permanently in the sequencers, wherein the shared memory is accessible by a single interface within a particular time interval, and wherein each individual sequencer is couplable to one of the plurality of interfaces through programmable switches and without any intervening sequencers, and a plurality of address managers configured to allocate each processing unit ports for access to the shared memory.

2. The architecture as claimed in claim 1, furthermore comprising a configuration memory intended to communicate with the address managers and intended to receive access tables, for each processing unit, for accessing the access ports and the shared memory.

3. The architecture as claimed in claim 2, comprising an error detection means associated with at least one of said configuration memory and said shared memory.

4. The architecture as claimed in claim 1, in which the sequencers of the plurality of the sequencers comprise configurable means of selective linking of said sequencers to the interfaces.

5. The architecture as claimed in claim 1, in which the shared memory comprises, for each processing unit, at least two memory locations comprising at least one memory location for storing sampled messages and/or at least one memory location for storing data queues.

6. The architecture as claimed in claim 1, in which all or some of the constituent elements of the architecture are integrated into one and the same integrated circuit.

7. A system for processing multiple data, comprising a plurality of data processing units that are intended to operate in parallel, wherein said system comprises at least one communication architecture as claimed in claim 1, for exchanging data between the processing units.

8. The processing system as claimed in claim 7, comprising a first plurality of processing units and at least one second plurality of processing units, the units of each plurality of processing units being linked by means of at least one communication system each connected to one or more shared memories.

Description

DESCRIPTION OF THE DRAWINGS

(1) The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:

(2) FIG. 1 is a schematic diagram of a communication architecture in accordance with the invention;

(3) FIG. 2 illustrates the mechanism for sequencing the time-division multiple accesses implemented within the architecture of FIG. 1;

(4) FIGS. 3 and 4 illustrate the principle of the accesses to the shared memory by storage of sampled messages and of data queues; and

(5) FIG. 5 illustrates an exemplary implementation of a multiple data processing system using a communication architecture of FIG. 1.

DETAILED DESCRIPTION

(6) FIG. 1 schematically illustrates a communication architecture according to the invention, designated by the general numerical reference 1.

(7) In the envisaged exemplary embodiment, this architecture 1 is intended to ensure data communication between processing units u.sub.2, . . . u.sub.n, by way of a communication system 2 and of a shared memory 3. Said processing units u.sub.1, u.sub.2, . . . u.sub.n are able to receive software applications or hardware applications.

(8) The shared memory 3 may be disposed inside the communication system 2, as represented in the embodiment of FIG. 1, or else outside said system 2.

(9) The architecture 1 of the invention may, for example, be used within a multicore data processing system comprising a set of processing units, for example processors networked to exchange data between the processors.

(10) The networked processors may be graphics calculation processors which cooperate with central processing units.

(11) Generally, the invention relates to a communication architecture allowing data exchange between processing units into which are loaded, or which are intended to receive, software applications or hardware applications operating in parallel.

(12) The communication system 2 comprises, successively, a set of interfaces 4-1, 4-2, . . . 4-n each linked to a single processing unit.

(13) These interfaces consist of interfaces adapted to communicate in a bidirectional manner with the processing units.

(14) The interfaces 4-1, . . . 4-n are linked to sequencers 5-1, 5-2, . . . 5-m which receive the data streams arising from the interfaces and allocate the central units time windows for access to the shared memory 3.

(15) The sequencers 5-1, 5-2, . . . 5-m may be TDMA sequencers. In this case, said sequencers 5-1, 5-2, . . . 5-m ensure a temporal division of the available passband for access to the shared memory 3 by defining time windows in the course of which, for each sequencer 5-1, 5-2, . . . 5-m, a single of the interfaces is authorized to transfer data to the shared memory 3 or to receive data originating from this memory 3.

(16) The system 2 is also endowed with a set of address managers 6-1, 6-2, 6-m which each communicate with a corresponding sequencer and which control accesses to the memory 3.

(17) The managers 6-1, 6-2 . . . , 6-m are supervised by a configuration memory into which can be loaded one or more access tables for accessing the ports p1, p2, . . . , pm to the memory before any use or while in use.

(18) The loading of the access table or tables makes it possible to define the configurations allowing the managers 6-1, 6-2 . . . , 6-m to transfer the data during an access by a port p1, p2, . . . , pm from or to a specific physical location of the shared memory 3.

(19) In order to guarantee access to the shared memory 3, the number m of accessible ports corresponds to the number of address managers and to the number of sequencers in such a way that the sequencers can access the memory in parallel and simultaneously.

(20) The configuration memory 7 can be, for example, configured when starting up the processing system, during a configuration phase, in the course of which the ports to which each processing unit is intended to write are declared by the latter. These access tables are thereafter provided to the address managers to indicate the ports with which the processing units are authorized to communicate.

(21) The essential elements of the communication system and of the sequencers can be achieved on the basis of bricks of programmable hardware means.

(22) Thus, accesses to the ports of the memory are programmed during the prior configuration phase and the configuration is stored in the configuration memory.

(23) It is indeed seen in FIG. 1 that the sequencers are capable of being linked to the set of interfaces 4-1, . . . 4-n.

(24) According to the embodiment represented in FIG. 1, the sequencers are endowed, at input, with configurable elements for selective linking to the interfaces.

(25) These configurable elements comprise here switches 8 which are selectively programmed so as to ensure the linking of a sequencer to a single interface.

(26) It is also seen in FIG. 1 that the configuration memory 7 and the shared memory 3 are each endowed with an error correction module 9, 10 which ensures detection and/or correction of errors in the data stored in the memories 3 and 7. This may for example entail detecting and correcting corrupted data bits on the basis of the value of the other bits of a word that is stored in memory.

(27) According to a variant, only one or the other of the memories is endowed with such a module for correcting errors.

(28) With reference to FIG. 2, the TDMA sequencers 5-1, . . . , 5-m ensure arbitration of accesses to the shared memory 3 by using time intervals of access to the shared memory 3 for data writing and reading.

(29) According to the embodiment represented in FIG. 2, the accesses are performed based on periods P of access time intervals “slot1, . . . , slotn”. The sequencing used here is a sequencing based on fixed, but optionally inhomogeneous, access intervals. The size of the time intervals of each slot may be different. The configuration of these time intervals, for each period, is either loaded at startup or stored permanently in the sequencers.

(30) According to the embodiment represented in FIGS. 3 and 4, access to the shared memory 3 is performed, for each port, using a memory location which may be a memory location for storing sampled messages, a memory location for storing data queues or another type of location.

(31) Thus, the memory 3 is split into sections U1, . . . , Un allocated respectively to the processing units. For each section U1, . . . , Un the shared memory 3 comprises one or more memory locations “portsampling 1”, . . . “portsampling p” for reading and writing sampled messages M. This reading and writing mechanism is generally referred to as “sampling”.

(32) For each section U1, . . . , Un the shared memory also comprises memory locations for storing data queues M1, M2 and M3, “port-queueing1”, . . . “port-queueingq”. These accesses are referred to as “queuing”.

(33) According to the access to the memory by sampled messages, the information contained in the port is updated at each new write. Accessing the port involves managing the size of the message M.

(34) As illustrated, an indicator I representative of the age of the data item stored in memory can be added to each memory location.

(35) According to the memory access mechanism based on data queues, the first data item written by one of the processing units is the first data item read by the recipient processing unit receiving the data.

(36) Once the data item has been read, it is erased from the port. This reading and writing mechanism involves the use of a notion of message size and queue depth.

(37) In both cases, a single processing unit is authorized to write data to a port which is allocated to it. In the case of data queue based memory access, a single processing unit is authorized to read and to utilize the data written to the memory.

(38) By virtue of the use of the address managers, the processing units can access storage elements of the memory, such as Sampling port 1 to p and Queuing port 1 to q, without knowing the physical address at which the read and written message is stored.

(39) The address managers ensure the management of the ports.

(40) In the case of sample based writing, the address managers ensure the integrity of the messages.

(41) As long as a processing unit accesses a message, the latter is not altered by a new write, without however blocking this new write.

(42) Read-access to the storage element guarantees reading of the most recent data item.

(43) In the case of a read-access in progress, a new write does not alter the data item read by the processing unit in the course of reading. On the other hand, any new access is done on a new data item.

(44) In the case of data queue based access, the access managers guarantee that the read message is the first one available of the queue.

(45) During a read-access, the address managers return the data contained in the first message of the waiting queue.

(46) At the following access, the following message is provided to the recipient, without the latter having to worry about the physical addressing of the messages.

(47) In all cases, the address managers ensure that each of the storage elements possesses only a single provider and, in the case of queue based access, each of the storage elements possesses only a single consumer.

(48) As regards the shared memory 3, the latter can in reality consist of one or more memories. It can also be integrated into the communication system or consist of a memory external to the latter.

(49) The communication system, or the main constituent elements of said system, can be integrated into one and the same integrated circuit.

(50) The invention which has just been described can be used to ensure data exchange between processing units of a multiple data processing system.

(51) It may for example entail ensuring communication between data processors that are partitioned and networked.

(52) An exemplary embodiment of such a system has been represented in FIG. 5.

(53) The system illustrated in FIG. 5 is a multicore system which here comprises two stages each comprising several processing units u1, . . . , un, and one or more additional sets of processing units u′1, . . . u′n. The system according to the invention can be a multicore system comprising three stages, four stages, or indeed more.

(54) The exchanges of data between the processing units of the first set and of the second set of processing units and within each processing unit are performed by means of a communication system identical to the communication system 2 described previously, by way of one or more shared memories 3.

(55) According to the embodiment represented in FIG. 5, the communication systems 2 share the same shared memory 3.

(56) According to another variant, the communication systems 2 can each have one or more dedicated shared memories 3.

(57) While illustrative embodiments have been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.