ENCODING OF FAULT SCENARIOS OF A MANYCORE PROCESSOR

Abstract

A method implemented by computer for compressing and decompressing all the fault scenarios of a processor comprising computation units interconnected by a communication network having topology symmetries, each fault scenario corresponding to the number and the location of one or more failing computation units and the method comprises the steps of reception or determination of one or more topology symmetries; determination of the equivalent scenarios by means of said topology symmetries; each of the fault equivalence classes being associated with a resource allocation solution in terms of mapping and routing. Different developments include the determination or the application of an inference engine, of identifiers associated with the fault scenarios, of combinatorial exploration techniques, of compression rates, of reconfiguration of the processor and of classification of the processor in a range. A program product and associated systems are also described.

Claims

1. A method implemented by computer for compressing all the fault scenarios of a processor comprising computation units interconnected by a communication network having topology symmetries, each fault scenario corresponding to the number and the location of one or more failing computation units and the method comprising the steps of: reception or determination of one or more topology symmetries; determination of the equivalent scenarios by means of said topology symmetries; each of the fault equivalence classes being associated with a resource allocation solution determining a specific mapping of the tasks of the applications on the computation units and a specific routing of the data exchanges over the communication network.

2. The method as claimed in claim 1, the determination of the equivalent scenarios being performed by application of an inference engine.

3. The method as claimed in claim 1, the determination of the equivalent scenarios comprising the association of a set of identifiers with each fault scenario.

4. The method as claimed in claim 3, the determination of the equivalent scenarios comprising the association of a unique identifier with each fault scenario.

5. The method as claimed in claim 4, the unique identifier being obtained by concatenation of the character strings forming said identifiers.

6. The method as claimed in claim 1, the symmetries comprising one or more axial rotations about an axis comprising one or more computation units, according to angles 90/+90/+180, and one or more vertical or horizontal displacements or shifts.

7. The method as claimed in claim 1, the equivalent fault scenarios being determined by combinatorial exploration, said combinatorial exploration comprising the construction of one or more equivalence class trees.

8. The method as claimed in claim 1, further comprising the creation of a file specifying the allocation of resources.

9. The method as claimed in claim 1, further comprising the determination of a compression rate of the fault scenario space, the compression rate being associated with the type of architecture of the processor.

10. The method as claimed in claim 1 for decompressing all the fault scenarios of a processor, after the startup of the processor, further comprising: the determination of one or more faults associated with one or more computation units on starting up the processor; the characterization of the corresponding fault scenario; the identification of the fault equivalence class and of the computation resource allocation solution associated with said fault equivalence class.

11. The method as claimed in claim 10, further comprising the reconfiguration of the processor started up, the reconfiguration comprising the application to the processor of the computation resource allocation solution.

12. The method as claimed in claim 11, the reconfiguration being performed at the moment of, or after, the occurrence of one or more faults of one or more computation units of the processor in order to continue the execution of the application on the processor.

13. The method as claimed in claim 11, the reconfiguration being performed before the occurrence of one or more faults of one or more computation units of the processor preventively, one or more faults being simulated or anticipated.

14. The method as claimed in claim 1, further comprising the classification of the processor in a range grouping together processors with an identical number of failing computation units.

15. The method as claimed in claim 1, a failing computation unit being associated with a failure rate or confidence of execution rate, said failure being total or partial.

16. The method as claimed in claim 1, further comprising the identification of one or more symmetrical links from the determination or the reception of one or more topology symmetries.

17. A computer program product, said computer program comprising code instructions making it possible to perform the steps of the method as claimed in claim 1, when said program is run on a computer.

18. A system for compressing or decompressing the fault scenario space of a processor, the system comprising means for implementing the steps of the method as claimed in claim 1.

Description

DESCRIPTION OF THE FIGURES

[0059] Different aspects and advantages of the invention will become apparent from the description of a preferred, but nonlimiting, mode of implementation of the invention, with reference to the figures below:

[0060] FIG. 1 schematically illustrates a manycore processor with 9 clusters;

[0061] FIG. 2 illustrates an example of relabeling after a symmetry operation;

[0062] FIG. 3 illustrates the correlation between a fault scenario mapping/routing solution and an equivalent fault scenario;

[0063] FIG. 4 illustrates two examples of scenarios with three faults;

[0064] FIGS. 5A and 5B illustrate the concept of symmetrical distance;

[0065] FIG. 6 illustrates the equivalence of two examples of scenarios and the computation of identifiers;

[0066] FIGS. 7A, 7B and 7C illustrate different aspects of the method.

DETAILED DESCRIPTION OF THE INVENTION

[0067] A multicore microprocessor is a processor that has a number of physical cores which work in parallel. A physical core is a set of circuits capable of executing programs autonomously. All the functionalities necessary to the execution of a program are present in these cores: ordinal counter, registers, computation units, etc. The term multicore is generally employed to describe a processor composed of at least two cores (or computation units) etched within the same chip. These computation units can be clustered (or not). The computation units are interconnected by a communication network (Network on Chip or NoC).

[0068] The recent technological advances in the design and creation of circuits now make it possible to produce processors containing a large number of generic cores (from a few tens to several hundreds) with complex memory hierarchies: these are no longer called multicore but manycore systems.

[0069] A certain number of problems are due to the increase in the number of cores. For example, this increase brings about a situation in which the interconnections become the bottleneck in the quest for performance. To the different execution constraints are thus added particular parallel programming techniques to exploit all the potential of the circuit (run time layer development, performance analysis, distributed control, compilation, management of communications on a network on chip, etc).

[0070] The useand therefore the marketingof a degraded chip can be obtained by different means and/or methods.

[0071] A first approach consists in performing an allocation (mapping and routing) of an application (software) to the computation and communication resources of a degraded manycore processor. Generally, these operations are done statically and offline (that is to say, when the chip is not powered up). These techniques will also assume a total and entire knowledge of the architecture of the processors on which the software application will be deployed. For example, one approach to make it possible to sell these degraded processors may consist in defining computation power ranges. A given range then gathers together all the processors which guarantee the availability of a set overall minimum of computation resources. Thus, by following this marketing strategy and because of the absence of differentiation according to the location of the healthy resources, processors corresponding to different architectures may be found within a same range. For example, in the example of a manycore with an architecture providing 16 computation clusters for which the range of degraded processors derived from this manycore comprises only 15 functional clusters (a single cluster being faulty), there are then 16 possible architectures in this range. Each architecture corresponds to the case of one of the 16 clusters in failure state. There then arises the problem of the deployment of a same application on manycore processors with the same global computation capacity but with an architecture that differs from one processor to another. In effect, the information is known only very late, when the processor is switched on. Consequently, this mapping/routing technique is limited.

[0072] It is possible to provide a solution that makes it possible to reconfigure the deployment of the application on the chip according to the fault scenario thereof. In the absence of such a solution, the degraded processors cannot be fully exploited and cannot be marketed as is. Most of the current multi/manycore processors use shared buses (for example the ClearSpeed CSX manycore) or certain types of NoCs of RING type. These circuits tend to eliminate the problem of reconfiguration by eliminating the combinatorial explosion of possible fault scenarios thereof. In fact, the patent literature does not as yet even mention this technical problem (and a fortiori does not propose any technical solution). However, these circuits with shares buses or NoCs of RING type are not scalable and the next generations of manycore processors will probably have to dispense with this type of circuit to adopt NoCs of 2D Torus, 3D Torus or other such types for which this technical problem will be acute.

[0073] Another possible approach consists in modifying the hardware, for example with the implementation of special routers. This solution may complement the examples disclosed herein.

[0074] Regarding the exploration of the space of the architectures for solving the problem of deployment, a first solution consists in exploring all the space of the architectures of a range and reiterating the mapping/routing algorithms for each architecture (fault scenario) and saving the solutions obtained. When a chip is started up, it is then possible to detect what its architecture is and use the corresponding solution. By construction, this technique is exact (i.e. optimal) and provides the guarantee, in the compilation phase, of the possibility (or not) of deploying the application whatever the fault scenario within a range. However, this technique is generally costly in execution time because the number of configurations to be explored is very great. A problem of storage of the solutions also arises because of the huge size of the binary which results therefrom. In effect, with each fault scenario requiring its own binary, the size of the final binary could be very great if all the reconfigurations for the specified range had to be stored.

[0075] By contrast to the techniques previously mentioned, certain embodiments of the invention make it possible to advantageously process the faults in terms of computation resources. Certain embodiments of the invention in effect consider processors with an NoC that is entirely functional, that is to say without router (or link) fault. The communication links and the routers are generally less subject to faults.

[0076] It is stressed that the techniques disclosed hereinbelow apply to the mapping and to the routing (both). In order to avoid additional notations which would complicate the description, only the mapping will be mentioned hereinbelow, but the disclosure will apply equally to the routing (routing links).

[0077] The term resource allocation consists of mapping and routing operations that make it possible to allocate the computation and communication resources necessary to execute the different tasks of an application and have them communicate.

[0078] The term mapping therefore designates the operation which assigns each of the tasks of an application to execution media (i.e. computation units, e.g. core, processor, cluster, server, etc.), which implicitly presuppose memory means. The tasks assigned to one or more media will then be executed solely thereby. The scheduling at a medium level will determine the order and the times for which the tasks access these shared resources (which are allocated to them).

[0079] The term routing designates the operation which computes a path made up of communication links which link the computation units (Bus, NoC, local area network, internet, etc.) in order to ensure the transfer of data between tasks which communicate with one another and which are executed on different media.

[0080] FIG. 1 schematically illustrates a manycore processor 100 with 9 clusters 110 (or cores or computation units) of 33 2D Torus type. The clusters are numbered from 1 to 9 and are interconnected by an NoC or interconnections 120 (represented by the lines in the figure). Most manycore processors comprise symmetries. These symmetries make it possible to define operations which transform the initial architecture into another which is totally identical to it (the architecture is said to be equivalent) in terms of form and of functionalities. Through the architecture of its NoC, a 2D Torus homogeneous manycore processor comprises several symmetries. Some symmetries are shared with the underlying mesh (checkerboard) architecture. In the example of FIG. 1, the following can for example be encountered: axial rotations in the plane of the mesh (relative to a straight line which passes through it at right angles at the level of the cluster 5, of)90/+90/+180; a rotation of 180 in the space around an axis which passes through the clusters 2, 5 and 8; another rotation of 180 in the space around an axis which passes through the clusters 4, 5 and 6; a rotation of 180 in the space around an axis which passes through the clusters 1, 5 and 9; a rotation of 180 in the space around an axis which passes through the clusters 3, 5 and 7. The links on the edges of the 2D Torus give rise to additional symmetries: a vertical displacement or shift of the lines of clusters by one pitch upward or downward; and a horizontal displacement or shift of the columns of clusters by one pitch to the left or to the right. These last two symmetries sometimes justify the adoption of a 2D Torus architecture rather than an asymmetrical mesh architecture on the edges (mapping and routing are highly dependent thereon).

[0081] FIGS. 2A, 2B and 2C illustrate the labeling before and after a symmetry operation. FIG. 2A shows the labels before the symmetry (of rotation). FIG. 2B shows the labels after the rotation. FIG. 3B shows the labels after relabeling.

[0082] Each symmetry operation has associated with a relabeling function which makes it possible to restore the initial order of the labels of the architecture. For example, the cluster relabeling function [1,4,7,2,5,8,3,6,9]->[1,2,3,4,5,6,7,8,9] corresponds to the rotation of 180 in the space around the axis which passes through the clusters 1, 5 and 9. The same applies for the interconnects whose labels are not mentioned or represented in the interests of simplification.

[0083] A fault scenario is defined by the number and the location of the computation units (i.e. of the clusters in the present example) which are defective. For example, in a scenario with three faults, a scenario for which only the three clusters i, j and k are faulty will be denoted a scenario (i,j,k). The order of formulation of these faults is unimportant, i.e. the scenarios (i,j,k), (i,k,j), (j,i,k), (j,k,i), (k,i,j) or (k,j,i) are the same.

[0084] FIGS. 3A, 3B and 3C make it possible to show how the symmetry can be used to compress the space of the fault scenarios. FIG. 3A illustrates a simplified example of a scenario with two faults (1,2). FIG. 3B shows the scenario (1,2) after rotation. FIG. 3C shows the relabeling after the rotation.

[0085] By applying a symmetry operation (FIG. 3B) followed by a relabeling (FIG. 3C), it is illustrated that the two scenarios (1,2) and (1,4) are the two facets of a same situation (which is that of two adjacent faulty clusters). By combining even more symmetry operations like the horizontal and vertical shift, similar conclusions can be drawn for scenarios like (2,3), (1,3), (1,7), (3,6), (3,9), etc. All these scenarios are equivalent to the scenario (1,2) and form what can be called an equivalence class of fault scenarios. An equivalence class contains exclusively scenarios which can all be brought, by sequences of symmetry operations, to a same and unique fault scenario (which is therefore equivalent to them). The latter is thus designated as the representative of the equivalence class. Given the property of transitivity of an equivalence relationship, any scenario can be designated as representative of the class to which it belongs. In the present example, the scenario can be chosen and the equivalence class can then be designated by the class (1,2).

[0086] Defining equivalence classes presents the advantage of not having to solve the mapping/routing problem and of saving the solution only for the single and unique representative of each class. By having available the sequence of symmetry operations, to bring a scenario to the representative of its class, it is possible to use this same sequence to reverse the order of application of the symmetry operations in order to relabel the solution of mapping/routing of the representative as a solution of exactly equal quality for the scenario concerned. This scenario will be known on startup of the chip.

[0087] In the example of FIG. 3, to compute a mapping/routing solution for the scenario (1,4) represented in FIG. 3C from the solution associated with the scenario (1,2) represented in FIG. 3A, it is possible to apply the function of relabeling of the clusters and of the interconnects associated with the rotation of 180 about the axis passing through the clusters 1, 5 and 9.

[0088] In the example of the scenario (1,2) represented in FIG. 3A, two computation tasks t and t are placed respectively on the clusters 3 and 4 (not failing), and communicate by using the path 3->6->4 which is made up of the direct interconnects 3->6 and 6->4. If a scenario (1,4) applies, i.e. a degraded chip with the failing clusters 1 and 4, the task t can no longer be performed and a re-mapping should be carried out. After this re-mapping, the task t will be placed on the cluster 7, and the task t will be placed on the cluster 2 after a symmetry of rotation of 180 about this axis passing through the clusters 1, 5 and 9. The communication between t and t will be performed through the path 7->8->2. Regarding the impact of the reduction or compression of the scenario space by the exploitation of a symmetry (here a rotation), the class (1,2) contains 18 equivalent fault scenarios which corresponds to a potential compression rate of 11/18=94.44%. This compression rate applies equally to the computation time and to the mapping/routing solutions save file.

[0089] The method for putting this compression into practice is described hereinbelow.

[0090] After the identification of the fault scenario equivalence classes (when the symmetry operations which save the structure of an architecture are explored and itemized), these same classes are exploited in order to proceed with the partitioning of all the fault scenarios. The number of scenario space partitioning possibilities corresponds to the Bell number which is exponential in the size of the set concerned.

[0091] For the partitioning, a first solution could consist in implementing an inference engine based on rules of symmetry, implemented for example by means of languages such as LISP or Prolog. Starting from a given scenario, the inference engine sets out to search for a sequence of rules to be applied in order to establish an equivalence with the representative of a class already listed or to create a new equivalence class if no equivalence is established (i.e. absence of sequence). In the latter case, this scenario becomes the representative of the new class. To implement such an inference engine, it is then necessary to define a) the rules of symmetry and the relabeling operations which are associated with them and b) the priority (or the order or the occurrence) of application of each of these rules and do so on each step of the search process (the priorities, orders or occurrences being able to vary from one step to another of the search process). The symmetry operations are not provided with good characteristics with respect to the composition. For example, they are not generally commutative, i.e. for two operations O1 and O2, O1*O2 gives a different result from O2*O1. Similarly, they are not idempotent: O1*O1=/=O1. These characteristics generally make the definition of an order/occurrence of application of the symmetry operations very complex. It is then difficult to define a search strategy which optimizes the execution time. Furthermore, there is a risk of having therein a number of exceptions to the predefined order which have to be taken into account according to the architecture. Moreover, when the symmetry is used, it is tempting to rely on intuition to establish these parameters but this will soon prove, in many cases, to be very complex and counterproductive for the identification of the equivalence classes. The characteristics of the symmetry operations also pose a problem of definition of the stop conditions for an inference engine. Since the number of compositions of operations possible is very great, when no equivalence with a case already listed is found, it is not generally possible to say at which moment it is possible to stop the inference engine and to conclude on the presence of a new equivalence class. If the stop condition is too restrictive, the technique also risks not being exact and the maximum compression rate might not be reached (a class could correspond to several partitions, if the equivalence between all its elements is not established). The adaptation of such a method for each architecture will also take a long time and be very complex.

[0092] These reasons make the development of an effective and generic inference engine irrelevant, in the general case.

[0093] FIG. 4 serves as a support for illuminating the complexity of the technical problem consisting in identifying equivalent cases. FIG. 4 shows two scenarios with three faults. FIG. 4A shows a scenario (1,2,9) and a scenario (2,4,7). To show that these scenarios are equivalent, a rotation in the plane of +90 in the clockwise direction around the central cluster 5 changes the labels of the adjacent clusters 4 and 7 to 1 and 2. The cluster 2 becomes 6. To transform the fault of the cluster 6 into a fault of the cluster 9, the application of a horizontal shift downward to the scenario (1,2,6) leads to the scenario (4,5,9). Then, by applying to this latter scenario an axial rotation of 180 about the axis which passes through the units 4, 5 and 6, and after labeling, the scenario becomes (4,5,3). The application of a reverse horizontal shift, upward, and the scenario obtained becomes (1,2,9). By the application of this series or sequence of transformations, exploiting the symmetries of the circuit, it is therefore possible to establish the association of the scenarios (1,2,9) and (2,4,7) with the same equivalence class (that can be represented or denoted (1,2,6), by following, for example, a criterion of minimization of the sums of the labels).

[0094] The ex ante knowledge of the equivalence makes it possible to persevere in the combination and the search for the sequence of symmetry operations. Without this prior knowledge, the stop problem could become critical. The demonstration which has just been made for the example might not stop, or by default (i.e. if a stop was triggered) might culminate in the incorrect conclusion of the non-equivalence of the scenarios (and therefore in the conclusion of the existence of two equivalence classes). This first search approach remains possible, but it is limited.

[0095] Another embodiment of the compression of the space of the scenarios is described below. This method is rapid and generic. According to this embodiment, each scenario is associated with an identifier, so that two scenarios associated with a same identifier are necessarily or inevitably equivalent and are therefore interchangeable. According to one embodiment, the identifier can be that of the equivalence class itself. By construction, such an identifier incorporates the transformations of symmetry and does not depend on the labels associated with the failing clusters. To this end, a distance is introduced: an inter-cluster distance which adds the concept of symmetry to the usual inter-cluster distances. Then, by using this new distance as a basis, means and/or steps make it possible to identify a particular scenario.

[0096] In detail, the compression method comprises four steps or phases.

[0097] In a first step, a distance is defined that is called symmetrical inter-cluster distance. To define the fault scenario identifier, a means of identification of the clusters at equivalent distance from another cluster is described hereinbelow. To define an inter-cluster distance, a first solution consists in taking into account the number of links which separate these clusters (e.g. hop count). This measure makes it possible to assess the shortest inter-cluster routing path length and thus gives a measurement for the data bit rate consumed relative to the overall routing capacity of the network. The smaller this is, the more the congestion of the network can be avoided (and energy saved). The improved inter-cluster distance proposed here takes into accountin addition to the hop count and the nature of the interconnectsthe patterns of symmetry. By corollary, two pairs of clusters which are at different hop counts will have different symmetrical distances. These distances are introduced in FIG. 5.

[0098] FIGS. 5A and 5B illustrate the concept of symmetrical distance. Take a computation unit denoted Ui. Unit Uj and Uk are said to be at the same symmetrical distance from Ui if there is a sequence of symmetry operations that makes it possible to save the location of Ui while bringing the location of Uk back to that of Uj. FIG. 5A gives an example for the 33 2D Torus and the cluster 5. By rotations of 90/+90/+180 in the plane around the cluster 5, it emerges that the clusters 2, 4, 6 and 8 are at the same symmetrical distance from the cluster 5 labeled as distance A. Similarly, the clusters 1, 3, 7 and 9 are at the same symmetrical distance from the cluster 5 and are labeled B. The result of all the clusters can be grouped together in a symmetrical distance matrix M represented in FIG. 5B, of 33 dimension. Mij is the distance between Ui and Uj. By construction, there are as many equivalence classes with two faults as there are types of symmetrical links (i.e. of symmetrical distances). For example, for the matrix M of the 33 2D Torus, the number of symmetrical links is two (A and B). In this case, there are two equivalence classes for the scenarios with two faults. Since the number of possible architectures with two faults is 36 (the number of combinations of 2 out of 9), the compression rate is 12/36=94.44%.

[0099] According to the symmetries detected and exploited in order to define the symmetrical distances (for example by a developer implementing reconfiguration software for a given chip), different matrices can be obtained. The greater the number of symmetries exploited, the better the compression. For example, if a developer considers that the clusters 6 and 2 are not at the same symmetrical distance from 5 because he or she disregards the axial rotation about the center, then he or she will have to label one of these distances some other way, C for example. This will automatically increase to 3 the number of equivalence classes for the range with 2 faults and will thus reduce the compression rate to 91.66%. This rate will be reduced all the more if it is considered that this same developer would surely have neglected other symmetries on the distances denoted B initially in the matrix of the example and will then have to define additional distances D, E and F. In this case, the compression rate drops to 83.33%.

[0100] To designate the symmetrical distances, it is possible to use numerical labels in place of the alphabetical labels, but the latter greatly simplify the computation and the comparison of the identifiers of scenarios thereby allowing for the implementation of more generic and more effective algorithms.

[0101] FIG. 6 introduces the scenario identifiers and illustrates a use thereof to establish the equivalence of the fault scenarios (2,4,7) and (1,2,9) provided previously in FIG. 4.

[0102] In a second step of the compression method, a fault scenario is identified and characterized. To do this, the positions of the faults relative to one another are used, by means of the concept of symmetrical distance defined previously.

[0103] In one embodiment, the identifier of a scenario (and therefore of its equivalence class) is a word taken from an alphabet made up exclusively of the labels of the symmetrical distances. For a scenario with two faults (Ui,Uj), the identifier denoted Identity(Ui,Uj) is given by the distance Mij. It should be noted that the brackets in the identifier are necessary neither for the definition nor for the comparison of the identifiers. These brackets are represented only to make it easier to read and show the transition from one fault to the next, but they can be deleted on implementation.

[0104] For a scenario with three faults (Ui,Uj,Uk), the identifier Identity(Ui,Uj,Uk) is obtained by Mji(MkiMkj). For the scenarios with a greater number of faults, recurrence is applied by using a character string concatenation operator. By using Identity(Ui, Uj, . . . , Um) to designate the identifier of the scenario (Ui, Uj, . . . , Um), then Identity(Ui, Uj, Uk, . . . , Um, Uq) corresponds to the concatenation of Identity(Ui, Uj, . . . , Um) and (MqiMqjMqk . . . Mqm). Since the concatenation is not symmetrical, the identification depends on the order considered in the formulation of the faults of the scenario. In other words, with no additional method, several identifiers can be computed for a same scenario. In order to avoid equivalent scenarios being detected as such because of the order of the formulation of the faults, an additional and optional step makes it possible to make sure that they have a unique identifier and therefore ensures the accuracy of the technique and a maximum compression rate. In one embodiment, the order of formulation of faults which gives the smallest identification with respect to the comparison of the character strings is selected. The benefit of the concatenation for the comparison of the identifiers justifies the use of alphabetical labels. Alternatively, alphabetical, numeric and/or alphanumeric characters can be used. By applying this identification technique to the example of FIG. 6, it is possible to show that the scenarios (2,4,7) and (1,2,9) are equivalent. In effect, M47(M27M24)=A(BB) and M21(M91M92)=A(BB).

[0105] In conclusion, the use of identifiers encoding the symmetries makes it possible to dispense with the description of a sequence of symmetry operations in order to demonstrate the equivalence of the scenarios.

[0106] When the number of faults is greater than half the number of units initially provided on a chip, rather than defining a scenario through its units with faults, it may be prudent to define it through its functional units to optimize the size of the scenario identifiers. The technique for computing the identifier Identity will remain valid by relating the positions of the healthy units relative to themselves. For example, from 5 faults, it is more efficient to list the remaining 4 healthy clusters of the 33 2D Torus.

[0107] In a third step of the compression method, the space of the scenarios is explored. During this step, an equivalence class tree is constructed. This operation consists in scanning the space of the possible fault scenarios in order to partition it into equivalence classes according to the definition given during the preceding step. For each scenario scanned, a check to see if the tree does not contain a listed scenario which is equivalent to it is carried out. If that is not the case, a new node is added to the tree (the scenario and its identifier being the representatives of the new equivalence class). The different levels of the tree indicate the different ranges. For example, at the top of the tree, there are the classes of the range with a single fault. In the example of the 33 2D Torus type processor, it is made up of only a single node. The construction of the tree stops when the maximum fault level considered (i.e. the most degraded range) has been reached. This tree is constructed just once for each manycore architecture because it is independent of the applications which will be deployed therein. The supplier of the manycore processor will be able to construct and incorporate this tree in the development tools (for example). The users or developers will also be able to use such trees, for example when designing applications in order to compute and save the mapping and/or routing solutions for each of the nodes of the level which represents the range on which they want to deploy the application.

[0108] In the case of the exploration, given that it is not possible to know previously whether two given scenarios are equivalent or not, the use of an identifier is found to be particularly advantageous: it makes it possible to state that two scenarios are equivalent even though the sequence of symmetry operations necessary to transform one scenario into another remains unknown (not available, not documented, not accessible, etc).

[0109] Optionally, different variants of the explorer can be implemented, for example with the choice of a widthwise or depthwise search, according to different choices of the representative of an equivalence class (for example choosing the one for which the labels of the clusters are the smallest).

[0110] For most manycore processor architectures, the space explored in order to construct the tree of the equivalence classes can itself be reduced. In effect, it is not always necessary to scan the entire space. For the example of the 33 2D Torus, any fault can be brought backby symmetry operationsto be placed (or relabeled) as fault of the cluster 1. In order to reduce the space to be explored, it is therefore possible to consider that the computation unit number 1 is always faulty. Thus, the size of the space to be explored is divided by 9. Optionally, for greater certainty, a test can also be implemented in order to check that the tree of the equivalence classes obtained does indeed cover all the possible scenario cases.

[0111] In a fourth step of the compression method, a relabeling is performed. After the startup of the chip and the detection of the fault scenario, the identifier and therefore the equivalence class to which the chip belongs can be determined according to the technique presented in the second step. Thus, it is possible to extract, from the tree of the scenarios constructed during the phase 3, the equivalent scenario and the mapping/routing solution which are associated with the fault situation of the chip. It will then be necessary to adapt this equivalent solution to the scenario of the chip. This adaptation constitutes the relabeling step or phase.

[0112] This relabeling step is executed on the chip since there is a need to know the fault scenario. In an embedded execution context, this phase can be optimized because of or for resources that are limited in computation and memory terms (low complexity in execution time and space).

[0113] In some cases, the NoC architecture can be the same for all the manycore processors (when, by assumption, the NoC is entirely functional or fault-free). The relabeling step proposed consumes little in computation terms. In effect, contrary to the reasons which urge relativizing the use of an inference engine, the fact of knowing previously that two scenarios are equivalent (by computing their identifier and therefore the order of formulation of the faults) makes possible the determination of a sequence of symmetry operations which is both generic and effective for bringing a scenario back to another which is equivalent to it. For example, for a 44 2D Torus, the algorithm requires fewer than 3 instructions to detect the sequence and thus relabel a scenario to its equivalent. For that, from the moment when the first fault is known according to the order which has been obtained for the computation of the identifier, it is sufficient to bring this fault back to the fault 1 by performing the necessary vertical and horizontal shifts. Then, the axial rotation (if necessary) will determine the rest of the faults. This algorithm is of O(1) complexity, that is to say the lowest there is.

[0114] FIGS. 7A, 7B and 7C illustrate different aspects of the method. FIG. 7A illustrates examples of operations generally performed offline (processor off). Generally, these operations do not depend on the software applications mapped onto said processor. The symmetries are identified in the step 711. The symmetrical links are labeled in the step 712. The symmetrical inter-cluster distance matrix is constructed in the step 713. The exploration of the scenarios (equivalence classes and identifiers) is performed in the step 714. The storage of the identifiers and of the representatives of the equivalence classes is performed in the step 715.

[0115] FIG. 7B illustrates examples of operations generally performed offline (processor off), for the deployment of a software application for example. In the step 721, the scenarios representative of classes and their identifiers are received or identified or computed. In the step 722, for each class representative scenario, the appropriate mapping/routing solution is computed. In the step 723, the solutions are stored (each solution can specify the scenario for which it has been computed).

[0116] FIG. 7C illustrates examples of operations generally performed at the time of, or after, the startup of the chip (for example after the deployment of an application according to FIG. 7B). The steps itemized below can be performed (some steps may be optional or modifiable or modified or the required data received from external modules, etc). Step 731: startup of the chip. Step 732: identification of the fault scenario. Step 733: computation of the identifier of the fault. Step 734: recovery of the scenario representing the class bearing the same identifier as that of the fault scenario observed on the processor. Step 735: computation of one or more sequences of symmetry operations making it possible to bring the fault scenario of the processor back to that of the representative of its equivalence class. Step 736: computation of the relabeling from the sequence of symmetry operations having been obtained. Step 737: relabeling of the mapping/routing solution associated with the representative of the class. Step 738: deployment of the application according to the relabeled mapping/routing solution.

[0117] The invention will be advantageously applicable for Cloud Computing, distributed computation environments, Grid Computing, etc. In addition to the virtualization of applications (eliminating the specifics of code execution on a stock of heterogeneous processors), the invention can in fact make it possible to manage faults over a set of processors, because of the topological abstraction which is performed to exploit the symmetries. In a simplified example, a processor with 800 cores can be topologically equivalent to two hundred processors with 4 cores. Consequently, the exploitation of the symmetries in both cases carry the same advantages, namely static reconfigurations (packaging in ranges, pre-marketing) even dynamic reconfigurations during runtime.

[0118] The present invention can be implemented from hardware and/or software elements. It can be available as computer program product on a computer-readable medium. The medium can be electronic, magnetic, optical, electromagnetic or be a broadcast medium of infrared type.

ENCODING OF FAULT SCENARIOS OF A MANYCORE PROCESSOR

Inventors

Cpc classification

Classification Explorer

G01R31/31704

PHYSICS

Classification Explorer

G01R31/31703

PHYSICS

Classification Explorer

G06F11/1428

PHYSICS

Classification Explorer

G01R31/31718

PHYSICS

Classification Explorer

G06F11/0724

PHYSICS

Classification Explorer

G01R31/3177

PHYSICS

Classification Explorer

G06F11/203

PHYSICS

Classification Explorer

G06F11/2028

PHYSICS

Classification Explorer

G06F11/2051

PHYSICS

Classification Explorer

G06F11/0793

PHYSICS

International classification

Classification Explorer

G01R31/317

PHYSICS

Classification Explorer

G01R31/3177

PHYSICS

Abstract

Claims

Description