SYSTEM FOR ASSISTING AIRCRAFT FAULT RESOLUTION
20200183926 ยท 2020-06-11
Inventors
Cpc classification
G05B2219/45071
PHYSICS
B64D45/00
PERFORMING OPERATIONS; TRANSPORTING
B64D2045/0085
PERFORMING OPERATIONS; TRANSPORTING
G06F21/6227
PHYSICS
G06F16/2465
PHYSICS
G05B23/0227
PHYSICS
B64F5/60
PERFORMING OPERATIONS; TRANSPORTING
G06F21/6254
PHYSICS
International classification
G06F16/2458
PHYSICS
B64D45/00
PERFORMING OPERATIONS; TRANSPORTING
Abstract
A system for assisting aircraft fault resolution by statistical inference of big data includes secure equipment including a plurality of databases storing big data concerning variables monitored during aircraft monitoring as well as an aggregator module. The system also includes an analyst module, outside the secure equipment, in communication with the aggregator module. The analyst module is used to define a statistical query to be processed by the aggregator module, which performs a statistical inference on the big data stored in the plurality of databases in order to respond to the statistical query. The aggregator module checks that the result of the statistical inference anonymizes the big data in question, and transmits the result of the statistical inference to the analyst module. Thus, the confidentiality of the big data is respected.
Claims
1. A system for assisting aircraft fault resolution by statistical inference of big data, comprising: a plurality of databases storing big data concerning variables monitored during aircraft monitoring, a device implementing an aggregator module tasked with querying the databases in a framework of a statistical inference, secure equipment including the device implementing the aggregator module, as well as the plurality of databases, to prevent external access to the big data; and a device implementing an analyst module, outside the secure equipment, in communication with the aggregator module; wherein the analyst module comprises: means for defining a statistical query, the statistical query requiring searching for a possible correlation between at least one event relating to a fault and at least one of the variables among the big data in the plurality of databases; and means for transmitting the statistical query to the aggregator module; and wherein the aggregator module comprises: means for performing a statistical inference on the big data stored in the plurality of databases in order to respond to the statistical query; means for checking that a result of the statistical inference anonymizes the big data in question; means for transmitting the result of the statistical inference to the analyst module for a case in which the result of the statistical inference anonymizes the big data in question; and means for rejecting the statistical query for the case in which the result of the statistical inference does not anonymize the big data in question.
2. The system according to claim 1, whreein the statistical query comprises a context defining a framework for searching the big data in the plurality of databases, the context specifying whether the statistical inference relates to all the aircraft for which big data is present in the plurality of databases or only to a subset, and the aggregator module comprises means for limiting the statistical inference to the context.
3. The system according to claim 2, wherein the aggregator module comprises: means for checking that the context is not defined with respect to big data parameters excluded for confidentiality reasons, and means for rejecting the statistical query for a case in which the context is defined with respect to big data parameters excluded for confidentiality reasons.
4. The system according to claim 2, wherein the aggregator module checks that the plurality of databases contains at least K times more samples in the context than occurrences of each event considered in the statistical query, where K is a non-zero positive integer.
5. The system according to claim 2, wherein the analyst module comprises means for exporting a graphical user interface providing the following components: Context components to define the context; Event components to select at least one event to form the statistical query; Variable components to select at least one so-called variable to form the statistical query; and Combiner components to formulate the statistical query from Context, Event and Variable components, wherein the graphical user interface additionally provides the following operators: Logic operators to perform combinatorial operations between Variable components; Time operators to fix timeframes on Variable components and on Event components; Union operators to merge Variable components, combine Event components and combine Context components; and Filter operators to filter Variable components, filter Event components and filter Context components.
6. The system according to claim 1, wherein the plurality of databases is completed by at least one database storing big data in open access arrangement.
7. The system according to claim 1, wherein the aggregator module comprises means for storing in at least one dedicated database of the secure equipment private information supplied via the analyst module, and wherein the aggregator module additionally performs the statistical inference by exploiting the private information.
8. The system according to claim 1, wherein the aggregator module supplies to the analyst module a result of the statistical inference formed as a contingency table for each variable targeted by the statistical query.
9. The system according to claim 8, wherein the analyst module comprises: means for determining, for each contingency table, probability deviation values, with respect to a theoretical distribution of observations of the variable in question; means for determining, for each contingency table, values of test strength; means for classifying a content of the contingency tables as a function of the probability deviation values and test strength values; a classification indicating whether the content in question shows whether or not the variable considered is correlated with the event or events targeted by the statistical query, or whether an inference result is inconclusive.
10. The system according to claim 8, wherein the analyst module comprises means for producing a volcano plot visualization of a content of each contingency table.
11. A method for assisting aircraft fault resolution by statistical inference of big data, the method being implemented by a system comprising: a plurality of databases storing big data concerning variables monitored during aircraft monitoring; a device implementing an aggregator module tasked with querying the databases in a framework of a statistical inference; secure equipment including the device implementing the aggregator module as well as the plurality of databases so as to prevent external access to the big data; and a device implementing an analyst module, outside the secure equipment, in communication with the aggregator module; the method comprising the following steps implemented by the analyst module: defining a statistical query, the statistical query requiring searching for a possible correlation between at least one event relating to a fault and at least one of the variables among the big data in the plurality of databases; and transmitting the statistical query to the aggregator module; wherein the method comprises the following steps implemented by the aggregator module: performing a statistical inference on the big data stored in the plurality of databases in order to respond to the statistical query; checking that a result of the statistical inference anonymizes the big data in question; transmitting the result of the statistical inference to the analyst module for the case in which the result of the statistical inference anonymizes the big data in question; and rejecting the statistical query for a case in which the result of the statistical inference does not anonymize the big data in question.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0023] The abovementioned features of invention, as well as others, will become clearer upon reading the following description of at least one example embodiment, the description being made with reference to the appended drawings, in which:
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0030]
[0031] The system for assisting aircraft fault resolution includes a first device implementing an analyst module ANA 110.
[0032] The system for assisting aircraft fault resolution includes a second device implementing an aggregator module AGG 121. The second device is part of secure equipment SEC 120.
[0033] For example, the first device is a PC (Personal Computer) and the second device is a supercomputer.
[0034] The aggregator module AGG 121 accesses a plurality of databases DB 130. At least one of these databases contains big data concerning variables monitored during aircraft monitoring, and more particularly relating to incidents that have arisen on these aircraft, and for which the contents must not be, and are not, displayed outside the environment of the secure equipment SEC 120. The secure equipment SEC 120 prevents any external access to the big data. In
[0035] One or more databases DB1 131, containing big data for which the contents are set up as open access, can enrich the plurality of databases DB 130. For example, the database DB1 131 contains meteorological big data indicating what the meteorological conditions were at such or such place at such or such moment.
[0036] The first device implementing the analyst module ANA 110 and the second device implementing the aggregator module AGG 121 are connected by a communication line. Since the information of confidential nature in the plurality of databases DB 130 does not leave the environment of the secure equipment SEC 120, this communication line does not need to be secure. The aggregator module AGG 121 and the databases DB2 132, DB3 133 and DB4 134 can be connected by secure tunnels, thereby providing for distributing the secure equipment SEC 120 over several sites.
[0037] The analyst module ANA 110 provides an input interface for defining a statistical query. The statistical query requires searching among the big data in the plurality of databases DB 130 for a possible correlation between at least one event relating to a fault and at least one variable monitored as part of aircraft monitoring. The statistical query preferably includes a context defining a framework for searching the big data in the plurality of databases DB 130, the context specifying whether the statistical inference relates to all the aircraft for which big data is present in the plurality of databases DB 130 or only to a subset.
[0038] According to an example, this input interface is a file in which the statistical query is formulated. Preferably, the first device is equipped with a screen and a control interface, such as a mouse-keyboard set. Thus, a human operator can formulate the statistical query by combining various components and operators, using as input interface a graphical user interface GUI exported by the analyst module ANA 110. Thus, the GUI of the analyst module ANA 110 provides the following components:
[0039] 1. Context
[0040] 2. Event
[0041] 3. Variable
[0042] 4. Combiner
[0043] The Context components provide for defining an aircraft context to be considered in the plurality of databases DB 130, for example: all the aircraft, climbing aircraft, specific aircraft models, etc.
[0044] The Event components provide for selecting events to be found in the plurality of databases DB 130, for example: time-stamped list of occurrences, particular fault for example identified by a specific code, etc.
[0045] The Variable components provide for selecting parameters to evaluate by statistical inference with respect to one or more defined events, for example: age of the aircraft or of a particular part, data of a particular sensor, rough landing, vibrations, etc. Each parameter is associated with at least one condition with respect to a value or a field of values to define a status that can be expressed at any moment from parameters of any population sample of the context considered by true, false or unknown. According to an example, a low altitude variable LOW_ALTITUDE relates to the altitude parameter associated with the condition <1000 feet (approximately 300 meters), a middle altitude variable MIDDLE_ALTITUDE relates to the altitude parameter associated with the condition 1000 feet (approximately 300 meters) and <25000 feet (7620 meters), and a high altitude variable HIGH_ALTITUDE relates to the altitude parameter associated with the condition 25000 feet (7620 meters). According to another example, a young age variable YOUNG relates to the age parameter associated with the condition <730 days, a middle age variable MIDDLE_AGE relates to the age parameter associated with the condition 730 days and <4380 days, and an old age variable OLD relates to the age parameter associated with the condition 4380 days.
[0046] Variables of the same type (age, altitude, etc.) can be grouped together in the same Variable components. For example, the variables YOUNG, MIDDLE_AGE and OLD can be grouped together in the same component, in order to easily define a statistical query which relates to a statistical inference simultaneously covering these three age brackets considered.
[0047] The Combiner components provide for assembling the Event components and the Variable components in a big data population defined by the Context components in order to formulate a statistical query, typically: Is there a correlation between VARIABLES and EVENTS considering the defined CONTEXT?
[0048] In addition to the above components, the GUI of the analyst module ANA 110 preferably provides operators, which enable the statistical query to be made more complex and therefore refined:
[0049] 1. Logic (logic AND, logic OR, EXCLUSIVE OR, etc.)
[0050] 2. Time (minimum duration, lag, etc.)
[0051] 3. Union
[0052] 4. Filter
[0053] The Logic operators provide for performing a combinatorial operation between Variable components.
[0054] The Time operators provide for fixing a timeframe to an Event component or a Variable component.
[0055] The Union operators provide for merging Context components, merging Event components and merging Variable components.
[0056] The Filter operators provide for filtering Context components, filtering Event components and filtering Variable components.
[0057] An example is schematically illustrated in
[0058] An Event component, indicated as EVT 320, is defined therein. A certain type of event is thus selected, representative of a particular fault. For example, the component EVT 320 identifies an EDP (Engine Driven Pump) fault.
[0059] A Variable component, indicated as VAR1 331, is defined therein. A first parameter is thus selected to be evaluated by statistical inference with respect to the event described in the component EVT 320. For example, this first parameter concerns age. Another Variable component, indicated as VAR2 332, is defined therein. A second parameter is thus selected to be evaluated by statistical inference with respect to the event described in the component EVT 320. For example, this second parameter concerns altitude.
[0060] A Union operator, indicated as U 350, is defined therein. The operator U 350 is applied to the component VAR2 332 after application of the operator T 340 and to the component VAR1 331. The age and altitude parameters are then simultaneously evaluated by statistical inference.
[0061] A Combiner component, indicated as COMB 360, is defined therein. The component COMB 360 defines the statistical query: Is there a correlation between VAR1 or VAR2 and EVT considering the context CTXT? This example provides for attempting to determine whether the EDP faults are statistically linked to the age of the EDP and/or to atmospheric pressure matters.
[0062] Once formulated, the statistical query is transmitted by the analyst module ANA 110 to the aggregator module AGG 121. The aggregator module AGG 121 then processes the statistical query by analyzing the big data stored in the plurality of databases DB 130. This aspect is detailed hereafter with reference to
[0063] The analyst module ANA 110 provides an output interface for providing the response to the statistical query posed. According to an example, this output interface is a file. Preferably, the GUI of the analyst module ANA 110 provides the response to the statistical query posed, for example in the form of a volcano plot visualization. This provides for more easily and rapidly identifying the variables of interest, particularly when the statistical query relates to a combination of a multitude of variables. This aspect is detailed hereafter with reference to
[0064] In a particular embodiment, one or more dedicated databases of the plurality of databases DB 130 serve to enrich the context of the statistical query with private information. This enriching can be carried out by the analyst module ANA 110, through which the private data is supplied to the aggregator module AGG 121 which is then tasked with storing the private data in a dedicated database. The dedicated database or databases are included in the secure equipment SEC 120 and can then be accessed only as part of a statistical inference, in order to respect their confidentiality. For example, assume an expert wishes to study the impact of solar radiation on aircraft equipment for which faults have been picked up. The expert has a model for calculating the solar flux as a function of time and geographic position, but the expert does not have direct access to a sufficient amount of position data for aircraft that have endured this fault. The expert can then import the model as private data and thus enrich the big data with the solar flux parameter. The expert still does not have direct access to the resulting big data, but statistical queries can then be posed on a possible correlation between the effect of the solar flux and/or several events relating to the faults picked up.
[0065]
[0066] The device DEV 200 includes, connected by a communication bus 210: a processor 201; random-access memory 202; read-only memory (ROM) 203, or EEPROM (Electrically-Erasable Programmable Read Only Memory); a storage unit 204, such as a hard disk drive HDD, or a storage medium reader, such as an SD (Secure Digital) card reader; and an input-output interface manager 205.
[0067] For the first device implementing the analyst module ANA 110, the input-output interface manager 205 provides for communicating with the second device implementing the aggregator module AGG 121. Preferably, the input-output interface manager 205 provides for interacting with a human operator, as already described.
[0068] For the second device implementing the aggregator module AGG 121, the input-output interface manager 205 provides for communicating with the first device implementing the analyst module ANA 110, as well as with the plurality of databases DB 130.
[0069] The processor 201 is capable of executing instructions loaded into the random-access memory 202 from the read-only memory 203, from an external memory, from a storage medium (such as an SD card) or from a communication network. When the device DEV 200 is powered up, the processor 201 is capable of reading instructions from the random-access memory 202 and executing them. These instructions form a computer program bringing about the implementation, by the processor 201, of all or part of the algorithm and steps described hereafter.
[0070] All or part of the algorithms and steps described hereafter can thus be implemented in software form by the execution of a set of instructions by a programmable machine, for example a DSP (Digital Signal Processor) processor or a microcontroller, or be implemented in hardware form by a dedicated component or machine, for example an FPGA or ASIC component. Generally, the device DEV 200 includes electronic circuitry adapted and configured to implement, in software and/or hardware form, the algorithms and steps described hereafter in relation to the device DEV 200 in question.
[0071]
[0072] In a step 400, the analyst module ANA 110 defines a statistical query.
[0073] Step 400 can be detailed by a set of steps 401 to 405.
[0074] In step 401, the analyst module ANA 110 acquires a context definition, preferably via a Context component or several Context components connected by an operator.
[0075] In step 402, the analyst module ANA 110 acquires a selection of at least one event, preferably via an Event component or several Event components connected by an operator. When several events are thus selected, these events are typically suspected of having the same triggering cause.
[0076] A list of events available to the analyst module ANA 110 can depend on the context defined at step 401, i.e., the context must not have samples of big data for which no information concerning this event has been listed. For example, if the events have been recorded only for one model of aircraft, the context should contain big data samples only for this model of aircraft. In a particular embodiment, once the context is defined, the analyst module ANA 110 queries the aggregator module AGG 121 to determine which events have been logged for the defined context, the aggregator module AGG 121 searching (or having searched) the plurality of databases DB 130 in order to do this.
[0077] In step 403, the analyst module ANA 110 acquires a selection of at least one variable, preferably via a Variable component or several Variable components connected by one or more Logic or Union or Filter operators.
[0078] In step 404, the analyst module ANA 110 can acquire a timeframe definition on one or more variables and/or one or more events, preferably via one or more Time operators.
[0079] In step 405, the analyst module ANA 110 acquires a definition for combining the context (step 401), the event or events (step 402), the variable or variables (step 403), and possibly the timeframe (step 404), in order to formulate the statistical query.
[0080] Step 400 is followed by a step 410 in which the analyst module ANA 110 transmits to the aggregator module AGG 121 the statistical query formulated at step 400. The method applied by the aggregator module AGG 121 is detailed hereafter with reference to
[0081] Then, in a step 420, the analyst module ANA 110 obtains the return from the aggregator module AGG 121 concerning the statistical query transmitted at step 410. As detailed hereafter, the aggregator module AGG 121 may have formulated a rejection or have returned a statistical result. The analyst module ANA 110 then carries out the appropriate processing (visualization, saving in a file, etc.). A particular embodiment is detailed hereafter with reference to
[0082]
[0083] In a step 501, the aggregator module AGG 121 receives the statistical query transmitted by the analyst module ANA 110 at step 410.
[0084] In a step 502, the aggregator module AGG 121 checks the acceptability of the statistical query. The aggregator module AGG 121, in particular, checks that the context is not defined with respect to big data parameters excluded for confidentiality reasons. For example, contexts limited to particular aircraft (identified by their manufacturer serial number MSN) or of a particular airline can be prohibited. According to another example, the aggregator module AGG 121 checks that the plurality of databases DB 130 contains many more samples in the defined context than occurrences of each event considered in the statistical query. In other words, the aggregator module AGG 121 checks that the plurality of databases DB 130 contains at least K times more samples in the defined context than occurrences of each event considered in the statistical query, where K is a non-zero positive integer, for example equal to 100 or 1000. According to yet another example, the aggregator module AGG 121 checks that the big data population concerned for the context of the statistical query is sufficient to provide for ensuring that the result of the statistical inference does not prejudice the confidentiality of the big data, i.e., the big data population thus concerned is greater than a predefined threshold.
[0085] In a step 503, the aggregator module AGG 121 determines whether the acceptability check for the statistical query is positive. If that is the case, a step 505 is carried out; otherwise, the aggregator module AGG 121 rejects the statistical query from the analyst module ANA 110 in a step 504 and the algorithm is ended.
[0086] In a step 505, the aggregator module AGG 121 queries the plurality of databases DB 130 to respond to the statistical query. The resulting processing time increases linearly with the quantity of samples considered in the population of the context defined in the statistical query. The aggregator module AGG 121 performs a statistical inference to respond to the statistical query.
[0087] The theoretical distribution of an event in a population is equivalent to that of a thrown die, where the result of the throw of the die is the occurrence of the event (true, false or unknown). By knowing the proportion of true, false or unknown in the population considered, binomial tests can be performed. A binomial test is an exact test of the statistical significance of deviations from a theoretical distribution of observations.
[0088] Given that the occurrence time interval or the uncertainty of the instant of the occurrence of an event can cover several samples, each sample has a weight of 1/n with respect to this event, where n is the quantity of samples covered by the event in question. This approach advantageously provides for producing an average count in a contingency table computation (floating values). Recall that contingency tables are matrices, potentially multi-variable, which show the frequency distribution of the variable or variables considered. The aggregator module AGG 121 then performs the following aggregations on the context samples for each variable of the statistical query: [0089] Quantity of true; and [0090] Quantity of false.
[0091] Furthermore, the aggregator module AGG 121 performs the following aggregations on the samples appearing during the occurrence of each event: [0092] Sum of the weights of the samples with the variable at true; and [0093] Sum of the weights of the samples with the variable at false.
[0094] In a step 506, the aggregator module AGG 121 checks that the result of the statistical inference performed at step 505 anonymizes the big data concerned, i.e. it is not possible to deduce therefrom for example which airline the big data that has been used to obtain this result is from. In a particular embodiment, the aggregator module AGG 121 anonymizes the result. For example, generic pseudonyms can be used to mask the origin of the big data. An airline name can thus be replaced by the term AIRLINE followed by a random number assigned by the aggregator module AGG 121. However, this approach can be used only when the quantities of samples for airlines are similar in the defined context, to avoid being able to trace back the name of the airline in question by relying on the quantity of samples.
[0095] In a step 507, the aggregator module AGG 121 determines whether the check that the result of the statistical inference anonymizes the big data concerned (or whether the big data has been anonymized) is positive. If that is the case, a step 509 is carried out; otherwise, the aggregator module AGG 121 rejects the statistical query from the analyst module ANA 110 in a step 508 and the algorithm is ended.
[0096] In step 509, the aggregator module AGG 121 sends to the analyst module ANA 110 the result of the statistical inference, i.e. the response to the statistical query. Preferably, the result of the statistical inference takes the form of a contingency table for each variable targeted by the statistical query.
[0097]
[0098] In a step 601, the analyst module ANA 110 receives the result of the statistical inference performed by the aggregator module AGG 121 in the form of a contingency table for each variable indicated in the statistical query previously formulated by the analyst module ANA 110 (see
[0099] In a step 602, the analyst module ANA 110 determines, for each contingency table, probability deviation values, called p-values, with respect to a theoretical distribution of observations of the variable concerned.
[0100] In a step 603, the analyst module ANA 110 determines, for each contingency table, values of test strength.
[0101] In a step 604, the analyst module ANA 110 classifies the content of the contingency tables as a function of the probability deviation values and test strength values. The classification indicates whether the content in question shows whether or not the variable considered is correlated with the event or events targeted by the statistical query, or whether the inference result is inconclusive. When the probability deviation values are lower than a predefined threshold TH1, for example 5%, the variable in question and the event or events considered in the statistical query previously formulated by the analyst module ANA 110 (see
[0102] In a step 605, the analyst module ANA 110 can perform a volcano plot visualization. A volcano plot is a type of scatter diagram which plots statistical significance (as ordinates, Y-axis) as a function of the statistical effect called fold change (as abscissas, X-axis). On the volcanic plot, the ordinates (Y-axis) typically represent the opposite of log.sub.10 of the p-values and the abscissas (X-axis) log.sub.2 of the statistical effect.
[0103] While at least one exemplary embodiment of the present invention(s) is disclosed herein, it should be understood that modifications, substitutions and alternatives may be apparent to one of ordinary skill in the art and can be made without departing from the scope of this disclosure. This disclosure is intended to cover any adaptations or variations of the exemplary embodiment(s). In addition, in this disclosure, the terms comprise or comprising do not exclude other elements or steps, the terms a or one do not exclude a plural number, and the term or means either or both. Furthermore, characteristics or steps which have been described may also be used in combination with other characteristics or steps and in any order unless the disclosure or context suggests otherwise. This disclosure hereby incorporates by reference the complete disclosure of any patent or application from which it claims benefit or priority.