METHOD FOR AUTOMATICALLY PROCESSING A NUMBER OF LOG FILES OF AN AUTOMATION SYSTEM

20170017693 · 2017-01-19

Assignee

Inventors

Cpc classification

International classification

Abstract

A method for automatically processing log files of different types of an automation system, said method determining a message part of a data set of the log file. All contents of the respective data sets of the log files are concatenated. The data sets of all the log files are summarized in a summary file. The data sets of all the log files are chronologically sorted in the summary file. The data sets are compressed in the summary file. The compressed data sets in the summary file are coded. The compressed data sets are assigned to groups with associated group codes. The group code of the assigned group is decoded. The decoded group code is output as the alphanumeric message part of the log file. The alphanumeric message part of the log file is stored in a memory.

Claims

1.-15. (canceled)

16. A method for automatically processing log files of different types of an automation system of a technical facility, said method comprising: concatenating all contents of a data set of a log file; summarizing data sets of all log files in a summary file; chronologically sorting the data sets of the log files in the summary file; compressing the data sets in the summary file; coding the compressed data sets in the summary file by a numeric, alphabetic or alphanumeric code; sorting and assigning the compressed data sets established on the basis of the numeric, alphabetic or alphanumeric code to groups with associated group codes; decoding the group code of the assigned one of the groups; outputting the decoded group code as an alphanumeric message part of the log file; and storing the alphanumeric message part of the log file in a memory.

17. The method of claim 16, wherein during compression of the data set, the data set is processed such that a data length and/or a data content of the data set is reduced.

18. The method of claim 17, further comprising automatically inserting spaces for special characters contained in the data set and for at least two or more consecutive spaces resulting therefrom, removing consecutive spaces extending beyond a single space from the data set.

19. The method of claim 16, further comprising resolving the concatenation of the contents of the data set and removing resultant separate alphanumeric contents with fewer than four characters.

20. The method of claim 19, further comprising analyzing and unifying the resultant separate alphanumeric contents, assigning a reference word to each of the separate alphanumeric content and assigning a numeric code to each reference word.

21. The method of claim 20, further comprising converting the summary file with the separate alphanumeric contents established on the basis of the reference word and/or the numeric code into a two-dimensional alphanumeric code matrix.

22. The method of claim 21, wherein the numeric codes of the two-dimensional alphanumeric code matrix are sorted chronologically and/or as a function of character length.

23. The method of claim 21, wherein similar ones of the numeric codes of the two-dimensional alphanumeric code matrix are grouped by using a similarity operation.

24. The method of claim 23, wherein the similarity operation is a Levenshtein distance.

25. The method of claim 24, wherein when the numeric code being considered has a Levenshtein distance of zero to one of the groups, said numeric code being considered is assigned to said one of the groups.

26. The method of claim 24, wherein when the numeric code being considered has a Levenshtein distance of one to one of the groups, and when the numeric code being considered differs by only one digit from the one of the groups, said numeric code being considered is assigned to said one of the groups.

27. The method of claim 24, wherein a new numeric code being considered is compared to non-grouped numeric codes of a same length or to non-grouped numeric codes having a greater length of a maximum of one character.

28. The method of claim 27, wherein when the numeric code being considered has a Levenshtein distance of one to another non-grouped numeric code, a new group is formed from the numeric code being considered and the non-grouped numeric code such that when the numeric code being considered and the non-grouped numeric code are of different lengths, the new group is assigned a shorter numeric code, and when the numeric code being considered and the non-grouped numeric code are of same length with at least one different digit, the new group is assigned the numeric code without the different digit.

29. The method of claim 24, wherein when the numeric code being considered is determined to have no Levenshtein distance to any other non-grouped numeric code that are less than two, the numeric code being considered is assigned to a new group.

30. A technical facility, comprising an automation system including log files of different types, said automation system configured to: concatenate all contents of a data set of a log file; summarize data sets of all log files in a summary file; chronologically sort the data sets of the log files in the summary file; compress the data sets in the summary file; code the compressed data sets in the summary file by a numeric, alphabetic or alphanumeric code; sort and assign the compressed data sets established on the basis of the numeric, alphabetic or alphanumeric code to groups with associated group codes; decode the group code of the assigned one of the groups; output the decoded group code as an alphanumeric message part of the log file; and store the alphanumeric message part of the log file in a memory.

31. An automation system of a technical facility, comprising: an operation and monitoring level; an automation level; a field level; and a communication system configured to connect the operation and monitoring level, the automation level and the field level to one another, at least one of the operation and monitoring level, the automation level and the field level being configured to: concatenate all contents of a data set of a log file; summarize data sets of all log files in a summary file; chronologically sort the data sets of the log files in the summary file; compress the data sets in the summary file; code the compressed data sets in the summary file by a numeric, alphabetic or alphanumeric code; sort and assign the compressed data sets established on the basis of the numeric, alphabetic or alphanumeric code to groups with associated group codes; decode the group code of the assigned one of the groups; output the decoded group code as an alphanumeric message part of the log file; and store the alphanumeric message part of the log file in a memory.

Description

[0037] The characteristics, features and advantages of this invention described above and also the manner in which these are achieved will become clearer and more easy to understand in conjunction with the following description of exemplary embodiments, which will be explained in greater detail in conjunction with the drawings, in which:

[0038] FIG. 1 shows a schematic of an exemplary embodiment for an automation system for control and monitoring of a technical facility, and

[0039] FIG. 2 shows a schematic of an exemplary embodiment for a method for processing log files that are generated in the automation system.

[0040] Parts that correspond to one another are provided with the same reference characters in all figures.

[0041] FIG. 1 shows a schematic block diagram of an exemplary embodiment for an automation system 1 of a technical facility, such as a power plant or a chemical plant.

[0042] The automation system 1, in a process level AS (also called the automation level), comprises a number of automation devices 2, which are connected to one another and communicate with one another and with a control unit 4 in an operation and monitoring level BB. In addition the automation devices 2 are connected via a field bus 5 to field devices 6 at a field level FE and communicate with one another,

[0043] The data bus 3 can be an Ethernet bus or another suitable data transmission unit for example. The field bus 5 can be an Ethernet-based field bus 5 for example or can be embodied wired or wirelessly as another suitable data transmission unit. The data bus 3 and/or the field bus 5 can form a computer network in this case.

[0044] The automation devices 2 can be constructed both as freely-programmable and also as stored-program processors, especially as a processor unit, and in each case control, regulate and/or monitor a number of component groups 7 or subsystems of the individual parts of the technical facility. In particular the control, regulation and/or monitoring of the technical facility also includes an analysis and other processing of the data, such as measurement signals, control signals, input and output signals, intermediate signals, stored data, processing data.

[0045] The control units 4 can be data processing units, such as a personal computer or another suitable operation and monitoring unit, by means of which the technical facility is monitored and controlled.

[0046] The field devices 6 can for example be compact, freely-programmable or stored-program controls, especially a processor unit and/or at least one partly hard-wired or logical circuit arrangement, which control, regulate and/or monitor individual components 8 of parts of plants, such as measurement probes (sensors) and/or setting elements (actuators) and which are connected for the purposes of communication by means of the field bus 5 to the automation device 2 (also called control device).

[0047] During operation of a technical facility, such as a power plant, large volumes of data in the form of log files P1 to Pn, which comprise control commands, status messages and/or fault messages, which are triggered by control interventions in the control unit 4 or are displayed on screens or on other display units of the control unit 4, are moved via the data bus 3 and the field bus 5. In such cases, in particular during commissioning and during startup/shutdown of the facility, a plurality of control interventions are required that result in a correspondingly large number of responses to be considered. This leads to a not inconsiderable load on the operating personnel.

[0048] FIG. 2 shows a schematic of an exemplary embodiment for a method for processing, especially pre-processing, of log files P1 to Pn, which are generated in the automation system 1 and are transmitted via the data bus 3 and/or the field bus 5 and are exchanged between the units of the automation system, especially between the control units 4, the automation devices 2 and the field devices 6.

[0049] The log files P1 to Pn each comprise at least one acquisition time stamp and a text, especially a message, a notification, a status text, which describe at least one event. In addition the log files P1 to Pn can include as text a system time, a version/revision number and further information. In such cases for example a number of log files P1 to Pn can describe one event or a number of events. Various *log, *xml, *txt, *info and/or *dmp files are created and generated as log files P1 to Pn in an automation system 1 of a technical facility for example.

[0050] In at least one of the components of the automation system 1 a computer program product in particular is implemented, which is able to be loaded directly into a memory of a digital computer, such as a control unit 4, comprising program code sections that are suitable for executing steps of the method described below. As an alternative the computer program product can also be loaded into an automation unit 2.

[0051] As an alternative a computer-readable storage medium, e.g. any given memory, can be provided, which comprises instructions (e.g. in the form of program code) able to be executed by a control unit 4, which are suitable for the control unit 4 to carry out steps of the method described below.

[0052] The method implemented as a log file compressor 9 for automatic processing, especially pre-processing, of the plurality of generated, especially heterogeneous or incompatible log files P1 to Pn, comprises at least the following steps: [0053] Determining at least one message part N1 to Nn of each data set D1 to Dn of a log file P1 to Pn by: [0054] Concatenating all the contents of the respective data set D1 to Dn of the log file P1 to Pn, [0055] Summarizing the data sets D1 to Dn of all log files P1 to Pn in a summary file SD, [0056] Chronological sorting of the data sets D1 to Dn of all log files P1 to Pn in the summary file SD, [0057] Compressing each data set D1 to Dn in the summary file SD, [0058] Coding the compressed data sets kD1 to kDn in the summary file SD by at least one or more numeric code(s) Kx, [0059] Sorting and/or assigning (especially grouping) of the compressed data sets kD1 to kDn, on the basis of the established numeric code(s) Kx, to at least one group Gy that has at least one associated group code KGy, and [0060] Outputting the alphanumeric message part N1 to Nn of the log file P1 to Pn by decoding the group code KGy of the assigned group Gy and output of the decoded group code KGy.

[0061] The message part N1 to Nn of each data set D1 to Dn of the log files P1 to Pn comprises text messages, such as notifications, statuses, texts, information, warnings etc. for example. The respective data set D1 to Dn can take the form of a table or a database or another suitable form with field divisions for example.

[0062] The message part N1 to Nn of the respective log file P1 to Pn can be contained for example in one or more fields of the associated data set D1 to Dn. Further fields of the data set D1 to Dn contain log and/or system information, such as for example system time, version number, revision number etc.

[0063] The inventive method makes provision for the contents of the fields of the respective data set D1 to Dn of each log file P1 to Pn to be concatenated in a first step and subsequently for all concatenated data sets D1 to Dn of all log files P1 to Pn to be summarized in a summary file SD, especially an individual table or database. For example two log files P1 and P2 comprise the following contents or entries:

Log File P1:

[0064]

TABLE-US-00001 28.03.2014 ABC 21.03.2014 ZDF

Log File P2:

[0065]

TABLE-US-00002 27.03.2014 XYZ 01.01.2014 SAP

[0066] After concatenation and summarizing of the contents in the summary file SD, said file summarizes the contents as follows:

TABLE-US-00003 28.03.2014 ABC 27.03.2014 XYZ 21.03.2014 ZDF 01.01.2014 SAP

[0067] Within this especially tabular summary file SD the data sets D1 to Dn with the concatenated contents are sorted on the basis of a respective time stamp in each case, especially chronologically sorted, for example sorted in ascending or descending order.

[0068] The associated time stamp can especially involve an acquisition time stamp for the underlying message, which especially describes an event, such as ambient temperature in permitted range, initial start activated, pump failed, turbine started etc. in the technical facility. As an alternative or in addition the data sets D1 to Dn can be sorted on the basis of alternate or additional log parameters, such as location of acquisition, type of acquisition and/or acquisition period.

[0069] On subsequent compression of the data sets D1 to Dn with the concatenated contents these are processed such that at least the data length and/or the data contents of the respective data set D1 to Dn is/are reduced.

[0070] For example spaces are inserted automatically for special characters, such as for example , #, @, etc., or non-numeric or non-alphabetic characters, -, /, etc. contained in a data set D1 to Dn. Subsequently, in the contents string of the respective data set D1 to Dn, for two or more consecutive spaces, these are automatically reduced to one space by deletion/removal of the number of consecutive spaces extending beyond one individual space. Through this each character that does not describe the event in greater detail is removed from the data set D1 to Dn, especially from the message part N1 to Nn (for example text part, notification part, status part) of the log file P1 to Pn. Thus the message part N1 to Nn is extracted in a simple manner from the system part (for example system time, version, revision) of the log file P1 to Pn, so that the further processing and analysis of the log file is restricted to the content-relevant message part N1 to Nn and is thus greatly simplified and accelerated.

[0071] In a further step the concatenation of the contents of the message part N1 to Nn is then resolved, through which the contents, especially words/phrases, are separated. In such cases the contents are then separated into words/phrases for example.

[0072] For further data compression of the data sets D1 to Dn, each of the separated words/phrases having fewer than four, especially fewer than three, characters is removed from the data set D1 to Dn for example. For example the word is or yes is deleted. This makes it possible, easily and effectively, to compare two or more different entries/contents with one another syntactically (without semantic meaning) and automatically.

[0073] There is also provision for the data sets D1 to Dn, especially their separate alphanumeric contents, such as the separated words/phrases with especially more than three characters, to be analyzed and unified such that each separate alphanumeric content is assigned a reference word Rx and each reference word Rx is assigned a numeric code Kx. Through this the memory requirement for archiving the data sets D1 to Dn is greatly reduced and their analysis is greatly accelerated.

[0074] For example the words and/or phrases of the message contents

[0075] ambient-temperature in permitted range,

[0076] turbine-temperature in green range,

[0077] first-start activated,

[0078] pump failed,

[0079] turbine started,

are converted into the following reference words Rx

[0080] ambient-temperature is permitted,

[0081] turbine-temperature is permitted,

[0082] first-start activated,

[0083] pump failure,

[0084] turbine start,

and/or are converted into the following numeric code Kx:

[0085] 123,

[0086] 423,

[0087] 56,

[0088] 7,

[0089] 8,

[0090] In this case individual words are encoded by means of a single-digit numeric code Kx for example. Phrases with more than one word are encoded by means of a numeric code Kx having a number of digits corresponding to the number of words for example. Identical words and/or phrases are encoded with the same reference word Rx and with the same numeric code Kx. Words and/or phrases of different data sets D1 to Dn with partly matching characters and/or words are encoded with a numeric code Kx matching at least at these digits.

[0091] As an alternative to a numeric code KX an alphabetic code and/or an alphanumeric code can be used. The coding of the data sets D1 to Dn by means of the numeric code Kx has the advantage of a simple and fast sorting and grouping of the data sets D1 to Dn of the log files P1 to Pn.

[0092] For further unification and compression of the data sets D1 to Dn of the various log files P1 to Pn and simple and fast analysis of these data sets D1 to Dn, the tabular summary file SD with the separate alphanumeric contents of all log files P1 to Pn is converted on the basis of the established reference words RX and/or of the numeric code Kx into a two-dimensional, especially alphanumeric, code matrix KM.

[0093] In the two-dimensional code matrix KM the numeric codes Kx are then sorted chronologically, especially in ascending order of time and/or as a function of the respective character length. For example the numeric codes Kx 5632, 543, 64221, 123 are sorted as follows:

[0094] 123, 543, 5632, 64221.

[0095] In addition, in a further step, similar numeric codes Kx of the code matrix KM can be grouped by means of a similarity operation, especially the so-called Levenshtein distance, especially assigned to at least one group Gy (cluster).

[0096] In this case each group Gy is described or represented by an associated group code KGy. The associated group code KGy in this case can be generated from at least the numeric code Kx of a first data set D1 to Dn that will be assigned to this group Gy. As an alternative the respective group code KGy of one or more groups Gy can be predetermined.

[0097] For example, for a Levenshtein distance to a group Gy established as zero for a new numeric code Kx to be considered, this numeric code Kx to be considered can be assigned to this group Gy, especially with the group code KGy representing this group Gy.

[0098] On the other hand, for a Levenshtein distance of one to a group Gy for a numeric code Kx to be considered and with a difference of the numeric code Kx to be considered at only one digit from this one group Gy, this numeric code Kx to be considered is assigned to this one group Gy. For example for a group Gy with an associated group code KGy of 12, this group will be assigned the numeric codes Kx with the following digits 123, 124, 12 and/or 13.

[0099] In order for example to improve and to accelerate search functions in a subsequent analysis and also in the grouping of the numeric codes Kx, the comparison of the new numeric code Kx to be considered with the already generated groups Gy will be started at the last group Gy set up or at the group Gy that has last been assigned a preceding numeric code Kx.

[0100] Especially the start for the grouping or assignment of a new numeric code Kx to be considered for the group Gy, which has been assigned a preceding numeric code Kx, accelerates the grouping algorithm, since usually neighboring numeric codes Kx, especially following on from each other but also preceding each other in time, which have previously been sorted chronologically, are assigned to one and the same event and thus are able to be assigned to one and the same group Gy.

[0101] Over and above this each newly generated and thus new numeric code to be considered can be compared with non-grouped numeric codes Kx of the same length or with a greater length of maximum one character.

[0102] In the event of the Levenshtein distance to another non-grouped numeric code Kx of a new numeric code Kx to be considered being equal to one, a new group Gy+1 with a new group code KGy+1 is formed from the two numeric codes Kx such that [0103] If the two numeric codes Kx to be considered are of a different length, the new group Gy+1 is assigned the shorter numeric code Kx as the new group code KGy+1 (for example Kx=123 and 12 leads to a new group Gy with a new group code KGy+1 of 12) or [0104] If the two numeric codes Kx are the same length with at least one different digit, the new group Gy+1 will be assigned the numeric code Kx without the different digit as the new group code KGy+1 (for example Kx=133 and 134 leads to a new group Gy with a new group code KGy+1 of 13).

[0105] In addition or as an alternative, when a new numeric code Kx to be considered that does not have a Levenshtein distance that is equal to one to another non-grouped numeric code Kx is identified, this numeric code Kx to be considered will be assigned to a further new group Gy+1. In this case this further new group Gy+1 will be assigned this new numeric code Kx to be considered as group code KGy+1. Future new numeric codes Kx to be considered, which, when compared to this new group code KGy+1, have a Levenshtein distance of one, will be assigned to this new group Gy+1.

[0106] The two-dimensional code matrix KM thus represents a dynamic expert system in which the established groups Gy, Gy+1 with group codes KGy, KGy+1 for grouping similar and/or identical codes Kx that describe an event, can be expanded or supplemented in an ongoing manner by addition of new numeric codes Kx of new data sets D1 to Dn of further/new log files P1 to Pn.

[0107] For further processing of the content of the numeric code Kx and/or group codes KGy, KGy+1, these are decoded and output as alphanumeric message part N1 to Nn of the log file P1 to Pn and can be stored for example in a memory 10.

[0108] As an alternative or in addition these numeric codes Kx and group codes KGy, KGy+1 decoded and representing the alphanumeric message part N1 to Nn can be supplied to the control unit 4 for further analysis and assessment of the event(s).

[0109] Over and above this the method allows only a predetermined number of groups Gy to be used when the method is started and no new groups Gy+1 to be generated. Through this a robust grouping of new numeric codes Kx to be considered is initially made possible. The generation of new groups Gy+1 can then be allowed in ongoing operation.

[0110] Although the invention has been illustrated and described in greater detail by preferred exemplary embodiments, the invention is not restricted by the disclosed examples and other variations can be derived herefrom by the person skilled in the art, without departing from the scope of protection of the invention. In particular the log file compressor 9 can be implemented at a suitable location in a component of the automation system 1.