HIGH-PERFORMANCE MECHANISM FOR GENERATING LOGGING INFORMATION IN RESPECT OF A COMPUTER PROCESS

20170255540 · 2017-09-07

    Inventors

    Cpc classification

    International classification

    Abstract

    Some embodiments are directed to a logging within a software application executed over an assembly of information processing devices. More particularly, some embodiments relate to a method allowing process logging in the case of a software application operating with several processes and/or threads.

    Claims

    1. A logging method comprising: executing a process (P1) that includes at least one application thread (F1, F2) and at least one logging thread (Fj) over an assembly of information processing devices; detecting within the at least one application thread a logging event, and immediately transmitting first logging information to said at least one logging thread (FJ); receiving said first logging information and generating second logging information starting from said first logging information; and publishing said second logging information via a publication interface (IP) to at least one processing element (Fs, PJ) previously registered with the at least one logging thread.

    2. The method as claimed in claim 1, wherein the at least one processing element includes an output thread (Fs) belonging to the process (P1).

    3. The method as claimed in claim 1, wherein the at least one processing element includes a thread of a logging process (PJ) distinct from the process (P1).

    4. The method as claimed in claim 1, wherein the application thread (F1) transmits the first information to the at least one logging thread in an asynchronous manner via a communications interface (IC).

    5. The method as claimed in claim 1, wherein the communications interface and the publication interface are of the socket type and conform to the ZeroMQ library.

    6. The method as claimed in claim 1, wherein, at the start of said process (P1), the at least one application thread (F1, F2) waits for the initialization of the at least one logging thread (Fj) before continuing with its execution, and in which said logging threads is initialized by synchronizing itself with a sub-set of the at least one processing element (Fs, PJ).

    7. The method as claimed in claim 1, wherein the first logging information comprises a name and a level.

    8. The method as claimed in claim 1, wherein, when the process (P1) is duplicated, the at least one logging thread (FJ) is terminated, then restarted within the initial parent process (P1).

    9. The method as claimed in claim 1, wherein, when the at least one logging thread receives a signal, it publishes second logging information associated with the signal, then triggers the processing code associated with the signal.

    10. The method as claimed in claim 1, wherein, when the application thread receives a signal, it transmits first logging information associated with the signal to the at least one logging thread, waits for a given time, then causes the termination of the process.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0025] FIG. 1 shows schematically one exemplary embodiment of the invention.

    [0026] FIG. 2 shows schematically a finite state machine according to one embodiment of the invention.

    DETAILED DESCRIPTION OF THE INVENTION

    [0027] From a software perspective, a software application is composed of one or more processes.

    [0028] A “process” may be seen as a program being executed by an information processing system (or computer). A process may be defined as comprising: [0029] A set of instructions to be executed, which may be in the read-only memory, but most often downloaded from the mass memory to the random access memory; [0030] An addressing space in a random access memory for storing the stack, the working data, etc.; [0031] Resources such as the network ports.

    [0032] The same process may comprise several threads (or tasks or light processes). As opposed to a process, a thread does not dispose of its own virtual memory but shares it with all the threads of the same process.

    [0033] In the example illustrated by FIG. 1, an application comprises two processes P.sub.1, P.sub.2. The process P.sub.1 comprises three threads F.sub.1, F.sub.2, F.sub.3. Although not shown in the figure, the process P.sub.2 may also comprise several threads.

    [0034] It should be noted that the threads may have a lifetime different from that of the processes, and, a fortiori, from the duration of execution of the application: the threads may be executed dynamically in the course of execution of the process and terminated at any moment, once the task for which they are provided is finished. FIG. 1 therefore illustrates a situation at a given moment in the execution of the software application.

    [0035] In the following, the threads F.sub.1, F.sub.2 will be called “application threads”, in order to distinguish them from the logging thread or threads F.sub.j and from the output thread F.sub.s which will be described hereinbelow.

    [0036] The information processing system allowing the software application to be executed is typically a parallel system formed from an assembly of information processing devices.

    [0037] Each information processing device may for example be a microprocessor with associated circuits (memory, etc.), and the system is then formed from an assembly of interconnected microprocessors. Another example is that of a single microprocessor forming a system, and composed of an assembly of physical cores. Such a microprocessor is generally called “multi-core”.

    [0038] Irrespective of the architecture implemented, the processes P.sub.1, P.sub.2, P.sub.J of the example in FIG. 1 and the threads that they comprise may be executed in parallel.

    [0039] When the process P1 starts up, at least one thread also starts up. This first thread can subsequently launch the execution of other threads. These threads may of course be application threads (in other words belonging to the application and allowing the “logic” of the application to be implemented), but also notably logging threads F.sub.J.

    [0040] In the following, as illustrated in FIG. 1, only one logging thread is described. It is however possible to provide several logging threads, notably in order to enable a distribution of loading.

    [0041] According to one embodiment, when the logging thread starts up, it creates:

    [0042] A communications interface I.sub.C, allowing the reception of first logging information from the application threads F.sub.1, F.sub.2, in other words from the main application thread and from any potential threads that it might subsequently create; [0043] A publication interface I.sub.P allowing second logging information to be published to processing elements.

    [0044] Here, “processing elements” refers to the threads and the processes. In FIG. 1, as will be seen later on, the logging process P.sub.J and the output thread F.sub.S form such processing elements which are capable of receiving the logging information.

    [0045] According to one embodiment of the invention, these interfaces are of the “socket” type. A “socket” is a communication mechanism well known to those skilled in the art, developed on the operating systems of the “Unix” type, but today present under the majority of operating systems.

    [0046] They may for example conform to ZeroMQ. ZeroMQ is a platform of the “middleware” type which is inserted between the underlying operating system and the applications in order to provide additional infrastructure services that are independent of the operating system. With respect to concurrent platforms such as CORBA (Common Object Request Broker Architecture), for example, ZeroMQ provides a great facility of use and excellent performance characteristics: the code is very short for the application threads, leading to little processing overload, and the processing within the ZeroMQ library itself is also very fast. Accordingly, ZeroMQ complies with the requirements of the invention and allows the desired services to be fulfilled without any additional processing that may be detrimental.

    [0047] The mechanisms offered by ZeroMQ are notably accessible through a library using application code in C language. The mechanisms and advantages of the invention are therefore accessible for processes developed in C language.

    [0048] According to one embodiment, the communications interface I.sub.C is an asynchronous interface: it allows the application thread to send out logging information, then to continue with its processing, without having to wait for an acknowledgement from the logging thread F.sub.J.

    [0049] In the framework of an implementation using ZeroMQ, this communications interface I.sub.C may be of the push/pull type. In this case, upon start-up, the logging thread creates a socket of the “pull” type. The application threads F.sub.1, F.sub.2 may then transmit the logging information via a socket of the “push” type connected to the “pull” socket of the logging thread F.sub.J.

    [0050] Between the two types of sockets forming the communications interface I.sub.C, an inter-thread and intra-process transport protocol such as the “inproc” protocol may be established. This protocol allows the transmission of messages between threads within the same process: the information is directly transmitted by the memory belonging to the context associated with the process. This mechanism does not therefore generate any inputs/outputs, and hence contributes to the high performance of the method according to the invention.

    [0051] The publication interface I.sub.P may comprise a socket of the “pub” type created within the logging thread F.sub.J. The processing elements (processes or threads) P.sub.J, F.sub.S can create sockets of the “sub” type in order to subscribe to the “pub” socket of the logging thread P.sub.J. The “publish-subscribe” model managed by ZeroMQ thus allows messages to be transmitted to all the processing elements already subscribed.

    [0052] According to one embodiment of the invention, at the start of the process P.sub.1, the application thread waits for the initialization of this logging thread F.sub.J before continuing with its execution.

    [0053] When it launches the execution of the logging thread F.sub.J, the (main) thread of the process P.sub.1 can indicate a number of processing elements which must receive logging information. It will only receive an acknowledgement from the logging thread F.sub.J when this number of subscribed processing elements has been reached. Once the acknowledgement has been received, the thread can then continue with its execution.

    [0054] Similarly, the logging thread F.sub.J is initialized by synchronizing itself with the number of processing elements which must receive logging information.

    [0055] For this purpose, the logging thread F.sub.J can publish synchronization information. Upon receipt of this information, the processing elements having received it will transmit an acknowledgement to the logging thread. The latter can count the acknowledgements received and readily determine when the specified number is reached.

    [0056] This synchronization phase allows it to be ensured that no logging information is lost: indeed, the establishment of the connection between the logging thread F.sub.J and the processing elements can take a certain time. Furthermore, the processing elements themselves may also be in the process of initialization. During this time, if the logging thread F.sub.J began to send out logging information immediately, the latter would be lost due to the absence of acknowledgement in the transmission of the logging information published by the receiving processing elements. This absence of acknowledgement allows good performance characteristics to be achieved.

    [0057] However, in certain situations, it can be important, or even crucial, not to lose any logging information.

    [0058] According to this implementation, it is therefore possible to specify the number of the processing elements for which the receipt of all of the synchronization information must be guaranteed by this synchronization mechanism.

    [0059] This number of elements determines a sub-set of the set of subscribed processing elements because, once the initialization phase has finished, it is perfectly possible for other processing elements to subscribe to the publications of the logging thread F.sub.J. However, the latter may miss the first publications of the logging thread F.sub.J.

    [0060] The logging method according to the invention comprises a step for execution of the process P.sub.1. As previously described, this execution involves the execution of at least one application thread F.sub.1 and of one logging thread F.sub.J. In the case of a process written in C language, this application thread F.sub.1 may correspond to the execution of the function main( ).

    [0061] Once the initialization phase has finished, the application thread or threads execute the code of the software application.

    [0062] The method then consists in detecting, within the application thread or threads, a logging event and in immediately transmitting first logging information e1, e2 to the logging thread F.sub.J.

    [0063] The detection of a logging event is a technique known per se, which consists in inserting “loggers” into the code in order to trigger a logging event when certain conditions are met. These conditions may be quite simply the passage through a precise point in the code (in order to allow the sequence of operations of the code to be followed), or else a situation of error, etc.

    [0064] This logging information generated by the application thread F.sub.1, F.sub.2 is here referred to as “first logging information” in order to distinguish it from the second logging information which will be that published by the logging thread F.sub.J.

    [0065] This information might only comprise a name and a level. According to one embodiment, this first information only comprises this name and this level.

    [0066] The name may be a chain of characters identifying a logger within the application code.

    [0067] The level is generally an integer number, identifying a degree of criticality of the event. This level may belong to a previously-defined list which may comprise: [0068] “Critical”: to indicate a critical error which, in general, leads to the termination of the process. [0069] “Error”: to indicate a normal error. [0070] “Warning”: to indicate an unimportant error. [0071] “Output”: to indicate a normal message, not associated with an error. [0072] “Info”: to indicate a message for the attention of the user of the application (and not only for the developer or tester). [0073] “Debug”: to indicate a more detailed message, intended for the testers (or “debuggers”) of the application. [0074] “Trace”: to indicate a message associated with the most detailed level. Its use is clearly intended for the development stage of the application.

    [0075] This list is of course non-exhaustive. Many other levels may be defined by the developer of the application.

    [0076] It is important that all the outputs intended for the developers or testers conform to this formalism in order to be taken into account by the mechanisms of the invention. It is notably therefore important for the developer to avoid direct outputs, notably by the printf( ) function of the C language: they may be replaced by the “Output” level for example.

    [0077] This first information may also comprise: [0078] a timestamp of the occurrence of the logging event; [0079] an identifier of the process P.sub.1; [0080] other information on the execution context: identifier for the thread of the kernel, name of the software application, name of the file, number of the line of the application code, name of the function in the process of execution, etc.

    [0081] According to the invention, the application thread immediately transmits this first logging information to the logging thread F.sub.J via the communications interface I.sub.C. As was previously seen, this interface is asynchronous and does not require any acknowledgement. Nor is any lock installed, in such a manner that, once the transmission has been carried out (and without worrying about the receipt by the logging thread), the application thread F.sub.1, F.sub.2 can immediately continue with the application processing.

    [0082] According to one embodiment, no other processing is applied between the detection of a logging event and the generation of the first logging information.

    [0083] According to one embodiment, only a formatting processing operation is applied.

    [0084] In no case, according to the invention, does the application thread set up inputs/outputs mechanisms: these mechanisms are implemented by the logging thread F.sub.J, and hence transferred outside of the application thread F.sub.1, F.sub.2.

    [0085] As a result, for the application thread, the extra cost is reduced to a minimum.

    [0086] According to one embodiment of the invention, the application code is divided up into modules. Each module is associated with a name or identifier, which may be incorporated into the name of the logger.

    [0087] For example, in a module “Module 1”, loggers with names “Module1.logger1”, “Module1.logger2”, etc.

    [0088] This mechanism allows the various logging events to be more clearly named: at the end of a chain, the location in the code where the event has taken place may thus be directly determined as a function of the module name included in its name.

    [0089] The logging thread F.sub.J receives in an asynchronous manner the first logging information generated by the application thread or threads F.sub.1, F.sub.2. Its role is then to generate second logging information starting from the first logging information received from the application threads, and, potentially, from complementary information. This complementary information may be information common to all of the application threads of the process.

    [0090] The processing implemented by the logging thread F.sub.J may be limited to the generation of this second logging information. The generation may comprise, on the one hand, the addition of the potential complementary information, but also a conditioning, according to a predefined format allowing its exploitation by processing elements.

    [0091] This formatting may be very simple and consist solely of a formatting such that the second information is in a format independent of the computer programming language used.

    [0092] This second information is subsequently published by the logging thread F.sub.J via the publication interface I.sub.P. It can then be received by one or more processing elements F.sub.s, P.sub.J already registered with the logging thread, as previously described.

    [0093] These processing elements may comprise an output thread F.sub.s belonging to the process P1. This output thread may be designed to form an output of the second logging information on a display terminal (screen, etc.), in a file stored in a memory, notably a mass memory, etc.

    [0094] These output mechanisms are generally costly in processing time owing to the interaction required with hardware and, in general, to the necessity for an acknowledgement (the thread must ensure that the information really has been stored on the hard disk, etc.).

    [0095] Thanks to the invention, these mechanisms do not impact the application thread which follows its operating sequence in a manner parallel to that of the logging thread.

    [0096] The processing elements may also comprise a logging process P.sub.J distinct from said process P.sub.1.

    [0097] This process may also implement output mechanisms in the same way as an output thread F.sub.S.

    [0098] It may also implement more complex mechanisms for exploitation of the logging information: filtering, etc.

    [0099] According to one embodiment of the invention, when the process P.sub.1 is duplicated, the logging thread F.sub.J and the potential output thread F.sub.S are terminated, then restarted within the initial parent process P.sub.1. Within the daughter process, these two threads are not restarted (they go into a “finalized” state, as will be described hereinbelow). Indeed, very often, the daughter process will trigger the execution of a function “exec( )” which will replace and “crush” the content of the daughter process by a new program: it is therefore unnecessary to trigger an automatic restart of the logging and output threads, and it may even be counter-productive.

    [0100] The duplication, or “fork”, is the mechanism for creation of a new process in a software application operating under an operating system of the “Unix” type, or conforming to the Posix standard.

    [0101] The duplication of a process comprising several threads (“multithreaded process”) poses significant problems. This issue is notably described in the Posix standard, IEEE 1003.1, notably in the “Rationales” part.

    [0102] The mechanism implemented by the invention allows it to be avoided.

    [0103] Furthermore, a management of a state machine may be set up in order to best manage the duplications “fork( )”.

    [0104] FIG. 2 illustrates such a state machine of the logging thread F.sub.J. It is considered that the thread F.sub.J can be in 5 main states. These states are conventional: [0105] “unset”, corresponding to an “unstarted” state, this corresponding to the state in which the thread may be before the application thread triggers its initialization. [0106] “initializing”, corresponding to an initialization state of the thread, during which the synchronization step previously described notably takes place. [0107] “initialized”, corresponding to the normal operation of the logging thread. [0108] “finalizing”, corresponding to the termination of the logging thread. [0109] “finalized”, corresponding to a state where the thread has finished.

    [0110] In certain states, according to this embodiment of the invention, duplications are prohibited. This is the case for the “initializing” and “finalizing” states: the arrow “fork( )” leads to an “illegal” state.

    [0111] In the particular states “unset” and “finalized”, duplication may be permitted and not give rise to particular processing operations. The arrow “fork( )” loops back to the current state.

    [0112] In the “initialized” state, the duplication brings the transition to the state “finalizing” in order to terminate the logging thread.

    [0113] The same is true for the potential output thread F.sub.S.

    [0114] Once the thread has finished, the process can be duplicated.

    [0115] Once the duplication has been carried out, the threads F.sub.J and F.sub.S may be restarted. In the parent process (in other words the initial process P.sub.1), the threads are restarted in the state where they were prior to the duplication, in other words the “initialized” state. In the daughter process, the threads start in an “unset” state: the application thread of the daughter process must then launch its initialization in order to make it change state.

    [0116] Furthermore, a process P.sub.1 operating under a system of the Posix or unix type can receive signals. These signals may be provided for terminating the process, such as the signals SIGINT, SIGTERM, SIGQUIT, SIGSEGV, SIGBUS.

    [0117] When such a signal is received by the logging thread F.sub.J, the latter may choose to process them or not depending on their nature. For example, the signals SIGINT, SIGTERM and SIGQUIT may be considered as needing to be processed by the application thread and hence not considered by the logging thread. It may, on the other hand, consider other types of signals such as the signals SIGSEGV and SIGBUS.

    [0118] Upon receiving such a signal, the logging thread F.sub.J may consider that this constitutes a logging event, and then publish logging information, associated with this signal.

    [0119] Subsequently, it may once again trigger this signal and its “normal” processing. The normal processing of a signal is provided by a processing code, typically referred to as a “handler” and associated with this signal. The normal processing of this signal SIGSEGV or SIGBUS leads to the termination of the process.

    [0120] Thus, by virtue of this mechanism, the process adopts the expected behavior consisting in coming to an end, but it is also ensured that a logging linked to the receipt of this event takes place: also, if a processing element is subscribed to the logging thread, it will be informed of the cause of the termination of the process P.sub.1.

    [0121] If a signal is received by the application thread, the latter may determine that it constitutes a logging event. For this purpose, a specific code may be associated with the signal as a “handler”: upon receipt of a given signal, it is this specific code which is triggered by the operating system.

    [0122] This specific code enables the immediate transmission of (first) logging information to the logging thread via the communications interface I.sub.C.

    [0123] This code may also include a wait time allowing the logging thread to generate second logging information starting from this first logging information and to publish it via the publication interface.

    [0124] Only then can the specific code call up the normal processing code in order to implement the termination of the process or to terminate the process itself.

    [0125] Thus, the subscribed processing elements can be informed of the cause of the termination of the process.

    [0126] It goes without saying that the present invention is not limited to the examples and to the embodiment described and shown, but it is capable of numerous variants accessible to those skilled in the art.