TECHNICAL INJECTION SYSTEM FOR INJECTING A RETRAINED MACHINE LEARNING MODEL

20230237324 · 2023-07-27

    Inventors

    Cpc classification

    International classification

    Abstract

    A technical injection system for injecting a retrained machine learning model is provided, including a. a first computing unit including a first storage medium, wherein the first computing unit is configured for providing the retrained machine learning model; and preprocessing the retrained machine learning model; wherein the retrained machine learning model is stored in the first storage medium; b. a second computing unit comprising a second storage medium and an injection interface, wherein the injection interface is configured for injecting at least one relevant part of the retrained machine learning model after processing from the first storage medium of the first computing unit into the second storage medium of the second computing unit by the injection interface at runtime.

    Claims

    1. A technical injection system for injecting a retrained machine learning model, comprising: a. a first computing unit comprising a first storage medium, wherein the first computing unit is configured for: providing the retrained machine learning model; and preprocessing the retrained machine learning model; wherein the retrained machine learning model is stored in the first storage medium; b. a second computing unit comprising a second storage medium and an injection interface, wherein the injection interface is configured for: injecting at least one relevant part of the retrained machine learning model after processing from the first storage medium of the first computing unit into the second storage medium of the second computing unit by the injection interface at runtime; wherein a current machine learning model is stored in the second storage medium; and the injection comprises an identification of the at least one relevant part of the current machine learning model and a replacement of the identified at least one relevant part of the current machine learning model by the corresponding at least one relevant part of the retrained machine learning model.

    2. The technical injection system according to claim 1, wherein the first computing unit is hosted on or configured as an embedded device, an embedded control computer, or an edge device.

    3. The technical injection system according to claim 1, wherein the first storage medium is a volatile or non-volatile storage medium.

    4. The technical injection system according to claim 1, wherein the preprocessing comprises compressing, decompressing, quantizing, optimizing, deserializing, initializing and/or testing.

    5. The technical injection system according to claim 1, wherein the first computing unit further comprising an input interface, configured for receiving a notification when the retrained machine learning model is available from a machine learning platform, other computing unit, or other technical system.

    6. The technical injection system according to claim 1, wherein the input interface is further configured for receiving or downloading the retrained machine learning model from a machine learning platform, other computing unit, or other technical system, after notification.

    7. The technical injection system according to claim 1, wherein the first computing unit is further configured for performing at least one test based on the retrained machine learning model before the injection is performed by the first computing unit through the injection interface of the second computing unit.

    8. The technical injection system according claim 7, wherein the injection is performed in the case that the test is successful.

    9. The technical injection system according to claim 1, wherein the second computing unit is configured as an accelerator component, FPGA, Field Programmable Gate Array, or ASIC, Application-Specific Integrated Circuit.

    10. The technical injection system according to claim 1, wherein the second storage medium is a volatile or non-volatile storage medium.

    11. The technical injection system according to claim 1, wherein the relevant part of the retrained machine learning model is at least one internal data structure, at least one layer, at least one bias, at least one weight and/or at least one parameter.

    12. The technical injection system according to claim 1, wherein the injection is synchronized with a control cycle clock.

    13. The technical injection system according to claim 1, wherein the injection interface is configured as an Ethernet, a parallel or a serial communication interface, connected to a communication controlled of the second computing unit.

    14. The technical injection system according to claim 1, wherein the current machine learning model and the retrained machine learning model are semantically equivalent, configured as neural networks, even more configured as feedforward neural networks, convolutional neural networks or recurrent neural networks.

    Description

    BRIEF DESCRIPTION

    [0053] Some of the embodiments will be described in detail, with reference to the following figures, wherein like designations denote like members, wherein:

    [0054] FIG. 1 illustrates the technical injection system according to an embodiment of the invention.

    [0055] FIG. 2 illustrates a computer-implemented method for injection according to an embodiment of the invention.

    DETAILED DESCRIPTION

    [0056] FIG. 1 illustrates the technical injection system 1 for injecting the retrained machine learning model 30. The technical injection system 1 comprises two computing units 10, 20, wherein each computing unit 10, 20 comprises a storage medium 12, 22 for storing the retrained and current machine learning models. The current machine learning model is the one that is actually or currently being used by the underlying system, such as the high-performance controller (HPC) of a Simplex controller, which can be used, e.g., for motion planning by a robot unit or autonomous vehicle. Additionally, other data can be stored.

    [0057] According to an embodiment, the first computing unit 10 is hosted on an embedded device, such as an embedded control computer or an edge device. The first computing unit 10 can be securely connected to a machine learning platform, which can provide a publish-subscribe mechanism. The first computing unit 10 can be notified via the publish-subscribe mechanism when the retrained machine learning model 30 is available.

    [0058] After reception of the notification, the first computing unit 10 downloads the retrained machine learning model 30. The first computing unit 10 processes the retrained machine learning model 30. For example, the first computing unit 10 deserializes the retrained machine learning model 30 in the first storage medium 12.

    [0059] The current machine learning model is stored in a second storage medium 22 of the second computing unit 20. The second computing unit 20 can be based on a different architecture than the first computing unit 10. For example, the logic of the second computing unit 20 can be implemented in hardware using an accelerator device (e.g., FPGA or ASIC), whereas the first computing unit 10 is hosted on an edge device.

    [0060] The second computing unit 20 comprises the injection interface 40, coupled to the second storage unit 22, which stores the current machine learning model.

    [0061] As part of preprocessing, the retrained machine learning model can be scored at least once on the first computing unit using a first scorer and a test data set to test the model's quality properties (e.g., correctness of results, correct memory usage, etc.).

    [0062] According to an embodiment, one or more further tests can be performed based on the retrained machine learning model 30 as part of the preprocessing. For example, test and/or validation data e.g., provided with the retrained machine learning model 30 can be used. Additionally or alternatively, memory integrity and/or stress tests can be performed based on the retrained machine learning model 30. The tests aim at preventing any failures and malfunctions when the retrained model is used by the second computing unit.

    [0063] According to an embodiment, if these tests are successful, a flag e.g., quality flag (QF) will be set to 1, otherwise to 0.

    [0064] The model injector 41 of the first computing unit 10 checks the flag and if it is set to 1, the model injector 41 extracts the relevant part of the retrained machine learning model 30. The model injector 41 can be implemented as a software function, which is triggered by the system's control cycle clock. For example, the arrays holding the weights and biases of a neural network can be the relevant part. The model injector 41 then sends the relevant part of the retrained machine model to the second computing unit 20 via the injection interface 40. Then, the injection interface 40 injects the relevant part of the retrained machine learning model 30 from the first storage medium 12 into the second storage medium 22. Hence, the relevant part of the current machine learning model 50 is overwritten by the relevant part of the retrained machine learning model 30 without requiring any reinitialization of the current machine learning model on the second computing unit 20.

    [0065] According to an embodiment, the injection is realized by overwriting the content of a specified, fix-sized segment of the random-access memory (“RAM”) 22, storing the relevant part of the current machine learning model on an FPGA 20.

    [0066] As a result, the current machine learning model is adapted by virtue of the replacement of its core (i.e., relevant part) at system runtime in an efficient way. This is because the retrained machine learning model 30 is preprocessed on a separate computing unit 10 and the model core is directly injected in the format expected by the second computing unit 20, such as binary format.

    [0067] The injection can, for example, transfer the memory in bulk as a bitstream or by value. In the latter case, the second computing unit 20 can provide a communication interface, such as an Ethernet, a serial or a parallel port, whose controller can write the received payload data directly into the RAM segment holding the core of the current machine learning model e.g., the different arrays holding the biases and weights of a neural network.

    [0068] It is the task of the external first computing device 20 to compress the retrained machine learning model 30 using an efficient model compression technique so that the size of the relevant parts of a retrained model will match the size of the relevant parts of the current model.

    [0069] For example, such a model compression technique can quantize the values of the weights and biases of a neural network model in order to facilitate an efficient representation of the individual weights and biases as a byte or integer instead of a double word. The weights and/or biases arrays can then be packed into a compact bitstream, which can be directly written into the RAM of an accelerator device.

    [0070] FIG. 2 illustrates a computer-implemented method for injection according to an embodiment of the invention.

    [0071] Although the present invention has been disclosed in the form of embodiments and variations thereon, it will be understood that numerous additional modifications and variations could be made thereto without departing from the scope of the invention.

    [0072] For the sake of clarity, it is to be understood that the use of “a” or “an” throughout this application does not exclude a plurality, and “comprising” does not exclude other steps or elements.