PERFORMING WEB TENSIONING ADJUSTMENTS IN A FOOD PACKAGING SYSTEM BASED ON REINFORCEMENT LEARNING

Abstract

Methods and apparatus, including computer program products, are described for controlling web tensioning in a food packaging machine comprising a plurality of sub-systems. One or more local variable value are received, which indicate measurements by the food packaging machine of one or more physical parameters for a web tensioning sub-system. One or more remote variable values are received, which indicate measurements by the food packaging machine of one or more physical parameters for one or remote sub-systems. One or more control parameter values are determined for the web tensioning sub-system, by processing the remote and the local variable values using a reinforcement learning model and a local control model. One or more control parameters of the web tensioning sub-system are adjusted in accordance with the determined control parameter values.

Claims

1. A method for controlling web tensioning in a food packaging machine, the method comprising: receiving one or more local variable values indicating measurements by the food packaging machine of one or more physical parameters for a web tensioning sub-system of a plurality of local sub-systems; receiving one or more remote variable values indicating measurements by the food packaging machine of one or more physical parameters for one or more remote sub-systems; determining one or more control parameter values for the web tensioning sub-system, by processing the remote variable values and the local variable values using a reinforcement learning model and a local control model; and adjusting the one or more control parameters of the web tensioning sub-system in accordance with the determined control parameter values.

2. The method according to claim 1, wherein the reinforcement learning model comprises a deep reinforcement learning model including a neural network.

3. The method according to claim 2, wherein the web tensioning sub-system includes two stationary guide rolls and a movable guide roll.

4. The method according to claim 3, wherein the movable guide roll is located between the two stationary guide rolls along a path traversed by the web through the packaging machine, and is movable so as to increase or decrease the tension of the web in response to instructions received from the control parameter values.

5. The method according to claim 2, wherein the neural network comprises one of: a convolution neural network, a recurrent neural network, a Long Short-Term Memory neural network, or a fully connected neural network.

6. The method according to claim 1, wherein: the one or more local variable values include measurements relating to one or more of: a web tension set point or a current web tensioning system position, and the one or more remote variable values include measurements relating to one or more of: web movement control variables, a jaw motion profile, packaging material characteristics, or a filling status.

7. A food packaging machine comprising: a plurality of local sub-systems configured to control web tensioning; a memory; and a processor, wherein the memory stores instructions that, when executed by the processor, cause the processor to perform a method comprising: receiving one or more local variable values indicating measurements by the food packaging machine of one or more physical parameters for a web tensioning sub-system of the plurality of local sub-systems; receiving one or more remote variable values indicating measurements by the food packaging machine of one or more physical parameters for one or remote sub-systems; determining one or more control parameter values for the web tensioning sub-system, by processing the remote variable values and the local variable values using a reinforcement learning model and a local control model; and adjusting one or more control parameters of the web tensioning sub-system in accordance with the determined control parameter values.

8. The food packaging machine according to claim 7, wherein the reinforcement learning model comprises a deep reinforcement learning model including a neural network.

9. The food packaging machine according to claim 7, wherein the web tensioning sub-system includes two stationary guide rolls and a movable guide roll.

10. The food packaging machine according to claim 9, wherein the movable guide roll is located between the two stationary guide rolls along a path traversed by the web through the packaging machine, and is movable so as to increase or decrease the tension of the web in response to instructions from the received control parameter values.

11. The food packaging machine according to claim 8, wherein the neural network comprises one of: a convolution neural network, a recurrent neural network, a Long Short-Term Memory neural network, or a fully connected neural network.

12. The food packaging machine according to claim 7, wherein: the one or more local variable values include measurements relating to one or more of: a web tension set point or a current web tensioning system position, and the one or more remote variable values include measurements relating, to one or more of: web movement control variables, a jaw motion profile, packaging material characteristics, or a filling status.

13. A computer program product comprising a non-transitory computer readable storage medium with instructions that, when executed to a processor, cause the processor to carry out the method according to claim 1.

Description

DRAWINGS

[0026] Embodiments of the invention will now be described, by way of example, with reference to the accompanying schematic drawings.

[0027] FIG. 1 is a schematic diagram of a portion of a food packaging machine, in accordance with one embodiment.

[0028] FIG. 2 is a schematic diagram of a controller in a food packaging machine, in accordance with one embodiment

DETAILED DESCRIPTION

[0029] As was mentioned above, a goal with the various embodiments of the invention is to provide improved control techniques for equipment and systems relating to food processing and packaging, and in particular with respect to web tensioning. Having a proper web tension is important, not only from machine packaging machine operation point of view, but also from a functionality point of view, as an improper web tension might result in integrity issues of the packages formed by the packaging machine. By applying the general concepts of reinforcement learning and/or deep reinforcement learning techniques to control a web tensioning system of the food packaging machine, a larger range of factors can be taken into account compared to what is possible in existing systems and the web tensioning can be adjusted very precisely, such that the operation of the food packaging machine can be enhanced, and the package integrity, formation and appearance can be ensured.

[0030] Both reinforcement learning and deep reinforcement learning are examples of machine learning techniques. In general, reinforcement learning (RL) can be characterized as dynamically learning through the use of positive or negative rewards. A system performance is evaluated with respect to a desired target. If the target is reached or not, a positive reward is delivered, and if the target is not reached, a negative reward is delivered. As the positive and negative rewards accumulate over time, the RL model evolves a control policy for the system, with the goal of maximizing the outcome. Deep reinforcement learning (DRL) can be characterized as an enhancement of RL, in which RL is used together with a neural network when evolving the control policy for the system.

[0031] In the context of food processing and packaging, RL (i.e., agent-environment interaction) can be used to evolve a control policy for a food processing and/or packaging machine. Using DRL (i.e., RL together with a neural network) can be particularly useful when evolving control policies for sub-systems, such as the filling sub-system, that must consider a large number of variables whose internal relations and effects on the sub-system may not be known. In addition, it should be noted that RL and DRL techniques can also be used to improve existing, local control techniques, in essence by filling in the gaps of conventional control techniques with this data-driven approach. Thus, the DRL algorithm can then directly (or indirectly through other control layers, e.g., by tuning the gains of a conventional PID controller to allow the PID controller to operate more efficiently compared to the conventional control techniques) control the actuators (e.g., servomotors, pneumatic actuators or other actuators) that adjust the web tensioning in food packaging systems that may occur in a food packaging system.

[0032] In order to further illustrate these principles, various embodiments of the invention will now be described more fully by way of example of controlling a web tensioning sub-system in a food packaging machine to ensure proper web tensioning throughout the food packaging machine, and with reference to the accompanying drawings in which some, but not all, embodiments of the invention are shown. The invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein.

[0033] As was mentioned above, a web tensioning sub-system is an important part of a food packaging machine, and its operation needs to be carefully controlled in order to ensure that the web moves smoothly and at a desired speed throughout the food packaging machine, despite the intermittent operation of the jaw system described above, so as to be able to ensure proper package integrity, formation, and appearance.

[0034] FIG. 1 shows a schematic view of a sub-section of a food packaging machine 100, where a web 102 of packaging material, which preferably includes at least one sealable surface thereon, is fed forwardly through a web feeder over guide rolls 106, 108, 110 in an S-shaped pattern and formed into a tube 112. The longitudinally overlaid side edges of the web 102 are sealed to close the tube along the longitudinal edge. The side edges may be overlaid either with the undersides against each other or overlapped with the undersides facing in the same direction. A strip of tape may also be provided along one or both of the longitudinal edges to assist in tube formation.

[0035] After forming the tube from the web, a food product is fed into the formed tube at a filling station (not shown). Food product in this context refers to anything that people or animals ingest, eat and/or drink or that plants absorb, including but not limited to liquid, semi-liquid, viscous, dry, powder and solid food products, drink products, and water. For the avoidance of doubt, food products also include ingredients for preparing food. Some examples of food products include milk, water and juice. The filled tube is then forwarded to a jaw system, which creates transverse seals of the filled tube and severs the sealed tube transversely of its length and within the bounds of the transversely sealed areas to form individual packages filled with the product. Thus, the jaw system creates a pulling force on the web 102 in a forward direction. As was mentioned above, a corresponding, opposite force is created on the web 102 as a result of the inertia from the large rolls holding the web 102 at the beginning of the packaging machine.

[0036] The right side of FIG. 1 shows a more detailed view of a web tensioning sub-system 200 of the food packaging machine 100. In the illustrated embodiment of the web tensioning sub-system 200, in order to regulate the tension on the web 102, the middle guide roll 108 is movable in a vertical direction, thus allowing the tension on the web 102 to be reduced during those times when the jaw system actively pulls the web 102 in a forward direction, and to be increased during those times when the jaw system does not actively pull on the web 102. It should be noted that FIG. 1 only shows one possible embodiment of a web tensioning sub-system 200, and that other embodiments may have a larger number of guide rolls in which several guide rolls may move. Likewise, the reference to a vertical motion of the middle guide roll 108 is merely done for purposes of explanation. In other embodiments, the guide rolls may be rotated, say, 90 degrees compared to what is illustrated in FIG. 1, such that the movement of the middle guide roll 108 instead is a movement in a left-right direction. Thus, many variations can be envisioned by those having ordinary skill in the art. Due to its movement, the middle guide roll 108 is also often referred to as a pendulum roll.

[0037] In the embodiment shown in FIG. 1, the movement of the middle guide roll 108 is performed in response to signals from a controller 114. The controller 114 receives input from a sensor 116 in the web tensioning sub-system 200 that measures the current tension of the web.

[0038] In addition, the controller also receives input from one or more remote, sub-systems of the food packaging machine 100, and from one or more remote sub-systems of the food packaging machine 100, which may experience events that also influence the operation of the web tensioning sub-system 200. Some examples of such events may include splice events (i.e., when a tail end of packaging web on a used roll of packaging web, at the beginning of the food packaging machine, is joined with a front end of packaging web of a fresh roll of packaging web to create a continuous packaging web, thus creating a section of web having a thickness of two layers instead of a single layer); acceleration, deceleration or stops of the web 102 due to jaw movement or for other reasons; packaging material change; product change; package filling status; web length, etc.

[0039] These events can be represented by a set of variables, whose values indicate various states at different sub-systems of the food packaging machine. This is schematically illustrated in FIG. 2, which shows how the input from the local sensor 116 of the web tensioning sub-system 200 is input to the controller 114 along with the input values 204 from other sub-systems of the food packaging machine.

[0040] In one embodiment, some examples of variables representing physical parameters from the local web tensioning sub-system 200 include: [0041] A web tension set point (i.e., a desired web tension for the particular type of web being used in the food packaging machine). [0042] A web current web tensioning system position (e.g., as represented by the physical displacement of the movable guide roller from its neutral position).

[0043] In one embodiment, some examples of variables from other sub-systems include: [0044] Web movement and control variables representing (e.g., detection of splices or size of packages, speed, etc.) [0045] Jaw system motion profile (e.g., how often and with how much force the jaws pull on the web, etc.) [0046] Packaging material characteristics variables (e.g., packaging material stiffness, presence of closures, package volume, web length, etc.) [0047] Filling status (e.g., filling flow, product level, etc.)

[0048] As can be realized, these are merely a few examples of possible influencing factors from other sub-systems, and should not be considered as an exhaustive list. However, they do represent influencing factors which cannot be considered by conventional web tensioning control systems, as it is difficult or impossible to determine how various possible combination of these factors should influence the operation of the web tensioning sub-system 200.

[0049] In accordance with the various embodiments described herein, the controller 114 uses a local control model 210 to process the local sub-system input variables 116, in combination with a reinforcement learning model 206 to process the input values from the other sub-systems, to determine how all the measured variables as a whole collectively influence the operation of the web tensioning sub-system 200. The local control model 206 can be an algorithm executed by PID controller. The reinforcement learning model 206 can be a deep reinforcement learning model, which includes one or more neural networks, as described above. In some embodiments, the local sub-system input variables 116 can be processed by the reinforcement learning model 206. In some embodiments, the reinforcement learning model 206 can be used to figure out how different combinations of local and remote variables should influence the web tensioning sub-system and use this insight to improve the local control model 210. Based on the result of this processing and determination, the controller 114 generates a set of output control signals 208 for the local web tensioning sub-system 200, which control the position of the middle guide roll 108 to accomplish the proper web tension.

[0050] Examples of neural networks that can be used in embodiments that use a deep reinforcement learning model 206 include, for example, a Convolution Neural Network (CNN) that has been trained using reinforcement learning and deep reinforcement learning, a Recurrent Neural Network (RNN), such as a Long Short-Term Memory (LSTM) neural network, which is often used in the field of deep learning, or a fully connected neural network. The LSTM network may be particularly useful since, unlike standard feedforward neural networks, the LSTM has feedback connections. This enables the LSTM to process not only single data points, but also entire sequences of data, which can be particularly useful in the context of a food packaging machine designed to generate a large number of packages.

[0051] Thus, if the tension of the web 102 varies, for example, due to the changing nature of the jaw movement, or due to imprecise functioning of one or more mechanical elements of the filling machine, the data driven approach allows the controller 114 to detect such variance in web tension and adjust the position of the middle guide roll 108 to ensure proper web tension all the time, thereby avoiding potential damage to the packaging material and ensuring sealing and forming quality. Moreover, conventional control techniques often require a manual calibration for each different working setup. In contrast, this embodiment of the invention allows for a training environment to be provided, which enables the controller 114 to learn the optimal control policy given the goal for the web tensioning sub-system 200. This may save a considerable number of manhours in setting up the packaging machine, and thereby also reduce the time to market of new packages and products. Furthermore, in some embodiments, the output from the reinforcement learning model can be used to tune the gains of a conventional PID controller, such that the PID controller can operate more efficiently compared to the conventional control techniques where the PID controller relies on local variable values only. Thus, embodiments of the invention can be beneficial even in situations where the only means for controlling the web tensioning sub-system 200 is a PID controller.

[0052] It should be noted that even though a sub-system has been referred to above as a web tensioning system, filling system, a sterilizing system, a package folding system, etc. it can also refer to a portion of the above-mentioned sub-system, or individual elements.

[0053] It should be noted that in some embodiments, the control models for the controller 140 can reside within the controller 140 itself, as illustrated in FIG. 1. In other embodiments, they may reside in and operate from external hardware/software (e.g., an external computer or similar processing equipment) to further accelerate the required computations and the controller 140 in the food packaging machine may be a simpler controller that merely executes the functionality, as determined by the external hardware/software.

[0054] The systems and methods disclosed herein can be implemented as software, firmware, hardware or a combination thereof. In a hardware implementation, the division of tasks between functional units or components referred to in the above description does not necessarily correspond to the division into physical units; on the contrary, one physical component can perform multiple functionalities, and one task may be carried out by several physical components in collaboration.

[0055] Certain components or all components may be implemented as software executed by a digital signal processor or microprocessor, or be implemented as hardware or as an application-specific integrated circuit. Such software may be distributed on computer readable media, which may comprise computer storage media (or non-transitory media) and communication media (or transitory media). As is well known to a person skilled in the art, the term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, optical or magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer.

[0056] The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.

[0057] From the description above follows that, although various embodiments of the invention have been described and shown, the invention is not restricted thereto, but may also be embodied in other ways within the scope of the subject-matter defined in the following claims.

PERFORMING WEB TENSIONING ADJUSTMENTS IN A FOOD PACKAGING SYSTEM BASED ON REINFORCEMENT LEARNING

Inventors

Cpc classification

Classification Explorer

B65H2511/112

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B3/26

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B65/006

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65H2801/69

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

G06N3/08

PHYSICS

Classification Explorer

B65B57/145

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65H2301/5161

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B57/02

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65H2301/3112

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

G06N7/01

PHYSICS

Classification Explorer

B65H2557/24

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65H2557/264

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

G06N3/006

PHYSICS

Classification Explorer

B65B9/213

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B37/00

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65H23/048

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B9/207

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B57/00

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65H23/1888

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

G05B13/027

PHYSICS

Classification Explorer

B65B41/16

PERFORMING OPERATIONS; TRANSPORTING

International classification

Classification Explorer

B65B41/16

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B9/213

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B57/02

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65H23/04

PERFORMING OPERATIONS; TRANSPORTING

Abstract