ENHANCING PACKAGE FORMATION IN A FOOD PACKAGING SYSTEM BASED ON REINFORCEMENT LEARNING

Abstract

Methods and apparatus, including computer program products, are described for forming individual packages in a food packaging machine (100) comprising a plurality of sub-systems. One or more local variable values (142) are received, which indicate measurements by the food packaging machine (100) of one or more physical parameters for a local sub-system (120). One or more remote variable values (204) are received, which indicate measurements by the food packaging machine (100) of one or more physical parameters for one or remote sub-systems. One or more control parameter values are determined for the local sub-system (120) of the food packaging machine (100), by processing the remote variable values (204) and the local variable values (142) using a reinforcement learning model (206) and a local control model (210). One or more control parameters of the local sub-system (120) are adjusted in accordance with the determined control parameter values. The formation of individual packages (122) by the food packaging machine (100) is controlled in accordance with the adjusted one or more control parameters.

Claims

1. A method for producing individual packages filled with one or more food products in a food packaging machine, the method comprising: receiving one or more local variable values indicating measurements by the food packaging machine of local one or more physical parameters for a local sub-system of a plurality of local sub-systems; receiving one or more remote variable values indicating measurements by the food packaging machine of one or more physical parameters for one or more remote sub-systems; determining one or more control parameter values for the local sub-system, by processing the remote variable values and the local variable values using a reinforcement learning model and a local control model; adjusting the one or more control parameters of the local sub-system in accordance with the determined control parameter values; and controlling a formation of the individual packages by the food packaging machine in accordance with the adjusted one or more control parameters.

2. The method according to claim 1, wherein the reinforcement learning model comprises a deep reinforcement learning model including a neural network.

3. The method according to claim 1, wherein the local sub-system comprises a jaw configured to form the individual packages from a tube of packaging material filled with the food product.

4. The method according to claim 3, wherein adjusting one or more control parameters of the jaw comprises adjusting one or more of: a timing of a sealing jaws' engagement with the tube of packaging material to form the individual packages or a position of the sealing jaws' engagement with the tube of packaging material to form individual packages.

5. The method according to claim 2, wherein the neural network comprises one of: a convolution neural network, a recurrent neural network, a Long Short-Term Memory neural network, or a fully connected neural network.

6. The method according to claim 1, wherein: the one or more local variables comprises measurements relating to one or more of: synchronization marks printed on a packaging web, a jaw system motion profile or a physical position of mechanical forming adjustment tools, and the one or more remote variable values comprises measurements relating to one or more of: packaging web movement and control variables, packaging web tension variables, packaging filling state variables or packaging material.

7. A food packaging machine comprising: a plurality of local sub-systems configured to produce individual packages filled with one or more food products; a memory; and a processor, wherein the memory stores instructions that, when executed by the processor, cause the processor to perform a method comprising: receiving one or more remote variable values indicating measurements by the food packaging machine of one or more physical parameters for one or remote sub-systems; receiving one or more local variable values indicating measurements by the food packaging machine of one or more physical parameters for a local sub-system of the plurality of local sub-systems; determining one or more control parameter values for the local sub-system, by processing the remote variable values and the local variable values using a reinforcement learning model and a local control model; adjusting the one or more control parameters of the local sub-system in accordance with the determined control parameter values; and controlling a formation of individual packages by the food packaging machine in accordance with the adjusted one or more control parameters.

8. The food packaging machine according to claim 7, wherein the reinforcement learning model comprises a deep reinforcement learning model comprising a neural network.

9. The food packaging machine according to claim 7, wherein the local sub-system comprises a jaw configured to form the individual packages from a tube of packaging material filled with the food product.

10. The food packaging machine according to claim 9, wherein adjusting one or more control parameters of the jaw comprises adjusting one or more of: a timing of a sealing jaws' engagement with the tube of packaging material to form the individual packages, or a position of the sealing jaws' engagement with the tube of packaging material to form individual packages.

11. The food packaging machine according to claim 8, wherein the neural network comprises one of: a convolution neural network, a recurrent neural network, a Long Short-Term Memory neural network, or a fully connected neural network.

12. The food packaging machine according to claim 7, wherein: the one or more local variables comprises measurements relating to one or more of: synchronization marks printed on a packaging web, a jaw system motion profile, or a physical position of mechanical forming adjustment tools, and the one or more remote variable values measurements relating to one or more of: packaging web movement and control variables, packaging web tension variables, packaging filling state variables, and packaging material.

13. A computer program product comprising a non-transitory computer readable storage medium with instructions that, when executed to a processor, cause the processor to carry out the method according to claim 1.

14. The method according to claim 3, wherein adjusting the one or more control parameters of the jaw facilitates production of a different form factors of the individual packages by the food packaging machines without requiring mechanical changes to the jaw.

15. The food packaging machine according to claim 9, wherein adjusting the one or more control parameters of the jaw facilitates production of a different form factors of the individual packages by the food packaging machines without requiring mechanical changes to the jaw.

Description

DRAWINGS

[0026] Embodiments of the invention will now be described, by way of example, with reference to the accompanying schematic drawings.

[0027] FIG. 1 is a schematic diagram of a portion of a food packaging machine, in accordance with one embodiment.

[0028] FIG. 2 is a schematic diagram of a controller in a food packaging machine, in accordance with one embodiment

DETAILED DESCRIPTION

[0029] As was mentioned above, a goal with the various embodiments of the invention is to provide improved control techniques for equipment and systems relating to food processing and packaging, and in particular with respect to forming individual packages by the food packaging machine. Having correctly formed packages is important, not only from design and aesthetics point of view, but also from a functionality point of view, as very small inaccuracies in the formation of the individual packages may impact the functionality of the package. For some packages, very precise accuracy (typically at a sub-millimeter level) is required. By applying the general concepts of reinforcement learning and/or deep reinforcement learning techniques to control the jaw system, misalignments (e.g., between the design on the packaging material and the sealing and cutting processes in the jaw system) can be corrected at a very precise level.

[0030] Both reinforcement learning and deep reinforcement learning are examples of machine learning techniques. In general, reinforcement learning (RL) can be characterized as dynamically learning through the use of positive or negative rewards. A system performance is evaluated with respect to a desired target. If the target is reached or not, a positive reward is delivered, and if the target is not reached, a negative reward is delivered. As the positive and negative rewards accumulate over time, the RL model evolves a control policy for the system, with the goal of maximizing the outcome. Deep reinforcement learning (DRL) can be characterized as an enhancement of RL, in which RL is used together with a neural network when evolving the control policy for the system.

[0031] In the context of food processing and packaging, RL (i.e., agent-environment interaction) can be used to evolve a control policy for a food processing and/or packaging machine. Using DRL (i.e., RL together with a neural network) can be particularly useful when evolving control policies for sub-systems, such as the filling sub-system, that must consider a large number of variables whose internal relations and effects on the sub-system may not be known. In addition, it should be noted that RL and DRL techniques can also be used to improve existing, local control techniques, in essence by filling in the gaps of conventional control techniques with this data-driven approach. Thus, the DRL algorithm can then directly (or indirectly through other control layers, e.g., by tuning the gains of a conventional PID controller to allow the PID controller to operate more efficiently compared to the conventional control techniques) control the actuators (e.g., servomotors, pneumatic actuators or other actuators) that control how individual packages are formed in a food packaging system.

[0032] In order to further illustrate these principles, various embodiments of the invention will now be described more fully by way of example of controlling a jaw sub-system in a food packaging machine to perform alignment correction throughout the food packaging machine, and with reference to the accompanying drawings in which some, but not all, embodiments of the invention are shown. The invention may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein.

[0033] As was mentioned above, the jaw system is an important sub-system of a food packaging machine, and its operation needs to be precisely controlled in order to comply with the design of the packaging material, and to properly form the individual packages. A misaligned design may result in folding and integrity issues of the packaging material.

[0034] FIG. 1 shows a schematic view of a food packaging machine 100, where a web 102 of packaging material, which preferably includes at least one sealable surface 104 thereon, is fed forwardly 106 through a web feeder over guide rolls 108, 110 and formed into a tube 112. The longitudinally overlaid side edges 114, 116 of the web 102 are sealed to close the tube along the longitudinal edge thereof. The side edges may be overlaid either with the undersides against each other or overlapped with the undersides facing in the same direction. A strip of tape (not shown) may be provided along one or both of the longitudinal edges 114, 116 to assist in tube formation.

[0035] A food product is fed into the formed tube from a food product filling device via a food product pipe 118 placed at least partly inside the formed tube. Food product in this context refers to anything that people or animals ingest, eat and/or drink or that plants absorb, including but not limited to liquid, semi-liquid, viscous, dry, powder and solid food products, drink products, and water. For the avoidance of doubt, food products also include ingredients for preparing food. Some examples of food products include milk, water and juice. The filled tube is then forwarded to a jaw sub-system 120, where the transverse seals of the package 122 are formed at, preferably, equally spaced apart locations along the length of the tube, although non-equal lengths may also be formed if so desired. Sealing may occur by heat or other known means. After the tube is sealed, it is severed transversely of its length and within the bounds of the transversely sealed areas to form individual packages filled with the product. Commonly, where equal sized packages are produced, each of the packages is filled with a consistent volume of product. In food packaging machines, in particular, consistency of volume is provided by making the individual packages of equal volume when sealed. Thus, the individual transverse seals are preferably formed at equally spaced apart locations along the length of the web.

[0036] In a preferred embodiment of the food packaging machine depicted in FIG. 1, the jaw sub-system 120 includes first and second sealing jaw subassemblies 124 and 126, respectively, which are disposed on opposite sides of the tube. These subassemblies 124, 126 include at least one carriage 128, 130 and preferably a plurality of carriages. The carriages 128, 130 are preferably mounted on respective tracks 132, 134 along closed loop paths. Alternatively, the carriages may be mounted on open loop paths. Preferably, instead of varying the velocity of the web 102, the positioning of the carriages 128, 130 and their associated scaling jaws 136, 138 is controlled by a controller 140, or other controlling mechanism, to ensure that each pair of sealing jaws 136, 138 registers with the appropriate portion of the tube at a preselected location. This serves to ensure proper package 122 size.

[0037] The controller 140 receives input from a registration sensor 142, such as an optical sensor that is capable of optically detecting synchronization marks 144, which are provided at spaced intervals on the packaging web. The synchronization marks 144 are constructed in such away that there is little chance that the registration sensor 142 will misread them. For example, they may have high contrast with the background and/or have an easily recognizable shape. One example of a synchronization mark 144 is a UPC (Universal Product Code) bar code. In some embodiments, the registration sensor 142 may be an infrared or fluorescent ink sensor or a proximity probe, or any other type of position sensing device, such as a sensor capable of detecting magnetic ink.

[0038] In addition, the controller also receives input from remote, sub-systems of the food packaging machine 100, which may experience events that may have an effect on the operation of the local jaw sub-system. Some examples of such events may include splice events; acceleration, deceleration or stops of the packaging web; package format change; product change, etc.

[0039] These events can be represented by a set of remote variables, whose values represent various states at different sub-systems of the food packaging machine. This is schematically illustrated in FIG. 2, which shows how the input from the registration sensor 142 of the local jaw sub-system is input to the controller 140 along with the input values 204 from remote sub-systems of the food packaging machine.

[0040] In one embodiment, some examples of variables representing physical parameters from the local jaw sub-system include: [0041] Synchronization marks printed on the packaging material. [0042] A jaw system motion profile (i.e., stored motion data describing the movement of the jaw system over a period of time, for example, by logging the movement of a servo motor controlling the jaw system in a PLC (Programmable Logic Controller)). [0043] A physical position of the mechanical forming adjustment tools (this position may change, e.g., based on the particular kind of packages being produced by the food packaging machine).

[0044] In one embodiment, some examples of variables representing physical parameters from the remote sub-systems include: [0045] Web movement and control variables representing, e.g., detection of splices or size of packages, etc. [0046] Web tension variables representing, e.g., positions and/or pressures on various rollers in the food packaging machine as the web travels through the food packaging machine, [0047] Filling state, e.g., filling flow and product level. [0048] Packaging material characteristics variables, e.g., packaging material stiffness, presence of closures, package volume, etc.

[0049] As can be realized, these are merely a few examples of possible influencing factors from remote sub-systems, and should not be considered as an exhaustive list. However, they do represent influencing factors which cannot be considered by conventional control systems used today. The local and remote variables all affect the tube position in their own way, and it is difficult or impossible for conventional control systems to determine how various possible combinations of these remote and local variables should influence the operation of the local jaw sub-system.

[0050] In accordance with the various embodiments described herein, the controller 140 uses a local control model 210 to process the local sub-system input variables 142, in combination with a reinforcement learning model 206 to process the input values from the remote sub-systems, to determine how the measured variables collectively influence the operation of the local jaw sub-system. The local control model 210 can be an algorithm executed by a PID controller. The reinforcement learning model can be a deep reinforcement learning model, which includes one or more neural networks, as described above. In some embodiments, the local sub-system input variables 116 can be processed by the reinforcement learning model 206. In some embodiments, the reinforcement learning model 206 can be used to figure out how different combinations of local and remote variables should influence the web tension sub-system and use this insight to improve the local control model 210. Based on the result of this processing and determination, the controller 140 generate a set of output control signals 208 for the local jaw system 120, which control the timing of the sealing jaws of the two subassemblies and their movement into engagement with the moving tube 112 for the formation of the transverse seal.

[0051] Examples of neural networks that can be used in embodiments that use a deep reinforcement learning model include, for example, a Convolution Neural Network (CNN) that has been trained using reinforcement learning and deep reinforcement learning, a Recurrent Neural Network (RNN), such as a Long Short-Term Memory (LSTM) neural network, which is often used in the field of deep learning, or a Fully Connected Neural Network. The LSTM network may be particularly useful since, unlike standard feedforward neural networks, the LSTM has feedback connections. This enables the LSTM to process not only single data points, but also entire sequences of data, which can be particularly useful in the context of a food packaging machine designed to generate a large number of packages.

[0052] Thus, if the velocity of the moving tube 112 varies, for example, due to changes in tension in the tube, or due to imprecise functioning of one or more mechanical elements of the filling machine, the data driven approach allows the controller 140 to detect such variance in velocity of the tube and adjust the position of the registered sealing jaws to ensure that the sealing jaws engage the tube at a proper time, thereby avoiding misalignments with respect to the design of the individual packages. As a result, the food packaging machine can operate more efficiently and fewer packages need to be discarded, compared to existing solutions which may not be able to take such variables into account, resulting both in financial and environmental advantages.

[0053] Furthermore, in some embodiments, the output from the reinforcement learning model can be used to tune the gains of a conventional PID controller, such that the PID controller can operate more efficiently compared to the conventional control techniques where the PID controller relies on local variable values only. Thus, embodiments of the invention can be beneficial even in situations where the only means for controlling the jaw sub-system is a PID controller. Further, as a result of the flexibility of the present system to position the sealing jaws as a function of the variables collected from the different sub-systems of the food packaging machine, among other things, the present system may be employed to produce any of a variety of package sizes without requiring any mechanical changes to the system.

[0054] It should be noted that even though a sub-system has been referred to above as a jaw system, a filling system, a sterilizing system, a package folding system, etc. it can also refer to a portion of the above-mentioned sub-systems, or individual elements.

[0055] It should be noted that in some embodiments, the control models for the controller 140 can reside within the controller 140 itself, as illustrated in FIG. 1. In other embodiments, they may reside in and operate from external hardware/software (e.g., an external computer or similar processing equipment) to further accelerate the required computations and the controller 140 in the food packaging machine may be a simpler controller that merely executes the functionality, as determined by the external hardware/software.

[0056] The systems and methods disclosed herein can be implemented as software, firmware, hardware or a combination thereof. In a hardware implementation, the division of tasks between functional units or components referred to in the above description does not necessarily correspond to the division into physical units; on the contrary, one physical component can perform multiple functionalities, and one task may be carried out by several physical components in collaboration.

[0057] Certain components or all components may be implemented as software executed by a digital signal processor or microprocessor, or be implemented as hardware or as an application-specific integrated circuit. Such software may be distributed on computer readable media, which may comprise computer storage media (or non-transitory media) and communication media (or transitory media). As is well known to a person skilled in the art, the term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, optical or magnetic storage devices, or any other medium which can be used to store the desired information, and which can be accessed by a computer.

[0058] The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.

[0059] From the description above follows that, although various embodiments of the invention have been described and shown, the invention is not restricted thereto, but may also be embodied in other ways within the scope of the subject-matter defined in the following claims.

ENHANCING PACKAGE FORMATION IN A FOOD PACKAGING SYSTEM BASED ON REINFORCEMENT LEARNING

Inventors

Cpc classification

Classification Explorer

B65H2511/112

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65H2557/24

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B59/003

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65H2801/69

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B3/00

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B2210/04

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B37/00

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B9/20

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B51/146

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B57/00

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65H23/1888

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

G05B13/027

PHYSICS

Classification Explorer

B65B41/16

PERFORMING OPERATIONS; TRANSPORTING

International classification

Classification Explorer

B65B57/00

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

G05B13/02

PHYSICS

Classification Explorer

B65B9/20

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B59/00

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B37/00

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B3/00

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B41/16

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65B51/14

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B65H23/188

PERFORMING OPERATIONS; TRANSPORTING

Abstract

Claims

Description