METHOD AND SYSTEM FOR EVALUATING OPTIMIZED CONCENTRATION TRAJECTORIES FOR DRUG ADMINISTRATION
20220406434 · 2022-12-22
Assignee
- Deutsches Krebsforschungszentrum Stiftung Des Oeffentlichen Rechts (Heidelberg, DE)
- BERLINER INSTITUT FUER GESUNDHEITSFORSCHUNG ZENTRUM DIGITALE GESUNDHEIT (Berlin, DE)
Inventors
- Stefan KALLENBERGER (Heidelberg, DE)
- Tim TREIS (Heidelberg, DE)
- Chiara DI PONZIO (Heidelberg, DE)
- Roland ElLS (Berlin, DE)
Cpc classification
G01N33/48728
PHYSICS
G16H20/40
PHYSICS
G16H40/40
PHYSICS
G16H20/10
PHYSICS
International classification
G16H20/40
PHYSICS
G01N33/50
PHYSICS
Abstract
The present invention is in the field of experimental data acquisition. In particular, the present invention relates to a live-cell imaging method and a corresponding system for acquiring experimental data of one or more biological probes. More specifically, the present invention relates to methods and systems for evaluating an optimized concentration trajectory for administration of a drug, in particular a chemotherapeutic drug.
Claims
1. A method for evaluating an optimized concentration trajectory for administration of a drug, in particular a chemotherapeutic drug, the method comprising: executing, by a processing module (30), a machine learning scheme configured to learn, based on an initial model of a cellular signal transduction pathway that is affected or targeted by the drug, to determine an optimized drug concentration trajectory such that at least one predefined cellular parameter of a biological probe (20) is improved when the drug is applied to the biological probe (20) according to the optimized drug concentration trajectory; experimentally applying, by a probe manipulation device (16), the drug to the biological probe (20) according to the optimized drug concentration trajectory determined by the machine learning scheme; obtaining, by an imaging device (12), optical measurements of the biological probe (20); determining, by the processing module (30), at least one measurement value of the at least one predefined cellular parameter of the biological probe (20) from the optical measurements; and fitting, by the processing module (30) based on the at least one measurement value of the at least one predefined cellular parameter of the biological probe (20), the initial model to obtain a first refined model, and repeating execution of the machine learning scheme based on the first refined model.
2. The method according to claim 1, wherein an improvement of the at least one predefined cellular parameter of the biological probe (20) comprises maximizing the number of dead cells contained in the biological probe (20) and/or minimizing the number of dividing cells contained in the biological probe (20).
3. The method according to claim 1, wherein the step of experimentally applying the drug to the biological probe (20) according to the optimized drug concentration trajectory determined by the machine learning scheme is performed when a first convergence criterion is fulfilled, wherein the first convergence criterion may be defined by a predefined number of learning cycles of the machine learning scheme.
4. The method according to claim 1, wherein the step of repeating execution of the machine learning scheme based on a refined model is performed until a second convergence criterion is fulfilled, wherein the second convergence criterion may be defined by a predefined number of experimental cycles, or wherein the second convergence criterion may be defined by the determination that a measurement value of the at least one predefined cellular parameter corresponds to a target value of the at least one predefined cellular parameter within a predefined tolerance.
5. The method according to claim 1, wherein the machine learning scheme includes a reinforcement learning framework including an agent configured to apply a time series of actions A(t) on an environment resulting in observations O(t) and rewards R(t), wherein the environment is defined by the model of a cellular signal transduction pathway that is affected or targeted by the drug.
6. The method according to claim 5, wherein the agent of the reinforcement learning framework is configured to select drug concentration trajectories according to a policy associated with a neural network.
7. The method according to claim 5, wherein the policy is iteratively updated in order to maximize the rewards R(t), wherein the rewards R(t) are defined based on an improvement of the at least one predefined cellular parameter of the biological probe (20) effected by applying a drug concentration trajectory selected by the agent to the biological probe (20).
8. The method according to claim 1, wherein determining the at least one measurement value of the at least one predefined cellular parameter comprises classifying, counting and/or identifying cells in the corresponding biological probe (20) with respect to the cellular parameter, wherein the cells are preferably classified as living or dead; and/or wherein the cells are classified, counted and/or identified by a neural network algorithm trained for classifying, counting and/or identifying cells of the one or more biological probes (20) based on one or more optical measurements with respect to the cellular parameter; and/or wherein the cellular parameter comprises one or more of cell number, living cell number, living cell fraction, dead cell number, dead cell fraction, cell proliferation rate, cell death rate, cell division rate, cell differentiation rate, cell exocytosis rate, cell endocytosis rate, cell size, cell dimensions, cell adherence area, beating frequency, cell depolarization rate, and drug concentration.
9. A system for evaluating an optimized concentration trajectory for administration of a drug, in particular for execution of a method according to claim 1, the system comprising: a processing module (30) that is configured for executing a machine learning scheme configured to learn to determine, based on an initial model of a cellular signal transduction pathway that is affected or targeted by the drug, an optimized drug concentration trajectory such that at least one predefined cellular parameter of a biological probe (20) is improved when the drug is applied to the biological probe (20) according to the optimized drug concentration trajectory; a probe manipulation device (16) configured for experimentally applying the drug to the biological probe (20) according to the optimized drug concentration trajectory determined by the machine learning scheme; an imaging device (12) configured for obtaining optical measurements of the biological probe (20); wherein the processing module (30) is further configured to determine at least one measurement value of the at least one predefined cellular parameter of the biological probe (20) from the optical measurements; and to fit, based on the at least one measurement value of the at least one predefined cellular parameter of the biological probe (20), the initial model to obtain a first refined model, and to repeat execution of the machine learning scheme based on the first refined model.
10. The system according to claim 9, further comprising a control unit (18) configured for controlling the operation of the imaging device (12) and the probe manipulation device (16) based on control instructions received over a functional connection (40) from the processing module (30).
11. The system according to claim 9, wherein the imaging device (12) is preferably a live-cell imaging device (12) and comprises an optical device (126), in particular one or more of a microscope, a digital camera, a CCD, one or more mirrors, one or more deflectors and/or one or more focusing lenses; and/or wherein the imaging device (12) or the optical device (126) is movable for scanning the one or more probes, and wherein the control unit (18) is further configured for controlling a movement of the imaging device (12) or the optical device (126); and/or wherein the imaging device (12) comprises a housing (121) enclosing at least some of the remaining components of the imaging device (12); wherein the housing (121) preferably comprises a cover plate (122), a bottom plate (132) and at least a lateral wall (124) extending between the cover plate (122) and the bottom plate (132), wherein the cover plate (122) preferably is at least partly transparent and is configured for supporting the one or more biological probes (20) and/or one or more probe carriers containing the one or more biological probes (20), and/or wherein the housing (121) preferably comprises a metallic bottom plate (132).
12. The system according to claim 9, further comprising a reflective element (50) for directing illumination light to and/or through the biological probes (20) and an illumination light source for generating the illumination light for illuminating the one or more biological probes (20) for obtaining the at least one optical measurement by the imaging device (12), wherein the probe manipulation device (16) preferably comprises a perfusion device for perfusing the one or more biological probes (20) with an experimental fluid; and/or wherein the probe manipulation device (16) preferably comprises a light source, preferably an LED, for emitting experimental light on the one or more biological probes (20).
13. A processing module (30) connectable to a functional connection (40) of a live-cell imaging system (10), wherein the processing module (30) is configured for: executing a machine learning scheme configured to learn to determine, based on an initial model of a cellular signal transduction pathway that is affected or targeted by the drug, an optimized drug concentration trajectory such that at least one predefined cellular parameter of the biological probe (20) is improved when the drug is applied to the biological probe (20) according to the optimized drug concentration trajectory; providing the optimized drug concentration trajectory determined by the machine learning scheme to the live-cell imaging system (10) via the functional connection (40); receiving via the functional connection (40) optical measurements of the biological probe (20) obtained by an imaging device (12) of the live-cell imaging system (10) after experimental application of the drug to the biological probe (20) by a probe manipulation device (16) of the live-cell imaging system (10) according to the optimized drug concentration trajectory determined by the machine learning scheme; determining at least one measurement value of the at least one predefined cellular parameter of the biological probe (20) from the optical measurements; and fitting, based on the at least one measurement value of the at least one predefined cellular parameter of the biological probe (20), the initial model to obtain a first refined model, and repeating execution of the machine learning scheme based on the first refined model.
14. (canceled)
15. A non-transitory computer readable medium comprising processor executable instructions that, when executed by one or more processors, causes the one or more processors to operate as a processing module (30) according to claim 13.
16. A processing module (30) connectable to a functional connection (40) of a live-cell imaging system (10), wherein the processing module (30) is configured for: executing a machine learning scheme configured to learn to determine, based on an initial model of a cellular signal transduction pathway that is affected or targeted by the drug, an optimized drug concentration trajectory such that at least one predefined cellular parameter of the biological probe (20) is improved when the drug is applied to the biological probe (20) according to the optimized drug concentration trajectory; providing the optimized drug concentration trajectory determined by the machine learning scheme to the live-cell imaging system (10) via the functional connection (40); receiving via the functional connection (40) optical measurements of the biological probe (20) obtained by an imaging device (12) of the live-cell imaging system (10) after experimental application of the drug to the biological probe (20) by a probe manipulation device (16) of the live-cell imaging system (10) according to the optimized drug concentration trajectory determined by the machine learning scheme; determining at least one measurement value of the at least one predefined cellular parameter of the biological probe (20) from the optical measurements; and fitting, based on the at least one measurement value of the at least one predefined cellular parameter of the biological probe (20), the initial model to obtain a first refined model, and repeating execution of the machine learning scheme based on the first refined model, wherein the processing module is further configured for controlling the live-cell imaging system (10) so as to implement the method defined in claim 1.
17. A non-transitory computer readable medium comprising processor executable instructions that, when executed by one or more processors, causes the one or more processors to operate as a processing module (30) according to claim 16.
Description
BRIEF DESCRIPTION OF THE FIGURES
[0110]
[0111]
[0112]
[0113]
[0114]
[0115]
[0116]
[0117]
[0118]
[0119]
[0120]
[0121]
[0122]
[0123]
DESCRIPTION OF PREFERRED EMBODIMENTS OF THE INVENTION
[0124]
[0125] The live-cell imaging system 10 further comprises a probe manipulation device 16 configured for applying at least one experimental stimulus to the one or more biological probes 20. The arrow between the probe manipulation device 16 and the biological probes 20 in
[0126] Additionally or alternatively, the probe manipulation device 16 can comprise a light source, such as an LED (not shown in the figure), for emitting experimental light on the one or more biological probes 20, thereby influencing an environmental condition thereof by means of the emitted light.
[0127] The live-cell imaging system 10 further comprises a control unit 18 that is operatively connected to the imaging device 12 and to the probe manipulation device 16. The control unit 18, which in the exemplary embodiment shown is a software-based control unit 18 supported on an internal processing unit of the live-cell imaging system 10, is configured for controlling the operation of the imaging device 12 and the probe manipulation device 16 based on control instructions received over a functional connection 40, over which the live-cell imaging system 10, and in particular the control unit 18, is connectable to external devices. In the embodiment shown, the functional connection 40 is a wired connection, for instance an Ethernet connection for inputting and outputting data over the internet. However, in other embodiments, the functional connection 40 may be another type of wired connection or a wireless connection.
[0128] By means of the functional connection 40, the control unit 18 is connected to an external processing module 30. The control unit 18 is configured for receiving control instructions from the external processing module 30, and for controlling the operation of the imaging device 12 and the probe manipulation device 16 based on such control instructions. For example, the probe manipulation device 16 sets the concentration of the bioactive agent in the experimental fluid as controlled by the control unit 18 and controls a position of the imaging device 12 with respect to the biological probes 24 obtaining optical measurements according to a measuring routine obtained through the functional connection 40.
[0129] Further, the control unit 18 is configured for outputting to the processing module 30, over the functional connection 40, the optical measurements of the biological probes 20 obtained by the imaging device 12. The processing module 30 is configured for analysing the optical measurements obtained by the imaging device 12 received over the functional connection 40 and for determining at least one measurement value of a cellular parameter of the one or more biological probes 20. The processing module 30 comprises a neural network algorithm trained for classifying cells of the biological probes 20 based on corresponding optical measurements with respect to a chosen cellular parameter. For instance, the algorithm may be configured and trained for determining a number of living cells and/or a number of dead cells in a biological probe 20 from an optical measurement thereof, for example by means of image segmentation.
[0130] In the exemplary embodiment illustrated in
[0131] The processing module 30 may be a hardware-based module or a software-based module, for example in the form of software code loaded on a processor. The processing module 30 is operatively connected, via the functional connection 40, to the control unit 18.
[0132] The processing module 30 is further configured for determining, if the convergence criterion is not satisfied, at least one experimental stimulus for setting environmental condition of the biological probes 20 by means of the probe manipulation device 16. The processing module is configured for determining the at least one experimental stimulus such that the at least one experimental stimulus, upon being applied by the probe manipulation device 16 to the one or more biological probes 20, set corresponding environmental conditions of the biological probes 20 such that fulfilment of the convergence criterion is improved.
[0133] The processing module 30 is further configured for sending control instructions to the live-cell imaging system, in particular to the control unit 18, via the functional connection 40, for controlling the probe manipulation device 16 to apply the at least one experimental stimulus determined by the processing module 30 to the one or more biological probes 30. This can be achieved by different configurations that may allow the processing module 30 to control a live-cell imaging system so as to implement a live-cell imaging method according to embodiments of the present invention. Examples of some of these configurations are discussed below.
[0134]
[0135]
[0136] The microscope 126 is movable for scanning the one or more biological probes 20. For this purpose, the imaging device 12 comprises a first guide structure 128 for guiding the movement of the microscope 126 in the x direction and a first stepper motor 130 for driving the movement of the microscope 126 along the first guide structure 128, i.e. in the x direction. Although it is not shown in
[0137] The control unit 18 is further configured for controlling a movement and corresponding positioning of the optical device by correspondingly controlling the settings of the stepper motors or of the corresponding motor units for scanning the one or more biological probes 20 according to a measuring routine, which can be stored in the control unit 18 or be inputted to the control unit 18 by the processing module 30. The measuring routine specifies a sequence of measuring events, i.e. of measuring positions corresponding to one of the probes and respective measuring times.
[0138] When optical measurements are to be obtained from a particular biological probe 20, the position of the microscope 126 is adjusted such that the microscope 126 is located directly below said particular biological probe 20 and can optically access the probe through the transparent cover plate 122 and through the corresponding transparent well plate, dish, well or the like for obtaining the optical measurements. The coordinates defining the position of each biological probe on the plate 122 can be stored in the control unit 18 or in an external processing module 30. The microscope 126 can then be moved so as to scan the biological probes 20 in order to obtain optical measurements of the biological probes 20, for example based on control instructions received from the processing module 30 over the functional connection 40 corresponding to the positions of the biological probes 20 on the cover plate 122. In other examples, the control unit 18 or the processing module 30 can be configured for identifying non-pre-stored positions of the biological probes 20.
[0139] In the embodiment shown in
[0140] In the embodiment shown in
[0141] The live-cell imaging system 10 illustrated in
[0142] The mirror plate 50 can be pivoted to the open position, shown in
[0143]
[0144]
[0145] According to the method 200, at least one optical measurement of the biological probes 20 is obtained by the imaging device 12 at operation 202. The at least one optical measurement can correspond to digital image data of the one or more biological probes 20. For instance, each optical measurement can correspond to digital image data of one corresponding biological probe 20 at a given time. The control unit 18 controls the operation of the imaging device 12 for obtaining the at least one optical measurements, for example by determining the settings of the optical device of the imaging device 12 for the optical measurement, such as focusing settings or image size and definition, and/or by determining a positioning or sequence of positionings of the optical device with respect to the one or more biological probes 20. The optical measurements obtained are then received by the control unit 18 and forwarded to the processing module 30 via the functional connection 40. The optical measurements can correspond to one of the biological probes 20 or to different biological probes 20, in which case the information received by the control unit 18 from the imaging device 12 can further comprise, for each optical measurement, information about the corresponding biological probe, for example spatial coordinates or an identification label obtained during the measurement by the imaging device 12.
[0146] According to the method 200, at 204, the processing module 30 analyses the at least one optical measurement obtained by the imaging device 12 for determining at least one measurement value of a cellular parameter of the one or more biological probes 20. The cellular parameter can for example be an apoptosis rate based on counts of living and/or dead cells extracted from timely distributed optical measurements of the same biological probe, the measurement value then corresponding to the value of the apoptosis rate in each case. For this purpose, the processing module 30 of the embodiment considered comprises a convolutional neural network algorithm that has been trained using a large number of optical measurements for identifying cells as living and/or dead cells from an image of the corresponding biological probe obtained as an optical measurement. From the values of the measurement value “number of apoptotic cells”, a cellular parameter “apoptosis rate” or a cellular parameter “drug concentration at half-maximal apoptosis rate” can be estimated using a biological model, as will exemplarily be shown below.
[0147] The processing module 30, at 206, determines whether the at least one measurement value satisfies a convergence criterion of a regulatory task. In the embodiment under consideration, an example of which is provided in detail below as Example 1, the regulatory task may define a target apoptosis rate to be achieved and held, for example an apoptosis rate of 0.1/h.
[0148] If the result of operation 206 is positive, i.e. if the processing module 30 determines that the convergence criterion is satisfied, this means that the apoptosis rate has a value corresponding to the target apoptosis rate and the method is terminated, at 208, as illustrated in
[0149] If the result of operation 206 is negative, i.e. if the processing module 30 determines that the convergence criterion is not satisfied, this means that the apoptosis rate does not (yet) have a value corresponding to the target apoptosis rate. The method 200 then continues, at 210, with the probe manipulation device 16 of the live-cell imaging system 10 applying to the one or more biological probes 20 an experimental stimulus, for example a drug concentration determined by the processing module 30, such that the apoptosis rate approaches or achieves the target apoptosis rate. If the regulatory task is not fulfilled because the determined apoptosis rate is below the target apoptosis rate, the processing module 30 defines an experimental stimulus corresponding to an increase in the concentration of the bioactive agent in the experimental fluid and correspondingly instructs the probe manipulation device 16, via the control unit 18, to set the concentration accordingly. As a result, the apoptosis rate of the biological probes, to which the experimental stimulus is applied, increases.
[0150] Conversely, if the determined apoptosis rate is above the target apoptosis rate, the processing module may determine and transmit to the control unit 18 an experimental stimulus corresponding to a reduction in the concentration of the bioactive agent in the experimental fluid with which the biological probes 20 are being perfused. As a result, the apoptosis rate of the biological probes, to which the experimental stimulus is applied, decreases.
[0151] The method then goes back to operation 202 and reiterates until the regulatory task is fulfilled, or reiterates in order to keep fulfilment of the regulatory task.
[0152]
[0153] According to the method 300, the processing module 30 is configured for recording a sequence of N measurement values and associated environmental conditions for each of the biological probes 20 corresponding to N cyclic repetitions of operations 302 and 304. In the example illustrated in
[0154] The associated environmental conditions correspond to the environmental conditions of a biological probe when the optical measurement is obtained and may be estimated from the optical measurement and/or from an experimental stimulus applied to the biological probe, or directly measured by other means such as sensors and the like. For example, when the at least one experimental stimulus corresponds to a light intensity, for instance provided by an LED light source, to which the one or more biological probes are exposed, the imaging device may comprise a light intensity detector for detecting the light intensity applied to the biological probe from which the optical measurement is being obtained. Additionally or alternatively, the probe manipulation device may be calibrated such that an environmental condition can be directly obtained from the experimental stimulus applied to the biological probe from which the optical measurement is being obtained.
[0155] N may for example be 10 or 100. If the processing module 30 determines at 306 that the regulatory task is not fulfilled, the processing module 30 evaluates, at 302, whether a number of consecutive cyclic repetitions of operations 302 to 306 corresponds to N or to a multiple thereof, i.e. to k-N with k being an integer (kϵZ). If the number of cyclic repetitions of operations 302 to 306 does not correspond to k-N, the method 300 continues with operation 310, which is analogous to operation 210 described above for the method 200 of
[0156]
[0157] In the exemplary method illustrated in
[0158] The probe manipulation device 16 is configured for controlling the concentration of a bioactive agent and for perfusing the biological probes 20 with the experimental fluid via experimental fluid conduits 162 as shown in
[0159] In the case of method 400 illustrated in
[0160] In the embodiment under consideration, the stimulus trajectory is applied to each of the biological probes 20 in parallel. Thus, with reference to the arrangement illustrated in
[0161] Each time an optical measurement is obtained for one of the biological probes 20, i.e. each time a measurement of the number of the cellular parameter, the measured measurement value and the corresponding concentration of the bioactive agent, i.e. the corresponding value U; or a value of an environmental condition related thereto, are stored, for example in the processing module 30 or in a storage device connected thereto.
[0162] In operation 406, the processing module 30 determines whether a number of repetitions of operations 402 and 404 is smaller or equal (i.e. does not exceed) a number M corresponding to the number of experimental stimuli U1, . . . , U.sub.M of the stimulus trajectory. If this is the case, i.e. if the stimulus trajectory U1, . . . , U.sub.M has not been completed yet, the method 400 proceeds to operation 416.
[0163] In operation 416, the processing module 30 determines, like in operation 308 of method 300 illustrated in
[0164] If the number of cyclic repetitions of operations 402 to 406 does not correspond to k-N, the method 400 continues with operation 420, which is analogous to operations 210 and 310 described above, respectively, for method 200 illustrated in
[0165] If the processing module 30 determines at 416 that the number of cyclic repetitions of operations 402 to 406 corresponds to N or a multiple thereof, method 400 proceeds to operation 418, in which the processing module uses the sequence of measurement values or the sequence of measurement values and associated environmental conditions for fitting model parameters of a biological model for estimating values of the cellular parameter of the one or more biological probes as a function of the corresponding environmental condition and of the model parameters, corresponding to operation 312 of the method 300 illustrated in
[0166] If the processing module determines, at 406, that the number of repetitions of operations 402 and 404 exceeds the number M, i.e. that the current stimulus trajectory has been completed, it goes on to operation 408, in which the processing module determines whether a convergence criterion of the regulatory task is fulfilled. In the embodiment illustrated in
[0167] Otherwise, if condition 408 has a negative result, meaning that the convergence criterion is not fulfilled yet, the method 400 proceeds to 410. In 410, the processing module 30 determines an updated stimulus trajectory U.sub.1.sup.updated, . . . , U.sub.M.sup.updated. Examples of the determination of the updated stimulus trajectory shall be provided below.
[0168] After operation 410, the method 400 proceeds to operation 414, in which the processing module replaces the (previous) stimulus trajectory U.sub.1, . . . , U.sub.M by the updated stimulus trajectory U.sub.1.sup.updated, . . . , U.sub.M.sup.updated and goes back to operation 402 to restart the sequence of operations 402 to 420 for a further iteration as long as the convergence criterion evaluated in operation 408 is not fulfilled. Thus, when, in subsequent alterations, operation 420 is carried out, the experimental stimulus applied is an experimental stimulus of the corresponding updated stimulus trajectory U.sub.i.sup.updated.
[0169] Method 400 thereby achieves an optimal experimental set-up for estimating the model parameters. A detailed example of an application of method 400 is described below as Example 1.
[0170]
[0171] However, according to method 500, when the processing module 30 determines negative outputs of conditions 506 and 508, the method 500 proceeds to 510, wherein the processing module 30 determines a subsequent experimental stimulus U.sub.M+1 to be added to the stimulus trajectory.
[0172] The subsequent experimental stimulus is determined for setting a “significant environmental condition” of at least one of the biological probes to which it is applied. The “significant environmental condition” is defined as an environmental condition having a value greater than a first environmental condition and smaller than a second environmental condition. The first and second environmental conditions are defined as environmental conditions respectively set by a first experimental stimulus and a second environmental stimulus, wherein a variation of the cellular parameter as a function of the environmental condition is extremal, e.g. maximal or minimal, between the first and second environmental conditions. Thus, the “significant environmental condition” is defined to be in a region of the stimulus trajectory at which an extremal variation of the cellular parameter is determined. The subsequent experimental stimulus U.sub.M+1 is defined as a function of the first and second experimental stimuli. A detailed example of an application of method 500, including an exemplary manner for determining the subsequent experimental stimulus, is described below as Example 2.
[0173] After the subsequent experimental stimulus U.sub.M+1 is determined in operation 510, the method proceeds to operation 514, wherein the probe manipulation device 16 applies the determined subsequent experimental stimulus U.sub.M+1 to the corresponding biological probes 20, and then proceeds back to operation 502, as illustrated in
[0174] Thus, according to method 500, a closed-loop can be defined for determining data points, fitting the model parameters and determining the next applied experimental stimulus (e.g. drug concentration).
[0175] The method 500 may allow for a more comprehensive characterisation of drug response curves, for example IC50 curves, which can be relevant for selecting drug candidates. Further, the method 500 allows determining drug response curves for cell proliferation and for cell death rates separately, without confusing the sources or effects thereof.
[0176]
[0177] The processing module 30 can also be configured for controlling the probe manipulation devices of each of the live-cell imaging systems 10-1, . . . , 10-K such that one regulatory task is distributed over the live-cell imaging systems. For example, a stimulus trajectory can be applied to the live-cell imaging systems 10-1, . . . , 10-K, with a first part of the stimulus trajectory, for example a first number of experimental stimuli of the stimulus trajectory, being applied to a first live-cell imaging system 10-1, a second part of the stimulus trajectory, for example a first number of other experimental stimuli of the stimulus trajectory, being applied to a second live-cell imaging system 10-2, and so on. The measurement values based on the optical measurements obtained by each of the live-cell imaging systems can all be received and analysed by the processing module 30.
[0178]
[0179] According to method 600, K stimulus trajectories U.sup.j.sub.i (j=1, . . . , K), each with M experimental stimuli (i=1, . . . , M) are applied, at 602, to K different sets of biological probes by the probe manipulation devices 16 of respective live-cell imaging systems 10-1, . . . , 10-K. “Set of biological probes” refers herein to the biological probes that are monitored and effected upon by a given live-cell imaging system.
[0180] The different stimulus trajectories are defined by different values of a stimulus parameter. For example, different time-dependent stimulus trajectories can be defined as a function of a stimulation parameter α. A first time-dependent stimulus trajectory U(t.sub.i, α.sub.1)=U(t.sub.1, α.sub.1), . . . , U(t.sub.M, α.sub.1) is applied to a first set of biological probes by a first live-cell imaging system 10-1, a second time-dependent stimulus trajectory U(t.sub.i, α.sub.2)=U(t.sub.1, α.sub.2), . . . , U(t.sub.M, α.sub.2) is applied to a second set of biological probes by a second live-cell imaging system 10-2, and so on until the K-th time-dependent stimulus trajectory U(t.sub.i, α.sub.K)=U(t.sub.1, α.sub.K), . . . , U(t.sub.M, α.sub.K) is applied to a set of biological probes by the K-th live-cell imaging system 10-K.
[0181] At 602, the processing module controls the probe manipulation devices 16 of the live-cell imaging systems 10-1, . . . , 10-K to simultaneously apply a respective one of the stimulus trajectories U(t.sub.i, α.sub.1), . . . , U(t.sub.i, α.sub.K) to the corresponding sets of biological probes.
[0182] In operation 604, the processing module 30 controls the imaging devices 12 of the live-cell imaging systems 10-1, . . . , 10-K to obtain optical measurements of the corresponding sets of biological probes. Operations 602 and/or 604 may overlap in time at least in part. In operation 606, the processing module 30 obtains the measurement values from the optical measurements obtained by the imaging device 12.
[0183] At 608, the processing module 30 determines whether a convergence criterion of a regulatory task is fulfilled, for example if a corresponding stimulus trajectory has been completed. The processing module 30 stores, for each obtained measurement value, the measurement value and the associated experimental stimulus.
[0184] If, at 6o8, it is determined that the stimulus trajectory has not been completed yet, the method proceeds to operation 610, in which an extremal value α* of the stimulus parameter α is determined by the processing module 30. The extremal value α* corresponds to the stored experimental stimulus U(t,α.sub.j), which, from all experimental stimuli of all stimulus trajectories applied to the biological probes, results in an extremal, e.g. maximal or minimal, measurement value.
[0185] The processing module then proceeds to determine, in 612, a first reference value α.sub.1, and a second reference value α.sub.2. The first and second reference values α.sub.1, α.sub.2 are determined such that α.sub.1≤α*≤α.sub.2, preferably such that α.sub.1<α*<a.sub.2. For example, if α*=α.sub.j, the first and second reference values can be defined as the values immediately preceding and immediately following the stimulus parameter α*, i.e. α.sub.1=α.sub.j−1 and α.sub.2=a.sub.j+1.
[0186] Then, in operation 614, the processing module 30 replaces the K different stimulus trajectories that are applied to the K different sets of biological cells by the K the updated stimulus trajectories U.sup.updated(t.sub.i,α.sub.j) with i=1, . . . , M and j=1, . . . , K, respectively. The updated stimulus trajectories are respectively determined by a value of the stimulus parameter α lying between the first reference value α.sub.1, and the second reference value α.sub.2, i.e. for α.sub.1≤α.sub.j≤α.sub.2 for all j.
[0187] The method 600 then proceeds to operation 616, in which the processing module determines whether a difference between the first and second reference values |α.sub.2−α.sub.1| is smaller than a predefined threshold value δ. If this is the case, the method 600 is terminated at 618.
[0188] Otherwise, the method 600 goes back to operation 602 for a further iteration of method operations 602 to 616.
PRACTICAL EXAMPLES
Example 1
[0189] Example 1 is an example of a live-cell imaging method 400 as illustrated in
[0190] Apoptotic cells are identified by the processing module 30 using a correspondingly trained convolutional network algorithm that identifies dead cells by means of image segmentation on the basis of image local contrast using images of one of the biological probes 20.
[0191] The probe manipulation device 16 is configured for controlling the concentration of a bioactive agent in an experimental fluid and for perfusing the biological probes with the experimental fluid by means of a pump for pumping the experimental fluid, an output reservoir for collecting experimental fluid residues after they have flown in contact with the biological probes, and fluid conduits fluidly connecting each of the biological probes between the pump and the output reservoir. The bioactive agent is T4-CD95L, which favours cell death due to binding of the cell death ligand (CD95L), a bioactive agent that influences concentration of CD95L in the experimental fluid.
[0192] In the exemplary embodiment under consideration, a stimulus trajectory is defined by a time-dependent concentration of the CD95L death ligand as given by the function:
u(t)=U.sub.0 exp(−U.sub.1.Math.t) (1)
U.sub.0 being the initial concentration of the death ligand at an initial time t=0, which can be, for example 500 ng/ml. The stimulus trajectory is hence defined by a sequence of concentrations of the CD95L death ligand corresponding to evaluations of the function u(t) of equation (1) at times corresponding to multiples of a predefined time interval Δt, i.e U.sub.i=u(i.Math.Δt) with I being a natural number, i=1, . . . , M.
[0193] The used biological model is based on a set of three coupled ODEs and describes binding of the cell death ligand CD95L to cell death receptors CD95R, release of the ligand, activation of an effector caspase C* by active death receptors, and inactivation of the effector caspase C*:
[CD95R] is the concentration of the cell death receptor CD95R, [CD95L] is the concentration of the cell death ligand CD95L, [CD95R*] is the concentration of active CD95R cell death receptors, and [C*] is the concentration of active effector caspase. The biological model is further defined by model parameters k.sub.on, k.sub.off, k.sub.act, and k.sub.inact, which are kinetic parameters respectively describing binding, unbinding, activation of the effector caspase and inactivation of the active effector caspase, and by the Hill-type function parameters h and K, which are used for modelling effector caspase activation. According to this biological model, a number or fraction of dead cells can be associated with a fraction of active effector caspase. Thus, the biological model defined by equations (2) to (4) provides estimated values of the cellular parameter (i.e. here a fraction of apoptotic cells) as a function of the environmental condition defined by the experimental stimuli (concentration of the cell death ligand [CD95L]) and the model parameters k.sub.on, k.sub.off, k.sub.act, k.sub.inact, h and K.
[0194] The updated stimulus trajectory is determined by the processing module such that a covariance of the one or more model parameters as a function of the stimulus trajectory is minimised. For a parameter vector θ (e.g. θ=[k.sub.on, k.sub.off, k.sub.act, k.sub.inact, h, K].sup.T), and an estimate of parameters obtained from model fitting {circumflex over (θ)}, the covariance matrix of the model parameters can be defined as an expected value C.sub.θ=E ({circumflex over (θ)}−θ)({circumflex over (θ)}−θ).sup.T). For example, the processing module can be configured to calculate a sensitivity matrix
defined for a function y{x(t.sub.i),u(t.sub.i),θ}, with x being a vector containing the concentrations of model quantities of the biological model given by equations (2) to (4), x=[[CD95R], [CD95R*], [C*]].sup.T at time points t.sub.i, with i=1, . . . , M. Using an estimate of the covariance matrix C.sub.y defined for the optical measurements obtained by the imaging device 12 and sensitivities at different time points S.sub.t.sub.
[0195] Thus, the processing module can be configured to determine the updated stimulus trajectory such that the covariance of the one or more model parameters as a function of the stimulus trajectory, i.e. the trace or the determinant of C.sub.θ or of F.sup.−1 be minimised to obtain more accurate estimates of model parameters. The updated stimulus trajectory is then obtained (in operation of the method illustrated in
[0196] When the convergence criterion defined for condition 408 is fulfilled, an improved estimation of the model parameters k.sub.on, k.sub.off, k.sub.act, k.sub.inact, h and K is obtained. The experimental design has up to then been designed in an optimised manner by choosing experimental conditions such that a confidence of the model parameters be minimised. Thus, an improved or more reliable version of the biological model defined by equations (2) to (4) is obtained in an efficient and automated manner based on closed-loop live-cell imaging.
Example 2
[0197] Example 2 is a detailed example of a live-cell imaging method 500 as illustrated in
[0198] The probe manipulation device 16 is configured for controlling a concentration d of a chemotherapeutic drug in an experimental fluid and for perfusing the biological probes with the experimental fluid, thereby respectively setting environmental conditions thereof. The stimulus trajectory is initially defined here as a sequence of (initially three) drug concentrations U.sub.1=d.sub.max/100, U.sub.2=d.sub.max/10 and U.sub.3=d.sub.max=U.sub.M.
[0199] The biological model being used is a model that describes the number of living cells L and the number of dying cells A, a growth rate θ.sub.g and an apoptosis rate θ.sub.a as a function of the chemotherapeutic drug concentration d and of a set of model parameters k.sub.g, k.sub.d, K.sub.i, K.sub.d, h and l:
where k.sub.g denotes a maximal speed of growth, k.sub.d denotes a maximal speed of cell death, K.sub.i denotes a drug concentration at which the proliferation rate is decreased to half its maximal value, K.sub.d denotes the chemotherapeutic drug concentration resulting in the half-maximal cell death rate, and h and l are Hill-function parameters describing the steepness of the involved sigmoidal curves.
[0200] In method 500, operation 518 comprises (re)fitting the model parameters k.sub.g, k.sub.d, K.sub.i, K.sub.d, h and l, to the set of data points (measurement values, i.e. values of A and L+environmental conditions, i.e. values of d) previously collected and stored by the processing module 30 based on a previous estimation of the growth and cell death rates θ.sub.g and θ.sub.a. For parameter estimation, the biological model can for example be fitted in a two-step procedure: first, a current sequence of measurement values of L and A corresponding to optical measurements obtained by the imaging device 12 at a corresponding drug concentration d can fitted based on equations (5) and (6) of the biological model to determine the growth and apoptosis (cell death) rates θ.sub.g and θ.sub.a. Then, equations (7) and (8) of the biological model can be fitted to the obtained values of θ.sub.g and θ.sub.a to determine the parameters k.sub.g, k.sub.d, K.sub.i, K.sub.d, h and l. In general, the model parameter K.sub.d is of major interest to describe the effectiveness of the chemotherapeutic drug against cancer cells and should therefore be accurately determined.
[0201] In some examples, the probe manipulation device 16 may be configured for simultaneously applying each of the different drug concentrations to different biological probes, for example different biological probes monitored and influenced by the same live-cell imaging system (e.g. different of the well plates shown in
[0202] The different live-cell imaging systems can be differently located, remote from each other, and centrally controlled by a common processing module, for example via an internet connection, as in the example illustrated in
[0203] In operation 510 of method 500 illustrated in
E.sub.i,j=k.sub.d(θ.sub.a,j−θ.sub.a,i) (9)
[0204] In an iteratively sequence based on the three last applied drug concentrations U.sub.M−2<U.sub.M−1<U.sub.M with n≥3, the subsequent experimental stimulus U.sub.M+1 to be applied can then be defined, in operation 510 of method 500 illustrated in
[0205] A sequence of iteration steps consisting of (I) automatically applying a drug concentration according to the current stimulus trajectory, (II) obtaining optical measurements of the biological probes and corresponding measurement values, (III) fitting the model parameters and (IV) determining a subsequent experimental stimulus to be applied based on the scheme defined by equations (9) and (10) can be applied until a certain termination criterion is fulfilled. This termination criterion can be defined as a maximal number of iterations of steps (I) to (IV) or based on the comparative parameter E defined in equation (9). For example, the method can be terminated when the number of iterations of steps (I) to (IV) reaches or exceeds a predefined threshold M.sub.max, or when the comparative parameter E.sub.M,M−1 falls below a certain predefined threshold E.sub.min. Thereby, the parameter estimation of the model parameters k.sub.g, k.sub.d, K.sub.i, K.sub.d, h and l, in particular of the most relevant model parameter K.sub.d, is efficiently optimised by iteratively selecting most informative data points and hence concentrating data acquisition in relevant regions of the variable space.
[0206] As an alternative to the iteration scheme defined by equations (9) and (10), other iterative schemes can be used.
Example 3
[0207] Example 3 is a further example of a live-cell imaging method according to the present invention, for example according to the method 600 illustrated in
[0208] It is known that the activity of CD95 cell death receptors stimulated by the ligand T4-CD95L follows an inverse bell shape, depending on the dose or concentration of the ligand. An increased concentration of T4-CD95L results in an increased receptor activity and faster cell death. However, after a certain peak concentration is exceeded, further increasing the concentration of the ligand T4-CD95L results in decreased receptor activity and hence decreased cell death. The observation that cell viability is lowest for intermediate cell death ligand concentrations has important implications for efficiently using a cell death ligand, which is injected into the cellular compartment and thereafter eliminated from the compartment, to efficiently induce apoptosis in a population of cells. For a pre-defined total amount of cell death ligand, injecting the ligand too slow will not result in sufficient activation of CD95 receptors, whereas, injecting the ligand too fast will result in less cell death as in case of an intermediate injection speed. The aim of the application of the method according to this example of closed-loop live-cell imaging is finding an optimal injection speed that maximizes cell death in an automated manner.
[0209] The aforesaid optimal injection speed of the T4-CD95L, at which a maximal fraction of cells undergoing apoptosis is achieved, can be determined by means of method 600 illustrated in
[0210] According to an example, different stimulus trajectories applied to different biological probes are based on the following ODE model that simulates a ligand concentration L in the blood serum of a patient after intravenous administration of a drug, and a concentration L. of the drug at the site of a tumour within the body of the patient:
wherein the drug transport to the site of the tumour is described by the injection rate k.sub.in and drug removal from the site of the tumour is described by the parameter k.sub.out.
[0211] The solution to the coupled ODEs (11) and (12) that describe a pulse of transient ligand accumulation and removal is:
[0212] Concentration trajectories described by equation (13) with different values for the injection rate k.sub.in can be applied by the probe manipulation device 16 of a live-cell imaging system 10 to simulate biological processes at corresponding biological probes according to method 600.
[0213] In operation 602, M different sets of biological probes are treated with stimulus trajectories L.sub.m(t, k.sub.in,i) at different injection rates k.sub.in,i for i=1, . . . , M (i.e. wherein k.sub.in,i corresponds here to the stimulus parameter α.sub.i as described in
[0214] The different injection rates k.sub.in,i can be chosen to be
with k.sub.in,min,0 and k.sub.in,max,0 being, respectively, the minimum and the maximum injection rates supported by the probe manipulation device 16.
[0215] In operations 604 and 606, numbers of living and dead cells of the biological probes are segmented from optical measurements obtained by the imaging device 12 with a pre-defined time interval between subsequent optical measurements of e.g. 1 hour and the cell death rate θ.sub.a is estimated, for example using the biological model described for Example 2, i.e. using equations (5) to (8), fitted to the measurement values obtained by the processing module 30.
[0216] In operation 610, after the M sets of biological probes have been treated with the respective stimulation trajectories, the injection rate k.sub.in,m=k.sub.in*=α* resulting in the maximum cell death rate is determined by the processing module 30.
[0217] The first and second reference values for this initial iteration are then determined, at operation 612, as k.sub.in,m−1 and k.sub.in,m+1 according to equation (14).
[0218] Accordingly, in each of the subsequent iteration steps with indices l=1 . . . Q, the new minimum and maximum injection rates for the updated stimulus trajectories are respectively determined as a function of the injection rate k.sub.in,m=k.sub.in* resulting in the maximum cell death rate in the previous iteration, as:
[0219] The stimulus trajectories L.sub.m,i(t) are then replaced, in operation 614, by updated stimulus trajectories defined by the subsequent set of injection rates comprised between k.sub.in,m−1 and k.sub.in,m+1, with j=1 . . . M:
[0220] Operations 602 to 614 are then repeated as long as, in 616, the processing module does not determine that |k.sub.in,max,t−k.sub.in,min,t| (cf. |a.sub.2−α.sub.1|) is below a threshold value δ, or, additionally or alternatively, in case the maximum number of overall iterations l=Q is reached.
[0221] Thus, the method 600 allows automatically determining dose-response curves in an efficient manner by enabling feedback between an experimental variable (cf. drug concentration) and a cellular parameter monitored by a live-cell imaging system, based on a pharmacokinetic model describing the drug concentration at the site of a tumour.
[0222] In related examples, stimulus trajectories can be optimised for more than one stimulus parameter.
Example 4
[0223] Example 4 is a further example of a live-cell imaging method according to the present invention, directed to the evaluation of an optimized concentration trajectory for administration of a chemotherapeutic drug. According to this example, the cellular parameter being measured or estimated is an apoptosis rate in each of the biological probes 20.
[0224] According to the present example, a computer-implemented method and a system are provided in which several hardware and software components are combined to systematically optimize concentration trajectories of chemotherapeutic drugs. Specifically, the example relates to a method/system combining: (1) automated live-cell imaging of cells, in particular cancer cells, (2) drug perfusion, (3) detection of cell fates, preferably by a convolutional neural network (CNN), and (4) optimization of concentration trajectories by reinforcement learning based on mathematical models of cellular signal transduction pathways affected by the applied chemotherapeutic drugs that are fitted to the recorded experimental data. For instance, the signal transduction pathways may include the CD95L signal transduction pathway, MAP kinase signal transduction pathway, PI3K/Akt signal transduction pathway, signal transduction pathways associated with apoptosis, cell division, DNA replication, DNA-damage repair mechanism and antigen-specific immune responses, or the like. In the following, components of the systems and how they interact will be described in more detail.
[0225] The hardware component of the system comprises an imaging system, for example the live-cell imaging system 10 of
[0226] The software components, which will be described in more detail below, is configured to analyze and recognize cell states using a CNN and train mathematical models to estimate cellular systems parameters during the experiment. Based on model fitting results, a RL framework is used to predict optimal drug concentration trajectories, as further described below, that can be experimentally tested. All components are controlled by the processing module 30, which may include a small low-cost single-board computer. The component may be operated via wireless LAN or an ethernet connection, and therefore supports performing long-term live-cell experiments by remote control via a local network or the internet.
[0227] The processing module 30 of the present embodiment can be used for quantitatively analyzing effects of chemotherapeutic drugs on cancer cells based on live-cell imaging together with an automated control of the applied drug concentration. It can be used to automatically estimate cell growth and death rates for quantitatively characterizing drug efficiencies based on dose-response curves.
[0228] To evaluate microscope images in parallel to the experiment, a CNN was trained to automatically classify the state of cells. More specifically, as shown in
[0229] The principle of optimizing concentration trajectories based on a cellular pathway model can be applied in an exemplary scenario of programmed cell death stimulated by the CD95 cell death ligand (CD95L; CD, cluster of differentiation), for reference, see C Kallenberger et al.: “Intra- and Interdimeric Caspase-8 Self-Cleavage Controls Strength and Timing of CD95-Induced Apoptosis”, in Science Signaling 2014, the entire disclosure of which is hereby incorporated by reference herein. The model, which describes extrinsic apoptosis, links a concentration trajectory of injected CD95L to the percentage of cells undergoing apoptosis in a heterogeneous cell population. It can be demonstrated that, based on the calibrated model, optimal injection speeds can be predicted for different total CD95 doses to minimize the surviving fraction of cells.
[0230]
[0231] To describe heterogeneous cell populations, multivariate (log-normal) distributions of the initial concentrations of model species as well as fractions of tBID sufficient for inducing cell death were estimated by model fitting to experimental data. Based on distribution parameters and estimates of model parameters, CD95L injection speeds can be related to fractions of cells that survive or undergo apoptosis. Thereby, optimal injection speeds can be predicted which maximize cell-death induction caused by using a defined total amount of CD95L.
[0232]
[0233] The method 1200 may be implemented by a live-cell imaging system, for example by the live-cell imaging system 10 of
[0234] As already mentioned earlier, the RL framework 6o for optimizing drug concentration trajectories is based on a pathway model, for instance the programmed cell death model described above in connection with
[0235] According to an embodiment, the processing module 30 may be configured to apply the procedure for optimizing drug concentration trajectories according to the algorithm depicted in
[0236] In case a first convergence criterion is fulfilled (as shown at 1210), which can be defined by a certain number of learning cycles, the optimal concentration trajectory is handed over to the live cell imaging system 10, i.e. the optimal concentration trajectory is experimentally applied by the component for live-cell imaging 12 and drug perfusion 16. Using the recorded microscope images, time series of cell numbers in biological states are segmented using a convolutional neural network, as explained in connection with
[0237] Next, as shown at 1230, the model underlying the RL environment 61 is fitted to the experimental dataset to estimate model parameters.
[0238] Considering the process as a whole, operations 1220 and 1230 can be regarded as an outer loop of the procedure.
[0239] As shown at 1240, a second convergence criterion is evaluated that can be defined by a maximal number of experimental cycles or a desired improvement of the cytostatic or cytotoxic effects caused by the administered drug dose. If the convergence criterion is not fulfilled, the model parameter estimates are used to update the RL environment 61. Thereafter, the RL procedure comprising cycles of choosing infusion rates, model simulations and policy updates, depending on the reward function R(t) associated with the updated environment 61, is repeated. Thereby, again, an optimized drug concentration trajectory is obtained that can be experimentally tested. Cycles of RI, experiments and updates of the environment are performed until the second convergence criterion is fulfilled. As a result, the cytostatic or cytotoxic effect that can be achieved from administering a certain drug dose according to an optimized sequence of infusion rates can be improved.
[0240] Briefly, the processing steps of the embodiment of
[0248] Collectively, a combination of live-cell imaging, mathematical modeling and AI is used to improve the effectivity of chemotherapies. Thereby, applications of the present invention can fill a gap between knowledge from systems biology studies of pathways in cells and clinical applications of chemotherapeutic drugs which represents an innovative concept on the way from systems biology to systems medicine.
[0249] Accordingly, embodiments of the present achieve optimizations of drug concentration trajectories by combining a device for live-cell imaging and drug perfusion with the utilization of RL as AI method, connected with a mathematical model of the signal transduction pathways in cells affected by the applied drugs. The mathematical model, which translates an applied drug concentration trajectory to a trajectory of cell numbers in certain biological states (as ‘living’, ‘apoptotic’ or ‘dividing’) is used to define the RL environment. To close the loop, the model is calibrated with experimental data to make realistic model predictions.
[0250] Combined cycles of RL based on the environment defined by the cellular pathway model (inner loop) and of experiments with live-cell imaging and drug perfusion to calibrate the model and update the environment (outer loop) in accordance with embodiments of the invention achieves several advantages. First of all, embodiments of the invention achieve an optimization of the therapeutic effect caused by a certain dose of one oncological drug or doses of several drugs by the RL-based selection of a sequence of infusion rates that result in stronger cell growth in inhibition or higher cytotoxicity. As a further effect, after having optimized the therapeutic effect of a certain drug dose, it would be possible to utilize the combination of RL, the mathematical model and the device for live-cell imaging and drug perfusion in accordance with the methods described herein to minimize the total dose of the applied drug. The minimization could be performed in such a way that the total dose, assuming an optimized concentration trajectory, only just suffices to achieve a certain measurement value of a cellular parameter, for example to effect a certain predefined degree of cell death, for instance a rate of apoptosis of >=95%.
[0251] In addition, embodiments of the invention provide a mechanism for the parallelization of RL-based optimization experiments on several technical devices for microscopy and drug perfusion simultaneously controlled by one processing unit that distributes experiments to the devices. In this context, it should be noted that defining an RL environment based on a mathematical model of cellular signalling pathways affected by the applied drugs is not only biologically reasonable, but also supportive for managing the experimental effort of the procedure since the experimental repetitions to be performed can be reduced to a manageable number.
[0252] Hereinafter, an application example with experimental measurements will be described in detail. The described approach was applied to find an optimal concentration trajectory of the CD95 cell death ligand (CD95L) that maximizes the effect of cell death induction achieved by administering a certain CD95L dose. To this end, the live-cell imaging system with infusion pumps was applied in combination with a model of the involved pathway and an RL agent.
Experimental Setup
[0253] The experimental setup consisted of an automated live-cell microscope and infusion pumps for empty medium or CD95L stock solution placed inside a standard cell culture incubator. The microscope was combined with a surrounding, separable, autoclavable box for ensuring a sterile environment inside the incubator. Empty medium and the CD95L stock solution were guided through silicon tubes (inner diameter: 1 mm) connected to a t-shaped adapter piece. The t-adapter was connected with a microfluidic chip containing approximately 80,000 CD95-HeLa cells (human cervix carcinoma cell line overexpressing CD95 death receptors). Thereby, cells could be perfused by a mixture of CD95L stock solution and empty medium. For adjusting the CD95L concentration in each step of an experiment, a constant volume of 0.8 ml was infused. This volume comprised varying fractions of CD95L stock solution and empty medium. The microfluidic chip together with the t-adapter had an inner volume of 0.25 ml that was thus fully replaced by the infused volume. The CD95L stock solution had a concentration of 250 ng/ml. The total dose was set to 400 ng of CD95L according to a stock solution volume of 1.6 ml. Accordingly, cells could be perfused with CD95L at concentrations between zero and the stock concentration. During the experiment, images of the perfused cells were recorded every 3 minutes using the live-cell microscope. Within 180 minutes, the CD95L concentration was changed 10 times in time intervals of 18 minutes to adjust the CD95L concentration in the microfluidic chip as predicted by the RL agent based on an ODE model of programmed cell death. In this setting, an equal distribution of the total dose of 400 ng CD95L to 40 ng per time interval in the exchanged volume of 0.8 ml resulted in a constant CD95L concentration of 50 ng/ml. The device for live-cell imaging and drug perfusion was controlled remotely using ethernet connection to the inner space of the incubator.
Combination with ODE Model of Programmed Cell Death
[0254] According to the description, an actor-critic RL agent was combined with an ODE model describing programmed cell death. The ODEs and model parameters that were obtained by fitting the model to experimental data were taken from Kallenberger et al. (Science Signaling 2014).
[0255] The model describes the activation process of procaspase-8. After ligation of CD95 death receptors with CD95L, the adapter protein FADD binds to the intracellular part of active CD95 receptors. Then, procaspase-8 dimerizes at CD95-FADD complexes and becomes activated by self-cleavage reactions. Active forms of caspase-8, p43 and p18, cleave the protein BID to truncated BID (tBID). The model further describes concentrations of two fluorescent cleavage probes that were stably expressed in cells to experimentally measure caspase-8 activities. One of the probes indicated the cleavage activity of p43 and p18; the other probe indicated only the cleavage activity of p18. As described by the model, in case the concentration of tBID exceeds a certain threshold, mitochondria outer membrane permeabilization (MOMP) is caused, which irreversibly triggers apoptosis. To use the model for describing cell death in heterogeneous cell populations, the distributions of initial protein concentrations of the model species CD95, FADD, pro-caspase-8 (p55), BID and two cleavage probes were determined by fluorescence-assisted cell sorting (Kallenberger et al., Science Signaling 2014).
[0256] For this application example, to simulate cell death based on the model by Kallenberger et al., single-cell fractions of tBID at the time of apoptosis, relative to the total amount of BID (the sum of cleaved and intact BID), were estimated. To simulate apoptosis in single cells, cell death was assumed in case the fraction of tBID exceeded this threshold. For tBID fractions as well as initial protein concentrations, parameters of log-normal distributions were determined.
[0257] These log-normal distribution parameters were used together with estimated model parameters for simulating apoptosis in heterogeneous cell populations. To simulate single cells of the population, initial protein concentrations of and tBID thresholds were sampled. By integrating the ODE model, caspase-8 activation and cleavage of BID to tBID were simulated until the tBID threshold value was exceeded. Different CD95L concentration trajectories were used as model input to simulate accumulation of tBID and cell death over time. For RL, a population of 50 cells was simulated.
Optimization of CD95L Concentration Trajectory
[0258] An actor-critic (AC) RL agent was defined by two neural networks. The neural network for choosing actions comprised 19 state variables (the current amount of CD95L, average concentrations of model species, time, fraction of tBID, fraction of dead cells). The agent was linked to 11 actions representing CD95L concentrations between zero and tenth parts of the CD95L stock concentration. Inside the AC agent, a stochastic actor representation was linked to a neural network consisting of the following sequence of layers: (1) an input layer linked to state variables, (2) a fully connected layer with 128 neurons, (3) a hyperbolic tangent activation layer, (4) a fully connected layer with 128 neurons, (5) a hyperbolic tangent activation layer, (6) a fully connected layer with 64 neurons, (6) a rectified linear unit (ReLU) activation layer, (7) a fully connected layer with 11 output neurons associated with actions. The neural network serving as critic consisted of the same sequence of layers as the actor network except the last fully connected layer with only one output neuron. For training of the AC agent, the Adam optimization algorithm, an extended stochastic gradient descent algorithm, was applied with a learning rate of 0.001 and a gradient threshold of 1. In training episodes, a discount factor of 0.9 was used. A total of 1000 RL training episodes were performed.
[0259] One training episode comprised a sequence of the following steps: (1) adjustment of the CD95L concentration depending on the remaining amount of CD95L, (2) simulation of model species concentrations in all cells by numerically integrating the ODE model within the duration of one step, (3) test, if the tBID threshold is exceeded in any cell, (4) documentation of model species concentrations, tBID fractions and dead cells for next episode step. The episodes resembled the experimental time course of 10 intervals with different applied CD95L concentrations within a total duration of 180 minutes. The reward of the AC agent was defined in each episode step by the sum of the fraction of dead cells and the average tBID fraction multiplied by a scaling factor. The agent was penalized in case the selected infusion rates were non-zero despite the available amount of CD95L was consumed. By summing up the rewards in each episode, the agent was rewarded in case high numbers of dead cells and tBID fractions were caused already at early time steps. Thereby, the agent was trained to select CD95L infusion rates that caused cell death in the population of CD95-HeLa cells as fast as possible using the available dose of CD95L.
[0260] According to the procedure of the embodiment, the following steps were performed in this application example: [0261] a) Perfusion of cells with a constant CD95L concentration of 50 ng/ml, serving as a reference experiment. Detection of living or apoptotic cells at different timepoints within 3 hours. [0262] b) RL based on an environment defined by the described original model of CD95L-induced apoptosis to predict an improved CD95L concentration trajectory. [0263] c) Experimental application of an optimized CD95L concentration trajectory as predicted by the RL agent and translated to infusion speeds. [0264] d) Identification of living or apoptotic cells. [0265] e) Fitting of the cellular pathway model to the fractions of apoptotic cells at different time points within 3 hours. [0266] f) Second iteration of the RL step based on the environment defined by the calibrated pathway model to again predict an improved CD95L concentration trajectory. [0267] g) Second iteration of experiments using the improved CD95L concentration trajectory. [0268] h) Identification of living or apoptotic cells for the second experimental iteration.
[0269] Predicted trajectories of CD95L infusion rates were experimentally applied in the system combining an automated live-cell microscope with infusion pumps (cf.
[0270] According to the trained AC agent, the total dose of 400 ng CD95L was spent at the beginning of the experiment to perfuse cells with a high CD95L concentration in a limited time interval. Results by applying the first RL-optimized CD95L trajectory were opposed to the case when using the same total dose of CD95L at a constant infusion rate [steps a) to d),
[0271] The updated RL-environment, defined by the calibrated model, was used for another round of RL to again predict an optimized CD95L concentration trajectory [step f),
[0272]
[0273] Taken together, the experimental results indicate that the method and system described herein can serve to increase the effect resulting from a given dose of a cytotoxic drug by optimizing the applied concentration trajectory.
[0274] Many modifications and other embodiments of the invention set forth herein will come to mind to the one skilled in the art to which the invention pertains having the benefit of the teachings presented in the foregoing description and the associated drawings. Therefore, it is to be understood that the invention is not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.