METHODS AND APPARATUSES FOR PREDICTION OF MECHANISM OF ACTIVITY OF COMPOUNDS
20180372724 ยท 2018-12-27
Inventors
- Michelle Khine (Irvine, CA)
- Eugene Lee (Irvine, CA, US)
- Tang Wai Ronald Adolphus Li (Pok Fu Lam, HK)
- David Dan Tran (Aliso Viejo, CA, US)
Cpc classification
G16B40/00
PHYSICS
G16C20/30
PHYSICS
G01N33/48785
PHYSICS
G16H40/00
PHYSICS
G01N33/48792
PHYSICS
G16C20/00
PHYSICS
G01N33/48728
PHYSICS
G01N30/8675
PHYSICS
G16H50/00
PHYSICS
G16C20/20
PHYSICS
International classification
Abstract
A platform configured to predict type or family of an unknown drug candidate compound, the platform including: a living cell or a tissue; a detector that measures an indicator of a cellular response by the living cell or tissue upon exposure to the unknown drug candidate compound; a memory configured to store data related to the indicator of the cellular response detected by the detector from a library of drug types and/or families; and one or more processing unit(s) configured to: process the data related to the indicator of the cellular response of the living cell or tissue upon exposure to the unknown drug candidate compound, and compare cellular response data from the library of drug types and/or families, so that a drug type and/or a drug family and/or a mechanism of action of the unknown drug candidate compound can be predicted on the basis of a similarity between the detected cellular response data of the unknown drug candidate compound and the cellular response data of the library of drug types and/or families. Also disclosed are methods of screening an unknown drug, including: comparing the data measured from a test cell to corresponding cellular response data in a library of known drug types, and determining a relationship between the unknown drug and a known drug type or a known drug family to predict the type or family of the unknown drug.
Claims
1. A platform configured to predict type or family of an unknown drug candidate compound, the platform comprising: (a) a living cell or a tissue that is capable of exerting a force in response to exposure to the unknown drug candidate compound; (b) a detector that measures an indicator of a cellular response by the living cell or tissue upon exposure to the unknown drug candidate compound; (c) a memory configured to store data related to the indicator of the cellular response detected by the detector from a library of drug types and/or families; and (d) one or more processing unit(s) configured to: (i) process the data related to the indicator of the cellular response of the living cell or tissue upon exposure to the unknown drug candidate compound, and (ii) compare cellular response data from the library of drug types and/or families, so that a drug type and/or a drug family and/or a mechanism of action of the unknown drug candidate compound can be predicted on the basis of a similarity between the detected cellular response data of the unknown drug candidate compound and the cellular response data of the library of drug types and/or families.
2. The platform according to claim 1, wherein the living cell or tissue is a model of cardiac muscle fiber.
3. The platform according to claim 1, wherein the living cell or tissue is configured as a human cardiac tissue strips (hCTS).
4. The platform according to claim 1, wherein the platform is configured to electrically pace the living cell or tissue and wherein the cellular response data is captured at a variety of pacing frequencies.
5. The platform according to claim 1, wherein the processing unit is configured to implement machine learning.
6. The platform according to claim 5, wherein the machine learning utilizes predetermined parameters of cellular response data to classify the cellular response data measured in response to the unknown drug and the cellular response data from the library of drug types and/or families.
7. The platform according to claim 6, wherein the predetermined parameters of the cellular response data comprise force data, the force data comprising one or more of the following parameters: (a) pacing frequency; (b) a captured pacing frequency; (c) a maximum force generated (amplitude); (d) a duration of rise from a cutoff level to maximum force in a contraction phase; (e) a duration of decline from maximum force to a cutoff level in a relaxation phase; (f) an area under the curve of rise from a cutoff level to maximum force; (g) an area under the curve of decline from maximum force to a cutoff level; (h) a maximum change of force over time (F/t) of contraction phase; and (i) a maximum change of force over time (F/t) of relaxation phase.
8. The platform according to claim 7, wherein the predetermined parameters of the cellular response data comprise force data, the force data comprising one or more of the following parameters: desired pacing frequency, captured pacing frequency, max force generated (amplitude), duration of rise from 95% cutoff to max force (contraction phase), duration of decline from max force to 95% cutoff (relaxation phase), area under the curve of rise from 95% cutoff to max force, area under the curve of decline from max force to 95% cutoff, max change of force over time (F/t) of contraction phase, max change of force over time (F/t) of relaxation phase, duration of rise from 50% cutoff to max force, duration of decline from max force to 50% cutoff, area under the curve of rise from 50% cutoff to max force, area under the curve of decline from max force to 50% cutoff, duration of rise from 25% cutoff to max force, duration of decline from max force to 25% cutoff to max force, area under the curve of rise from 50% cutoff to max force, and area under the curve of decline from max force to 50% cutoff.
9. The platform according to claim 1, wherein the cellular response data comprises a measure of cell or tissue motion and/or electrical conduction and/or calcium flux and the detector is capable of detecting motion and/or electrical conduction and/or calcium flux in the living cell or tissue following exposure to the drug.
10. The platform according to claim 9, wherein the electrical conduction detected corresponds to one or more of a micro-impedance signal and an electrophysiological signal.
11. The platform according to claim 1, wherein the processing unit is configured to output dosing information of the unknown drug candidate compound based upon a comparison to the cellular response data of one or more members of the library.
12. The platform according to claim 1, further comprising a library of drug types and/or families stored in the memory.
13. The platform according to claim 12, wherein each drug type or drug family is characterized by a plurality of distinct compounds within the drug type or drug family.
14. A method of screening an unknown drug, comprising: (a) exposing a test cell or a tissue to the unknown drug, (b) quantifying a cellular response by obtaining cellular response data measured from the test cell in response to the unknown drug, (c) comparing the data measured from the test cell to corresponding cellular response data in a library of known drug types, (d) determining a relationship between the unknown drug and a known drug type or a known drug family to predict the type or family of the unknown drug.
15. The method of claim 14, wherein the cellular response data is indicative of cardioactivity.
16. The method of claim 14 wherein the test cell or tissue is a human cardiac tissue construct.
17. The method according to claim 15, wherein a degree to which a compound is cardiotoxic/cardioactive is predicted.
18. The method according to claim 14, wherein machine learning is used to form the library of cellular response data of known drug types.
19. The method according to claim 14, wherein said comparing the cellular response data of the test cell to a library of corresponding cellular response data of known drug types is done by a series of binary classifications.
20. The method according to claim 14 comprising calculating a singular quantitative index generated by a supervised learning algorithm to consolidate a plurality of parameters of a cellular response into a singular quantitative index.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0080]
[0081]
[0082]
[0083]
[0084]
[0085]
[0086]
[0087]
[0088]
[0089]
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
[0090] We hypothesize that multi-classification algorithms can be implemented to create a model to define drug classes and subsequently predict an unknown compound's mechanistic action. Such information would assist in streamlining the drug discovery pipeline, allowing for the rapid identification of select compounds for more in-depth follow up assays. In addition, this information coupled with knowledge of a predicted class can guide scientists to efficiently and selectively screen for specific drug-to-drug interactions that prompt cardiotoxicity (e.g., disruption of Ca.sup.2+ handling when sofosbuvir and amiodarone are combined) instead of relying on the traditional brute force approach (Millard et al., 2016). Furthermore, drug response relationships between the unknown compound and the library can be determined.
[0091] We examined a database containing drug screens of various compounds on twitch force measurements from human ventricular cardiac tissue strips (hvCTS) engineered from hPSC-CMs embedded in a 3D collagen-based matrix (Turnbull et al., 2014). A unique aspect of these screens was that the hvCTSs were electrically paced at four different frequencies from 0.5 to 2.0 Hz, spanning a physiologic range. These measurements interrogated the influence of cardioactive compounds on the hvCTS force-frequency relationship, and contributed to a high dimensional dataset. We selected a total of twelve compounds with acute cardiac effects that represented five drug classes (1. Ca.sup.2+ channel blockers, 2. adrenergic agonists, 3. cardiac glycosides, 4. hERG K.sup.+ channel blockers, and 5. angiotensin converting enzyme (ACE) inhibitors) along with one reference compound (aspirin). We report for the first time the use of machine learning to establish a drug classification model based on hvCTS contractile behavior (using half of the selected compounds) and subsequently demonstrate predictive capabilities by having the model classify unknown compounds, which were withheld from the machine during training.
[0092] Accurately predicting cardioactive effects of new molecular entities for therapeutics remains a daunting challenge. Immense research effort has been focused towards creating new screening platforms that utilize human pluripotent stem cell (hPSC)-derived cardiomyocytes and three-dimensional engineered cardiac tissue constructs to better recapitulate human heart function and drug responses. As these new platforms become increasingly sophisticated and high-throughput, the drug screens result in larger, higher-dimensional datasets. New automated analysis methods must therefore be developed in parallel to fully comprehend the cellular response across a multidimensional parameter space. Here, we describe the use of machine learning to comprehensively analyze, in one embodiment, 17 functional parameters derived from force readouts of hPSC-derived ventricular cardiac tissue strips (hvCTS) electrically paced at a range of frequencies and exposed to a library of compounds. A generated metric is effective for then determining the cardioactivity of a given drug. Furthermore, we demonstrate a classification model that can automatically predict the mechanistic action of an unknown cardioactive drug.
Formation of Drug Classification Model
[0093] To form the drug classification model, the screens of twelve compounds (Table 1) acquired on the hvCTS platform were used.
TABLE-US-00001 TABLE 1 Library Compounds Compound Name Class Description Test Range (M) Nifedipine Ca.sup.2+ Channel A L-type Ca.sup.2+ channel blocker known to .sup.10.sup.8 to 3.0 10.sup.5 Blocker shorten action potential duration (Harris et al., 2013). Mibefradil Ca.sup.2+ Channel A tetralol derivative that blocks both L- and .sup.10.sup.9 to 3.0 10.sup.6 Blocker T-type Ca.sup.2+ channels with higher affinity for T-type (Martin et al., 2000). Isoproterenol Adrenergic A mixed beta-adrenergic agonists. 10.sup.8 to 10.sup.4 Agonist Compound is nonselective in terms of beta receptors (Steinberg, 1999). Norepinephrine Adrenergic Mixed adrenergic agonist that stimulates 10.sup.9 to 10.sup.5 Agonist both alpha- and beta-receptors (Yang et al., 2014) Digoxin Cardiac A cardiac glycoside that inhibits that 10.sup.8 to 10.sup.4 Glycoside Na.sup.+/K.sup.+-ATPase, resulting in higher intracellular Na.sup.+. Higher Na.sup.+ concentration suppresses the Na.sup.+/Ca.sup.2+ exchanger and causing the accumulation of intracellular Ca.sup.2+ (Katz et al., 2010). Ouabain Cardiac A cardiac glycoside that affects Na.sup.+/K.sup.+-ATPase, 10.sup.8 to 10.sup.4 Glycoside which consists of both alpha and beta-subunits. Has a lower affinity for alpha subunits than digoxin (Katz et al., 2010). Flecainide hERG K.sup.+ A mixed hERG K.sup.+ blocker that also 10.sup.8 to 10.sup.4 Channel Blocker inhibits Na.sup.+ channels, causing effects on action potential's repolarization and conduction (Harris et al., 2013). E-4031 hERG K.sup.+ A pure hERG K.sup.+ channel blocker known 10.sup.8 to 10.sup.4 Channel Blocker for pro-arrhythmic potential (Ziupa et al., 2014). Cisapride hERG K.sup.+ A serotonin (5-HT.sub.4) receptor agonists that 10.sup.8 to 10.sup.4 Channel Blocker also inhibits the hERG K.sup.+ channel (Wong et al., 2010). Lisinopril ACE Inhibitor An ACE inhibitor, which reduces 10.sup.8 to 10.sup.4 vasoconstriction and lowers blood pressure in patients (Williams, 1988). Ramipril ACE Inhibitor An ACE inhibitor. It does not block ACE 10.sup.9 to 10.sup.5 until it is converted by liver (Williams, 1988). Aspirin Non-cardioactive Nonsteroidal ant-inflammatory drug that 10.sup.8 to 10.sup.4 Reference has been shown to have no cardioactive effects in screening platforms (Maddah et al., 2015). hERGHuman ether-a-go-go-related gene ACEAngiotensin-converting-enzyme
[0094] Each of the compounds, with the exception of aspirin, belonged to one of five classes with each class comprising a minimum of two compounds. Aspirin functioned as a reference for a cardiac-neutral compound. To quantify the cardioactive effects of these compounds, 17 parameters were derived from each contractile event recorded in the hvCTS twitch force vs. time tracings (
[0095] 1. Desired pacing frequency.
[0096] 2. Captured pacing frequency.
[0097] 3. Max force generated (amplitude).
[0098] 4. Duration of rise from 95% cutoff to max force (contraction phase).
[0099] 5. Duration of decline from max force to 95% cutoff (relaxation phase).
[0100] 6. Area under the curve of rise from 95% cutoff to max force.
[0101] 7. Area under the curve of decline from max force to 95% cutoff.
[0102] 8. Max change of force over time (F/t) of contraction phase.
[0103] 9. Max change of force over time (F/t) of relaxation phase.
[0104] 10. Duration of rise from 50% cutoff to max force.
[0105] 11. Duration of decline from max force to 50% cutoff.
[0106] 12. Area under the curve of rise from 50% cutoff to max force.
[0107] 13. Area under the curve of decline from max force to 50% cutoff.
[0108] 14. Duration of rise from 25% cutoff to max force.
[0109] 15. Duration of decline from max force to 25% cutoff to max force.
[0110] 16. Area under the curve of rise from 50% cutoff to max force.
[0111] 17. Area under the curve of decline from max force to 50% cutoff.
[0112] These are example parameters. Parameters 4, 10 and 14 can be measured as duration of rise from various cutoffs to max force during the contraction phase, for example, cutoffs of 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85% and 90%. Parameters 5, 11 and 15 can be measured as duration of decline from max force to various cutoffs during the relaxation phase, for example, cutoffs of 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85% and 90%. Parameters 6, 12 and 16 can be measured as area under the curve of rise from various cutoffs to max force, for example using cutoffs of 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85% and 90%. Parameters 7, 13 and 17 can be measured as area under the curve of decline from max force to various cutoffs, for example using cutoffs of 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85% and 90%.
[0113] Other parameters can be added without changing the basic methodology. For instance, electrical conduction versus time tracings can provide one or a plurality of additional parameters that can be used instead of or in addition to the foregoing parameters to provide further insights using the methods and apparatuses disclosed herein.
[0114] Once the parameters characterizing each contraction were calculated, establishing the library for machine learning consisted of two primary steps. The first step was determining the degree of cardioactivity for each compound at a given dosage by calculating a singular quantitative index generated by a binary SVM approach (
[0115] The binary SVM is capable of summarizing all parameters and providing a simple metric that expresses a compound's degree of cardioactivity at a given dosage (Lee et al., 2015). Specifically, the machine is tasked with creating a decision boundary that separates between two groups (data from untreated hvCTSs and those from hvCTSs exposed to a concentration of a compound) as seen in
[0116] The second step was the utilization of multi-class SVM to create and evaluate a model. The eleven compounds (excluding aspirin) were divided into two groups (
Control Experiments
[0117] Although the hvCTSs (
[0118] To ensure that the number of hvCTSs in the subset had no effect, the calculations were performed with the sample size, n, equal to 6-10, which matched the range of numbers of strips used in each drug study of the 12 tested compounds. As expected, the SVM accuracy, regardless of the size of n, was approximately 50% for all serial additions (
[0119] To validate this reference of non-cardioactivity, drug screens of aspirin from the database were used as negative controls. Aspirin is known to have no cardioactive effects on hPSC-CMs (Lu et al., 2015; Maddah et al., 2015; Scott et al., 2014). The SVM accuracies of the aspirin drug screens (n=6) had an average of 52.851.77% among all serial additions (10 nM to 100 M). None of the conditions were statistically different from vehicle counterparts, indicating non-cardioactivity by aspirin (
Generalizability of Drug Classification Model
[0120] In setting up the drug classification model, the eleven non-reference compounds were compared to vehicle-treated tissue strips with the aforementioned binary SVM approach. At one or more of the tested concentrations, all but two compounds, lisinopril and ramipril, had SVM accuracies that were significantly greater than those of the respective vehicle studies (
[0121] A subset of the data was always withheld from the machine prior to training in each of the runs. This withheld set quantified the generalizability in the models and ensured that overfitting had not occurred. Upon asking the machine to classify these test sets, the multi-class models demonstrated good generalizability by being able to correctly classify itself at an average accuracy rate of 76.096.43, 78.295.34, and 73.615.19% for the flecainide only, E-4031 only, and flecainide & E-4031 conditions respectively (
[0122] In all three conditions, the multi-class models behaved similarly in that both the nifedipine and isoproterenol classifiers performed the best by always achieving the highest F.sub.1 score values, a metric that ranges from 0 to 1 with 1 representing perfection in model's classification. This performance indicates that the data points of the nifedipine and isoproterenol compounds occupied very distinct boundaries compared to the other two classes, allowing for the binary learners to more accurately separate these compounds from others. For perspective on the quality of model performance, if there were no discernable differences among the four compounds for the machine to use, the expected values for precision (i.e., positive predictive value), recall (i.e., sensitivity), and accuracy would be a rate of 25% with a F.sub.1 score of 0.25. As all three multi-class models demonstrated good generalizability with average accuracy rates exceeding 70%, these results suggest the setup of the model was robust to the choice of compound representing the hERG K.sup.+ channel blocker family.
Prediction on Unknown Compounds
[0123] With each condition's model established and evaluated, the machine was then asked to predict the data from the unknown compounds group. In the first scenario with flecainide as the only hERG K.sup.+ channel blocker representative, the multi-class model was able to correctly assign the four unknown compounds to their corresponding counterparts with an average accuracy of 71.691.96% (
[0124] When the second drug class model (only E-4031 defining hERG K.sup.+ channel blocker class) was used to predict the unknown compounds, the average accuracy diminished to 65.372.33% (
[0125] In the last condition where the machine was trained with both flecainide and E-4031 representing the hERG K.sup.+ channel blocker family, the average accuracy was 71.432.09% (
Class Relationship Metrics
[0126] Once the drug classes were predicted, the concentrations of library compounds that induced the most similar cardioactive effects as the unknown compounds were computed (
TABLE-US-00002 TABLE 2 Summary of estimated drug response relationships of all four classes. Estimated Similar Compound (M) Predicted Class Concentration (M) Mibefradil (3.0 10.sup.6) Ca.sup.2+ channel Nifedipine (1.28 10.sup.7) blocker Norepinephrine Adrenergic agonist Isoproterenol (1.0 10.sup.5) (1.44 10.sup.6) Ouabain (1.0 10.sup.5) Cardiac glycoside Digoxin (5.35 10.sup.5) Cisapride (1.0 10.sup.4) hERG K.sup.+ channel Flecainide (1.03 10.sup.5) blocker
[0127] For example, an estimated 5.3510.sup.5 M of digoxin would be needed to evoke a level of cardioactivity that matches ouabain tested at 1.010.sup.5 M. Such relationships could provide insights about drug potency. In the aforementioned example, ouabain would be considered the more potent compound as it requires approximately 5-fold lower concentration to achieve the same level of cardioactivity. Ouabain's higher potency has been observed in other in vitro studies (Guo et al., 2011; Katz et al., 2010).
Decoupling Force Frequency Relationship
[0128] While the concept of examining multiple parameters from waveforms has been pursued lately, some studies have suggested that only a few select parameters (e.g., peak count) are necessary in assessing a compound's cardioactivity as other parameters provide no further mechanistic insight (Lu et al., 2015; Pointon et al., 2016; Sirenko et al., 2013). This is primarily true when the hPSC-CMs are spontaneously beating, meaning the force generated is linked to beating frequency. This study's dataset affirmed the importance of decoupling this force-frequency relationship through the pacing of the tissues. By setting a fixed pacing frequency, any changes to the force waveform can be truly accredited to a compound's inotropic and lusitropic effects. For example, if the nifedipine-treated strips were allowed to spontaneously beat, a positive chronotropic effect would have most likely been observed (Guo et al., 2011; Harris et al., 2013; Pillekamp et al., 2012). As the hvCTSs displayed a negative force-frequency relationship (
Examination of Cardioactive Effects
[0129] The data was further examined on an individual parameter basis to better comprehend the performance of the multi-class models and their ability to differentiate between different mechanistic actions. The adrenergic agonists and cardiac glycosides were expected to induce a positive inotropic response in the hvCTSs, while the Ca.sup.2+ and hERG Kchannel blockers would induce a negative inotropic response. The negative inotropic agents prompted distinct decreases in maximum force generated among hvCTSs; however, the hvCTS sensitivity to positive inotropic agents was not very apparent. For example, hvCTSs exposed to 10 M of isoproterenol and paced at 0.5 Hz had a similar increase in maximum developed force to that of respective vehicle-treated strips (10.4216.23% and 15.7621.05%, respectively), suggesting the compound had negligible inotropic effects (
[0130] Cardiac glycosides-treated hvCTSs also demonstrated the system's sensitivity to positive inotropes. Typically, these compounds increase the waveform amplitude (Ca.sup.2+ transients or microelectrode array measurements); however, in this dataset, the hvCTSs decreased in maximum developed force as the concentration increased (
[0131] Like those of isoproterenol, cardioactive effects of the cardiac glycosides at lower concentrations appeared in other parameters. When either the concentration of cisapride or digoxin increased, the maximum developed force decreased while the duration of the relaxation phase increased (
DISCUSSION
[0132] In recognition of the need for better detection of drug-induced cardiotoxicity, numerous methodologies have emerged to capture and quantify the attributes of hPSC-CMs when exposed to cardioactive compounds, ranging from phenotype to calcium transients to contractile force. The nature of this outputted data becomes high-dimensional when multiple experimental conditions are present or a multiplex system is used (Dempsey et al., 2016). In this study, we present the use of supervised machine learning to exploit high dimensional data and provide relevant information in an automated manner. Besides indicating if a compound was cardioactive, the machine constructed a multi-class drug model that accurately classified cardioactive compounds that it had never previously encountered. This comprehensive approach can be readily applied to other screening platforms to more fully utilize generated datasets and enhance evidence-based decision-making for drug development.
[0133] With multi-class SVM, drug classification libraries were established under various conditions to examine effects on predictive performance. The conditions that yielded the best performance in predicting mechanistic action were the two libraries that included flecainide as a representative of the hERG K.sup.+ channel blocker family. In both libraries, the macro-averages of F.sub.1 scores were 0.71 (macro-average of F.sub.1 scores would be 0.25 if random classifiers were used). While this clear difference in F.sub.1 scores indicates that the models have the capability to predict a compound's mechanistic action, there are opportunities to further improve model performance and obtain F.sub.1 scores closer to 1, indicating reduction in errors.
[0134] One method to improve model performance is to define each drug family with multiple compounds. By having only one compound define a class, there is a risk of only defining a partial region of space that the drug class truly encompasses. The data of E-4031 exemplified this when it was tasked with defining the hERG K.sup.+ channel blocker family. E-4031's defined boundaries did not match or include that of cisapride's, another hERG K.sup.+ channel blocker, causing classification of cisapride to be closer to that of the cardiac glycoside family. The inclusion of flecainide, a mixed hERG K.sup.+ channel blocker, with E-4031 in the definition of the class allowed for the correct prediction of cisapride without adversely affecting the predictive capability of the remaining classes. Although the addition of E-4031 to the hERG K.sup.+ channel blocker definition does not necessarily improve the predictive capability with respect to cisapride classification, establishing a more expansive region of space to define the hERG K.sup.+ channel blocker class may improve prediction of other unknown hERG K.sup.+ channel blockers that have effects more similar to E-4031 than flecainide. These results also suggest the potential of having subgroups within classes of the model, which can be achieved through a series of multi-class classifications. For instance, a compound can be predicted as a Ca.sup.2+ channel blocker in the first classification; within this family, the compound can be subsequently categorized into a subgroup (e.g., defined by frequency-dependent cardioactivity). As machine learning does not define drug classes with a priori knowledge (e.g. guidelines on how parameters are expected to change), the number of drug families and subclasses that can be defined within a model are not limited. The unbiased and automated nature of machine learning is also advantageous when a new drug family needs to be added, because no rubric needs to be manually amended and re-evaluated.
[0135] This study demonstrates the potential of machine learning for providing insights in the detection of cardioactivity using hPSC-CMs. The basis of this study's libraries was an error-correcting output codes approach with binary learners being SVM. Different binary learners, such as decision trees, should be explored alongside completely different approaches (e.g., neural networks). The ideal machine learning technique should balance predictive capabilities and use of computational resources. In this study, all models were generated with a standard desktop. Each calculated instance of a model took approximately four hours. However once all models were formed, the predictions made on unknown compounds were on the timescale of seconds.
[0136] Improvements of the multi-class drug libraries can also be achieved from enhancements of the hvCTSs and acquisition system. In particular, the sensitivity of this system to positive inotropic compounds can be increased by addressing two issues, the maturity of stem-cell derived cardiomyocytes and the drifting baseline of vehicle-treated strips. Studies have shown that hPSC-CMs elicit a minimal to non-existent response to certain positive inotropic compounds, such as beta-adrenergic agonists, because of immature intracellular structures (Lundy et al., 2013; Pillekamp et al., 2012). When these diminished responses are paired with a baseline that has increasing contractility over time, positive inotropic effects of a compound can get masked and harder to detect as seen in the aforementioned isoproterenol drug screen. While hvCTSs were arranged in an aligned manner and co-cultured with fibroblasts, they can be further matured through additional techniques, such as conditioning by electrical stimulation, a cellular tri-culture including endothelial cells, or forced expression of selected proteins (Eng et al., 2016; Liu et al., 2009; Ravenscroft et al., 2016). As for stabilization of the baseline, different components of the setup, ranging from pH to CO.sub.2 levels in ambient environment, should be re-evaluated to minimize overall drift during serial additions. Increasing the system's sensitivity to positive inotropic agents would yield even more distinct boundaries and subsequently better predictability in the drug classification libraries.
[0137] In summary, we present the implementation of supervised machine learning on high dimensional data of hvCTSs exposed to drugs while paced at various frequencies. In an automated fashion, this machine learning approach is able to not only determine if a compound is cardioactive, but it can predict the mechanistic action along with other metrics. Furthermore, this approach can be adapted to state of the art tissue engineered cardiac models, including different forms of signals (e.g., calcium transients, micro electrode array and optical recordings), and has the potential to integrate diverse output data of multiplex systems or even those across platforms. Along with analyses of compounds with acute cardioactive effects, machine learning can be readily applied with non-invasive techniques (e.g., force calculation with hvCTS) to longitudinal studies to inspect a compound's chronic effects. Moreover, machine learning can be utilized on a grander scale by incorporating past clinical data to determine the optimal combination of in vitro and in silico data for the prediction of drug-induced cardiotoxicity in patients.
Software
[0138] The software utilized enables the use of readouts of cardiac tissue strips or other models of human myocardium to provide relevant information about a compound's cardioactivity potential for the streamlining of the drug discovery pipeline.
[0139] The software utilizes machine learning to analyze the curves or shapes of each contractile events (cardiac beats) acquired from a screening platform's readout. To describe these curves or shapes, parameters are derived (e.g., amplitude). For example with the human ventricular cardiac tissue strips, 17 parameters are calculated from the force readouts. The machine learning is able to simultaneously analyze all parameters and any underlying relationships. Using machine learning we generate a singular quantitative index that determines a compound's level of cardioactivity. This is achieved by comparing the readouts of each drug concentration to readouts of cardiac models at a control state. The control tissues could be either healthy or diseased, allowing the added possibility of using disease models to screen for disease-specific cardioactivity. If the compound is deemed cardioactive, we can then predict the mechanism of cardioactivity with a library of defined drug classes. Since machine learning is implemented, guidelines and rubrics that define a drug class are not required. This allows for the addition of drug class into the library with relative ease. Furthermore, drug response relationships between compounds can be determined (e.g., what concentration of Compound A is required to elicit the same response at a concentration of Compound B).
[0140] The relevant information mentioned above assists researchers in more efficient drug development. The prediction of a drug compound class allows for the rapid identification of select compounds for more in-depth follow up assays. In addition, this information coupled with knowledge of predicted class guides scientists to efficiently and selectively screen for specific drug-to-drug interactions that prompt cardiotoxicity (e.g., disruption of Ca.sup.2+ handling when sofosbuvir and amiodarone are combined) instead of a brute force approach. Collectively, the information enables better evidence-based decision-making in drug development. Applications include, but are not limited to, drug screening and basic research of cardiomyocytes.
[0141] Software takes into account hardware requirements, operating system requirements, programming language (e.g., MATLAB), user interfaces, drawings, schematics and flow charts, required utilities, required distribution format(s), and significance of third party code.
[0142] In some embodiments, the software is used for drug screening and toxicity studies.
[0143] The concept of integrating parameters derived from an assay or across multiple assays has been explored. Software tools such as ToxPi analyze and weight different inputted data. Other studies, such as Clements et al. (Bridging functional and structural cardiotoxicity assays using human embryonic stem cell-derived cardiomyocytes for a more comprehensive risk assessment Toxicological Sciences, 2015), have clustered compounds into groups by integrating data of multiple assays (e.g., functional readouts of micro-electrode arrays paired with structural readouts of high content analysis). The software described here is able to integrate the data of multiple parameters derived from a single or multiple assays and create a drug classification library in an automated manner. The establishment of this library does not require any guidelines, rubrics or thresholds (e.g., a compound is deemed chronotropic if the beating frequency exceeds 20% of that of healthy cardiomyocytes). The library is then able to predict the mechanism of cardioactivity of unknown compounds (those never seen by the computer), which is not evidently present in the aforementioned software and studies. In addition, this software is adaptable to various readouts of different screening platforms.
Example 1
[0144] hvCTS Formation
[0145] Human ventricular cardiomyocytes were differentiated from a hES2 stem cell line with a Wnt inhibitor-based protocol as previously described (Weng et al., 2014). Human ventricular cardiac tissue strips (hvCTS) were then formed by mixing cardiomyocytes (100 k cells per strip) at 14-16 days post differentiation with a solution of bovine collagen I (2 mg/mL), Matrigel (0.9 mg/mL), and human foreskin fibroblasts (100 k cells per strip) as previously described (Turnbull et al., 2014). The cell-matrix solution (100 uL per tissue strip) was injected into a custom PDMS force-sensing bioreactor device and placed in an incubator (37 C. and 5% CO.sub.2). Formed hvCTSs were fed DMEM with 10% new born calf serum, 1% penicillin-streptomycin and 0.1% amphotericin B. The PDMS device contains two flexible vertical end-posts to which the tissue anchors, causing the posts to deflect as the tissue beats. Contractile force measurements were captured with a high speed (100 fps) CCD camera while custom LabVIEW software tracked the centroid movement of the flexible post tips. Force was converted from the deflection of the PDMS posts by an elastic beam bending equation (Serrao et al., 2012). A custom MATLAB script was used to calculate 17 parameters that described the overall shape of the force traces for each contractile event (
Example 2
Drug Treatment
[0146] After 7-8 days post tissue formation, hvCTS were exposed to drugs for pharmacodynamic analysis. Flecainide, lisinopril, norepinephrine and ramipril were provided by Pfizer, while all other compounds were purchased form Sigma-Aldrich. Compounds were initially resuspended in DMSO and subsequently diluted in water for final concentrations composed of less than 0.1% (vol/vol) DMSO. The PDMS device containing the hvCTS was placed onto a heated stage (37 C.) under a dissecting microscope. Before either vehicle or drug addition, the media was replaced with DMEM containing high glucose (4.5 g/L) and HEPES without phenol red. Drug doses were added to a tissue in consecutively increasing manner up to 10 concentrations with 3 minutes in between measurements. Vehicle doses containing only water were applied similarly. A pulse stimulator (AMPI Master-9) connected to platinum wires electrically paced the hvCTS with a monophasic electric field of 5 V/cm with a 10 ms pulse duration.
Example 3
Machine Learning
[0147] To establish the drug class model, we identified individual compounds that respectively represented our defined classes. The compounds and the corresponding tested concentrations are listed in Table 1. To determine which concentration of a chosen compound to add to the model, we first gauged each compound's level of cardioactivity by utilizing binary SVM (
[0148] For multi-class classification, we then selected the compound concentration that met two criteria: 1) a binary SVM accuracy closest to 85% and 2) at least 6 of all screened tissue strips were still responsive to electrical stimulation (see Example 6). As seen in
Example 4
Statistics
[0149] SVM accuracies of strips exposed to a drug condition were compared to those of the non-cardioactive benchmark by using Student's t-test (desired a value of 0.05) with a Bonferroni correction (m, number of tests or hypotheses, was dependent on the number of drug additions in a screen). If the adjusted p-value was statistically significant, the drug condition was considered to have incited irregular behavior in hvCTSs and was labeled as cardioactive. The Bonferroni correction was also applied when examining changes in specific parameters.
[0150] To analyze the performance of the multi-class models, confusion matrices were generated for each of the 50 runs. In a confusion matrix, M, the precision and recall rates were defined as the following:
[0151] The precision and recall rates were calculated for each of the classifiers. To further summarize these metrics, the F.sub.1 score, harmonic mean of precision and recall, was computed and defined as the following:
[0152] A model that is perfect would achieve a F.sub.1 score of 1. If a model were composed of s-number of classes and had random classifiers, the expected Fiscore would be
[0153] To assess the model as a whole, accuracy, defined as:
was calculated. In summarizing the 50 runs of each model, all calculated metrics were averaged and a confusion matrix containing the average number of contractile events over all runs was provided. All reported sample sizes (n) refer to independent tissue strips (biological replicates). All descriptive statistics are in the format of meanstandard deviation.
Example 5
Class Relationship Metrics
[0154] The concentration of a library compound that induced the most similar cardioactive effects as the compound of interest was determined. This class relationship metric was computed by first selecting the compound of interest at a desired concentration and performing a series of binary SVM among the tested range of a library compound. For each concentration of the library compound, the closer the SVM accuracy was to 50%, the more defined boundaries of the compounds overlapped and the more similar the cardioactive effects were. This relationship between SVM accuracy and tested concentration range was presumed to behave in a Gaussian manner with the centroid representing the concentration that would elicit the most similar effects. The Gaussian fit was set with 50% as the lower limit and the highest achieved SVM accuracy as the upper limit. If the original SVM accuracies reached the 50% mark and remained around this value for subsequent concentrations, only the first concentration to reach the 50% was included in the fit to accurately model one side of the Gaussian curve.
Example 6
Optimization of Binary and Multi-Class SVM Classifiers
[0155] To optimize the binary SVM classifiers, a non-linear kernel, radial basis function, was implemented. The hvCTS data was allocated with one third representing the test set and the remainder serving as the training set. We maintained a balanced number between the vehicle-treated strips and those exposed to a cardioactive compound of the model (n=6, 7, 8, 9, or 10). Since the number of vehicle strips (n=28) always outweighed those treated with drugs, we randomly selected a subset of the vehicle-treated tissue strips that equaled the sample size for each SVM run. We tuned both the box constraint and sigma parameter of each run with a geometric progression approach. To prevent overfitting, we performed a 5-fold cross validation. It should be noted that if more than half of the tissue strips become unresponsive to the electrical stimulation at a given concentration, the SVM accuracy for that condition was automatically designated as 100% and binary SVM was not performed. A total of 50 SVM runs were performed for each concentration to account for the variation and random selection of data sets.
[0156] For the multi-class models, a criterion of 85% binary SVM accuracy was used to determine the specific concentration of a compound that would be included in the library. This criterion was chosen as it was as a reference point where the cardioactive effects of a compound would be prominent, but can still define generalizable boundaries from those of other compounds. The value of 85% was specifically chosen as it was approximately the midpoint between the maximum achievable separation (100%) and a minimum bound that would ensure cardioactivity. We defined the minimum bound as the largest sum of mean SVM accuracy and one standard deviation across all vehicle studies, resulting in a bound of 69.34% (mean SVM accuracy of 53.45% and standard deviation of 15.89%). The criterion of at least 6 responsive tissue strips was to ensure that within the test sets there were data from at least two strips for all runs.
[0157] For the creation and optimization of the multi-class models, a one-vs.-one strategy with binary SVM learners was used. An error-correcting output codes approach was used to summarize results and classify. Binary learners were again tuned in regards to the box constraint and sigma parameter. A 10-fold cross validation was performed on the entire model. Similarly, this multi-class classification and prediction process was repeated a total of 50 times.
REFERENCES
[0158] Chen, A., Lee, E., Tu, R., Santiago, K., Grosberg, A., Fowlkes, C., and Khine, M. (2014). Integrated platform for functional monitoring of biomimetic heart sheets derived from human pluripotent stem cells. Biomaterials 35, 675-683. [0159] Dempsey, G. T., Chaudhary, K. W., Atwater, N., Nguyen, C., Brown, B. S., Mcneish, J. D., Cohen, A. E., and Kralj, J. M. (2016). Cardiotoxicity screening with simultaneous optogenetic pacing, voltage imaging and calcium imaging. J. Pharmacol. Toxicol. Methods 81, 240-250. [0160] Dick, E., Rajamohan, D., Ronksley, J., and Denning, C. (2010). Evaluating the utility of cardiomyocytes from human pluripotent stem cells for drug screening. Biochem. Soc. Trans. 38, 1037-1045. [0161] Dietterich, T. G. (1995). Solving Multiclass Learning Problems via Error-Correcting Output Codes. J. Artif. Intell. Res. 263-286. [0162] Eng, G., Lee, B. W., Protas, L., Gagliardi, M., Brown, K., Kass, R. S., Keller, G., Robinson, R. B., and Vunjak-novakovic, G. (2016). Autonomous beating rate adaptation in human stem cell-derived cardiomyocytes. Nat. Commun. 7, 1-10. [0163] FDA (2005). Guidance for Industry: S7B Nonclinical Evaluation of by Human Pharmaceuticals Guidance for Industry. [0164] Ferriman, A. (2000). UK licence for cisapride suspended Cancer drug may cause heart failure. Br. Med. J. 321, 2000. [0165] Guo, L., Qian, J., Abrams, R., Tang, H., Weiser, T., Sanders, M. J., and Kolaja, K. L. (2011). The Electrophysiological Effects of Cardiac Glycosides in Cardiomyocytes and in Guinea Pig Isolated Hearts. Cell. Physiol. Biochem. 27, 453-462. [0166] Guo, L., Coyle, L., Abrams, R. M. C., Kemper, R., Chiao, E. T., and Kolaja, K. L. (2013). Refining the Human iPSC-Cardiomyocyte Arrhythmic Risk Assessment Model. 136, 581-594. [0167] Harmer, A. R., Abi-Gerges, N., Morton, M. J., Pullen, G. F., Valentin, J. P., and Pollard, C. E. (2012). Validation of an in vitro contractility assay using canine ventricular myocytes. Toxicol. Appl. Pharmacol. 260, 162-172. [0168] Harris, G., and Koli, E. (2005). Lucrative Drug, Danger Signals and the F.D.A. New York Times 1-9. [0169] Harris, K., Aylott, M., Cui, Y., Louttit, J. B., McMahon, N.C., and Sridhar, A. (2013). Comparison of electrophysiological data from human-induced pluripotent stem cell-derived cardiomyocytes to functional preclinical safety assays. Toxicol. Sci. 134, 412-426. [0170] Huebsch, N., Loskill, P., Deveshwar, N., Spencer, C. I., Judge, L. M., Mandegar, M. A., Fox, C. B., Mohamed, T. M. A., Ma, Z., Mathur, A., et al. (2016). Miniaturized iPS-Cell-Derived Cardiac Muscles for Physiologically Relevant Drug Response Analyses. Sci. Reports 6, 1-12. [0171] Katz, A., Lifshitz, Y., Bab-Dinitz, E., Kapri-Pardes, E., Goldshleger, R., Tal, D. M., and Karlish, S. J. D. (2010). Selectivity of digitalis glycosides for isoforms of human Na,K-ATPase. J. Biol. Chem. 285, 19582-19592. [0172] Lee, E. K., Kurokawa, Y. K., Tu, R., George, S. C., and Khine, M. (2015). Machine learning plus optical flow: a simple and sensitive method to detect cardioactive drugs. Sci. Rep. 5. [0173] Li, X., Zhang, R., Zhao, B., Lossin, C., and Cao, Z. (2016). Cardiotoxicity screening: a review of rapid-throughput in vitro approaches. Arch. Toxicol. 90, 1803-1816. [0174] Liu, J., Lieu, D. K., Siu, C. W., Fu, J. D., Tse, H. F., and Li, R. A. (2009). Facilitated maturation of Ca 2+handling properties of human embryonic stem cell-derived cardiomyocytes by calsequestrin expression. Am J Physiol Cell Physiol 297, 152-159. [0175] Lu, H. R., Whittaker, R., Price, J. H., Vega, R., Pfeiffer, E. R., Cerignoli, F., Towart, R., and Gallacher, D. J. (2015). High throughput measurement of Ca++dynamics in human stem cell-derived cardiomyocytes by kinetic image cytometery: A cardiac risk assessment characterization using a large panel of cardioactive and inactive compounds. Toxicol. Sci. 148, 503-516. [0176] Luna, J. I., Ciriza, J., Garcia-ojeda, M. E., Kong, M., Herren, A., Lieu, D. K., Li, R. A., Fowlkes, C. C., Khine, M., and McCloskey, K. E. (2011). Multiscale Biomimetic Topography for the Alignment of Neonatal and Embryonic Stem Cell-Derived Heart Cells. Tissue Eng. Part C 17. [0177] Lundy, S. D., Zhu, W.-Z., Regnier, M., and Laflamme, M. A. (2013). Structural and Functional Maturation of Cardiomyocytes Derived from Human Pluripotent Stem Cells. Stem Cells Dev. 22, 1991-2002. [0178] Maddah, M., Heidmann, J. D., Mandegar, M. a, Walker, C. D., Bolouki, S., Conklin, B. R., and Loewke, K. E. (2015). A Non-invasive Platform for Functional Characterization of Stem-Cell-Derived Cardiomyocytes with Applications in Cardiotoxicity Testing. Stem Cell Reports 4, 621-631. [0179] Martin, R. L., Lee, J., Cribbs, L. L., Perez-reyes, E., and Hanck, D. A. (2000). Mibefradil Block of Cloned T-Type Calcium Channels 1. J. Pharmacol. Exp. Ther. 295, 302-308. [0180] Millard, D. C., Strock, C. J., Carlson, C. B., Aoyama, N., Juhasz, K., Goetze, T. A., Stoelzle-Feix, S., Becker, N., Fertig, N., January, C. T., et al. (2016). Identification of Drug-Drug Interactions In Vitro: A Case Study Evaluating the Effects of Sofosbuvir and Amiodarone on hiPSC-Derived Cardiomyocytes. Toxicol. Sci. 154, 1-9. [0181] Navarrete, E. G., Liang, P., Lan, F., Sanchez-Freire, V., Simmons, C., Gong, T., Sharma, a., Burridge, P. W., Patlolla, B., Lee, a. S., et al. (2013). Screening Drug-Induced Arrhythmia Events Using Human Induced Pluripotent Stem Cell-Derived Cardiomyocytes and Low-Impedance Microelectrode Arrays. Circulation 128, S3-S13. [0182] Pillekamp, F., Haustein, M., Khalil, M., Emmelheinz, M., Nazzal, R., Adelmann, R., Nguemo, F., Rubenchyk, O., Pfannkuche, K., Matzkies, M., et al. (2012). Contractile Properties of Early Human Embryonic Beta-Adrenergic Stimulation Induces Positive Chronotropy and Lusitropy but Not Inotropy. Stem Cells Dev. 21, 2111-2121. [0183] Pointon, A., Pilling, J., Dorval, T., Wang, Y., Archer, C., and Pollard, C. (2016). High-Throughput Imaging of Cardiac Microtissues for the Assessment of Cardiac Contraction during Drug Discovery. Toxicol. Sci. kfw227. [0184] Ravenscroft, S. M., Pointon, A., Williams, A. W., Cross, M. J., and Sidaway, J. E. (2016). Cardiac Non-myocyte Cells Show Enhanced Pharmacological Function Suggestive of Contractile Maturity in Stem Cell Derived Cardiomyocyte Microtissues. Toxicol. Sci. 152, 99-112. [0185] Scott, C. W., Zhang, X., Abi-Gerges, N., Lamore, S. D., Abassi, Y. A., and Peters, M. F. (2014). An impedance-based cellular assay using human iPSC-derived cardiomyocytes to quantify modulators of cardiac contractility. Toxicol. Sci. 142, 331-338. [0186] Serrao, G. W., Turnbull, I. C., Ancukiewicz, D., Kim, D. E., Kao, E., Cashman, T. J., Hadri, L., Hajjar, R. J., and Costa, K. D. (2012). Myocyte-depleted engineered cardiac tissues support therapeutic potential of mesenchymal stem cells. Tissue Eng. Part A 18, 1322-1333. [0187] Shum, A. M. Y., Che, H., Wong, A. O., Zhang, C., Wu, H., Chan, C. W. Y., Costa, K., Khine, M., Kong, C., and Li, R. A. (2017). A Micropatterned Human Pluripotent Stem Cell-Based Ventricular Cardiac Anisotropic Sheet for Visualizing Drug-induced Arrhythmogencity. Adv. Mater. 29. [0188] Sirenko, O., Crittenden, C., Callamaras, N., Hesley, J., Chen, Y.-W., Funes, C., Rusyn, I., Anson, B., and Cromwell, E. F. (2013). Multiparameter in vitro assessment of compound effects on cardiomyocyte physiology using iPSC cells. J. Biomol. Screen. 18, 39-53. [0189] Steinberg, S. F. (1999). The Molecular Basis for Distinct B-Adrenergic Receptor Subtype Actions in Cardiomyocytes. Circ. Res. 85, 1101-1111. [0190] Turnbull, I. C., Karakikes, I., Serrao, G. W., Backeris, P., Lee, J., Xie, C., Senyei, G., Gordon, R. E., Li, R. A., Akar, F. G., et al. (2014). Advancing functional engineered cardiac tissues toward a preclinical model of human myocardium. Fed. Am. Soc. Exp. Biol. 28, 644-654. [0191] US Food and Drug Administration (2007). FDA Announces Discontinued Marketing of GI Drug, Zelnorm, for Safety Reasons. 10-11. [0192] Wang, J., Chen, A., Lieu, D. K., Karakikes, I., Chen, G., Keung, W., Chan, C. W., Hajjar, R. J., Costa, K. D., Khine, M., et al. (2013). Biomaterials Effect of engineered anisotropy on the susceptibility of human pluripotent stem cell-derived ventricular cardiomyocytes to arrhythmias. Biomaterials 34, 8878-8886. [0193] Weng, Z., Kong, C.-W., Ren, L., Karakikes, I., Geng, L., He, J., Chow, M. Z. Y., Mok, C. F., Chan, H. Y. S., Webb, S. E., et al. (2014). A Simple, Cost-Effective but Highly Efficient System for Deriving Ventricular Cardiomyocytes from Human Pluripotent Stem Cells. Stem Cells Dev. 23, 1704-1716. [0194] Williams, G. H. (1988). Converting-enzyme inhibitors in the treatment of hypertension. N. Engl. J. Med. 319, 1517-1525. [0195] Wong, B. S., Manabe, N., and Camilleri, M. (2010). Role of prucalopride, a serotonin (5-HT4) receptor agonist, for the treatment of chronic constipation. Clin. Exp. Gastroenterol. 3, 49-56. [0196] Yang, X., Pabon, L., and Murry, C. E. (2014). Engineering Adolescence: Maturation of Human Pluripotent Stem Cell-Derived Cardiomyocytes. Circ. Res. 114, 549-561. [0197] Zhang, D., Shardin, I., Lam, J., Xian, H.-Q., Snodgrass, R., and Bursac, N. (2014). Tissue-engineered cardiac patch for advanced functional maturation of human ESC-derived cardiomyocytes. Biomaterials 34, 5813-5820. [0198] Ziupa, D., Beck, J., Franke, G., Feliz, S. P., Hartmann, M., Koren, G., Zehender, M., Bode, C., Brunner, M., and Odening, K. E. (2014). Pronounced Effects of HERG-Blockers E-4031 and Erythromycin on APD, Spatial APD Dispersion and Triangulation in Transgenic Long-QT Type 1 Rabbits. PLoS One 9.