ANALYTICAL METHODS AND ARRAYS FOR USE IN THE SAME

20220026411 · 2022-01-27

    Inventors

    Cpc classification

    International classification

    Abstract

    The present invention relates to a method for identifying agents which are capable of inducing respiratory sensitization in a mammal, and arrays and analytical kits for use in such methods.

    Claims

    1. A method for identifying agents capable of inducing respiratory sensitization in a mammal comprising or consisting of the steps of: (a) providing a population of dendritic cells or a population of dendritic-like cells; (b) exposing the cells provided in step (a) to a test agent; and (c) measuring in the cells of step (b) the expression of two or more biomarkers selected from the group defined in Table A; wherein the expression of the two or more biomarkers measured in step (c) is indicative of the respiratory sensitizing effect of the test agent of step (b).

    2. The method according to claim 1 wherein one or more of the biomarkers for which the expression is measured in step (c) is selected from the group defined in Table A(i).

    3. The method according to claim 1 or 2 wherein step (c) comprises or consists of measuring the expression of two or more biomarkers selected from the group defined in in Table A(i), for example, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 of the biomarkers listed in Table A(i).

    4. The method according to any one of the preceding claims wherein step (c) comprises or consists of measuring the expression of all of the biomarkers listed in Table A(i).

    5. The method according to any one of the preceding claims wherein step (c) comprises or consists of measuring the expression of one or more biomarkers selected from the group defined in in Table A(ii), for example, 2, or 3 of the biomarkers listed in Table A(ii).

    6. The method according to any one of the preceding claims wherein step (c) comprises or consists of measuring the expression of all of the biomarkers listed in Table A(ii).

    7. The method according to any one of the preceding claims wherein step (c) comprises or consists of measuring the expression of three or more of the biomarkers selected from the group defined in Table A, for example, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, or 28 of the biomarkers listed in Table A.

    8. The method according to any one of the preceding claims wherein step (c) comprises or consists of measuring the expression of all of the biomarkers listed in Table A.

    9. The method according to any previous claim further comprising: d) exposing a separate population of the dendritic cells or dendritic-like cells to one or more negative control agent that is not a respiratory sensitizer in a mammal; and e) measuring in the cells of step (d) the expression of the two or more biomarkers measured in step (c) wherein the test agent is identified as a respiratory sensitizer in the event that the expression of the two or more biomarkers measured in step (e) differs from the expression of the two or more biomarkers measured in step (c).

    10. The method any previous claim further comprising: f) exposing a separate population of the dendritic cells or dendritic-like cells to one or more positive control agent that is a respiratory sensitizer in a mammal; and g) measuring in the cells of step (f) the expression of the two or more biomarkers measured in step (c) wherein the test agent is identified as a respiratory sensitizer in the event that the expression of the two or more biomarkers measured in step (f) corresponds to the expression of the two or more biomarkers measured in step (c).

    11. The method according to any one of the preceding claims wherein step (c) comprises measuring the expression of a nucleic acid molecule of one or more of the biomarkers.

    12. The method according to claim 11 wherein the nucleic acid molecule is a cDNA molecule or an mRNA molecule.

    13. The method according to claim 12 wherein the nucleic acid molecule is an mRNA molecule.

    14. The method according to claim 12 wherein the nucleic acid molecule is a cDNA molecule.

    15. The method according to any one of claims 11 to 14 wherein measuring the expression of one or more of the biomarkers in step (c) is performed using a method selected from the group consisting of Southern hybridisation, Northern hybridisation, polymerase chain reaction (PCR), reverse transcriptase PCR (RT-PCR), quantitative real-time PCR (qRT-PCR), nanoarray, microarray, macroarray, autoradiography and in situ hybridisation.

    16. The method according to any one of claims 11 to 15 wherein measuring the expression of one or more of the biomarkers in step (c) is determined using a DNA microarray.

    17. The method according to any one of the preceding claims wherein measuring the expression of one or more of the biomarkers in step (c) is performed using one or more binding moieties, each capable of binding selectively to a nucleic acid molecule encoding one of the biomarkers identified in Table A.

    18. The method according to claim 17 wherein the one or more binding moieties each comprise or consist of a nucleic acid molecule.

    19. The method according to claim 17 wherein the one or more binding moieties each comprise or consist of DNA, RNA, PNA, LNA, GNA, TNA or PMO.

    20. The method according to claim 18 or 19 wherein the one or more binding moieties each comprise or consist of DNA.

    21. The method according to any one of claims 17 to 20 wherein the one or more binding moieties are 5 to 100 nucleotides in length.

    22. The method according to any one of claims 17 to 21 wherein the one or more binding moieties are 15 to 35 nucleotides in length.

    23. The method according to any one of claims 17 to 22 wherein the binding moiety comprises a detectable moiety.

    24. The method according to claim 23 wherein the detectable moiety is selected from the group consisting of: a fluorescent moiety; a luminescent moiety; a chemiluminescent moiety; a radioactive moiety (for example, a radioactive atom); or an enzymatic moiety.

    25. The method according to claim 24 wherein the detectable moiety comprises or consists of a radioactive atom.

    26. The method according to claim 25 wherein the radioactive atom is selected from the group consisting of technetium-99m, iodine-123, iodine-125, iodine-131, indium-111, fluorine-19, carbon-13, nitrogen-15, oxygen-17, phosphorus-32, sulphur-35, deuterium, tritium, rhenium-186, rhenium-188 and yttrium-90.

    27. The method according to claim 24 wherein the detectable moiety of the binding moiety is a fluorescent moiety.

    28. The method according to any one of claims 1 to 10 wherein step (c) comprises or consists of measuring the expression of the protein of one or more of the biomarkers.

    29. The method according to claim 28 wherein measuring the expression of one or more of the biomarkers in step (c) is performed using one or more binding moieties each capable of binding selectively to one of the biomarkers identified in Table A.

    30. The method according to claim 29 wherein the one or more binding moieties comprise or consist of an antibody or an antigen-binding fragment thereof.

    31. The method according to any one of claims 29 to 30 wherein the one or more binding moieties comprise a detectable moiety.

    32. The method according to claim 31 wherein the detectable moiety is selected from the group consisting of a fluorescent moiety, a luminescent moiety, a chemiluminescent moiety, a radioactive moiety and an enzymatic moiety.

    33. The method according to any one of the preceding claims wherein step (c) is performed using an array.

    34. The method according to claim 33 wherein the array is a bead-based array.

    35. The method according to claim 34 wherein the array is a surface-based array.

    36. The method according to any one of claims 33 to 35 wherein the array is selected from the group consisting of: macroarray; microarray; nanoarray.

    37. The method according to any one of the preceding claims wherein the method is performed in vitro, in vivo, ex vivo or in silico.

    38. The method according to claim 37 wherein the method is performed in vitro.

    39. The method according to any one of the preceding claims wherein the population of dendritic cells or population of dendritic-like cells comprises or consists of immortal and/or non-naturally occurring cells.

    40. The method according to any one of the preceding claims wherein the population of dendritic cells or population of dendritic-like cells is a population of dendritic-like cells.

    41. The method according to claim 40 wherein the dendritic-like cells are myeloid dendritic-like cells.

    42. The method according to claim 41 wherein the myeloid dendritic-like cells are derived from myeloid dendritic cells.

    43. The method according to claim 42 wherein the cells derived from myeloid dendritic cells are myeloid leukaemia-derived cells such as those selected from the group consisting of KG-1, THP-1, U-937, HL-60, Monomac-6, AML-193, MUTZ-3, and SenzaCell.

    44. The method according to any one of the preceding claims for identifying agents capable of inducing a respiratory hypersensitivity response.

    45. The method according to any one of the preceding claims wherein the hypersensitivity response is a humoral hypersensitivity response.

    46. The method according to any one of the preceding claims for identifying agents capable of inducing a type I hypersensitivity response in a mammal.

    47. The method according to any one of the preceding claims for identifying agents capable of inducing respiratory allergy.

    48. The method according to any one of the claims 9 to 47 wherein the one or more negative control agent provided in step (d) is selected from the group consisting of: unstimulated cells; cell media; vehicle control; DMSO; 1-Butanol; 2-Aminophenol; 2-Hydroxyethyl acrylate; 2-nitro-1,4-Phenylenediamine; 4-Aminobenzoic acid; Chlorobenzene; Dimethyl formamide; Ethyl vanillin; Formaldehyde; Geraniol; Hexylcinnamic aldehyde; Isopropanol; Kathon CG*; Methyl salicylate; Penicillin G; Propylene glycol; Potassium Dichromate; Potassium permanganate; Tween 80; Zinc sulphate; 2-Mercaptobenzothiazole; 4-Hydroxybenzoic acid; Benzaldehyde; Octanoic acid; Cinnamyl alcohol; Diethyl phthalate; DNCB; Eugenol; Glycerol; Glyoxal; Isoeugenol; Phenol; PPD; Resorcinol; Salicylic acid; SDS; and Chlorobenzene.

    49. The method according to any one of claims 10 to 48 wherein the one or more positive control agent provided in step (f) comprises or consists of one or more agent selected from the group consisting of: ammonium hexachloroplatinate, ammonium persulfate, glutaraldehyde, hexamethylen diisocyanate, maleic anhydride, methylene diphenol diisocyanate, phtalic anhydride, toluendiisocyanate; trimellitic anhydride; Chloramine-T hydrate; Isophorone diisocyanate; Piperazine; Reactive orange 16; Maleic anhydride; Phenyl isocyanate (MDI); Phthalic anhydride; Toluene diisocyanate; and Trimelitic anhydride.

    50. The method according to any one of the preceding claims wherein the method is indicative of the relative sensitizing potency of the sample to be tested.

    51. The method according to any one of the preceding claims wherein the method comprises one or more of the following steps: (i) cultivating dendritic or dendritic-like cells; (ii) seeding cells of (i) in one or more well(s), e.g. wells of one or more multi-well assay plates; (iii) adding to a one or more well(s) of (ii) the agent(s) to be tested; (iv) adding to one or more separate well(s) of (ii) one or more positive control(s); (v) adding to one or more separate well(s) of (ii) one or more negative control(s); (vi) incubating cells in wells of (iii)-(v), preferably for about 24 hours; (vii) isolating purified total RNA from cells of (vi) and, optionally, convert mRNA into cDNA; (viii) quantifying expression levels of individual mRNA transcripts from (vii), e.g. using an array, such as an Affymetrix Human Gene 1.0 ST array, and/or a Nanostring code set; (ix) exporting and normalizing expression data from (viii); (x) isolating data from (ix) originating from biomarkers of the GARD Prediction Signature (i.e. the biomarkers of Table A); (xi) applying a prediction model to data from (x), e.g. a frozen SVM model previously established and trained on historical data, e.g. data obtained in Example 1, to predict the respiratory sensitization effect of tested agents(s) and negative/positive control(s).

    52. An array for use in the method according to any one of claims 1-51, the array comprising one or more binding moieties as defined in any one of claims 17-27 and 29-32.

    53. The array according to claim 52 wherein the array comprises one or more binding moiety for each of the biomarkers defined in any one of the preceding claims.

    54. Use of two or more biomarkers selected from the group defined in Table A for identifying respiratory sensitizing agents, preferably wherein one or more of the biomarkers is selected from the group defined in Table A(i).

    55. Use of two or more binding moieties each with specificity for a biomarker selected from the group defined in Table A for identifying respiratory sensitizing agents, preferably wherein one or more of the binding moieties has specificity for a biomarker selected from the group defined in Table A(i).

    56. An analytical kit for use in a method according any one of claims 1-55 comprising: (a) an array according to any one of claims 52-53; and (b) (optionally) one or more control agent. (c) (optionally) instructions for performing the method as defined in any one of claims 1-51.

    57. A method use, array or kit substantially as described herein.

    Description

    [0239] Preferred, non-limiting examples which embody certain aspects of he invention will now be described, with reference to the following figures:

    [0240] FIG. 1. PCA of training data set in a co pressed space of 28 variables, originating from an optimized biomarker signature.

    [0241] FIG. 2. Visualization of classification results of test set 1, using the finalized GARDair prediction model. [0242] A test substance is classified as a respiratory sensitizer if the mean SVM decision value (n=3) is greater than 0.

    [0243] FIG. 3. Visualization of classification results of test set 2, usin the finalized GARDair prediction model. [0244] A test substance is classified as a respiratory sensitizer if the mean SVM decision value (n=3) is greater than 0.

    EXAMPLE 1

    Results

    Prediction Model Rationale

    [0245] GARD™ is a state-of-the-art methodology platform for assessment of chemical sensitizers. It is based on a dendritic cell (DC)-like cell line, thus mimicking the cell type involved in the initiation of the response leading to sensitization. Cultivated DCs are exposed to test substances of interest. Following incubation, exposure-induced transcriptional changes are measured in order to study the activation state of the cells. These changes are associated with the bridging of innate and adaptive immune responses and the decision-making role of DCs in vivo and constitutes of e.g. up-regulation of co-stimulatory molecules, induction of cellular and oxidative stress pathways and an altered phenotype associated with migratory and inter-cell communication functions. By using state-of-the-art gene expression technologies, high informational content data is generated, that allows the user to get a holistic view of the cellular response induced by the test substance. Simplified, the described technology allows the assessment of the test substance as a sensitizer or a non-sensitizer.

    [0246] GARD is considered a testing strategy platform, on which is based a number of applications. The term “platform” here indicates that all applications are based on the same experimental strategy and similar experimental protocols. The term “application” here indicates different assays for different biological endpoints.

    [0247] The “GARDair” assay described herein is a novel assay based on the GARD platform that here is demonstrated to have the capacity to accurately classify respiratory sensitizers. Thus, GARDair has the capacity to be the preferred test method that specifically classifies chemicals as respiratory sensitizers, an endpoint to which validated, or even widely accepted and used, prediction models currently do not exist.

    GARDair Biomarker Discovery

    [0248] SenzaCells (ATCC Deposit #PTA-123875) were exposed to a reference panel of chemicals, comprising 10 well characterized respiratory sensitizers and 20 non-respiratory sensitizers, as defined by available literature and expert consensus (Chan-Yeung & Malo, 1994, Dearman et al., 1997, Dearman et al., 2012, Lalko et al., 2012). Of note, the set of non-respiratory sensitizers include skin sensitizers without any recorded capability to induce respiratory sensitization. This set of reference chemicals were used to create what is typically referred to as a training data set, and it is listed in Table 1. All exposures were performed in repeated triplicate experiments in a controlled setting, thus generating a coherent dataset with high statistical power optimized for subsequent biomarker discovery.

    [0249] Purified RNA from chemically exposed cell cultures were isolated and gene expression analysis was performed using Affymetrix microarrays, thus generating a whole genome expression data set for information mining, referred to as the training data set. The statistical power of the training data set was further increased by the application of a Surrogate Variable Analysis (SVA) algorithm, which identifies and subsequently eliminates noise signals originating from surrogate variables that are statistically unrelated to the biological endpoint of interest. Next, analysis of variance (ANOVA) was applied to identify differentially expressed genes (DEGs). Using an adjusted p-value (i.e. the q-value, a p-value corrected for multiple hypothesis testing using the Benjamini-Hochberg method) of <0.05 as a definition of statistical significance, 28 DEGS met the selection criteria. The identities of the 28 DEGs, henceforth collectively referred to as the GARD respiratory prediction signature (GRPS), are presented in Table 2. Furthermore, the training data set is visualized using principal component analysis (PCA) in FIG. 1.

    [0250] The respective weightings of the 28 genes in the SVM model are indicated in Table 5. The SVM is an algorithm that defines a prediction model. Once the model is defined (i.e. trained) the actual prediction model can be represented by a linear equation, as so:


    DV=K1*X2+K2*X3+ . . . +KN*XN+M

    [0251] In which DV is the decision value (the output of the model when applied), Ks are constants, Xs are independent variables and M is a constant representing an intercept. In this case, N is 28. Expression levels of 28 genes (i.e. Xs) were measured and a defined equation used with 28 fixed Ks and M to calculate DV.

    [0252] The weights provided are the Ks, i.e. the constant with which each gene expression level is multiplied. Thus, the bigger the K, the more impact the corresponding gene X will have on the DV. As a simplified example, consider the case in which N=1. This will give the commonly known equation for a straight line, i.e. Y=KX+M.

    Technology Platform Transfer and Prediction Model Definition

    [0253] Following the establishment of the GRPS, hybridizing probes were designed for standardized measurements of the GRPS using the Nanostring nCounter system (Geiss et al., 2008). This work was performed in a close analogy of the technology transfer of GARDskin, progress which has been previously published (Forreryd et al., 2016). Utilizing identical cellular protocols as the afore-mentioned assay facilitates a robust, simple and resource-effective assay. A prediction model was trained and frozen, based on a Support Vector Machine (SVM), using the samples of the training data set with a binary “function in study” (respiratory sensitizer/non-respiratory sensitizer) as the dependent variable, and the gene expression values of the GRPS as the independent variables (i.e. predictors), see also Table 4.

    Proof of Concept—Classifications of External Test Data.

    [0254] Having established an optimized prediction model and associated protocols, the assay was challenged with two sets of external samples, referred to as test data sets. The chemical identities of included samples in the test sets, their true group belonging (respiratory sensitizers or non-respiratory sensitizers) and the GARDair classification results are listed in Table 3. Graphical representations of classifications, as defined by generated GARDair decision values, are shown in FIGS. 2 and 3, for test sets 1 and 2, respectively.

    [0255] Estimating the predictive performance of GARDair based on the available data, the predictive accuracy was calculated to 89%, well-balanced between sensitivity and specificity. Furthermore, based on the few repeated exposures available from independent experiments, the reproducibility was 100%, indicative of a robust assay.

    Discussion

    [0256] Based on the here within presented data, it was concluded that the concept of utilizing the GARD platform, e.g. exposing DC-like cells to test substances and interrogating the induced transcriptional pattern for machine-learning assisted classification is a functional strategy for assessment of chemical respiratory sensitizers.

    [0257] GARDair is to date a finalized assay, based on a genomic readout, as measured by a state-of-the-art platform, of chemically exposed DC-like cells in vitro. The assay has been demonstrated to be functional and robust. The assay is proposed to monitor transcriptional changes in DCs, as induced specifically by respiratory sensitizers, related to the bridging of innate and adaptive immune functions and skewing towards Th2 type immune responses. Primarily, this is demonstrated by the data-driven identification of IL7R and CRLF2 genes, which as translated proteins together form the receptor for thymoid stromal lymphopoietin (TSLP). TSLP ligand-binding to the TSLP receptor of antigen presenting cells has been previously shown to drive Th2 differentiation (Paul & Zhu, 2010, Soumelis et al., 2002). However, it has previously not been described in relation to induction of respiratory sensitization to chemicals.

    Material & Methods

    Cell Line Maintenance and Seeding of Cells for Stimulation

    [0258] The human myeloid leukemia-derived cell line SenzaCell (available through ATCC), acting as an in vitro model of human Dendritic Cell (DC), is maintained in a-MEM (Thermo Scientific Hyclone, Logan, Utah) supplemented with 20% (volume/volume) fetal calf serum (Life Technologies, Carlsbad, Calif.) and 40 ng/ml recombinant human Granulocyte Macrophage Colony Stimulating Factor (rhGM-CSF) (Miltenyi Biotec, Germany). A media change during expansion is performed every 3-4 days. Working stocks of cultures are grown for a maximum of 16 passages or two months after thawing. For chemical stimulation of cells, exposed cells are incubated for 24 h at 37° C., 5% CO2 and 95% humidity.

    Test Substance Handling and Assessment of Cytotoxicity

    [0259] All Test Substances were stored according to instructions from the supplier, to ensure stability of Test Substances. Test Substances were dissolved in DMSO or water, based on physical properties. As many Test Substances will have a toxic effect on the cells, cytotoxic effects of

    [0260] Test Substances were monitored. Some Test Substances were poorly dissolved in cell media; therefore, the maximum soluble concentration was assessed as well. The Test Substance that was to be tested was titrated to concentrations ranging from 1 μM to the maximum soluble concentration in cell media. For freely soluble Test Substances, 500 μM was set as the upper limit of the titration range. For Test Substances dissolved in DMSO, the in-well concentration of DMSO was 0.1%. After incubation for 24 h at 37° C., 5% CO2 and 95% humidity, harvested cells were stained with the viability marker Propidium Iodide (PI) (BD Bioscience, USA) and analyzed by flow cytometry. PI-negative cells were defined as viable, and the relative viability of cells stimulated with each concentration in the titration range was calculated as


    Relative viability=(fraction of viable stimulated cells)/(fraction of viable unstimulated cells).Math.100

    [0261] For toxic Test Substances, the concentration yielding 90% relative viability (Rv90) was used for the GARD assay, the reason being that this concentration demonstrates bioavailability of the Test Substance used for stimulation, while not impairing immunological responses. For non-toxic Test Substances, a concentration of 500 μM was used if possible. For non-toxic Test Substances that were insoluble at 500 μM in cell media, the highest soluble concentration was used. Whichever of these three criteria was met, only one concentration will be used for gene expression analysis. The concentration to be used for any given chemical was termed the ‘GARD input concentration’.

    GARD Main Stimulation

    [0262] Once the GARD input concentration for Test Substances to be assayed was established, the cells were stimulated again as described above, this time only using the GARD input concentration. All assessments of Test Substances and Benchmark Controls were assayed in biological triplicates, performed at different time-points and using different cell cultures. After incubation for 24 h at 37° C., 5% CO2 and 95% humidity, cell culture was lysed in TRlzol reagent (Life Technologies) and stored at −20° C. until RNA was extracted. In parallel, a small sample of stimulated cells was taken for PI staining and analysis with flow cytometry, to ensure the expected relative viability of stimulated cells was reached.

    Isolation of RNA

    [0263] RNA isolation from lysed cells was performed using commercially available kits (Direct-Zol RNA MiniPrep, Zymo Research, Irvine, Calif.). Total RNA was quantified and quality controlled using BioAnalyzer equipment (Agilent, Santa Clara, Calif.).

    Gene Expression Analysis Using Microarrays

    [0264] Preparation of cDNA and hybridization to HuGene ST 1.0 microarrays were performed by Swegene Centre for Integrative Biology at Lund University (SCIBLU, Lund, Sweden), according to the manufacturer's recommended protocols, kits and reagents (Affymetrix, Santa Clara, Calif.).

    Microarray Data Acquisition and Normalization

    [0265] Hybridized microarrays were washed and scanned according to recommended protocols. Raw data .cell-files were imported into the R environment for statistical computing project.org). Raw data were normalized and converted to gene expression signals using the R-package SCAN.

    Data Analysis—Feature Selection of GARDair Sensitization Biomarker Signature

    [0266] Normalized data containing biological triplicates of SenzaCell samples stimulated with the panel of chemicals listed in Table 1 were mined for differentially regulated genes, able to discriminate between respiratory sensitizers and respiratory non-sensitizers. Unwanted variation from undefined sources was removed using Surrogate Variable Analysis, available from the R-package SVA. Regulated genes were identified using an ANOVA from the R-package Limma. Genes with a false discovery rate (i.e. the q-value, a p-value corrected for multiple hypothesis testing using the Benjamini-Hochberg method) <0.05 were considered statistically significant. 28 unique genes met the selection criteria and they are presented in Table 2.

    Technology Platform Transfer

    [0267] Unique Nanostring nCounter system transcript probes were synthesized by the Nanostring Bioinformatics team (Nanostring, Seattle, Wash.). Following protocols by the supplier (Nanostring), Nanostring gene expression data was generated from the RNA samples produced for biomarker discovery, i.e. a complete reproduction of the training data set (Table 1), covering the 28 genes of interest.

    Prediction Model Establishment and Testing of External Test Chemicals

    [0268] A Support Vector Machine (SVM) was trained on Nanostring expression data generated by the training data set (Table 1), using the “Function in study” as dependent variable (i.e. parameter to be predicted) and the 28 genes of the biomarker signature as independent variables (i.e. predictors), using the R statistical environment (R Core Team) and additional packages (see Table 4). For testing of external test chemicals, gene expression data was generated according to protocols described above. The trained SVM model was applied to classify each sample as respiratory sensitizer or non-respiratory sensitizer, as determined by the mean SVM decision value (n=3). Positive decision values denotes a positive classification.

    REFERENCES

    [0269] Chan-Yeung & Malo, 1994. Aetiological agents in occupational asthma. European Respiratory Journal. [0270] Dearman et al., 1997. Classification of chemical allergens according to cytokine secretion profiles of murine local lymph node cells. Journal of Applied Toxicology. [0271] Dearman et al., 2011. Inter-relationships between different classes of chemical allergens. Journal of Applied Toxicology. [0272] Dearman et al., 2012. Inter-relationships between different classes of chemical allergens. Journal of Applied Toxicology. [0273] Forreryd et al., 2015. Prediction of chemical Respiratory sensitizers using GARD, a novel in vitro assay based on a genomic biomarker signature. PLoS One 10(3). [0274] Forreryd et al., 2016. From genome-wide arrays to tailor-made biomarker readout—Progress towards routine analysis of skin sensitizing chemicals with GARD. Toxicology in vitro. [0275] Geiss et al., 2008. Direct multiplexed measurement of gene expression with color-coded probe pairs. Nature Biotechnology. [0276] Isola et al., 2008. Chemical respiratory allergy and occupational asthma: what are the key areas of uncertainty? Journal of Applied Toxicology. [0277] Johansson et al., 2011. A genomic biomarker signature can predict skin sensitizers using a cell-based in vitro alternative to animal tests. BMC Genomics. [0278] Kimber et al., 2002. Chemical respiratory allergy: role of IgE antibody and relevance of route of exposure. Toxicology. [0279] Kimber et al., 2011. Chemical allergy: translating biology into hazard characterization. Toxicological Sciences. [0280] Kimber et al., 2014. Chemical respiratory allergy: reverse engineering an adverse outcome pathway. Toxicology. [0281] Lalko et al., 2012. The direct peptide reactivity assay: selectivity of chemical respiratory allergens. Toxicological Sciences. [0282] Paul & Zhu, 2010. How are Th2-type immune responses initiated and amplified. Nature Reviews Immunology. [0283] Soumelis et al., 2002. Human epithelial cells trigger dendritic cell-mediated allergic inflammation by producing TSLP. Nat Immunol. [0284] Sullivan et al., 2017. An Adverse Outcome Pathway for Sensitization of the Respiratory Tract by Low-Molecular-Weight Chemicals: Building Evidence to Support the Utility of In Vitro and In Silico Methods in a Regulatory Context. Applied in vitro Toxicology.

    Tables

    [0285]

    TABLE-US-00001 TABLE A Entrez Affymetrix Gene name Gene Symbol ID ID Weight Table A(i) cytokine receptor-like CRLF2 64109 8171105 −1.01835933608703 factor 2 fascin actin-bundling FSCN1 6624 8131339 1.00203258207129 protein 1 amino-terminal enhancer AES 166 8032576 −0.937232228971051 of split arachidonate 5- ALOX5AP 241 7968344 0.859616973865753 lipoxygenase activating protein RAB27B, member RAS RAB27B 5874 8021301 0.782688844360711 oncogene family ZFP36 ring finger protein ZFP36L1 677 7979813 −0.719233666771149 like 1 solute carrier family 44 SLC44A2 57153 8025672 0.718226173217911 member 2 atlastin GTPase 1 ATL1 51062 7974270 0.699374841646448 family with sequence FAM30A 9834 7977440 0.683461721920966 similarity 30 member A cathepsin H CTSH 1512 7990757 −0.65487992465195 ninjurin 1 NINJ1 4814 8162455 −0.577359642405239 Ral GTPase activating RALGAPA2 57186 8065280 0.552163931377946 protein catalytic alpha subunit 2 ring finger protein 220 RNF220 55182 7900979 −0.551522449893945 oxysterol binding protein OSBPL3 26031 8138613 −0.538467358395433 like 3 calcium voltage-gated CACNA2D2 9254 8087691 −0.51849673058401 channel auxiliary subunit alpha2delta 2 Heterogeneous Nuclear HNRNPC 3183 7893129 0.299399629874934 Ribonucleoprotein C (C1/C2) phosphatidylinositol 3- PIK3C3 5289 8021015 −0.256425970684912 kinase catalytic subunit type 3 HOP homeobox HOPX 84525 8100507 0.166534308063369 versican VCAN 1462 8106743 −0.147007737618858 RUN and FYVE domain RUFY1 80230 8110499 0.0996656054685292 containing 1 G protein subunit alpha 15 GNA15 2769 8024572 0.0794276641913698 ADAM metallopeptidase ADAM8 101 7937150 −0.0746172327492091 domain 8 nuclear receptor interacting NRIP1 8204 8069553 0.0715765479932369 protein 1 CCCTC-binding factor CTCF 10664 7996593 0.0477003538478608 phosphatidylinositol PLCXD1 55344 8165711 0.0263482446344047 specific phospholipase CX domain containing 1 Table A(ii) MYCN proto-oncogene, MYCN 4613 8040419 −0.775008003430203 bHLH transcription factor interleukin 7 receptor IL7R 3575 8104901 0.215964226173642 RAS like proto-oncogene A RALA 5898 8132406 −0.101979863027782

    TABLE-US-00002 TABLE 1 Chemical constituents of the training data set Chemical Name CAS Function in study ammonium hexachloroplatinate 16919-58-7 RS ammonium persulfate 7727-54-0 RS ethylendiamine 107-15-3 RS/SS glutaraldehyde 111-30-8 RS hexamethylen diisocyanate 822-06-0 RS maleic anhydride 108-31-6 RS methylene diphenol diisocyanate 101-68-8 RS phtalic anhydride 85-44-9 RS toluen diisocyanate 584-84-9 RS trimellitic anhydride 552-30-7 RS 2-aminophenol 95-55-6 SS/NRS 2-hydroxtethyl acrylate 818-61-1 SS/NRS 2-nitro-1,4-phenylendiamine 5307-14-2 SS/NRS formaldehyde 50-00-0 SS/NRS geraniol 106-24-1 SS/NRS hexylcinnamic aldehyde 101-86-0 SS/NRS kathon CG 96118-96-6 SS/NRS penicillin G 61-33-6 SS/NRS potassium dichromate 7778-50-9 SS/NRS 1-butanol 71-36-3 NS 4-aminobenzoic acid 150-13-0 NS chlorobenzene 108-90-7 NS dimethyl formamide 68-12-2 NS ethyl vanillin 121-32-4 NS isopropanol 67-63-0 NS methyl salicylate 119-36-8 NS potassium permanganate 7722-64-7 NS propylene glycol 57-55-6 NS tween 80 9005-65-6 NS zinc sulphate 7733-02-0 NS RS; Respiratory sensitizer, SS; Skin sensitizer, NRS; Non-respiratory sensitizer, NS; Non-sensitizer.

    TABLE-US-00003 TABLE 2 Identities of the 28 genes of the GRPS. Gene Affymetrix Gene name Symbol Entrez ID ID amino-terminal enhancer of split AES 166 8032576 solute carrier family 44 member 2 SLC44A2 57153 8025672 ring finger protein 220 RNF220 55182 7900979 ADAM metallopeptidase domain 8 ADAM8 101 7937150 RAS like proto-oncogene A RALA 5898 8132406 interleukin 7 receptor IL7R 3575 8104901 fascin actin-bundling protein 1 FSCN1 6624 8131339 phosphatidylinositol specific phospholipase C X PLCXD1 55344 8165711 domain containing 1 ZFP36 ring finger protein like 1 ZFP36L1 677 7979813 cytokine receptor-like factor 2 CRLF2 64109 8171105 CCCTC-binding factor CTGF 10664 7996593 family with sequence similarity 30 member A FAM30A 9834 7977440 G protein subunit alpha 15 GNA15 2769 8024572 calcium voltage-gated channel auxiliary subunit CACNA2D2 9254 8087691 alpha2delta 2 MYCN proto-oncogene, bHLH transcription factor MYCN 4613 8040419 arachidonate 5-lipoxygenase activating protein ALOX5AP 241 7968344 versican VCAN 1462 8106743 cathepsin H CTSH 1512 7990757 RAB27B, member RAS oncogene family RAB27B 5874 8021301 Ral GTPase activating protein catalytic alpha RALGAPA2 57186 8065280 subunit 2 phosphatidylinositol 3-kinase catalytic subunit type PIK3G3 5289 8021015 3 ninjurin 1 NINJ1 4814 8162455 nuclear receptor interacting protein 1 NRIP1 8204 8069553 Heterogeneous Nuclear Ribonucleoprotein C HNRNPC 3183 7893129 (C1/C2) HOP homeobox HOPX 84525 8100507 atlastin GTPase 1 ATL1 51062 7974270 oxysterol binding protein like 3 OSBPL3 26031 8138613 RUN and FYVE domain containing 1 RUFY1 80230 8110499

    TABLE-US-00004 TABLE 3 Prediction results of external test data sets using the finalized GARDair prediction model. True Prediction Prediction Included in Chemical name group Test set 1 Test set 2 Training Set 2- NRS NRS — No Mercaptobenzothiazole 4-Hydroxybenzoic acid NRS NRS — No Benzaldehyde NRS NRS — No Octanoic acid NRS NRS — No Chloramine-T hydrate RS RS RS No Cinnamyl alcohol NRS NRS — No Diethyl phthalate NRS NRS — No DNCB NRS NRS NRS No Eugenol NRS NRS — No Glycerol NRS NRS — No Glyoxal NRS RS — No Isoeugenol NRS NRS — No Isophorone RS RS — No diisocyanate Phenol NRS NRS — No Piperazine RS RS RS No PPD NRS NRS NRS No Reactive orange 16 RS RS RS No Resorcinol NRS NRS — No Salicylic acid NRS RS — No SDS NRS NRS — No Chlorobenzene NRS — NRS Yes DMSO NRS — NRS Yes Maleic anhydride RS — RS Yes Phenyl isocyanate RS — RS Yes (MDI) Phthalic anhydride RS — RS Yes Toluene diisocyanate RS — NRS Yes Trimelitic anhydride RS — RS Yes RS; Respiratory sensitizer, NRS; Non-respiratory sensitizer. False classifications are highlighted.

    TABLE-US-00005 TABLE 4 Listed below are details of the algorithm script, written in R code, used to perform the method: #This code describes the typical usage of the GRPS in its intended application as constituting #predictors in a computational prediction model. Dependencies on standard functions are #stored in GARD_GRPS.R. # Required files: # - GARD_GRPS.R # - raw affymetrix files of test samples in subdir: raw_affy/ # - Annotation of the new data describing the unstimulated samples raw_affy/annotation.rds # - Historical data stored in trainingset.rds # Load required dependencies source(‘~/GARD_GRPS.R’) # Load Training Data train = readRDS(‘~/trainingset.rds’) # Read new data and annotations new_data = read_raw_affy(‘~/raw_affy/*.CEL’) new_data_ref = readRDS(‘~/raw_affy/annotation.rds’) # Normalize the new data normalized_data = normalize_train_test(train = train, test = new_data, test_reference = new_data_ref) # Train model on historical data model = train_svm(normalized_data) # Predict New Samples predictions = predict_test_samples(model = model, data=normalized_data)

    TABLE-US-00006 TABLE 5 Weightings Entrez Affymetrix Gene name Gene Symbol ID ID Weight cytokine receptor-like CRLF2 64109 8171105 −1.01835933608703 factor 2 fascin actin-bundling FSCN1 6624 8131339 1.00203258207129 protein 1 amino-terminal enhancer AES 166 8032576 −0.937232228971051 of split arachidonate 5- ALOX5AP 241 7968344 0.859616973865753 lipoxygenase activating protein RAB27B, member RAS RAB27B 5874 8021301 0.782688844360711 oncogene family MYCN proto-oncogene, MYCN 4613 8040419 −0.775008003430203 bHLH transcription factor ZFP36 ring finger protein ZFP36L1 677 7979813 −0.719233666771149 like 1 solute carrier family 44 SLC44A2 57153 8025672 0.718226173217911 member 2 atlastin GTPase 1 ATL1 51062 7974270 0.699374841646448 family with sequence FAM30A 9834 7977440 0.683461721920966 similarity 30 member A cathepsin H CTSH 1512 7990757 −0.65487992465195 ninjurin 1 NINJ1 4814 8162455 −0.577359642405239 Ral GTPase activating RALGAPA2 57186 8065280 0.552163931377946 protein catalytic alpha subunit 2 ring finger protein 220 RNF220 55182 7900979 −0.551522449893945 oxysterol binding protein OSBPL3 26031 8138613 −0.538467358395433 like 3 calcium voltage-gated CACNA2D2 9254 8087691 −0.51849673058401 channel auxiliary subunit alpha2delta 2 Heterogeneous Nuclear HNRNPC 3183 7893129 0.299399629874934 Ribonucleoprotein C (C1/C2) phosphatidylinositol 3- PIK3C3 5289 8021015 −0.256425970684912 kinase catalytic subunit type 3 interleukin 7 receptor IL7R 3575 8104901 0.215964226173642 HOP homeobox HOPX 84525 8100507 0.166534308063369 versican VCAN 1462 8106743 −0.147007737618858 RAS like proto-oncogene A RALA 5898 8132406 −0.101979863027782 RUN and FYVE domain RUFY1 80230 8110499 0.0996656054685292 containing 1 G protein subunit alpha 15 GNA15 2769 8024572 0.0794276641913698 ADAM metallopeptidase ADAM8 101 7937150 −0.0746172327492091 domain 8 nuclear receptor interacting NRIP1 8204 8069553 0.0715765479932369 protein 1 CCCTC-binding factor CTCF 10664 7996593 0.0477003538478608 phosphatidylinositol PLCXD1 55344 8165711 0.0263482446344047 specific phospholipase C X domain containing 1