Biomarker Panel For Diagnosing Cancer

20200033352 ยท 2020-01-30

    Inventors

    Cpc classification

    International classification

    Abstract

    The present invention pertains to a new method for the diagnosis, prognosis, stratification and/or monitoring of a therapy, of cancer in a patient. The method is based on the determination of the level of a panel of biomarkers selected from CEA, AREG, IL-6, GDF-15, HGF-receptor, CXCL9, ErbB4-Her4, CXCL10, Flt3L, VEGFR-2, CD69, CXCL5, PSA, EMMPRIN, Cathepsin-D, Caspase-3, TNF-alpha, and INF-gamma. The new biomarker panel of the invention allows diagnosing and even stratifying various cancer diseases. Furthermore provided are diagnostic kits for performing the non-invasive methods of the invention.

    Claims

    1. A non-invasive method for the diagnosis, prognosis, stratification and/or monitoring of a therapy, of a cancer disease in a subject, comprising the steps of: (a) Providing a biological sample from the subject, (b) Determining the level of at least two or more biomarker selected from the group consisting of Flt3L AREG, CEA, IL-6, GDF-15, HGF-receptor, CXCL9, ErbB4-Her4, CXCL10, VEGFR-2, CD69, CXCL5, PSA, EMMPRIN, Cathepsin-D, Caspase-3, TNF-alpha, and INF-gamma, in the biological sample, wherein a differential level of the at least two or more biomarkers in the biological sample from the subject as determined in step (b) compared to a healthy control or a reference value is indicative for the presence of a cancer disease in the subject.

    2. The method according to claim 1, wherein step (b) comprises determining the level of at least Flt3L or AREG in the biological sample.

    3. The non-invasive method according to claim 1, wherein step (b) comprises determining the level of the Flt3L or AREG, in the biological sample.

    4. The method according to claim 1, wherein the biological sample is a tissue sample or body liquid sample, preferably a blood sample, most preferably a plasma sample.

    5. The method according to claim 1, wherein the biomarker is a protein biomarker.

    6. The method according to claim 1, wherein the method is a screening method for establishing a first diagnosis of cancer in the subject.

    7. The method according to claim 1, wherein the cancer is colorectal cancer, pancreatic cancer, gastric cancer, breast cancer, lung cancer, prostate cancer, hepatocellular cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urinary tract, thyroid cancer, renal cancer, carcinoma, melanoma, leukemia or brain cancer.

    8. The method according to claim 7, wherein the cancer is colorectal cancer, gastric cancer or pancreatic cancer.

    9. The method according to claim 1, wherein a differential level of a biomarker selected from CEA, GDF-15, AREG, IL-6, CXCL10, CXCL9, PSA, TNF-alpha, and Cathepsin-D, is a level higher than the healthy control or the reference value.

    10. The method according to claim 1, wherein a differential level of a biomarker selected from HGF-receptor, ErbB4-Heer4, CXCL5, Flt3L, EMMPRIN, VEGFR-2, CD69 and Caspase-3, is a level lower than the healthy control or the reference value.

    11. The method according to claim 1, wherein the biomarker is detected using one or more antibodies, preferably wherein the biomarker is detected by western blot, ELISA, Proximity Extension Assay, or mass-spectrometrically.

    12. The A diagnostic kit for performing a method according to claim 1.

    13. The diagnostic kit of claim 12, comprising one or more antibodies for the detection of the biomarkers Flt3L and AREG.

    Description

    [0049] The present invention will now be further described in the following examples with reference to the accompanying figures and sequences, nevertheless, without being limited thereto. For the purposes of the present invention, all references as cited herein are incorporated by reference in their entireties. In the Figures and Sequences:

    [0050] FIG. 1: STAndards for the Reporting of Diagnostic accuracy studies (STARD) diagram of the participants in the BliTz study (2005-2012).

    [0051] FIG. 2: Box plots of plasma levels for 17 protein biomarkers: (a) between CRC cases and controls; (b) early stages (I/II) and advanced stage (III/IV) CRC. The bottom and top of the box indicate the first (Q1) and third (Q3) quartiles, and the middle line in the box is the median; the upper-limit equals Q3 plus 1.5 times interquartile range (IQR), and the lower-limit equals Q1 minus 1.5 times IQR.

    [0052] FIG. 3: Comparison of receiver operating characteristic curve for the eight-marker algorithm: (a) between the training set and the independent validation set; (b) between different subgroups in the independent validation set (i.e., all CRC cases, tumor stage I/II and tumor stage III/IV).

    [0053] FIG. 4: Comparison of receiver operating characteristic curve for the eight-marker algorithm between the colorectal cancer training set, the colorectal cancer independent validation set, the gastric cancer set and the pancreatic cancer set.

    EXAMPLES

    Materials and Methods

    1. Study Design and Study Population

    [0054] The analysis was conducted in the context of the BliTz study (Begleitende Evaluierung innovativer Testverfahren zur Darmkrebsfrherkennung). Briefly, BliTz is an ongoing study among participants of screening colonoscopy conducted in cooperation with 20 gastroenterology practices in South-western Germany since November 2005, which aims to evaluate novel promising biomarkers for early detection of CRC. Participants are recruited, and blood samples are taken in the practices at a preparatory visit, typically about one week prior to the screening colonoscopy.

    [0055] For this analysis, the following exclusion criteria were applied to exclude participants without adequate blood samples, participants who do not represent a true screening setting, and participants with potentially false negative results at screening colonoscopy: blood samples taken after screening colonoscopy or blood samples with unknown date of blood withdrawal, history of CRC or inflammatory bowel disease, previous colonoscopy history in the last five years or unknown colonoscopy history, incomplete colonoscopy or insufficient bowel preparation (latter two criteria only for controls). From the remaining participants of the BliTz study recruited in 2005-2012 (N=4345), all 35 available cases with newly detected CRC were included in the analysis. For comparison, the inventors included a representative sample of 54 controls free of colorectal neoplasms. Because this study was conducted in a true screening population in which patients with CRC are expected to be on average slightly older and to include a somewhat large proportion of men, the inventors did not match for these factors as this might lead to biased estimates of specificity in such a setting.

    [0056] For an independent validation, the inventors also included 54 additional CRC cases (recruited at four hospitals in and around the city of Heidelberg after diagnosis but before initiation of treatment) and 38 additional randomly selected controls free of neoplasm from the BliTz study.

    [0057] Colonoscopy and histology reports (BliTz study) and hospital records (54 CRC cases for the independent validation set) were collected from all participants. Relevant information was extracted by two research assistants independently who were blind to the blood test results. Tumor stages were classified according to the UICC TNM classification.

    2. Laboratory Procedures 2.1. Sample Preparation

    [0058] Blood samples from participants giving informed consent were to be collected before bowel preparation for colonoscopy (BliTz study) or prior to large bowel surgery or neoadjuvant chemotherapy (54 CRC cases from the clinical setting) in EDTA tubes. The blood samples were immediately centrifuged at 2123 g for 10 minutes at 4 C. and the supernatant was transferred into new tubes, and transported to the biobank at DKFZ in a cool chain, where plasma samples were stored at 80 C. until analyses.

    2.2. Laboratory Measurements

    [0059] Protein profiling was performed using Proseek Multiplex Oncology I.sup.96x96 (Olink Bioscience, Uppsala, Sweden) which enables quantification of 92 human tumor-associated protein biomarkers (full marker list in Supplementary Table 51). The panel of 92 protein biomarkers reflects various biological mechanisms involved in carcinogenesis, such as angiogenesis, cell-cell signaling, growth control and inflammation. All laboratory operations were conducted according to the Proseek Multiplex Oncology I.sup.96x96 User Manual in the TATAA Biocenter (Gteborg, Sweden). In short, the Prossek reagents are based on the Proximity Extension Assay (PEA) technology, where 92 oligonucleotide labeled antibody probe pairs are allowed to bind to their respective target present in the sample. A PCR reporter sequence is formed by a proximity dependent DNA polymerization event and is subsequently detected and quantified using real-time PCR. Four internal controls (including two incubation controls, one extension control and one detection control) were included in the assay. In addition, there were three replicates of negative controls which were used to calculate the lower limit of detection (LOD) for each protein. All information regarding the study population was blind to the laboratory operators.

    3. Data Normalization and Statistical Analyses

    3.1 Data Normalization

    [0060] Normalization of raw data followed the standard protocol from the manufacturer and was conducted through the Olink Wizard of GenEx software (MultiD, Gteborg, Sweden). For each data point, the raw Cq-value (in log.sub.e scale) was exported from the Fluidigm Real-Time PCR Analysis Software. The first step of normalization is to subtract the raw Cq-value for the extension control for the corresponding sample in order to correct for technical variation. The calculated Cq-values (dCq-value) were further normalized against the negative control determined in the measurement, which yielded ddCq-values (hereafter: Cq-value, in log.sub.2 scale) and could be used for further analyses. LOD was defined as the mean value of the three negative controls plus 3 calculated standard deviations. Missing data and data with a value lower than LOD were replaced with LOD in the following statistical analyses.

    3.2 Statistical Analyses

    [0061] The plasma protein levels (Cq-value) were first compared between CRC cases and neoplasm-free controls using Wilcoxon Rank Sum Test (hereafter: Wilcoxon test), and Benjamini & Hochberg method was additionally employed for multiple testing. The following diagnosis-related indicators were used for evaluating the diagnostic performance of each protein biomarker: sensitivity (true positive rate), specificity (true negative rate), receiver operating characteristics (ROC) curve, and area under the ROC curve (AUC). For each individual protein biomarker, a logistic regression model was used to construct the prediction model. Based on the predicted possibilities from the prediction model, the AUCs and their 95% confidence intervals (95% CIs, calculated based on 2000 bootstrap samples) were derived. Moreover, sensitivities of each individual biomarker at cutoffs yielding 80% and 90% specificity were calculated. In addition to direct estimates of the diagnosis related indicators, the 0.632+ bootstrap method (1000 bootstrap samples with replacement) was applied to adjust for potential overestimation of diagnostic performance. Furthermore, for the biomarkers which were identified to have significantly different plasma levels between CRC cases and controls, stage-specific AUCs (apparent and 0.632+ adjusted AUCs) were also calculated and Delong test was employed to compare the differences of apparent AUCs between early stages (i.e., tumor stage I/II) and advanced stages (i.e., tumor stage III/IV).

    [0062] A multi-marker algorithm was derived by applying the Lasso logistic regression model based on all 92 protein markers. With the purpose of adjusting for potential overfitting of the prediction algorithm, a 0.632+ bootstrap subsampling approach was conducted in the following way: i) generate 1000 bootstrap samples (subsampling method, bootstrap without replacement); ii) for each bootstrap sample set, apply the Lasso logistic regression procedure to select variables and to construct a prediction algorithm; iii) apply this algorithm on those patients not included in the bootstrap sample to obtain bootstrap estimates of prediction errors for each bootstrap sample; iv) further adjust these results using the 0.632+ method to obtain a nearly unbiased estimate of the prognostic AUC of the original algorithm. Construction of the algorithm was done including all CRC cases. Evaluation was likewise performed for all CRC cases and, in addition separately for CRC cases at early and advanced tumor stages. Finally, AUC and sensitivity at cutoffs yielding 80% and 90% specificity, respectively, and their 95% CIs of the multi-marker algorithm were determined in the independent validation sample.

    [0063] Statistical analyses were performed with the statistical software R version 3.0.3. R package Daim was used to conduct 0.632+ bootstrap analyses for single markers R package glmnet was employed to perform the Lasso logistic regression analysis for multi-marker analyses. Additionally, R packages peperr and c060 were applied to conduct the 632+ bootstrap subsampling approach described above. All tests were two-sided and p-values of 0.05 or less were considered to be statistically significant.

    Example 1: Identification of 17 Biomarkers

    [0064] FIG. 1 provides the STAandards for the Reporting of Diagnostic accuracy studies (STARD) diagram which shows the selection of study participants from all subjects enrolled in the BliTz study in 2005-2012. The final study sample included 35 CRC patients who were compared to a representative sample of 54 controls free of colorectal neoplasms. Latter included 6 participants with hyperplastic polyps and 48 participants without colorectal polyps.

    [0065] Table 1 presents the distribution of socio-demographic characteristics in the CRC case group and the control group. The controls were on average slightly younger than cases (meanstandard deviation: 62.87.0 versus 66.96.5 years). 71.4% of the patients with CRC were men, compared with 50.0% of those free of colorectal neoplasms. Approximately equal proportions of patients were diagnosed in early (stage I/II) and advanced stage (stage III/IV), and there were equal numbers of patients with colon and rectum cancer.

    [0066] Overall, there were 17 protein biomarkers showing significantly different plasma levels between CRC cases and controls (Table 2). When using 25% false positive rate (FDR) as the cutoff level for multiple testing, all the 17 biomarkers were still statistically significant.

    TABLE-US-00001 TABLE 1 Characteristics of the study population Variable CRC cases (%) Controls.sup.a (%) Age (years) <60 5 (14.3) 24 (44.4) 60-64 9 (25.7) 9 (16.7) 65-69 8 (22.9) 8 (14.8) 70 13 (37.1) 13 (24.1) Mean SD 66.9 6.5 62.8 7.0 Sex Male 25 (71.4) 27 (50.0) Female 10 (28.6) 27 (50.0) UICC tumor stage I 13 (37.1) II 4 (11.4) III 16 (45.7) IV 2 (5.7) CRC location Colon 17 (48.6) Rectum 17 (48.6) Unkown 1 (2.8) Total 35 (100.0) 54 (100.0) .sup.aControls included 6 participants with hyperplastic polyps and 48 participants without any finding at colonoscopy.

    TABLE-US-00002 TABLE 2 Diagnostic performance of protein biomarkers showing significant differences between CRC case and controls .632+ sens..sup.c Median Cq Adjusted Apparent AUC .632+ AUC at 80% at 90% Marker CRC Controls p-value.sup.a p-value.sup.b [95% CI] [95% CI] spec. spec. CEA 1.20 0.49 <0.001 0.015 0.73 [0.63-0.84] 0.69 [0.57-0.88] 52% 27% GDF-15 5.34 4.68 <0.001 0.016 0.72 [0.62-0.83] 0.69 [0.58-0.87] 43% 18% AREG 2.73 2.41 0.001 0.016 0.72 [0.61-0.83] 0.70 [0.57-0.86] 46% 36% IL-6 4.23 3.59 0.003 0.063 0.69 [0.58-0.80] 0.65 [0.54-0.84] 42% 16% CXCL10 6.84 6.20 0.013 0.184 0.66 [0.54-0.77] 0.60 [0.46-0.80] 27% 12% HGF-receptor 7.25 7.32 0.013 0.184 0.66 [0.54-0.77] 0.62 [0.48-0.81] 31% 18% CXCL9 5.78 5.23 0.014 0.184 0.66 [0.54-0.77] 0.59 [0.45-0.81] 28% 13% ErbB4-Her4 6.67 6.76 0.017 0.198 0.65 [0.54-0.77] 0.60 [0.49-0.79] 32% 16% CXCL5 5.74 6.32 0.030 0.244 0.64 [0.52-0.76] 0.59 [0.44-0.79] 35% 22% Flt3L 6.95 7.17 0.030 0.244 0.64 [0.52-0.75] 0.59 [0.48-0.78] 30% 14% EMMPRIN 7.09 7.19 0.033 0.244 0.63 [0.52-0.75] 0.59 [0.46-0.79] 28% 13% PSA 2.24 1.20 0.041 0.244 0.63 [0.50-0.75] 0.59 [0.44-0.79] 33% 18% TNF-alpha 0.52 0.78 0.042 0.244 0.63 [0.51-0.75] 0.57 [0.44-0.79] 27% 18% VEGFR-2 2.57 2.70 0.043 0.244 0.63 [0.51-0.75] 0.58 [0.43-0.78] 30% 17% CD69 6.67 7.19 0.044 0.244 0.63 [0.51-0.75] 0.59 [0.45-0.79] 29% 16% Cathepsin-D 2.48 2.31 0.045 0.244 0.63 [0.51-0.74] 0.55 [0.34-0.77] 25% 12% Caspase-3 10.28 10.70 0.045 0.244 0.63 [0.51-0.75] 0.57 [0.43-0.78] 28% 15% .sup.aWilcoxon Rank Sum Test to compare the protein expression differences between CRC cases and controls. .sup.bThe p-value was adjusted for multiple testing by Benjamini & Hochberg method. .sup.cSensitivities were adjusted by using the .632+ bootstrap method. Abbreviations: AUC, area under the receiver operating characteristic curve; CI, confidence interval.

    [0067] Carcinoembryonic antigen (CEA), growth differentiation factor 15 (GDF-15) and amphiregulin (AREG) met a FDR threshold of 5%. Apart from prostate specific antigen (PSA), for which statistically significantly higher plasma levels in men than in women were found, all the other 16 biomarkers did not show any statistically significant relationship with sex or age within the group of controls free of colorectal neoplasms (p-values>0.05). Additionally, sensitivity analyses excluding four participants reporting to have had any cancer diagnosis in the past in self-administrated questionnaires were also conducted, and yielded almost identical results.

    TABLE-US-00003 TABLE 3 Stage specific performance of specific protein markers for detection of CRC Tumor stages I and II Tumer stages III and IV Apparent AUC .632+ AUC Apparent AUC .632+ AUC Marker [95% CI] [95% CI] [95% CI] [95% CI] p-value.sup.a AREG 0.79[0.67-0.91] 0.76[0.61-0.95] 0.65[0.50-0.80] 0.60[0.39-0.87] 0.168 IL-6 0.78[0.67-0.90] 0.74[0.62-0.94] 0.60[0.45-0.75] 0.49[0.23-0.77] 0.064 GDF-15 0.78[0.67-0.89] 0.72[0.61-0.91] 0.67[0.52-0.82] 0.61[0.40-0.87] 0.270 HGF-receptor 0.70[0.55-0.85] 0.65[0.44-0.91] 0.62[0.48-0.75] 0.54[0.40-0.78] 0.411 CXCL9 0.70[0.55-0.85] 0.64[0.46-0.89] 0.61[0.47-0.76] 0.48[0.24-0.75] 0.421 ErbB4-Her4 0.70[0.56-0.83] 0.63[0.50-0.88] 0.61[0.46-0.75] 0.51[0.25-0.78] 0.385 CXCL10 0.70[0.55-0.84] 0.62[0.45-0.88] 0.62[0.47-0.76] 0.49[0.23-0.77] 0.445 Flt3L 0.69[0.55-0.83] 0.62[0.45-0.88] 0.59[0.43-0.74] 0.50[0.26-0.76] 0.320 VEGFR-2 0.67[0.51-0.83] 0.61[0.37-0.91] 0.59[0.44-0.75] 0.49[0.25-0.77] 0.505 CD69 0.66[0.50-0.82] 0.60[0.41-0.90] 0.59[0.44-0.75] 0.51[0.25-0.78] 0.546 CXCL5 0.64[0.48-0.81] 0.58[0.29-0.85] 0.63[0.49-0.78] 0.55[0.30-0.82] 0.937 CEA 0.68[0.54-0.82] 0.58[0.28-0.87] 0.79[0.66-0.92] 0.75[0.60-0.95] 0.252 PSA 0.63[0.46-0.80] 0.58[0.27-0.85] 0.63[0.47-0.78] 0.56[0.26-0.81] 0.976 EMMPRIN 0.64[0.48-0.80] 0.55[0.26-0.83] 0.63[0.48-0.77] 0.55[0.37-0.81] 0.898 Cathepsin-D 0.65[0.50-0.80] 0.54[0.21-0.83] 0.61[0.46-0.75] 0.49[0.24-0.75] 0.688 Caspase-3 0.62[0.47-0.78] 0.52[0.28-0.82] 0.63[0.48-0.79] 0.55[0.27-0.85] 0.923 TNF-alpha 0.59[0.43-0.74] 0.48[0.22-0.76] 0.67[0.51-0.82] 0.60[0.37-0.88] 0.480 .sup.aDelong test was employed to test the differences of AUCs between CRC at early stage and advanced stage. Abbreviations: AUC, area under the receiver operating characteristic curve; CI, confidence interval.

    [0068] Among these 17 protein markers, 9 protein markers were over-expressed and 8 protein markers showed lower levels in CRC cases compared with controls (Table 2). The 0.632+ adjusted AUCs of these 17 markers ranged from 0.70 to 0.55. Four markers, including AREG, CEA, GDF-15 and interleukin 6 (IL-6), yielded substantially better diagnostic performances than the others, with 0.632+ adjusted AUCs no less than 0.65. When the cutoff values were set to yield 80% specificity, the highest 0.632+ adjusted sensitivity was observed for CEA (52%). With cut-off values set to yield 90% specificity, the highest 0.632+ adjusted sensitivity was observed for AREG (36%).

    [0069] FIG. 2 shows the distribution of plasma levels for the 17 protein markers for CRC patients in early tumor stages and advanced tumor stages. 7 protein markers (IL-6, CXCL9, CXCL10, PSA, cathepsin-D, caspase-3 and AREG) showed higher levels in early tumor stages than in advanced ones. However, only the result for IL-6 was statistically significant (p-value<0.05). Table 3 shows the comparison of ROC analysis for these 17 markers between CRC patients at early and advanced stages. Most markers (13/17) showed higher adjusted AUCs in CRC patients at early tumor stages than at advanced ones. However, none of the differences was statistically significant. For three markers (AREG, IL-6 and GDF-15) the 0.632+ adjusted AUCs for early tumor stage CRC were higher than 0.70 (i.e., 0.76, 0.74, and 0.72, respectively). By contrast, CEA showed the highest 0.632+ adjusted AUC for advanced stage CRC (0.75).

    Example 2: Development of a Colorectal Cancer Diagnostic Panel of 8 Biomarkers

    [0070] The inventors used the Lasso Logistic regression model to construct a multi-marker prediction algorithm based on all 92 protein biomarkers. The following 8 markers were selected for inclusion in the algorithm: IFN-gamma, EMMPRIN, ErbB4-Her4, PSA, CD69, AREG, HGF-receptor and CEA (algorithm is shown in Table 4). The apparent AUC was 0.88 (95% CI, 0.81-0.95). Through the 0.632+ bootstrap subsampling approach, the adjusted AUC of this algorithm was 0.77 (95% CI, 0.59-0.91). Of note, this algorithm showed a similar diagnostic value for early stage CRC and advanced stage CRC (0.632+ adjusted AUC: 0.79 versus 0.75, respectively).

    TABLE-US-00004 TABLE 4 Eight-marker algorithm derived through the Lasso logistic regression model: intercept and marker coefficients EMM-PRINErbB4- Variable Intercept INF Her4 PSA CD69 AREG HGFR CEA Coeff. 7.57 0.0259 0.0887 0.8138 0.0642-0.1793 0.9605 0.5173 0.4450

    [0071] Finally, the inventors also validated this eight-marker algorithm in the independent validation set, which included 54 CRC cases and 38 controls free of colorectal neoplasms. The age distribution of this validation set was similar to the sage distribution in the main study from the screening setting, even though both cases and controls included somewhat lower proportions of men. The tumor stage distribution of cases in the independent validation set was similar to the stage distribution of CRC cases detected at screening colonoscopy according to the German screening colonoscopy registry. Table 5 and FIG. 3 show the diagnostic performance of the eight-marker algorithm for CRC prediction in the independent validation set. The AUC was 0.76 (95% CI, 0.65-0.85), and sensitivities at cutoffs yielding 80% and 90% specificities were 65% (95% CI, 41-80%) and 44% (95% CI, 24-72%), respectively. In this independent validation set, diagnostic performance was better for advanced stage than for early stage disease (AUC: 0.84 versus 0.72, respectively).

    TABLE-US-00005 TABLE 5 The diagnostic performance of the eight-marker algorithm for CRC detection in an independent validation set Sensitivity [95% CI] at 80% at 90% CRC group AUC [95% CI] specificity specificity All CRC cases 0.76 [0.65-0.85] 65% [41-80%] 44% [24-72%] CRC at Stage I/II 0.72 [0.60-0.84] 61% [34-79%] 34% [13-68%] CRC at Stage III/IV 0.84 [0.68-0.96] 75% [50-94%] 69% [44-94%] Abbreviations: AUC, area under the receiver operating characteristic curve; CI, confidence interval.

    Example 3: Validation of the Diagnostic Panel of 8 Biomarkers in the Diagnosis of qGastric Cancer and Pancreatic Cancer

    [0072] The diagnostic value of the 8 biomarker panel of the invention could also be validated for both pancreatic cancer and gastric cancer (FIG. 4), indicating the general applicability of the 8 biomarker panel for the diagnosis of cancers, not only colorectal cancer.