METHOD FOR DIAGNOSING CUTANEOUS T-CELL LYMPHOMA DISEASES

20230028910 · 2023-01-26

    Inventors

    Cpc classification

    International classification

    Abstract

    The present invention relates to a method for diagnosing Sézary syndrome or mycosis fungoides in a subject. The present invention further relates to a method for determining the frequency of Sézary signature cells and/or mycosis fungoides cells in a sample. Further, the present invention relates to a computer-implemented method comprising a classifier algorithm to determine the frequency of Sézary signature cells and/or mycosis fungoides cells. In addition the present invention relates to a panel of biomarkers that can be used for the diagnosis of Sézary syndrome or mycosis fungoides.

    Claims

    1. A method for determining the frequency of Sézary signature cells and/or mycosis fungoides cells in a plurality of cells, the method comprising the steps of: i) determining the levels of expression of two or more biomarkers in a plurality of cells; and ii) determining the frequency of Sézary signature cells and/or mycosis fungoides cells in the plurality of cells based on the levels of expression of the two or more biomarkers; wherein the biomarkers are selected from the group listed in Table 1: TABLE-US-00011 TABLE 1 Target CD45 CD196/CCR6 CD19 CD307c/FcRL3 HLA-ABC CD4 CD8/CD8a CD7 CD14 CD25 (IL-2R) CD61 CD 123 CD95 [Fas] CD366 [Tim3] TIGIT CD279/PD-1 CD195/CCR5 CD194/CCR4 CD197/CCR7 CD28 CD26 CDllc CD 164 CCL17/CADM1 CD16 CD44 CD27 CD127/IL-7Ra CD45RA CD3 CD20 CD38 KIR3DL2/CD158k HLA-DR CD223 [LAG3] CD56 CD45RO and wherein one of the biomarkers is TIGIT.

    2. The method according to claim 1, wherein the further biomarkers are selected from the group of cell surface markers consisting of: CD28, CD197, CD3, HLA-ABC, CD194, CD26, CD164, CD44, CD45, CD45RA, CD45RO, CD7, CD4, CD27, PD-1, HLA-DR, CD14, CD158K, CD95, CD8/CD8a and CD195.

    3. The method according to claim 1, wherein 8 or more biomarkers are used.

    4. The method according to claim 3, wherein 9 or more biomarkers are used.

    5. The method according to claim 1, wherein a classifier algorithm is used to distinguish between a Sézary signature cell and a non-Sézary signature cell or a mycosis fungoides cell and a non-mycosis fungoides cell, respectively.

    6. The method according to claim 1, wherein a convolutional neural network is used to distinguish between a Sézary signature cell and a non-Sézary signature cell or a mycosis fungoides cell and a non-mycosis fungoides cell, respectively.

    7. The method according to claim 1, wherein the plurality of cells are human cells.

    8. The method according to claim 1, wherein the levels of the biomarkers are determined using an antibody-based assay.

    9. The method according to claim 8, wherein the antibody-based assay is an antibody-based flow cytometry or mass cytometry assay.

    10. The method according to claim 1, wherein the two or more biomarkers are: a) a panel comprising or consisting of four different biomarkers, namely (i) TIGIT, (ii) CD45 and (iii) two other biomarkers selected from Table 1; b) a panel comprising or consisting of four different biomarkers, namely (i) TIGIT, (ii) CD45, (iii) CD3 and (iv) another biomarker selected from Table 1; c) a panel comprising or consisting of five different biomarkers, namely (i) TIGIT and (ii) four other biomarkers selected from Table 1; d) a panel comprising or consisting of five different biomarkers, namely (i) TIGIT, (ii) CD45, and (iv) three other biomarkers selected from Table 1; e) a panel comprising or consisting of six different biomarkers, namely (i) TIGIT and (ii) five other biomarkers selected from Table 1; f) a panel comprising or consisting of six different biomarkers, namely (i) TIGIT, (ii) CD45, (iii) CD3 and (iv) three other biomarkers selected from Table 1; g) a panel comprising or consisting of six different biomarkers, namely (i) TIGIT, (ii) CD3, (iii) CD4, (iii) CD7, (iv) CD8/CD8a and (v) CD26; h) a panel comprising or consisting of seven different biomarkers, namely (i) TIGIT and (iv) six other biomarkers selected from Table 1; i) a panel comprising or consisting of seven different biomarkers, namely (i) TIGIT, (ii) CD28, (iii) CD3, (iv) one biomarker selected from CD123, CD11c, CD61 and TIM3 and (v) three other biomarkers selected from Table 1; j) a panel comprising or consisting of seven different biomarkers, namely (i) TIGIT, (ii) CD28, (iii) CD3, (iv) CD123, (v) CD11c, (vi) CD61, (vii) TIM3; k) a panel comprising or consisting of eight different biomarkers, namely (i) TIGIT, (ii) CD28, (iii) CD3, (iv) one biomarker selected from CD197, HLA-ABC, PD-1, CD27 and CD4 and (v) four other biomarkers selected from Table 1; l) a panel comprising or consisting of eight different biomarkers, namely (i) TIGIT, (ii) CD28, (iii) CD3, (iv) two biomarkers selected from CD197, HLA-ABC, PD-1, CD27 and CD4 and (v) three other biomarkers selected from Table 1; m) a panel comprising or consisting of eight different biomarkers, namely (i) TIGIT, (ii) CD28, (iii) CD3, (iv) three biomarkers selected from CD197, HLA-ABC, PD-1, CD27 and CD4 and (v) two biomarkers selected from Table 1; n) a panel comprising or consisting of eight different biomarkers, namely (i) TIGIT, (ii) CD28, (iii) CD3, (iv) four biomarkers selected from CD197, HLA-ABC, PD-1, CD27 and CD4 and (v) another biomarker selected from Table 1; o) a panel comprising or consisting of eight different biomarkers, namely (i) TIGIT, (ii) CD28, (iii) CD3, (iv) CD197, (v) HLA-ABC, (vi) PD-1, (vii) CD27 and (viii) CD4; p) a panel comprising or consisting of nine different biomarkers, namely (i) CD28, (ii) CD197, (iii) CD4, (iv) CD3, (v) HLA-ABC, (vi) CD27, (vii) TIGIT, (viii) PD-1 and (ix) another biomarker selected from Table 1; q) a panel comprising or consisting of nine different biomarkers, namely (i) CD28, (ii) CD197, (iii) CD4, (iv) CD3, (v) HLA-ABC, (vi) CD27, (vii) TIGIT, (viii) PD-1 and (ix) CD26, CD164 or CD158K; r) a panel comprising or consisting of nine different biomarkers, namely (i) CD3, (ii) CD4, (iii) CD7, (iv) CD8/CD8a, (v) CD26, (vi) CD27, (vii) CD45, (viii) CD45RA and (ix) TIGIT; s) a panel comprising or consisting of ten different biomarkers, namely (i) TIGIT, (ii) nine other biomarkers selected from Table 1; t) a panel comprising or consisting of ten different biomarkers, namely (i) TIGIT, (ii) CD28 and (iii) eight other biomarkers selected from Table 1; u) a panel comprising or consisting of ten different biomarkers, namely (i) CD3, (ii) CD4, (iii) CD7, (iv) CD8/CD8a, (v) CD26, (vi) CD45RO, (vii) TIGIT, (viii) CD194, (ix) CD197 and (x) PD-1; v) a panel comprising or consisting of ten different biomarkers, namely (i) CD28, (ii) CD197, (iii) CD4, (iv) CD3, (v) HLA-ABC, (vi) CD27, (vii) TIGIT, (viii) PD-1 and (ix) two other biomarkers selected from Table 1; w) a panel comprising or consisting of ten different biomarkers, namely (i) CD28, (ii) CD197, (iii) CD4, (iv) CD3, (v) HLA-ABC, (vi) CD27, (vii) TIGIT, (viii) PD-1, (ix) HLA-DR and (x) CD14; or x) a panel comprising or consisting of ten different biomarkers, namely (i) CD28, (ii) CD197, (iii) CD4, (iv) CD3, (v) HLA-ABC, (vi) CD27, (vii) TIGIT, (viii) PD-1, (ix) CD164 and (x) CD158K.

    11. The method according to claim 1, wherein the two or more biomarkers are a panel comprising or consisting of nine different biomarkers, namely (i) CD3, (ii) CD4, (iii) CD7, (iv) CD8/CD8a, (v) CD26, (vi) CD27, (vii) CD45, (viii) CD45RA and (ix) TIGIT.

    12. A computer-implemented method for determining the frequency of Sézary signature cells or mycosis fungoides cells, the method comprising the steps of: i) executing a classifier algorithm on a set of data comprising the levels of expression of two or more biomarkers selected from the group consisting of the biomarkers listed in Table 1 in a plurality of cells; TABLE-US-00012 TABLE 1 Target CD45 CD196/CCR6 CD 19 CD307c/FcRL3 HLA-ABC CD4 CD8/CD8a CD7 CD14 CD25 (IL-2R) CD61 CD123 CD95 [Fas] CD366r [Tim3] TIGIT CD279/PD-1 CD195/CCR5 CD194/CCR4 CD197/CCR7 CD28 CD26 CD11c CD164 CCL17/CADM1 CD16 CD44 CD27 CD127/IL-7Ra CD45RA CD3 CD20 CD38 KIR3DL2/CD158k HLA-DR CD223 [LAG3] CD56 CD45RO ii) determining the frequency of Sézary signature cells or mycosis fungoides cells, respectively, in the plurality of cells underlying the set of data based on the levels of expression of the two or more biomarkers, wherein one of the two or more biomarkers is TIGIT.

    13. The method of claim 12, wherein the classifier algorithm comprises one or a combination of a support vector algorithm, a convolutional neural network, a tree-based method, logistic regression.

    14. A computer program product containing instructions for performing the computer-implemented method according to claim 12.

    15. A method for diagnosing a subject as having Sézary syndrome or as having mycosis fungoides, the method comprising the steps of: i) determining the frequency of Sézary signature cells or mycosis fungoides cells, respectively, in a sample obtained from the subject using the method according to claim 1; ii) comparing the frequency of Sézary signature cells or mycosis fungoides cells, respectively, determined in step (i) to the frequency of Sézary signature cells or mycosis fungoides cells, respectively, in a sample that has been obtained from a subject not suffering from Sézary syndrome or mycosis fungoides, respectively, and/or to the frequency of Sézary signature cells or mycosis fungoides cells, respectively, in a sample that has been obtained from a subject with Sézary syndrome or mycosis fungoides, respectively; and iii) determining a subject as having Sézary syndrome or mycosis fungoides, respectively, if the frequency of Sézary signature cells or mycosis fungoides cells, respectively, determined in step (ii) is higher compared to the frequency of Sézary signature cells or mycosis fungoides cells, respectively, for the subject not suffering from Sézary syndrome or mycosis fungoides, respectively, and/or if the frequency of Sézary signature cells or mycosis fungoides cells, respectively, determined in step (ii) is similar or higher compared to the frequency of Sézary signature cells or mycosis fungoides cells, respectively, in the sample that has been obtained from the subject with Sézary syndrome or mycosis fungoides, respectively.

    16. The method according to claim 15, wherein determining the subject as having Sézary syndrome or mycosis fungoides, respectively, comprises the use of a classifier algorithm.

    17. The method according to claim 16, wherein the classifier algorithm comprises a convolutional neural network and/or logistic regression.

    18. The method according to claim 15, wherein the subject without Sézary syndrome is a healthy subject or a subject suffering from atopic dermatitis, non-specific dermatitis, erythroderma and/or mycosis fungoides.

    19. The method according to claim 15 further comprising one or more other diagnostic tests and/or additional information about the subject.

    20. The method according to claim 19, wherein the one or more other diagnostic tests is/are selected from the group consisting of: whole-body imaging tests, skin lesion biopsies, histopathology tests, CD4/CD8 ratio determination, cell count analysis and PCR analysis of T cell receptor clonality.

    21. The method according to claim 19, wherein the additional information about the subject comprises age and/or clinical information.

    22. A method for determining the susceptibility to treatment of Sézary syndrome or mycosis fungoides in a subject, the method comprising the steps of: i) determining the frequency of Sézary signature cells or mycosis fungoides cells, respectively, in a first sample that has been obtained from said subject; ii) determining the frequency of Sézary signature cells or mycosis fungoides cells, respectively, in a second sample that has been obtained from the same subject, wherein the second sample has been obtained at a later time point than the first sample; iii) determining the change in frequency of Sézary signature cells or mycosis fungoides cells, respectively, between the first and second sample; and iv) determining the subject as being susceptible to treatment of Sézary syndrome or mycosis fungoides, respectively, if the frequency of Sézary signature cells or mycosis fungoides cells, respectively, is lower in the second sample than in the first sample; wherein the frequency of Sézary signature cells or mycosis fungoides cells, respectively, is determined using the method according to claim 1.

    23. The method according to claim 15, wherein the sample is a blood sample, in particular, wherein the sample comprises peripheral blood mononuclear cells.

    24. A composition comprising reagents for the detection of biomarkers for the diagnosis of Sézary syndrome or mycosis fungoides, the biomarkers comprising or consisting of: a) at least seven different biomarkers, namely: (i) CD28 (ii) CD3, (iii) TIGIT and (iv) at least four other biomarkers selected from a group consisting of: CD197, CD11c, CD26, CD164, TIM3, CD61, CD4, PD-1, HLA-ABC, HLA-DR, CD14, CD158K, CD197, CD8/CD8a and CD195; b) seven different biomarkers, namely (i) CD28, (ii) CD3, (iii) HLA-ABC, (iv) CD197, (v) CD7, (vi) CD27 and (vii) TIGIT; c) eight different biomarkers, namely (i) CD28, (ii) CD197, (iii) CD4, (iv) CD3, (v) HLA-ABC, (vi) CD27, (vii) TIGIT and (viii) PD-1; d) nine different biomarkers, namely (i) CD28, (ii) CD197, (iii) CD4, (iv) CD3, (v) HLA-ABC, (vi) CD27, (vii) TIGIT, (viii) PD-1 and (ix) CD26, CD164 or CD158K; e) nine different biomarkers, namely (i) CD3, (ii) CD4, (iii) CD7, (iv) CD8/CD8a, (v) CD26, (vi) CD27, (vii) CD45, (viii) CD45RA and (ix) TIGIT; f) ten different biomarkers, namely (i) CD28, (ii) CD197, (iii) CD4, (iv) CD3, (v) HLA-ABC, (vi) CD27, (vii) TIGIT, (viii) PD-1, (ix) HLA-DR and (x) CD14; f) ten different biomarkers, namely (i) CD3, (ii) CD4, (iii) CD7, (iv) CD8/CD8a, (v) CD26, (vi) CD45RO, (vii) TIGIT, (viii) CD194, (ix) CD197 and (x) PD-1; or g) ten different biomarkers, namely (i) CD28, (ii) CD197, (iii) CD4, (iv) CD3, (v) HLA-ABC, (vi) CD27, (vii) TIGIT, (viii) PD-1, (ix) CD164 and (x) CD158K.

    25. The composition according to claim 24, wherein the biomarkers are comprising or consisting of: a) CD3, CD4, CD7, CD8/CD8a, CD26, CD27, CD45, CD45RA and TIGIT; or b) CD3, CD4, CD7, CD8/CD8a, CD26, CD45RO, TIGIT, CD194, CD197 and PD-1.

    26. A method for determining the frequency of Sézary signature cells and/or mycosis fungoides cells in a plurality of cells, the method comprising the steps of: i) determining the levels of expression of three or more biomarkers in a plurality of cells; and ii) determining the frequency of Sézary signature cells and/or mycosis fungoides cells in the plurality of cells based on the levels of expression of the three or more, biomarkers; wherein the three or more biomarkers are a panel comprising or consisting of at least three biomarkers selected from the group of CD27, CD26, CD7, CD45RA and TIGIT.

    27-51. (canceled)

    Description

    BRIEF DESCRIPTION OF FIGURES

    [0304] FIG. 1: Accuracies of predictions from CellCnn models trained on all 36 markers on held-out data for 10-fold cross-validation. Median accuracy is 1.0.

    [0305] FIG. 2: Confusion matrix for training data (left) and validation data (right), calculated based on predictions of a CellCnn model trained using all 36 markers and all samples from the discovery cohort. 11 samples were samples were set aside as validation data during the training process.

    [0306] FIG. 3: Boxplots comparing the relative frequency of selected cells using a filter response cutoff of 0.3 between non-SS and SS samples. (p=1.6e-13, Wilcoxon rank-sum test).

    [0307] FIG. 4: t-SNE maps of cells from all samples in the discovery cohort (5000 cells randomly sub-sampled from each sample), showing in black either cells selected as above the filter response cutoff (left) or cells manually annotated as CD4 T cells (right).

    [0308] FIG. 5: Learned filter weights for each marker from the trained CellCnn model. Positive scores correspond to markers contribute positively towards a predictive of Sézary syndrome, while negative scores correspond to markers that contribute negatively towards a prediction of Sézary syndrome.

    [0309] FIG. 6: Smoothed densities of normalized, arcsinh-transformed marker expression for the top 9 most informative markers in selected cells (solid line) vs. all measured cells (dotted line). K-S indicates the Kolmogorov-Smirnov test statistic for each comparison. Markers are shown in decreasing order of separation between the selected cells distribution and the background distribution.

    [0310] FIG. 7: Display of workflow according to analysis strategy for Panel A as disclosed herein.

    EXAMPLES

    [0311] Aspects of the present invention are additionally described by way of the following illustrative non-limiting examples that provide a better understanding of embodiments of the present invention and of its many advantages. The following examples are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques used in the present invention to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. However, those of skill in the art should appreciate, in light of the present disclosure that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention.

    [0312] Identification of a Sézary-Specific Biomarker from Discovery Data

    [0313] A discovery cohort of 60 PBMC samples were collected from 60, comprising 20 samples from patients with clinically-diagnosed Sézary Syndrome (SS), 20 samples from patients with clinically-diagnosed Atopic Dermatitis (AD), and 20 healthy donor (HD) samples. CyTOF data was acquired from these samples using a panel of 37 antibodies (Table 1a), which were selected for the identification of the major human immune cell subsets in blood as well as known or suspected cancer-specific markers.

    [0314] PBMCs were isolated from whole blood via standard Ficoll isolation and 3×10.sup.6 cells per sample were resuspended in Maxpar Cell Staining buffer (CSB, Fluidigm). To stain for viable cells, samples were incubated for 10 min at room temperature (RT) with Cisplatin in PBS (1:20′000 dilution, Cell-ID Cisplatin-194Pt/198Pt, Fluidigm). For fixation and permeabilisation, the samples were resuspended in Maxpar Fix I Buffer (Fluidigm) and incubated for 10 minutes (min) at RT, followed by two washing steps with the Barcode Perm Buffer (Fluidigm). For barcoding, the samples were mixed with the barcodes (Cell-IP 20-Plex Pd Barcoding kit, Fluidigm), resuspended with 100 ul of Barcode Perm buffer in advance. After 30 min of incubation at RT, samples were centrifuged and washed with Maxpar Staining Buffer. Finally, all barcoded samples were pooled together. The pooled barcoded sample was stained with the antibody panel for 20 min at RT and washed twice with CSB. For the intercalator staining, 200 μl of freshly prepared 4% PFA in PBS was added to the stained cells and samples were incubated for 24 h at 4° C., followed by iridium staining by adding 200 μl of freshly prepared intercalator solution (1:5000 dilution in Max Par Fix and Perm buffer, Cell-ID Intercalator-Ir, Fluidigm) for 1 hour at RT. Before CyTOF acquisition, the samples were mixed with 15% of EQ™ Four Element Calibration Beads (.sup.140Ce, .sup.151Eu, .sup.153Eu, .sup.165Ho, .sup.175Lu, Fluidigm).

    [0315] The data was pre-processed by applying bead normalization to the entire dataset (Chevrier, S. et al. Cell Syst. 6, 612-620.e5 (2018) and Finck, R. et al. Cytometry A 83A, 483-494 (2013)) and then applying an arcsinh transformation with a cofactor of 5 (Nowicka, M. et al. F1000Research 6, 748 (2019)). Samples were de-barcoded using the CATALYST package in R (Chevrier, S. et al. Cell Syst. 6, 612-620.e5 (2018) and Zunder, E. R. et al. Nat. Protoc. 10, 316-333 (2015)). After de-barcoding, the inventors observed that 7 samples (5 SS, 1 AD and 1 HD) contained less than 10,000 events. The inventors reasoned that these samples, which had been previously observed to be of low quality during data acquisition, might not include all relevant rare cell types, or, in the worst case, might introduce biases into the results; therefore, the inventors excluded them from further analysis.

    [0316] The remaining 53 samples were used to train a CellCnn neural network to distinguish between SS and non-SS samples. For this task, AD and HD samples were combined into a “non-SS” class. To first evaluate the robustness of the CellCnn model on this dataset, the inventors conducted 10-fold cross-validation. The median accuracy on held-out test data across 10 runs for each two-way comparison of SS vs. non-SS samples was 1.0 (FIG. 1). From these results, the inventors concluded that CellCnn was able to robustly identify a Sézary-specific biomarker from the discovery cohort. The inventors then proceeded to train a CellCnn model using the full discovery dataset.

    [0317] Input data for CellCnn consisted of 1000 batches of 3000 cells each chosen randomly from each sample independently. For CyTOF data, all measured protein markers were included as parameters to train the network. For FACS data, all measured protein markers plus forward- and side-scatter parameters were included to train the network. Each sample was given a label of either 0, corresponding to Atopic Dermatitis and Healthy donor samples, or 1, corresponding to Sézary syndrome samples. CellCnn was run in classification mode. During training, 20% of the samples were set aside for validation, chosen in a stratified manner to maintain the relative proportions of each class. Fifteen independent networks were generating using different hyperparameters randomly chosen from the following options: [0318] Number of filters: [3, 4, 5, 6, 7, 8, 9, 10] [0319] MaxPool percentage: [0.01, 1, 5, 20, 100] [0320] Learning rate: range (0.001, 0.01)

    [0321] All other hyperparameters were fixed at the default value. Each network was trained for a maximum of 50 epochs, or until the validation loss no longer increased for 5 consecutive epochs.

    [0322] The resulting model showed 100% accuracy on both training and validation data (FIG. 2). Using the learned weights from the CellCnn filter most strongly positively associated with SS samples, the inventors calculated a filter response score for every cell in the dataset. These scores can be used to set a threshold to determine the Sézary-specific signature cells. A range of thresholds were tested, and the threshold that best separated SS and non-SS samples by relative frequency of Sézary signature cells was chosen as 0.3 (FIG. 3). The median frequency of signature cells in SS samples was 12.9% of all live PBMCs, while the median frequency in non-SS samples was 0.033% (p=1.6e-13, Wilcoxon rank-sum test).

    [0323] Validation of the Discovered Signature in an Independent Patient Cohort

    [0324] In order to validate the Sézary-specific biomarker, PBMCs were collected from an independent cohort of 33 individuals, comprising 11 patients with Sézary syndrome, 11 patients with Atopic Dermatitis, and 11 healthy donors. Sample labels were blinded during the data acquisition and analysis phases. The validation samples were pre-processed in the same way as the discovery samples, with bead normalization performed on all samples together to reduce the potential batch effects between the two cohorts. The pre-trained CellCnn network was used to predict the status (SS or non-SS) of each sample, which was then compared against the corresponding clinically-diagnosed disease status. The predictions showed a sensitivity of 0.81 and a specificity of 0.95.

    [0325] Performance metrics for the sample classification task were calculated as follows. True positives (TP) were defined as SS samples that were predicted as SS samples. False positives (FP) were defined as AD or HD samples that were predicted as SS samples. True negatives (TN) were defined as AD or HD samples that were predicted as AD or HD samples. False negatives (FN) were defined as SS samples that were predicted as AD or HD. Sensitivity was calculated as TP/(TP+FN); specificity as TN/(FP+TN); accuracy as (TP+TN)/(TP+FP+TN+FN); positive predictive value (PPV) as TP/(TP+FP); and negative predictive value (NPV) as TN/(TN+FN). The F1-score was calculated as the harmonic mean of sensitivity and specificity, using the formula 2*(sensitivity*specificity)/(sensitivity+specificity).

    [0326] Phenotypic Characterization of Sézary Signature Cells

    [0327] In parallel to the CellCnn analysis, the inventors performed clustering on the CyTOF data using FlowSOM (Van Gassen, S. et al., Cytometry A 87, 636-645 (2015)), followed by manual annotation of the resulting 100 nodes into major immune cell lineages based on marker expression profiles (CD4 T cells, CD8/CD8a T cells, other T cells, B cells, NK cells, myeloid cells and pDCs). T cells, and more specifically CD4+ T cells, showed an enrichment in SS vs. non-SS samples, while all other identified cell types showed a relative depletion. These results are consistent with the expected immune profiles from Sézary syndrome patients, as it is a disease originating from a CD4+ T cell expansion. Plotting the CellCnn-derived Sézary signature cells on a tSNE map revealed that they largely overlap with the CD4+ T cell cluster (FIG. 4); however, not all CD4+ T cells were included in the Sézary signature cells. In order to better phenotypically characterize the Sézary signature cells, the inventors examined both the learned filter weights from CellCnn and the distribution of marker expression in those cells.

    [0328] The filters learned by CellCnn correspond to a set of weights for each measured protein marker, which can be either positive or negative (FIG. 5). Taken as a whole, these weights may be considered to define the molecular profile of a theoretical group of cells that is strongly predictive of a given class label. Both positive and negative weights may be informative, as positive weights describe markers that have high values (are relatively enriched) in these idealized cells, while negative weights describe markers that have low values (are relatively depleted) in the same cells as compared to the entire dataset. As the Sézary signature cells were selected based on their filter response scores, which indicate how similar their individual expression profiles are to the learned filter, it is expected that signature cells should show enrichment for markers with high filter weights and depletion for markers with low filter weights. Nonetheless, the expression levels of markers in individual cells are heterogeneous, making it appropriate to examine their overall distributions.

    [0329] For each marker examined, the inventors compared the expression distribution in Sézary signature cells vs. the expression distribution in the entire dataset using a Kolmogorov-Smirnov 2-sample test (FIG. 6). The inventors performed one set of tests with the null hypothesis that the expression distribution in signature cells is stochastically greater than the background distribution, in order to identify markers most enriched in signature cells, and another set of tests with the null hypothesis that the distribution in signature cells is stochastic smaller than the background distribution, in order to identify markers most depleted in signature cells. The inventors then compared the lists of markers ranked by the Kolmogorov-Smirnov test statistic with the list of markers ranked by filter weights (Table 2). Among the top 10 enriched markers in signature cells, 7 (CD28, HLA-ABC, CD3, CD4, CD194, CD197, and CD127/CD25) were also in the top 10 markers with the most positive filter weights. Among the top 10 depleted markers in signature cells, 7 (CD44, CD38, CD7, CD11b, CD45RA, HLA-DR, and CD26) were also in the top 10 markers with the most negative filter weights.

    [0330] Definition of Reduced Antibody Panels

    [0331] Based on these results, as well as on expert knowledge of expected lineage markers present in T cells and other immune cell types, the inventors designed several panels comprising a reduced number of antibodies. These panels were tested in silico by retraining CellCnn using only the data corresponding to the markers in each reduced panel. As with the initial full antibody panel, the inventors conducted 10-fold cross-validation and calculated the accuracies of predictions on the held-out test data for each run. Median accuracies for the different reduced panels ranged from 0.96 to 1.00 (Table 3). The inventors also trained a single CellCnn model using all of the samples from the discovery cohort for each reduced panel, and used these models to predict the labels of the validation cohort samples. The predictions showed sensitivities ranging from 0.64 to 1.00 and specificities ranging from 0.32 to 1.00.

    [0332] Preferred Panels

    [0333] Preferred panels of biomarkers to be measured consist of at least 6 biomarkers from Table 1 and consist of a combination of both biomarkers whose levels are increased in cells specific to patients with Sézary syndrome (“positive biomarkers”) and biomarkers whose levels are decreased in cells specific to patients with Sézary syndrome (“negative biomarkers”). Panels of particular diagnostic utility comprise the panels listed in Table 3. The preferred panels may be expanded by adding one or more additional biomarkers from Table 1. This may result in increased sensitivity and/or specificity for diagnosis.

    [0334] Comparative Sensitivity of Marker Panels

    [0335] Several example panels of antibodies were tested. For each, thresholds for every marker individually were identified and then the frequencies of the resulting selected cells in SS and non-SS samples were calculated, before the p-value comparing those frequencies using a Wilcoxon test was determined. The same was done using CellCnn by defining a threshold on a CellCnn filter score. As can be seen in Table 4, all panels were able to separate the classes of samples.

    [0336] In addition, marker panels 1 to 19 were used for a comparison of the sensitivity of the various panels vs. four existing methods from the prior art. There were 15 patients in this cohort. As can be seen in Table 5, the tested panels were able to determine samples as comprising SS cells which were not detected using one of the methods of the prior art.

    [0337] While aspects of the invention are illustrated and described in detail in the Figures and in the foregoing tables and description, such Figures, tables and description are to be considered illustrative or exemplary and not restrictive. Also reference signs in the claims should not be construed as limiting the scope.

    [0338] It will also be understood that changes and modifications may be made by those of ordinary skill within the scope and spirit of the claims. In particular, the present invention covers further embodiments with any combination of features from different embodiments described above. It is also to be noted in this context that the invention covers all further features shown in the figures individually, although they may not have been described in the previous or following description. Also, single alternatives of the embodiments described in the figures and the description and single alternatives of features thereof can be disclaimed from the subject matter according to aspects of the invention.

    [0339] Whenever the word “comprising” is used in the claims, it should not be construed to exclude other elements or steps. It should also be understood that the terms “essentially”, “substantially”, “about”, “approximately” and the like used in connection with an attribute or a value may define the attribute or the value in an exact manner in the context of the present disclosure. The terms “essentially”, “substantially”, “about”, “approximately” and the like could thus also be omitted when referring to the respective attribute or value. The terms “essentially”, “substantially”, “about”, “approximately” when used with a value may mean the value ±10%, preferably ±5%.

    [0340] A number of documents including patent applications, manufacturer's manuals and scientific publications are cited herein. The disclosure of these documents, while not considered relevant for the patentability of this invention, is herewith incorporated by reference in its entirety. More specifically, all referenced documents are incorporated by reference to the same extent as if each individual document was specifically and individually indicated to be incorporated by reference.

    TABLE-US-00002 TABLE la Target Clone Mass Element CD45 HI30  89 Y CD196/CCR6 G034E3 141 Pr CD19 HIBI 9 142 Nd CD307c/FcRL3 H5/FcRL3 143 Nd HLA-ABC W6/32 144 Nd CD4 RPA-T4 145 Nd CD8/CD8a RPA-T8 146 Nd CD7 CD7-6B7 147 Sm CD14 RM052 148 Nd CD25 (IL-2R) 2A3 149 Sm CD61 VI-PL2 150 Nd CD 123 6H6 151 Eu CD95 [Fas] DX2 152 Sm CD366 [Tim3] F38-2E2 153 Eu TIGIT MBSA43 154 Sm CD279/PD-1 EH12.2H7 155 Gd CD195/CCR5 NP-6G4 156 Gd CD194/CCR4 205410 158 Gd CD197/CCR7 G043H7 159 Tb CD28 CD28.2 160 Gd CD26 BA5b 161 Dy CDllc Bul5 162 Dy CD 164 67D2 163 Dy CCL17/CADM1 3E1 164 Dy CD16 3G8 165 Ho CD44 BJ18 166 Er CD27 0323 167 Er CD127/IL-7Ra AO19D5 168 Er CD45RA HI 100 169 Tm CD3 UCHT1 170 Er CD20 2H7 171 Yb CD38 HIT2 172 Yb KIR3DL2/CD158k 539304 173 Yb HLA-DR L243 174 Yb CD223 [LAG3] 11C3C65 175 Lu CD56 NCAM16.2 176 Yb

    TABLE-US-00003 TABLE 2 Comparison of marker rankings Rank by Rank by K-S Rank by K-S filter test statistic test statistic Marker weight (enrichment) (depletion) CD28 1 1 34 CD 194 2 5 31 CD197 3 8 36 CD 164 4 10 23 CD3 5 4 25 HLA-ABC 6 2 32 CD11c 7 19 10 CD127CD25 8 9 24 CD61 9 33 2 CD158K 10 12 29 CD45 11 14 18 TIGIT 12 6 28 CD196 13 23 11 CD20 14 17 17 PD-1 15 7 35 CD123 16 24 19 TIM3 17 22 22 CD16 18 20 21 CD14 19 27 4 CD223 20 15 26 CD4 21 3 33 FcRL3 22 16 27 CD56 23 21 16 CD8/CD8a 24 26 12 CD195 25 36 14 CD19 26 25 13 HLA-DR 27 28 8 CD27 28 11 30 CD11b 29 29 6 CD45RA 30 35 7 CD38 31 30 3 CD7 32 32 5 CD95 33 18 20 CD26 34 34 9 CCL17 35 13 15 CD44 36 31 1

    TABLE-US-00004 TABLE 3 Preferred biomarker panels Median accuracy S.D. accuracy (10-fold CV, (10-fold CV, Sensitivity Specificity discovery discovery (validation (validation Panel cohort) cohort) cohort) cohort) CD28, CD197, 0.83 0.12 0.91 0.86 CD3, HLA-ABC, CD194, CD26 CD28, CD197, 1.00 0.12 0.91 0.91 CD194, CD3, HLA-ABC, CD164, CD44, CD45RA, CD26, CD7 CD28, CD197, 0.79 0.30 1.00 0.32 CD4, CD3, HLA- ABC, CD27, TIGIT, PD-1, HLA-DR, CD14 CD28, CD197, 0.82 0.18 0.64 1.00 CD4, CD3, HLA- ABC, CD27, TIGIT, PD-1 CD28, CD197, 1.00 0.14 0.82 0.73 CD4, CD3, HLA- ABC, CD27, TIGIT, PD-1, CD26 CD28, CD197, 0.92 0.19 0.73 1.00 CD4, CD3, HLA- ABC, CD27, TIGIT, PD-1, CD164 CD28, CD197, 0.75 0.23 0.64 1.00 CD4, CD3, HLA- ABC, CD27, TIGIT, PD-1, CD158K CD28, CD197, 0.79 0.23 0.80 1.00 CD4, CD3, HLA- ABC, CD27, TIGIT, PD-1, CD164, CD158K CD28, CD197, 1.00 0.12 0.82 0.91 CD3, HLA-ABC, CD194, CD26, CD7 CD28, CD197, 0.92 0.10 1.00 0.86 CD3, HL A-ABC, CD194, CD26, CD44 CD28, CD3, HLA- 1.00 0.19 1.00 0.96 ABC, CD194, CD26, CD44, CD95, CD8/CD8a, CD164, CD195

    TABLE-US-00005 TABLE 4 Threshold values for biomarker panels Mean Mean freq. Mean P-value CellCnn freq. Mean freq. Panel Marker Comp. Threshold non-SS freq. SS Wilcox threshold non-SS SS P-value Wilcox 1 CD28 >= 1,15 1,12 10,61 0,00381 0,0286 0,108 2,47 l,49E−04 HLA- >= 2,81 ABC 2 HLA- >= 3 0,975 11,57 0,0031 0,00514 0,00716 0,0145 0,0118 ABC CD3 >= 2,03 3 CD28 >= 0,971 1,05 11,56 0,0016 0,217 0,027 0,286 8,59E−09 HLA- >= 2,76 ABC CD3 >= 2,01 4 CD28 >= 0,995 0,526 11,21 0,000149 0,0836 0,303 7,27 l,05E−05 HLA- >= 2,66 ABC CD3 >= 2,03 CD26 <= 1,72 5 CD28 >= 0,971 0,74 10,94 0,00332 0,345 0,345 7,1 8,56E−04 HLA- >= 2,76 ABC CD4 >= 0,575 HLA- <= 1,13 DR 6 CD28 >= 0,971 0,346 11,51 5,87E−08 0,0762 0,844 16,44 4,45E−11 CD44 <= 1,6 HLA- >= 2,42 ABC CD3 >= 1,73 CD26 <= 1,91 7 CD28 >= 0,995 0,499 10,79 0,000149 0,248 0,829 13,04 3,10E−07 HLA- >= 2,66 ABC CD3 >= 1,72 CD26 <= 2,03 CD194 >= 0,674 8 CD28 >= 0,971 0,258 11,12 3,62E−09 0,115 0,0724 6,24 6,63E−10 CD44 <= 1,6 HLA- >= 2,42 ABC CD3 >= 1,73 CD26 <= 1,91 CD195 <= 1,41 9 CD28 >= 0,8 0,811 11,45 0,002879 0,0171 0,0958 2,63 0,0001349 HLA- >= 2,76 ABC CD3 >= 1,75 CD4 >= 0,575 CD164 >= 3,86 PD-1 <= 4,41 10 CD28 >= 0,995 0,479 10,81 0,0001226 0,0986 0,348 11,6 6,63E−10 HLA- >= 2,66 ABC CD26 <= 1,72 CD3 >= 2,03 CD194 >= 0,674 CD7 <= 3,71 11 CD28 >= 1,15 0,885 10,19 0,00216 0,0567 0,0555 1,08 5,87E−08 HLA- >= 2,82 ABC CCL17 <= 3,27 CD127 <= 3,51 CD16 <= 1,31 CD20 <= 1,26 12 HLA- >= 2,91 0,752 11,71 0,000856 0,102 0,0346 0,312 5,09E−10 ABC CD3 >= 2,03 CD11b <= 0,644 CD38 <= 0,971 CD56 <= 1,61 CD223 <= 2,01 13 CD28 >= 1,15 0,862 9,847 0,00609 0,415 0,03 0,568 4,52E−09 HLA- >= 2,78 ABC CD45 >= 0,704 CD45 <= 4,4 CD196 <= 2,64 CD19 <= 1,23 FcRL3 <= 2,09 CD25 <= 3,88 14 CD28 >= 1,26 1,93 9,69 0,00288 0,194 3,26 13,8 l,33E−05 TIGIT <= 3,08 CD61 <= 2,07 CD3 >= 1,31 TIM3 <= 1,32 CD11c <= 3,28 CD123 <= 1,45 15 CD28 >= 0,971 0,672 10,47 0,002495 0,075 0,039 0,928 2,65E−06 HLA- >= 2,76 ABC HLA- <= 1,2 DR CD4 >= 0,0959 CD14 <= 1,72 CD3 >= 1,21 PD-1 <= 4,41 16 CD28 >= 0,971 0,301 11,09 l,58E−08 0,681 0,394 13,7 3,87E−10 CD44 <= 1,59 HLA- >= 2,42 ABC CD3 >= 1,39 CD26 <= 1,91 CD194 >= 0,674 CD197 >= 0,0569 17 CD28 >= 0,943 0,367 12,1 1,11E−O9 0,338 0,226 13,8 6,24E−11 CD44 <= 1,59 HLA- >= 2,21 ABC CD3 >= 1,39 CD26 <= 1,91 CD164 >= 3,86 CD194 >= 0,674 CD45RA <= 4,39 CD7 >= 3,71 18 CD28 >= 0,73 0,185 11,53 4,45E−11 0,548 0,592 20,5 8,60E−10 CD44 <= 1,62 HLA- >= 2,34 ABC CD3 >= 1,39 CD26 <= 1,91 CD195 <= 1,44 CD164 >= 4,39 CD8/C <= 1,62 D8a CD194 >= 0,546 CD95 <= 1,71 19 CD28 >= 0,73 0,146 11,58 l,81E−09 0,205 0,273 17,8 6,40E−13 CD44 <= 1,62 HLA_ >= 2,09 ABC CD26 <= 1,91 CD3 >= 1,39 CD195 <= 1,44 CD164 >= 4,39 CD8/C <= 1,62 D8a HLA- <= 1,22 DR CD194 >= 0,546 CD11b <= 0,716 CD38 <= 2,63 CD45 >= 0,478 CD95 <= 1,71 CD223 <= 2,01 CD45RA <= 4,64 CD14 <= 2,02 CD7 <= 3,71 CD123 <= 1,45 TIM3 <= 1,45 TIGIT <= 3,08 CD19 <= 1,37 FcRL3 <= 2,09

    TABLE-US-00006 TABLE 5 Detection frequency and sensitivity of panels of Table 4 # Sezary patients detected out of 15- Method patient cohort Sensitivity CD4/CD8 10 0,666666667 ratio % CD4 CD7- 8 0,533333333 % CD4 CD26- 11 0,733333333 TCR clone 11 0,733333333 (flow) Panel 1 8 0,533333333 Panel 2 9 0,6 Panel 3 8 0,533333333 Panel 4 13 0,866666667 Panel 5 9 0,6 Panel 6 14 0,933333333 Panel 7 13 0,866666667 Panel 8 14 0,933333333 Panel 9 13 0,866666667 Panel 10 14 0,933333333 Panel 11 15 l Panel 12 14 0,933333333 Panel 13 12 0,8 Panel 14 11 0,733333333 Panel 15 12 0,8 Panel 16 14 0,933333333 Panel 17 15 l Panel 18 14 0,933333333 Panel 19 15 l

    [0341] Using the algorithm as described herein, 15 markers were shortlisted (Table 6) based on their contribution to detecting Sézary cells in Sézary Syndrome (SS) patients. Expression of these markers in SS patients as well as control samples was measured using flow cytometry (FACS) technology. Subsequently, the most preferred markers of a reduced set of markers were identified based on the FACS results. The analysis resulted in the preferred panels of 9 and 10 markers, respectively, as shown in Tables 7 and 9, respectively.

    TABLE-US-00007 TABLE 6 CD28 PD-1 CD4 CD3 HLA ABC TIGIT CD197 CD27 CD26 CD44 CD194 CD7 CD45RA CD158k CD164

    TABLE-US-00008 TABLE 7 Panel A: CD3 CD4 CD7 CD8/CD8a CD26 CD27 CD45 CD45RA TIGIT

    [0342] Three data analysis strategies for manual gating of the data were identified, wherein the strategies are used sequentially:

    [0343] Strategy 1: CD45+/CD3+ or CD3low/CD4+ broad/CD26−TIGIT+|Freq. of CD3+ or CD45+

    [0344] Positive SS, if frequency of selected cells is >=XX %

    [0345] If negative, continue with Strategy 2

    [0346] Strategy 2: CD45+/CD3+ or CD3low/CD4+ broad/CD7low|Freq. of CD3+ or CD45+

    [0347] Positive SS, if frequency of selected cells is >=XX %

    [0348] If negative, continue with Strategy 3

    [0349] Strategy 3: CD45+/CD3+ or CD3low/CD4+ broad/CD27+CD45RA+/TIGIT+|Freq. of CD27+CD45RA+

    [0350] Positive SS, if frequency of selected cells is >=XX %

    [0351] If negative, patient is negative for SS

    [0352] By applying these manual gating strategies, Panel A demonstrated accuracy as indicated in Table 8. For comparison, specificity and sensitivity of the CellCnn algorithm achieved on the same dataset is indicated, too.

    TABLE-US-00009 TABLE 8 Strategy Specificity, % Sensitivity, % Strategy 1 100 76 Strategy 1 + 100 86 Strategy 2 Strategy 1 + 100 95 Strategy 2 + Strategy 3 CellCnn algorithm 100 71

    [0353] Panel A performance was determined by analysing FACS data of 21 samples of SS patients and 52 control samples.

    TABLE-US-00010 TABLE 9 Panel B: CD3 CD4 CD7 CD8/CD8a CD26 CD45RO TIGIT CD194/CCR4 CD197/CCR7 CD279/PD-1