Identification of cognate pairs of ligands and receptors

Abstract

A method for identifying cognate pairs of a ligand species and a receptor species includes co-compartmentalizing ligand species and receptor species, forming a set of microreactors, each microreactor including a ligand species and preferably a receptor species; assaying the recognition between ligands and receptors in each microreactor and based on this assay, classifying each microreactor as positive when a ligand species and receptor species in the microreactor recognize one with the other or negative when no ligand species and no receptor species recognize in the microreactor; identifying ligand species and receptor species contained in each positive microreactor; establishing a subset of positive microreactors containing the same receptor species; determining the probability that the ligand species recognizing the receptor species corresponds to the most frequent co-compartmentalized ligand species. If the determined probability exceeds a threshold, identifying as a cognate pair the receptor species and the most frequent co-compartmentalized ligand species.

Claims

1. A method for identifying cognate pairs of a ligand species and a receptor species, comprising the following steps: providing a set of ligands comprising a plurality of ligand species, in which each ligand species is present more than one time; providing a set of receptors comprising at least one receptor species; compartmentalizing ligands of the set of ligands and receptors of the set of receptors to form a set of microreactors, wherein a plurality of the microreactors comprise at least one ligand species and at least one receptor species; assaying recognition between ligands and receptors within microreactors of the set of microreactors and, based on an assay readout within the microreactors, classifying microreactors of the set of microreactors as positive or negative, wherein a microreactor is classified as positive when at least one ligand and receptor in the microreactor recognize one with the other or as negative when no ligand and receptor recognize one with the other in the microreactor; identifying ligand species and receptor species contained in one or more positive microreactors; establishing a subset of positive microreactors each containing a first receptor species; determining a probability that in the subset of positive microreactors containing the first receptor species, the ligand species recognized by the first receptor species corresponds to the most frequent co-compartmentalized ligand species in the subset of positive microreactors; if the determined probability is greater than a predetermined threshold, identifying as a cognate pair the first receptor species and the most frequent co-compartmentalized ligand species in the subset of positive microreactors.

2. The method according to claim 1, wherein the probability is determined in function of the diversity of ligand species, the average number of true positive microreactors for the subset of positive microreactors containing the first receptor species, and the average number of non-cognate ligand species contained in positive microreactors containing the first receptor species.

3. The method according to claim 2, wherein the average number of true positive microreactors for a given subset of positive microreactors containing the same receptor species, is determined according to the following expression:
l=fn, wherein: l is the average number of true positive microreactors in the given subset of positive microreactors containing the same receptor species; n is the number of microreactors containing said receptor species; and f is the average frequency of each ligand species per microreactor which is determined as the ratio between a number of ligand species per microreactor and the total diversity of ligands species.

4. The method according to claim 3, wherein the average number of non-cognate ligand species contained in positive microreactors is determined according to the following expression:
b=f(l+e), wherein: b is the average number of non-cognate ligand species contained in positive microreactors; and e is an average number of measurement errors within the subset of positive microreactors which is determined as a product of a rate of technical false positives due to the assaying and the number of microreactors containing said receptor species.

5. The method according to claim 1, wherein said probability is determined according to the following expression: = .Math. k = 1 + ( b + t k - 1 e - t d t ( k - 1 ) ! ) d - 1 e - l l k k ! , wherein: is said probability for a given subset of positive microreactors containing the same receptor species; d is the diversity of ligand species; l is the average number of true positive microreactors in the given subset of positive microreactors containing the same receptor; and b is the average number of non-cognate ligand species contained in positive microreactors containing the same receptor species.

6. The method according to claim 1, wherein (a) receptors of the set of receptors are expressed by one or more cells, displayed on the surface of one or more cells or one or more beads, or are in vitro encoded; or (b) ligands of the set of ligands are expressed by one or more cells or displayed on the surface of one or more cells or one or more beads, or are in vitro encoded; or both (a) and (b).

7. The method according to claim 6, wherein the recognition between ligands and receptors in each microreactor is assayed by determining if a cellular response is induced in said microreactor, wherein a microreactor is classified as positive when an induced cellular response is determined in said microreactor or as negative when no induced cellular response is determined in said microreactor.

8. The method according to claim 1, wherein the set of receptors is a set of T cell receptors (TCR) and the set of ligands is a set of T cell antigens.

9. The method according to claim 6, wherein additional reagents are added to the positive microreactors, said additional reagents comprising one or more of a reverse transcriptase (RT), a cell lysis buffer, deoxynucleoside triphospates (dNTPs), a plurality of barcoded primers specific for a nucleic acid sequence encoding ligands of the set of ligands, and a plurality of barcoded primers specific for a nucleic acid sequence encoding receptors of the set of receptors, wherein the barcoded primers specific for the ligand-encoding nucleic acid sequence comprise a primer sequence specific for the ligand-encoding nucleic acid sequence and a barcode sequence or barcode set of sequences, wherein the barcoded primers specific for the receptor-encoding nucleic acid sequence comprise a primer sequence specific for the receptor-encoding nucleic acid sequence and a barcode sequence or barcode set of sequences, and wherein the barcode sequence or barcode set of sequences contained in a microreactor is distinguishable from the barcode sequence or barcode set of sequences contained in other microreactors, but the barcoded primers specific for the ligand-encoding nucleic acid sequence and for the receptor-encoding nucleic acid sequence contained in a given microreactor carry a common barcode sequence or barcode set of sequences.

10. The method according to claim 9, wherein said barcoded primers are delivered on particles, wherein each particle carries a barcode sequence or barcode set of sequences distinguishable from barcode sequences or barcode sets of sequences carried by other particles, and each microreactor contains a single particle or between 2 to 10 particles.

11. The method according to claim 9, wherein in the positive microreactors, barcoded cDNAs are prepared by: lysing the cells expressing or displaying receptors and/or the cells expressing or displaying ligands, to release mRNA from the cells, hybridizing at least some of the released mRNA coding for the receptor to the receptor-encoding nucleic acid sequence specific barcoded primer, and at least some of the released mRNA coding for the ligand to the ligand-encoding nucleic acid sequence specific barcoded primer, in at least some of the microreactors, reverse transcribing the released mRNA hybridized to the barcoded primers, thereby obtaining barcoded cDNAs.

12. The method according to claim 8, wherein the set of TCR and the set of T cell antigens are from a subject of interest suffering from cancer, autoimmune disease, inflammatory disease, infectious disease, or metabolic disease.

13. The method of claim 1, wherein each microreactor comprises no more than one receptor species.

14. The method according to claim 2, wherein said probability is determined according to the following expression: = .Math. k = 1 + ( b + t k - 1 e - t d t ( k - 1 ) ! ) d - 1 e - l l k k ! , wherein: is said probability for the given subset of positive microreactors containing the same receptor species; d is the diversity of ligand species; l is the average number of true positive microreactors in the given subset of positive microreactors containing the same receptor; and b is the average number of each non-cognate ligand species contained in positive microreactors containing the same receptor species.

15. The method according to claim 3, wherein said probability is determined according to the following expression: = .Math. k = 1 + ( b + t k - 1 e - t d t ( k - 1 ) ! ) d - 1 e - l l k k ! , wherein: is said probability for the given subset of positive microreactors containing the same receptor species; d is the diversity of ligand species; l is the average number of true positive microreactors in the given subset of positive microreactors containing the same receptor; and b is the average number of each non-cognate ligand species contained in positive microreactors containing the same receptor species.

16. The method according to claim 4, wherein said probability is determined according to the following expression: = .Math. k = 1 + ( b + t k - 1 e - t d t ( k - 1 ) ! ) d - 1 e - l l k k ! , where: is said probability for the given subset of positive microreactors containing the same receptor species; d is the diversity of ligand species; l is the average number of true positive microreactors in the given subset of positive microreactors containing the same receptor; and b is the average number of each non-cognate ligand species contained in positive microreactors containing the same receptor species.

17. The method according to claim 1, wherein the set of receptors is a set of T cell receptors and the set of ligands is a set of T cell antigens bound to major histocompatibility complex (MHC) displayed on the surface of antigen-presenting cells (APCs).

18. The method of claim 17, further comprising obtaining the APCs by introducing a library of nucleic acids encoding T cell antigens into APCs.

19. The method of claim 17, further comprising obtaining the APCs by introducing into APCs a library of synthetic mRNAs encoding antigens, optionally wherein said mRNAs are identified by sequencing the genome, exome or transcriptome of a tumor.

20. The method according to claim 17, wherein the set of T cell receptors is displayed on the surface of T cells.

21. A method for treating cancer, inflammatory disease, autoimmune disease, infectious disease, or metabolic disease in a subject in need thereof comprising the steps of: A) providing a set of ligands comprising a plurality of ligand species, in which each ligand species is present more than one time; B) providing a set of receptors comprising at least one receptor species; C) compartmentalizing ligands and receptors to form a set of microreactors, wherein a plurality of the microreactors comprise at least one ligand species and at least one receptor species; D) assaying recognition between ligands and receptors within microreactors of the set of microreactors and, based on an assay readout within the microreactors, classifying microreactors of the set of microreactors as positive or negative, wherein a microreactor is classified as positive when at least one ligand and receptor in the microreactor recognize one with the other or as negative when no ligand and receptor recognize one with the other in the microreactor; E) identifying ligand species and receptor species contained in one or more positive microreactors; F) establishing a subset of positive microreactors containing a first receptor species; G) determining a probability that in the subset of positive microreactors containing the first receptor species, the ligand species recognized by the first receptor species corresponds to the most frequent co-compartmentalized ligand species; H) if the determined probability is greater than a predetermined threshold, identifying as a cognate pair the first receptor species and the most frequent co-compartmentalized ligand species, wherein: (i) an antigen is identified as being part of a TCR/antigen pair, and/or (ii) a T cell expressing a TCR is identified as being part of a TCR/antigen pair, and/or (iii) a TCR is identified as being part of a TCR/antigen pair, and/or (iv) an ex-vivo engineered immune cell expressing either an antigen and/or a TCR is identified as being part of a TCR/antigen pair; and administering to the subject a therapeutically effective amount of a composition comprising said identified antigen, T cell expressing a TCR, TCR, and/or said ex-vivo engineered immune cell.

Description

BRIEF DESCRIPTION OF THE FIGURES

(1) FIG. 1: Example of workflow

(2) FIG. 2: Statistical analysis of number of measurements per TCR clones required to unequivocal identify the TCR recognizing a specific antigen. The number of measurement depends on the number of antigens per APC (the MOI) and the total number of antigens (the antigen diversity).

(3) FIG. 3: Error rate of pairs mis-assignment has been calculated for typical values of the experimental parameters on the figures. From left to right: detection error/I/b computed as a function of the MOI, for a TCR present in 1% of a population of 1 million cells, for a technical error rate of 1% and an antigen diversity of 50.000.

EXAMPLES

(4) Multiple variations for identifying cognate antigen/receptor pairs are possible and are described in table 1.

(5) Table 1 shows examples of strategies for linking antigen and TCR sequence recovery. The table exemplify the source of material, the method (if any) for mRNA isolation from tissue, the method for generating antigen cDNA (if any), the method of cDNA normalization (if any), the source of APC, the method for antigen expression, the method for antigen barcoding, the method for inducting antigen expression by APC, the methods for T cells and APC co-encapsulations, the methods for detecting T cells activation, the methods for enriching positive hits, the method for barcoding TCR and/or antigen and/or gene of interest, the method for recovering cDNA, the method for amplifying cDNA, the methods for sequencing.

(6) TABLE-US-00001 TABLE 1 variations for identifying cognate antigen/receptor pairs Options Normal Class I sample and/or II Steps Draining lymph Other for library expression Tissue Solid tumor Liquid tumor Circulating tumor cells Circulating tumor DNA nodes Ascites effusion normalization sample Isolation RNA extraction None None mRNA (RT from (no RT) cell lysate) Reverse Total mRNA Specific Identification Template Incl. UMI None transcription mRNA of potential switch for (DNA sequencing) genes encoding for full length molecular T cell antigens cDNA count by using sequencing to determine the genome, exome or transcriptome of a tumor Normalization Equalization of none all cDNA species using duplex specific nuclease APC Artificial (beads Engineered EBV Primary Barcoded of any type) (K562, transformed APC tetramer immortalized autologous or cells . . . ) B cells multimer Antigen Minigenes in Multiple Split pool Transcribed Synthetic Number antigens Antigen length expression vectors vectors synthesis RNA from RNA per APC: from (from 20 nt system mediated examples 1 to 50 genes, to 100 nt) vector above ideally 25 construction (with bead support +/ release of beads at the end of the process) Antigen Barcode per Barcode per None None barcoding antigen combination of (antigen (case 1 (i.e. tag antigen sequence antigen) from 1 nt to sequenced several nt) in part of fully) Transduction Transfection in Transduction in Transfection Transduction MOI of Penetratin of APC solution solution in droplet in droplet 0.1 to synthetic DNA 1000 released in droplet from beads used for the split pool synthesis T cell/APC co Double Complex of T 1 follow None Droplet Fusion encaps Poisson cells and APC Poisson stat. the of them from 40 to of 2 statistic pre formed other do not follow 1 nL droplets before being Poisson volume encapsulated distribution in droplet Detection of Secretion of Degranulation Activation None activation cytokine detection marker detection (based on (IFNg, (perforin, (CD137, CD69, sequencing: TNFa) granzyme) HLA-DR, . . . ) gene specific detection using using fluorescent primer, whole (beadline, fluorescent antibodies transcriptome) cell antibodies surface) Enrichment Droplet Droplet Magneto- Pneumatic None of positive sorting sorting phoresis controllers (use hits using using DEP of acoustic sequencing wave readout) Barcoding One to one Droplet fusion Gene Hydrogel Solid Hydrogel Solid beads electro- by alternating linkage beads beads beads mRNA RNA coalescence current (AC) (TCR and CDNA CDNA capture and capture field electro- antigen synthesis synthesis cDNA synthesis and coalescence in barcode/ (+/PCR) (+/PCR) out of droplet cDNA microchannels sequence) (+/PCR) synthesis RT (+/PCR) out of droplet (+/PCR) cDNA Emulsion Emulsion Emulsion Coalescence recovery breaking with breaking with breaking PFO HFE/PFO with electrocoaslescence Amplification Nested PCR PCR Unbiased amplification (IVT) Sequencing Illumina PacBiO Ion torrent Nanopore . . . Analysis TCR chains Gene panel antigen Whole transcriptome Pair recovery See example 10

(7) Below are specific examples.

Example 1 (Sorting and Sequencing)

(8) Cognate pairs of tumor T cell antigens and T cell receptors are identified using the following method. 1. Isolation of T cells and tissue from patients (in particular from tumor tissue using protease and DNAase in CO.sub.2 independent medium for 45 min at room temperature or 37 C., or from lymph-node, ascites or other effusions as well as blood), isolation of T cells and isolation of tissue cells expressing MHC class II and/or class I molecules. 2. Isolation of mRNA from the tissue and reverse transcription to produce cDNA. Addition of a sense and antisense universal primer sequence to allow the specific amplification of cDNA during subsequent RT and PCRi.e. allowing specific recovery of cDNA encoding antigen in steps 8 and 9. 3. Equalisation of the concentrations of all cDNA species. The cDNA library is normalized using duplex-specific nuclease (DSN) (Zhulidov et al. (2004). Nucleic Acids Res. 32 (3): e37; Bogdanova et al. (2010). Curr. Protoc. Mol. Biol. Chapter 5: Unit 5.12.1-27.), using DSN (Evrogen) according to the manufacturer's instructions. 4. Preparation of EBV transformed autologous B cells 5. Transduction of the EBV transformed autologous B cells with the cDNA library (using for example a lentiviral system) at a multiplicity of infection (m.o.i.) of 10-1000 to generate antigen-presenting cells (APCs) presenting multiple antigens. A selective marker or reporter gene is integrated into the lentiviral vector sequence in order to select/sort transduced APCs. 6. Co-compartmentalisation of 1-10 transfected B cells (APCs) with single T cells in aqueous droplets using a microfluidic system. The transfected B cells are mixed with the T cells in X-VIVO 15 (Lonza) and human serum so as to give an average 1 of 1-10 B cells and 0.1-1 T cells per droplet after compartmentalization in droplets. Droplets are created by hydrodynamic flow-focusing (Anna et al. (2003) Applied Physics Letters, 82 (3), 364-366) on a microfluidic device with a nozzle 25 m wide, 40 m deep, and 40 m long (Eyer et al. Nat Biotechnol (2017)), fabricated using soft-lithography in poly-(dimethylsiloxane) (PDMS) (Duffy et al. (1998) Anal. Chem., 70:4974-4984) as described in Mazutis et al. (2013) Nat. Protocols 8:870-891. The continuous phase comprised 2% (w/w) 008-FluoroSurfactant (RAN Biotechnologies) in Novec HFE7500 (3M) fluorinated oil. The aqueous cell suspension and an aqueous solution comprising paramagnetic colloidal nanoparticles and other detection reagents (see below) are co-flowed on-chip. The flow rates (around 800 l/h for oil, and 100 l/h for each aqueous solution, supplied using a neMESYS syringe pump, Cetoni) are adjusted to create droplets of 403 pl. 7. Detection of T cell activation in the droplets and sorting of the droplets containing activated T cells. To detect T cells secreting TNF- and IFN- a fluorescent sandwich immunoassay is used, in which secreted TNF and/or IFN- are captured onto paramagnetic colloidal nanoparticles coated with anti-TNF and/or IFN- capture antibodies in each droplet (Eyer et al. Nat Biotechnol (2017)). Upon application of a magnetic field, the 1,300 nanoparticles in each droplet form an elongated aggregate (termed a beadline). Each droplet also contains anti-TNF antibodies whose epitopes do not overlap with the capture reagent (red fluorescent) that relocates onto the beadline if TNF is secreted and/or anti-IFN- antibodies whose epitopes do not overlap with the capture reagent (green fluorescent) that relocated to the beadline if IFN- is secreted. The distribution of fluorescence in the droplets is analyzed by re-injected them into a second microfluidic chip, where each droplet is excited with superimposed laser lines (for example, 405 nm and/or 488 nm, and/or 561 nm, and/or 638 nm) and epifluorescence detected using photomultiplier tubes (PMTs). Secretion of TNF and IFN- are determined from red- and green-fluorescence localization to the beadline, respectively. The bioassay readout is monitored and fluorescence activated dielectrophoretic sorting (Baret et al. (2009). Lab Chip, 9:1850-1858) of droplets containing activated T cells and co-compartmentalized B cells is controlled by dedicated software. 8. Addition of unique (droplet-specific) barcoded cDNA primers to each droplet by one-to-one electrocoalescence (Chabert et al. (2005). Electrophoresis, 26 (19), 3706-3715) with 1 nL droplets containing single hydrogel beads (Abate et al. (2009). Lab on a Chip, 9 (18), 2628), cell lysis reagent and reverse transcription reagents, produced as described (Klein et al. (2015). Cell, 161 (5), 1187-1201; Zilionis et al. (2016). Nature Protocols, 12 (1), 44-73). Each bead carries multiple copies of cDNA primers able to prime cDNA synthesis on the mRNA encoding the antigens and mRNA encoding the TCR and chains, and primers on the same bead carry a bead-specific barcode. The beads are produced by split-and-pool synthesis as described (Klein et al. (2015). Cell, 161 (5), 1187-1201; Zilionis et al. (2016). Nature Protocols, 12 (1), 44-73). After droplet fusion, the cells are lysed, primers released from the hydrogel beads by UV-photocleavage and cDNA synthesis performed as described (Klein et al. (2015). Cell, 161 (5), 1187-1201; Zilionis et al. (2016). Nature Protocols, 12 (1), 44-73). 9. Breaking of the emulsion and pooling of the barcoded cDNAs (encoding for TCR and antigen). The emulsion containing the barcoded cDNA was broken by adding one volume of 1H, 1H,2H,2H-Perfluoro-1-octanol (370533, Sigma). The DNAs from each droplet all carry the same barcode. The barcoded cDNA are specifically amplified by a nested PCR approach using forward primers specific for the barcoded cDNA primers and backwards primers specific for each TCR / V gene, and the constant region flanking the antigens, as described in (Han et al. (2014). Nature Biotechnology, 32(7), 684-692) for TCR pairs recovery and gene amplification. 10. The barcoded cDNAs are sequenced with the Illumina system, using 2150 bp paired-end reads for sequencing TCR / and recovering V(D)J sequences (i.e. capturing at least the CDR3 of both and sequences), and 250 bp paired-end reads for antigen identification. 11. Analysis of the sequencing data to identify the pools of antigens recognised by single TCR and chains from activated T cells. Bioinformatic data processing is used for sequence read trimming, merging, barcode extraction and clustering and sequence characterization and filtering. Antigen consensus reads passing threshold from B cells co-compartmentalized with activated T cells and the TCR and chain consensus reads passing threshold from the T cells carry the same barcode. 12. Identification of cognate antigen-TCR pairs as Example 10.

Example 2 (Sorting Only+ Barcoded cDNA)

(9) Cognate pairs of tumor T cell antigens and T cell receptors are identified using the method described in Example 1 except that step 2 is replaced by the following step: 2. Isolation of mRNA from the tissue and reverse transcription using barcoded primers to produce cDNA and the allows subsequent identification of the cDNA by sequencing in steps 10 and 11. Addition of a sense and antisense universal sequence to allow the specific amplification of cDNA during subsequent RT and PCRi.e. allowing specific recovery of cDNA encoding antigen in steps 8 and 9.

Example 3 (without Droplet Sorting)

(10) Cognate pairs of tumor T cell antigens and T cell receptors are identified using the method described in examples 1 and 2 except that step 7 is deleted and steps and 8 are replaced by the following steps. 8. Addition of unique (droplet-specific) barcoded cDNA primers to each droplet by one-to-one electrocoalescence (Chabert et al. (2005). Electrophoresis, 26 (19), 3706-3715) with 1 nL droplets containing single hydrogel beads (Abate et al. (2009). Lab on a Chip, 9 (18), 2628), cell lysis reagent and reverse transcription reagents, produced as described (Klein et al. (2015). Cell, 161 (5), 1187-1201; Zilionis et al. (2016). Nature Protocols, 12 (1), 44-73). Each bead carries multiple copies of cDNA primers able to prime cDNA synthesis on the mRNA encoding the antigens, on the mRNA encoding the TCR and chains and a on panel of mRNAs encoding activation markers (IFNg, TNFa, CD69, HLA-DR, CD137, GRZM, PRF, CD25, OX40, CD38), and primers on the same bead carry a bead-specific barcode. The beads are produced by split-and-pool synthesis as described (Klein et al. (2015). Cell, 161 (5), 1187-1201; Zilionis et al. (2016). Nature Protocols, 12 (1), 44-73). After droplet fusion, the cells are lysed, primers released from the hydrogel beads by UV-photocleavage and cDNA synthesis performed as described (Klein et al. (2015) Cell, 161 (5), 1187-1201; Zilionis et al. (2016). Nature Protocols, 12 (1), 44-73). 9. Breaking of the emulsion and pooling of the barcoded cDNAs (encoding for TCR, antigen and activation markers). The emulsion containing the barcoded cDNA was broken by adding one volume of 1H, 1H,2H,2H-Perfluoro-1-octanol (370533, Sigma). The DNAs from each droplet all carry the same barcode. The barcoded cDNA are specifically amplified by a nested PCR approach using forward primers specific for the barcoded cDNA primers and backwards primers specific for each TCR / V gene, and the constant region flanking the antigens, as described in Han et al. 2014 (NBT) for TCR pairs recovery and gene amplification. 10. The barcoded cDNAs are sequenced with the Illumina system, using 2150 bp paired-end reads for sequencing TCR / and recovering V(D)J sequences (i.e. capturing at least the CDR3 of both and sequences), and 250 bp paired-end reads for antigen identification and sequencing of the activation marker mRNAs. 11. Analysis of the sequencing data to identify the pools of antigens recognised by single TCR and chains from activated T cells. Bioinformatic data processing is used for sequence read trimming, merging, barcode extraction and clustering and sequence characterization and filtering. Antigen consensus reads passing threshold from B cells co-compartmentalized with activated T cells and the TCR and chain consensus reads passing threshold and the activation marker reads from the T cells carry the same barcode.

Example 4: Sorting and Sequencing of the Activation Markers

(11) Cognate pairs of tumor T cell antigens and T cell receptors are identified as in example 3, except that step 7 from examples 1 and 2 is not deleted.

Example 5 (Synthetic mRNA Coding for Identified Candidate Antigens)

(12) Cognate pairs of tumor T cell antigens and T cell receptors are identified as in examples 1 to 4, except that steps 2 to 5 are replaced by the following steps. 2. Identification of potential genes encoding T cell antigens by using sequencing to determine the genome, exome or transcriptome of a tumor. 3. Transfection of antigen-presenting cells (APCs) with synthetic mRNAs (either as a tandem genes or as single gene) encoding antigens identified by sequencing at step 2. Synthetic RNA are generated by in vitro-transcription which are subsequently electroporated into APC/B cells (see Sahin et al. (2017) Nature 547:222-226).

(13) Synthetic RNA may optionally contain an antigen barcode and/or a universal sequence to amplify the RNA to retrieve the antigen specific information during the functional readout (combining either phenotypic screening and sequencing or by sequencing only).

Example 6 (Split and Pool with the Individual mRNA)

(14) Cognate pairs of tumor T cell antigens and T cell receptors are identified as in examples 1-4, except that steps 2 to 5 are replaced by the following steps. 2. Identification of potential genes encoding T cell antigens by using sequencing to determine the genome, exome or transcriptome of a tumor. 3. Making 384 synthetic photo-cleavable 5-biotinylated RNAs, corresponding to fragments of the genes coding for candidate T cell antigens. 4. Distributing the synthetic RNAs into a 384-well plate, at a concentration of 0.33 M in washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20) and in a volume of 50 L per well (to occupy less than 1/24th of streptavidin on beads in step 4) 5. Hydrogel beads carrying photo-cleavable 5-biotinylated RNAs are produced by split-and-mix synthesis using a method adapted from that previously described (Zilionis et al. (2017) Nat Protoc 12, 44-73; Klein et al. (2015) Cell 161, 1187-1201). 60 m diameter polyethylene diacrylate (PEG-DA) hydrogel beads containing streptavidin acrylamide are produced using a microfluidic device essentially as Zilionis et al. (2017) Nat Protoc 12, 44-73. The 160 pl droplets were produced at 4.5 KHz frequency and were exposed at 200 mW/cm.sup.2 with a 365 nm UV light source (OmniCure ac475-365) to trigger gel bead polymerization. Recovered gel beads are washed 10 times with washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20) and resuspended in the same buffer. Each bead has a binding capacity of 10.sup.7 biotinylated RNAs. One million PEG-DA-streptavidin beads are added, in a volume or 50 L, to each well in the first column of the 384-well plate containing the synthetic photo-cleavable 5-biotinylated RNAs and incubate for 60 mins at room temperature to allow binding. At this point less than 1/24th of the available biotin binding sites on the beads are occupied. 6. Recovering the contents of each well, washing the beads three times with washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20). 7. Pool the washed beads in 500 l of with washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20). 8. Redistribute the beads into each well of the second row of the 384-well plate. 9. Repeat steps 4 to 7, after each step re-distributing the beads into each well of the next row of the plate. 10. Recovering the contents of each well from the last row of the plate washing the beads three times with 500 l of washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20). 11. Pool the washed beads in 200 l of nuclease free water. At this point each bead carries 24 different RNAs, each RNA is present on one bead in 16, but as there are 8.sup.24=4.710.sup.21 possible permutations of the 24 RNAs on different beads every bead has a different permutation of the 24 RNAs. 12. Single B cells are co-compartmentalized in 1 nL volume droplets with single hydrogel beads carrying RNAs and transfection reagents (Lipofectamine MessengerMAX; ThermoFisher Scientific) using a microfluidic device as described (Zilionis et al. (2017) Nat Protoc 12, 44-73). The droplets are collected in a 1.5 mL tube containing HFE-7500 and 0.1% surfactant, UV photo-cleaved for 90 seconds (OmniCure ac475-365) and incubated at room temperature for 5 mins to transfect the cells with the RNA. 13. Transfected cells are recovered by addition of 100 L of EX-VIVO 15 supplemented with 5% human serum, followed by 100 L of 1H, 1H,2H,2H-Perfluoro-1-octanol (370533, Sigma) and gently mixed.

Example 7. Penetratin Based Delivery of DNA Allowing mRNA Translation

(15) Cognate pairs of tumor T cell antigens and T cell receptors are identified as in example 6, except that steps 3 and 12 are replaced by the following steps. 3. Making 384 synthetic photo-cleavable 5-biotinylated RNAs, corresponding to fragments of the genes coding for candidate T cell antigens and coupling these to the cell penetrating peptide, penetratin. 12. Single B cells in EX-VIVO 15 are co-compartmentalized in 1 nL volume droplets with single hydrogel beads carrying RNAs using a microfluidic device as described (Zilionis et al. (2017) Nat Protoc 12, 44-73). The droplets are collected in a 1.5 mL tube containing HFE-7500 and 0.1% surfactant, UV photo-cleaved for 90 seconds (OmniCure ac475-365) and incubated at room temperature for 5 mins to transfect the cells with the RNA.

Example 8 (Individual DNA Made on Beads and Transcribed into mRNA and Transfected in Drop)

(16) Cognate pairs of tumor T cell antigens and T cell receptors are identified as in example 1-4, except that steps 2-5 are replaced by the following steps. 2. Identification of potential genes encoding T cell antigens by using sequencing to determine the genome, exome or transcriptome of a tumor. 3. Making 384 synthetic 5-biotinylated DNAs, comprising fragments of the genes coding (plus strand) for T cell antigens with an upstream T7 RNA polymerase promoter. 4. Making 384 synthetic DNAs, complementary to the oligonucleotides from step 2, and annealing them to the oligonucleotides from step 2. 5. Distributing the double stranded synthetic DNAs from step 3 into a 384-well plat, at a concentration of 0.33 UM in washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20) and in a volume of 50 L per well (to occupy less than 1/24.sup.th of streptavidin on beads in step 4) 6. Hydrogel beads carrying double stranded synthetic DNAs are produced by split-and-mix synthesis using a method adapted from that previously described.sup.1,2. 60 m diameter Polyethylene diacrylate (PEG-DA) hydrogel beads containing streptavidin acrylamide are produced using a microfluidic device essentially as.sup.1. The 160 pl droplets were produced at 4.5 KHz frequency and were exposed at 200 mW/cm.sup.2 with a 365 nm UV light source (OmniCure ac475-365) to trigger gel bead polymerization. Recovered gel beads are washed 10 times with washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20) and resuspended in the same buffer. Each bead has a binding capacity of 10.sup.7 biotinylated RNAs. One million PEG-DA-streptavidin beads are added, in a volume or 50 L, to each well in the first column of the 384-well plate containing the synthetic double stranded synthetic DNAs and incubate for 60 mins at room temperature to allow binding. At this point less than 1/24.sup.th of the available biotin binding sites on the beads are occupied. 7. Recovering the contents of each well, washing the beads three times with washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20). 8. Pool the washed beads in 500 l of with washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20). 9. Redistribute the beads into each well of the second column of the 384-well plate. 10. Repeat steps 5 to 7, after each step re-distributing the beads into each well of the next row of the plate. 11. Recovering the contents of each well from the last column of the plate washing the beads three times with 500 l of washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20). 12. Pool the washed beads in 200 l of nuclease free water. At this point each bead carries 24 different DNAs, each DNA is present on one bead in 16, but as there are 8.sup.24=4.710.sup.21 possible permutations every bead has a different permutation of the 24 DNAs. 13. Single B cells are co-compartmentalized in 1 nL volume droplets with single hydrogel beads carrying RNAs reagents for in vitro transcription (HiScribe T7 ARCA mRNA kit, with tailing) and transfection (Lipofectamine MessengerMAX; ThermoFisher Scientific) using a microfluidic device as described.sup.1. The droplets are collected in a 1.5 mL tube containing HFE-7500 and 0.1% surfactant, UV photo-cleaved for 90 seconds (OmniCure ac475-365) and incubated at room temperature for 30 mins to in vitro transcribe the RNA and transfect the cells with the RNA. 14. Transfected cells are recovered by addition of 100 L of X VIVO media supplemented with 5% human serum, followed by 100 L of 1H, 1H,2H,2H-Perfluoro-1-octanol (370533, Sigma) and gently mixed.

Example 9 (Minigene Made on Beads and Transcribed into mRNA and Transfected in Drop)

(17) Cognate pairs of tumor T cell antigens and T cell receptors are identified as in example 1-4, except that steps 2-5 are replaced by the following steps. 2. Identification of potential genes encoding T cell antigens by using sequencing to determine the genome, exome or transcriptome of a tumor. 3. Making 384 synthetic DNA oligonucleotides in which the mutated amino acid residues and flanking 12 amino acids on both sides are encoded by a minigene of 75 nucleotides that can be transcribed in vitro using T7 or T3 or SP6 RNA polymerase, where each of the genes carry a T7, SP6 or T3 RNA polymerase promoter, optionally a ribosome entry site (Kozak sequence), an initiation codon and termination signal. The RNA synthesis can include cap nucleotide analogs to further stabilize the RNA or other sequences, or features, as known by people skilled in the art (NEB, Thermo fisher website). 4. Making 384 synthetic DNAs, complementary to the oligonucleotides from step 2, and annealing them to the oligonucleotides from step 2. Annealing results in double-stranded DNA with a different 4 nucleotide 5-overhang on each end. The first set of 16 DNAs have a first overhang complementary to the 5-overhang present on the DNA on the hydrogel bead (see step 3) and a second overhang that is different from the first, but identical in all 16 DNAs. The second set of 16 DNAs have a first overhang complementary to the 5-overhang present on the first set of 16 DNAs, and not identical to other 5-overhangs, and a second overhang that is different from the first, but identical in all 16 DNAs. The third, and subsequent sets, up to the 24th set are designed in the same manner, with a first overhang complementary to the 5-overhang present on the previous set of 16 DNAs and a second overhang that is different from the first, and not identical to other 5-overhangs, but identical in all 16 DNAs. 5. Distributing the double stranded synthetic DNAs from step 3 into a 384-well plate in T7 DNA ligase buffer (NEB), the first set of 16 DNAs being distributed into the first column, the second set of 16 DNAs being distributed into the second column, and so on, until column 24. Each well contains 5 l of 5 M double-stranded DNA and T7 DNA ligase (NEB, #M0318) as per the manufacturer's instructions. 6. Hydrogel beads carrying double stranded synthetic DNAs are produced by split-and-mix synthesis using a method adapted from that previously described. 60 m diameter Polyethylene diacrylate (PEG-DA) hydrogel beads containing streptavidin acrylamide are produced using a microfluidic device essentially as Zilionis et al. (2017) Nat Protoc 12, 44-73. The 160 pl droplets were produced at 4.5 kHz frequency and were exposed at 200 mW/cm.sup.2 with a 365 nm UV light source (OmniCure ac475-365) to trigger gel bead polymerization. Recovered gel beads are washed 10 times with washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20) and resuspended in the same buffer. Twenty four million PEG-DA beads are incubated in 1 ml final volume for 1 h at room temperature with 50 M of a photo-cleavable biotinylated dsDNA oligonucleotide with a 5 overhang complementary to the 5-overhang of the dsDNA oligos in the first column of the microtitre plate in step 4. One million PEG-DA-streptavidin beads coupled to the dsDNA are added to each well in the first column of the 384-well plate containing the synthetic double stranded synthetic DNAs and incubated for 60 mins at 20 C. to allow ligation. 7. Recovering the contents of each well, washing the beads as described (Zilionis et al. (2017) Nat Protoc 12, 44-73). 8. Pooling the washed beads and redistributing into each well of the second row of the 384-well plate. 9. Repeat steps 5 to 7, after each step re-distributing the beads into each well of the next column of the plate. 10. Recovering the contents of each well from the last column of the plate washing the beads three times with 500 l of washing and binding buffer (100 mM Tris pH 7.4, 0.1% v/v Tween 20). 11. Pool the washed beads in 200 l of nuclease free water. At this point each bead carries 510.sup.7 copies of a tandem minigene (TMG) construct with 24 different minigenes, each minigene is present on one bead in 16, but as there are 8.sup.24=4.710.sup.21 possible permutations of the minigenes every bead has a different permutation of the 24 DNAs. 12. Single B cells are co-compartmentalized in 1 nL volume droplets with single hydrogel beads carrying the TMGs and reagents for in vitro transcription (HiScribe T7 ARCA mRNA kit, with tailing) and transfection (Lipofectamine MessengerMAX; ThermoFisher Scientific) using a microfluidic device as described.sup.1. The droplets are collected in a 1.5 ml tube containing HFE-7500 and 0.1% surfactant, UV photo-cleaved for 90 seconds (OmniCure ac475-365) and incubated at room temperature for 30 mins to in vitro transcribe the RNA and transfect the cells with the RNA. 13. Transfected cells are recovered by addition of 100 L of EXVIVO15 supplemented with 5% human serum, followed by 100 L of 1H, 1H,2H,2H-Perfluoro-1-octanol (370533, Sigma) and gently mixed.

Example 10 (Same as Above with a Different Coupling Chemistry)

(18) As example 8 and 9, except that the hydrogel beads in step 5 are made from polyacrylamide and the photo-cleavable biotinylated dsDNA oligonucleotide carries a 5 acrydite group and is covalently coupled to the polyacrylamide during polymerisation, as described (Zilionis et al. (2017) Nat Protoc 12, 44-73).

Example 11 (Same as Above with a Different Reverse Transcription Reaction Chemistry)

(19) In all examples above, the reverse transcription reaction performed in step 8 and the cDNA amplification described in step 9 can be either individually or both be replaced by a template switch reverse transcription reaction and specific amplification.

(20) Such reaction is performed in droplet with an enzyme compatible with priming on non templated nucleotides (such as SuperScriptase 2, MutliScribe RT, SmartScribe RT, Maxima H RT), deoxynucleotide triphosphates (dNTPs), and a template switch oligo, as well as a plurality of primers.

(21) The plurality of primers can serve as reverse transcription primers with gene specific sequence (or polydT or random sequence) at the 3 end and an universal primer sequence followed with cell specific barcode at the 5 end.

(22) The template switch primer comprises an universal sequence and a sequence known for people skilled in the art to associate with non templated nucleotides generated during the RT by the reverse transcriptase at the 3 end of the cDNA, typically triple Cytosine.

(23) Alternatively, the plurality of primers will serve as free floating reverse transcription primers with gene specific sequence (or polydT or random sequence) at their 3 end. The template switch primer would comprise a single cell barcode sequence and a primer sequence known for people skilled in the art to associate with non templated nucleotides generated during the RT.

(24) The amplification of the generated cDNA includes PCR reaction, with primer priming on the universal primers on both the templates with oligos or the plurality of primers. A second round of PCR may be used for specifically amplifying only a sub set of generated cDNA.

(25) Typical reaction includes using template switch primer at 1 or 1.5 M concentration, 0.9M Betaine, 0.5% Igepal CA630, 1 enzyme buffer, 2800 Unit of RT enzyme, 0.4M RT primer, 0.7 mM each dNTP, 2.3 mM DTT, 6.3 mM MgCl2, 24 Units of RNase inhibitor. Reaction is performed for 1 hour at 50 C. Emulsion is then broken using Perfluoro-octanol, cDNA are purified using AMpure Beads and processed for PCR using universal primers present on the RT primers and on the template switch oligos.

Example 12

(26) This example statistically confirms that it is possible to identify a repertoire of TCR-antigen binding interactions by screening the binding activation between a library of cells presenting a high diversity of antigens and a highly diverse but biased population of TCR variants from a pool of primary T cells from a patient tumor.

(27) Objective

(28) The objective is to identify a repertoire of TCR-antigen binding interactions by screening the binding activation between a library of cells presenting a high diversity of antigens (1-5.Math.10.sup.4 variants, potentially including mutated epitopes) and a highly diverse but biased population of TCR variants from a pool of primary T cells (10.sup.6 cells comprising some variants representing up to a few %), for example coming from a patient tumor.

(29) Strategy

(30) Each T cell is compartmentalized in a droplet together with an Antigen Presenting Cell (APC). The latter may harbour a diversity of antigens (or epitopes), for example after lentiviral transduction at high Multiplicity Of Infection (MOI). Activation interactions inside droplets are measured by a fluorescence assay. Sorting upon fluorescence will lead to a population of positive T cell/APC pairs. TCRs and epitopes from the co-encapsulated APC are co-identified with a single droplet level sequencing strategy. The inventors have optimized the different parameters, notably the MOI, to obtain sufficiently low error rates (ex: 0.1%-1%) in the identification of antigen-TCR pairs.

(31) A workflow example is presented on FIG. 1.

(32) Experimental Parameters

(33) N, total number of cells (10.sup.6) n, number of clones in the total population possessing a given TCR (10.sup.3-10.sup.5, corresponding to 0.1% to 10% of the total population) d, diversity of antigens (1-5.Math.10.sup.4) m, number of distinct antigens displayed per APC (1-10.sup.3) a, rate of technical false positives due to assay (for example sorting errors 10.sup.3-10.sup.2)
Derived Parameters s=1/n, sensitivity=fraction of the total population represented by a given TCR clone which we want to associate to an antigen. For example, identifying the target of a TCR displayed by 10.sup.3 cells among a total population of 10.sup.6 T cells would correspond to a sensitivity of s=10.sup.3/10.sup.6=0.1%. f=m/d, average frequency of each antigen variant per APC (or per droplet) l=f*n, average number of true positives in the clonal TCR population e=a*n, average number of technical false positives (measurement errors) within a TCR clonal population b=f*(l+e), average number of each negative antigen in the population measured as positive and associated to a given TCR
Measurement and Error Estimation

(34) By quantifying the successful identification, it is meant estimating the probability that the following assertion is correct: Considering all T-cells with a same TCR in the positively sorted population, the antigen recognized by the TCR corresponds to the most frequent antigen found in the co-compartmentalized APCs.

(35) The number of true positives in the sorted population follows a Poisson distribution of parameter l:

(36) ( k , l ) = e - l l k k ! .

(37) Now, consider the true epitope to be present k times. Identification errors come from the possibility that by chance, another (negative) epitope is represented more than k times. This is in particular due to technical false positive droplets. Although each occurrence of the true epitope should be highly enriched after sorting, there is initially a very large number (typically d>10.sup.4) of potentially false epitopes co-compartmentalized with each given TCR.

(38) Occurrences of each negative epitope follows a Poisson distribution of parameter b. The probability that a given negative epitope is present strictly less than k times is given by the cumulative probability of the Poisson distribution of parameter b:

(39) H ( k , b ) = b + t k - 1 e - t d t ( k - 1 ) !

(40) The probability that all the d1 negative epitopes are present less than k times is:
H(k,b).sup.d-1
The probability .sub.k that there exist a negative epitope which is represented more than k times in the sorted population associated with the TCR is then:
.sub.k=1H(k,b).sup.d-1

(41) The probability of successful identification is finally:

(42) ( l , b , d ) = 1 - .Math. k = 0 + k ( k , l ) = .Math. k = 0 + ( b + t k - 1 e - t d t ( k - 1 ) ! ) d - 1 e - l l k k !

(43) It can be seen that the probability of success depends on the parameters (l,b,d).

(44) The probability of successful identification has been calculated for a TCR present in 1% of a population of 1 million cells, for a technical error rate of 1% and an antigen diversity of 50.000, the detection error, l and b be computed as a function of m, the number of distinct antigens displayed per APC.

(45) FIG. 2 shows the statistical analysis of number of measurements per TCR clones required to unequivocal identify the TCR recognizing a specific antigen. The number of measurement depends on the number of antigens per APC (the MOI) and the total number of antigens (the antigen diversity).

(46) FIG. 3 shows the error rate of pairs mis-assignment calculated for typical values of the experimental parameters on the figures. From left to right: detection error/I/b computed as a function of the MOI, for a TCR present in 1% of a population of 1 million cells, for a technical error rate of 1% and an antigen diversity of 50.000.

(47) It was thus demonstrated that it is possible to detect TCR frequencies down to 0.1% if the epitopes diversity is typically below 10.000, using m of 300. For higher epitopes diversity (50.000), one has to use a high m (1000) to detect 0.1% TCR frequencies, while a moderate m of 100 seems sufficient to detect 1% TCR frequencies.