Methods for Identifying Multiple Epitopes in Selected Sub-Populations of Cells
20230049314 · 2023-02-16
Inventors
Cpc classification
G01N2458/10
PHYSICS
C12Q2537/143
CHEMISTRY; METALLURGY
C12N15/1065
CHEMISTRY; METALLURGY
C12Q2537/143
CHEMISTRY; METALLURGY
C12Q1/6806
CHEMISTRY; METALLURGY
International classification
C12N15/10
CHEMISTRY; METALLURGY
Abstract
A method for identifying a sub-population within a mixed population of cells is disclosed. The method involves contacting the mixed population of cells with at least one unique binding agent, wherein the at least one unique binding agent is designed to bind to a target molecule present in the sub-population, and wherein the at least one unique binding agent is attached to an epitope specific barcode that represents the identity of the target molecule. The method further involves sequentially attaching two or more assayable polymer subunits to the epitope specific barcode to create unique cell origination barcodes that represent the identities of individual cells to which the at least one unique binding agent has bound; and decoding the epitope specific barcode and cell origination barcodes, thereby identifying the sub-population within the mixed population of cells.
Claims
1-20. (canceled)
21. A method for generating barcoded cDNA, comprising: (a) creating a mixture of cells that are fixed and permeabilized; (b) hybridizing a poly-dT oligonucleotide to mRNA in the cells; (c) reverse transcribing the mRNA in the cells to produce cDNA; and (d) adding a barcode to the cDNA by a method that comprises stepwise ligation of at least two assayable polymer subunits to the 5′ end of the cDNA via successive rounds of split pool synthesis, wherein each round comprises: (i) splitting the cells into a plurality of aliquots, (ii) incubating each aliquot with a different assayable polymer subunit, (iii) ligating the assayable polymer subunit onto the cDNA, (iv) optionally rinsing the cells, and (v) pooling of the aliquots; wherein the barcode is made up of distinct combinations of the different assayable polymer subunits.
22. The method of claim 21, wherein the ligation in each round is enabled by annealing of a splint oligonucleotide.
23. The method of claim 21, wherein at least one of the assayable polymer subunits comprises a random sequence.
24. The method of claim 21, wherein the barcode comprises a biotin moiety.
25. The method of claim 24, wherein the method comprises isolating the barcoded cDNA on a support.
26. The method of claim 25, further comprising sequencing the barcoded cDNA, or an amplification product thereof.
27. The method of claim 21, wherein the barcode identifies an individual cell.
28. The method of claim 21, wherein step (c) comprises (iv) rinsing the cells.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0036] The novel features of the invention are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings of which:
[0037]
[0038]
[0039]
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
[0056]
[0057]
[0058]
[0059]
[0060]
[0061]
[0062]
[0063]
[0064]
[0065]
[0066]
[0067]
DETAILED DESCRIPTION
[0068] The present disclosure is an extension of the methods, compositions, and kits described previously in published patent applications PCT/US2012/023411 and PCT/US2013/054190, which are incorporated herein by reference. In particular, the present disclosure describes methods, compositions, and kits for the detection of a plurality of target nucleic acid sequences in single cells, and more specifically, detection of a plurality of target mRNA sequences in single cells, using proximity probes designed to minimize non-specific hybridization and amplification of background signal, thereby improving detection sensitivity and specificity. In some embodiments, the disclosed methods are applied to the detection of a plurality of target mRNA sequences within selected sub-populations of cells in biological samples comprising complex mixtures of cells. In particular, the present disclosure describes methods, compositions, and kits for the detection of a plurality of target molecules in single cells within selected sub-populations of cells in biological samples comprising complex mixtures of cells.
[0069] The present disclosure provides improvements over the previously disclosed techniques in that the disclosed methods provide means for (i) identifying dead cells, cell fragments, or cell clusters within a cell population and eliminating them from further analysis, thereby improving the quality of cell-specific data collected for a complex mixture of cells, and (ii) identifying rare cells or selected subpopulations of cells within the complex mixture of cells, based on the presence of specific intracellular or extracellular markers, and restricting the subsequent analysis to the selected set of cells, thereby improving the specificity of the data collected.
[0070] Multiplexed testing at the single cell level is a key advantage of the disclosed methods, and provides a number of potential benefits including improved understanding of the physiological processes within individual cells, reduced sample quantity requirements (proportional to the number of multiplexed measurements), improved testing accuracy (through the elimination of sample handling and measurement errors associated with replicate testing), and significant savings in terms of labor and cost.
I. Definitions
[0071] As used herein, the phrase “unique binding agent” (UBA) refers to one of a variety of detection reagents for use in the disclosed methods. Each UBA is capable of binding or hybridizing to a single species of target molecule. It is this specificity of binding or hybridization that enables detection of the target molecule in a given individual cell.
[0072] As used herein, the term “epitope” is used in a more general sense to refer to the target molecule (including, but not limited to, proteins, peptides, DNA, RNA, mRNA, oligonucleotides, lipids, carbohydrates, and small molecules) or portion of a target molecule that is recognized by a unique binding agent. In common with the published patent applications referenced above, the terms “epitope” and “target molecule” are used interchangeably herein to refer to the molecule of interest (or a portion thereof) that is being detected and/or quantified by the methods described herein.
[0073] As used herein, the phrase “epitope specific barcode” (ESB) refers to a unique code that is associated with a specific epitope or target molecule. In some embodiments, the ESB is a molecule or assembly of molecules capable of encoding the identity of a target molecule. Examples of suitable ESB molecules or molecular assemblies include, but are not limited to, peptide sequences, oligonucleotide sequences, strings of covalently or non-covalently linked but distinguishable nanoparticles, and the like.
[0074] As used herein, the phrase “assayable polymer subunit” (APS) refers to a molecular building block comprising a distinct packet of information, wherein the molecular building blocks are capable of being linked in an ordered fashion to create cell origination barcodes. Examples of suitable assayable polymer subunits include, but are not limited to, amino acids, peptides, oligonucleotides, nanoparticles, and the like.
[0075] As used herein, the phrase “cell origination barcode” (COB) refers to an ordered assembly of assayable polymer subunits that creates a molecule or molecular assembly that encodes the identity of an individual cell. Examples of suitable COB molecules or molecular assemblies include, but are not limited to, peptide sequences, oligonucleotide sequences, strings of covalently or non-covalently linked but distinguishable nanoparticles, and the like.
[0076] As used herein, the phrase “common linker” (CL) refers to a linker moiety that may be directly or indirectly attached to UBA, ESB, or APS subunits for use in assembling molecular barcodes.
[0077] As used herein, the term “splint” refers to a template molecule used in the assembly of APS to form cell origination barcodes. In some embodiments, splint (or template, or annealing primer) molecules are oligonucleotides.
[0078] As used herein, the term “sub-code” (SC) refers to a unique coding region and/or a detectable molecule contained within an APS, wherein the serial combination of two or more APS create an individual COB having a detectable code or signal that distinguishes it from all other COB.
[0079] As used herein, the phrase “stop code” refers to a segment of a splint or template molecule that is designed to prevent replication or amplification of the splint or template molecule.
[0080] As used herein, the phrase “combinatorial synthesis” refers to synthetic methods that make it possible to synthesize large numbers of compounds (tens to thousands to hundreds of thousands, or more) in a single process comprising a minimal number of chemical coupling steps.
[0081] As used herein, the phrase “split-pool synthesis” refers to one example of a combinatorial synthesis process in which a reaction mixture is divided into several different aliquots prior to performing a coupling reaction, and wherein each aliquot receives a different monomer or component to be coupled. Following the coupling reaction, the aliquots are combined (pooled), mixed, and divided (split) into a new set of aliquots prior to performing the next round of coupling.
[0082] As used herein, the phrase “proximity probe” refers to each of a pair of probe molecules that are capable of hybridizing to different segments of the same target molecule. In some embodiments, proximity probes may be pairs of oligonucleotide probes capable of hybridizing to different segments of the same target oligonucleotide molecule. In some embodiments, the different segments of the target oligonucleotide recognized by the probes are in close proximity to each other.
[0083] As used herein, the phrase “bridge molecule” (or “bridge”) refers to a connector molecule that is capable of binding or hybridizing to two corresponding proximity probes only when the latter are bound to, or hybridized with, their respective target molecule. In some embodiments, the bridge molecule is an oligonucleotide that is capable of simultaneously hybridizing to each of a pair of oligonucleotide proximity probes.
II. Overview of Assay Methodology
[0084] The methods, compositions, and kits of the present disclosure provide means for the detection of a plurality of target molecules in single cells (including of selected sub-populations of cells) using a set of novel detection and barcoding reagents. In some embodiments of the disclosed methods, detection of a plurality of target molecules in single cells from selected sub-populations of cells is enabled. In general, the approach comprises the use of unique binding agents (UBA) to detect target molecules of interest, epitope specific barcodes (ESB) to encode the identities of the target molecules recognized by the UBA, and assayable polymer subunits (APS) to create unique cell origination barcodes (COB) that identify individual cells, thereby allowing one to define a selected sub-population of cells within a sample comprising a complex mixture of cells on the basis of a specified set of biomarkers, and subsequently correlate the detection of one or more target molecules with individual cells in the selected sub-population of cells.
[0085] Unique binding agents comprise the detection reagents for use in the disclosed methods. Each UBA is specific for a single target molecule species, and provides the binding or hybridization specificity that enables detection of the target molecule in a given individual cell. In many embodiments of the disclosed methods, cell samples are incubated with one or more UBA (either prior to or following fixation and/or permeabilization of the cells), and non-bound UBA are subsequently rinsed away. Those UBA bound to target molecules on or within the cells of the sample may then subsequently be identified using epitope specific barcodes (ESB). Each ESB comprises a unique code that is associated with the UBA for a specific target molecule (
[0086] In addition to the ESB used to identify specific target molecules, the disclosed methods, compositions, and kits provide components for creating cell origination barcodes that provide a means for assigning detected target molecules to specific individual cells. Each individual COB comprises a unique code that is associated with a specific cell of origin. Thus the collection of UBA for an individual cell, as identified by their associated ESB, will share a common COB that is different from the COB for all other cells in the sample.
[0087] In some embodiments, the COB are composed of two or more assayable polymer subunits attached to the bound UBA-ESB complex (
[0088] Decoding of the ESB-COB complexes to identify the target molecules present in individual cells of the sample can be performed using a variety of techniques, as described in the published patent applications referenced above. In some embodiments, the ESB-COB complexes are decoded by amplification and sequencing. Accordingly, certain aspects of the present disclosure provide methods for barcoding cells using a plurality of UBA-ESB complexes and a set of APS, wherein each APS comprises a unique SC, and wherein the COB for each UBA-ESB-COB complex is the same for a given cell and distinct from those for all other cells, and wherein the amplification and sequencing of the complete set of ESB-COB complexes allows one to catalogue the complete set of target molecules associated with each individual cell in the sample or in a selected sub-population of cells. In some embodiments of the disclosed methods, compositions, and kits, selective amplification of UBA-ESB-COB complexes of interest is enabled through the design and use of target-specific or semi-random amplification primers that produce amplified product comprising only those sequences of interest and of an appropriate length to optimize the efficient use of the sequencing capacity of modern high-throughput sequencing systems.
III. Compositions
A. Unique Binding Agents (UBA)
[0089] UBAs are molecules or molecular assemblies that are designed to bind to or hybridize with at least one target molecule or portions thereof, and can, under appropriate conditions, form a molecular complex comprising the UBA and the target molecule. Examples of target molecules include, but are not limited to, proteins, peptides, nucleic acids, DNA, RNA, mRNA, lipids, carbohydrates, small organic molecules, drug molecules, organic monomers, and ions. For convenience, most of the methods, compositions, and kits described herein are explained within the context of UBA that bind to a target protein or a target mRNA. However, these methods, compositions, and kits can also be applied to other target molecules.
[0090] In some embodiments, UBA comprise at least one recognition element that allows them to bind to or interact with at least one target molecule, at least one part of at least one target molecule, at least one target molecule surrogate, at least part of a target molecule surrogate, or combinations thereof. UBA typically bind to or interact with target molecules in a sequence-specific manner, a conformation-specific manner, or a combination of both. Examples of suitable molecular recognition interactions include, but not limited to, antibody-antigen binding, receptor-ligand binding, aptamer-target binding, enzyme-substrate recognition, oligonucleotide probe-target sequence hybridization, and the like. Accordingly, suitable recognition elements for use in constructing UBA include, but are not limited to, antibodies, receptors, enzymes, peptoids, aptamers, peptide aptamers, nucleic acid aptamers, oligonucleotide probe sequences, and the like.
[0091] In some embodiments, UBA comprise at least one common linker (CL) element that allows them to attach to or hybridize with an ESB that encodes for the identity of the target molecule and/or a COB that encodes for the identity of a specific individual cell. The common linker may be directly or indirectly attached to the UBA. In some embodiments, the common linker element may be an oligonucleotide molecule. In some embodiments, the common linker element may be an oligonucleotide sequence that is covalently attached to the UBA, while in some embodiments it may be non-covalently attached to the UBA.
[0092] In some embodiments, UBA further comprise a capture region which may be used for isolation of the UBA and/or immobilization of the UBA on a surface. In some embodiments, the capture region may be an affinity tag, a bead, a slide, or an array. In some embodiments, the capture region is the associated ESB, for example, the ESB can be a detectable bead such as a bead with a unique spectral signature (e.g. a bead that incorporates specific fluorophores emitting in the visible, near-infrared, or infrared).
[0093] In some embodiments, the UBA comprise antibodies as recognition elements (
[0094] Those skilled in the art will appreciate that antibodies can be obtained from a variety of sources, including but not limited to polyclonal antibodies, monoclonal antibodies, monospecific antibodies, recombinantly expressed antibodies, humanized antibodies, plantibodies, and the like; and can be obtained from a variety of animal species, including rabbit, mouse, goat, rat, human, horse, bovine, guinea pig, chicken, sheep, donkey, human, and the like. A wide variety of antibodies are commercially available from a variety of vendors, and custom-made antibodies can be obtained from a number of contract labs. Detailed descriptions of antibodies, including relevant protocols for production and use, can be found in, among other places, Current Protocols in Immunology, Coligan et al., eds., John Wiley & Sons (1999, including updates through August 2003); The Electronic Notebook: Basic Methods in Antibody Production and Characterization, G. Howard and D. Bethel, eds., CRC Press (2000); Monoclonal Antibodies: Principles and Practice, 3d Ed., J. Goding, Academic Press (1996); Using Antibodies, E. Harlow and D. Lane, Cold Spring Harbor Lab Press (1999); and Monoclonal Antibodies: A Practical Approach, P. Shepherd and C. Dean, Oxford University Press (2000).
[0095] In some embodiments, the antibodies described herein are attached to a nucleic acid, e.g., a common linker oligonucleotide or an ESB comprising an oligonucleotide sequence. One non-limiting example of an oligonucleotide sequence that comprises both a linker and an ESB is:
TABLE-US-00001 (SEQ ID NO: 1) 5′-GTGACTGGAGTTC AGACGTGTGCTCTTCCGATCT NNNNNNNNN CGTCAGACAGGGAGC-3′
where the NNNNNNNNN sequence is a 9 nucleotide code that is specific for the attached antibody. Methods for attaching nucleic acids to antibodies are well known in the art, and any suitable approach is encompassed within the presently disclosed methods, compositions, and kits. For example, in some embodiments antibodies may be attached to nucleic acid molecules using the methods described in Gullberg, et al. (2004), PNAS 101(22):228420-8424, and Boozer, et al. (2004), Analytical Chemistry 76(23):6967-6972, both of which are incorporated herein by reference. In some embodiments, antibodies may be attached to nucleic acid molecules by random coupling to free amines. In some embodiments, the antibodies may be attached to nucleic acid molecules by random coupling to free amines using a 10-to-1 ratio of nucleic acid to antibody. In some embodiments, antibodies may be attached to nucleic acid molecules using the methods described in Kozlov, et al. (2004), Biopolymers 5: 73 (5):621-630, which is incorporated herein by reference. In some embodiments, antibodies may be attached to nucleic acid molecules using hydrazine chemistry. In some embodiments, antibodies may be attached to nucleic acid molecules using “tadpoles” as described in Nolan (2005), Nature Methods 2:11-12, which is incorporated herein by reference. In general, antibodies may be attached to nucleic acid molecules using any suitable method known in the art for generating engineered antibodies, including the methods described herein.
[0096] In some embodiments of the disclosed methods, compositions, and kits, the UBA comprise nucleic acid sequences as recognition elements. Nucleic acid recognition elements may include target-specific recognition sequences, or generic target recognition sequences. Examples of suitable target recognition sequences include, but are not limited to, a poly-dT probe sequence for use in hybridization with mRNA molecules in general; an antisense DNA probe sequence for hybridization with a specific target mRNA, an oligonucleotide sequence designed to hybridize to an HIV viral sequence, and the like. The nucleic acid sequence is preferably at least 15 nucleotides in length, and more preferably is at least 20 nucleotides in length. In some embodiments, the target-specific recognition sequence is about 10 to 500, 20 to 400, 30 to 300, 40 to 200, or 50 to 100 nucleotides in length. In other embodiments, the target-specific sequence is about 30 to 70, 40 to 80, 50 to 90, or 60 to 100, 30 to 120, 40 to 140, or 50 to 150 nucleotides in length.
[0097] In some embodiments of the disclosed methods, compositions, and kits, the UBA comprise sets of oligonucleotide probes, e.g. a pair of proximity probes along with a bridge oligonucleotide sequence, which are designed to hybridize to a target nucleic acid molecule of interest, e.g. an mRNA molecule, with higher specificity than can be achieved using a single oligonucleotide recognition sequence. Examples of proximity oligonucleotide probe sets of the present disclosure that use a bridge molecule (e.g. a bridge oligonucleotide molecule) are illustrated in
[0098] Referring to
[0099] In some embodiments, the UBA may further comprise nucleic acid sequences comprising one or more primers, wherein the primers are used for amplification and/or sequencing of specific UBA probe sequences, ESB code sequences, COB sequences, or combinations thereof. Any suitable primer sequence may be used for amplification and/or sequencing, for example, the Illumina primers may be used for sequencing UBA-ESB-COB assemblies or conjugates, or portions thereof.
[0100] In some embodiments, the UBA may comprise a non-specific binding agent for recognition and binding to genomic DNA or chromosomal DNA structures, including but not limited to, for example, an antibody that binds DNA or histones, or a DNA intercalating molecule such as berberine, ethidium bromide, proflavine, daunomycin, dactinomycin, doxorubicin, daunorubicin, or thalidomide, to which an ESB may be attached.
[0101] In some embodiments, the UBA may comprise a non-specific binding agent for protein, including but not limited to, for example, an amine-reactive probe selected from the group consisting of succinimidyl esters, sulfosuccinimidyl esters, tetrafluorophenyl esters, sulfodichlorophenol esters, isothiocyanates, and sulfonyl chlorides, to which an ESB may be attached.
B. Epitope Specific Barcodes (ESB)
[0102] The epitope specific barcodes of the present disclosure provide a unique code that is associated with a specific target molecule. ESB are molecules or molecular assemblies that are designed to attach to or bind to a UBA or portions thereof, and can, under appropriate conditions, form a molecular complex comprising the ESB, the UBA, and the target molecule.
[0103] In some embodiments, ESB comprise at least one common linker region that allows them to bind to or interact with at least one UBA and/or at least one APS, typically in a sequence-specific manner, a conformation-specific manner, or a combination of both. Examples of suitable molecular binding interactions between the ESB and their associated UBA and/or APS include, but are not limited to, antibody-antigen binding, receptor-ligand binding, aptamer-target binding, enzyme-substrate interactions, oligonucleotide probe-target sequence hybridization, and the like. The interactions between the ESB and their associated UBA and/or APS are typically driven by ionic bonding, hydrogen bonding, or van der Waals forces. In some embodiments, the attachments between ESB and associated UBA and/or APS may be covalent. In some embodiments, the attachments are non-covalent. In some embodiments, the ESB are attached (either directly or indirectly) to the UBA prior to performing the assay. In other embodiments, the ESB bind to or are attached to the UBA following incubation of the sample with the UBA, i.e. as part of the assay procedure.
[0104] In some embodiments of the disclosed methods and compositions, the ESB comprise at least one coding region that encodes the identity of the attached UBA. In some embodiments, the ESB are oligonucleotide sequences, and the coding region comprises an oligonucleotide sequence that is between 5 and 15 nucleotides in length. In some embodiments, the coding region is an oligonucleotide sequence that is 9 nucleotides in length (
[0105] In many embodiments, the ESB are oligonucleotide sequences that further comprise one or more primers, and all or part of the ESB nucleic acid sequence and/or associated COB may be amplified using any nucleic acid amplification method, including, but not limited to, polymerase chain reaction (PCR), branched chain reaction, or rolling circle amplification approaches, as are well known in the art.
[0106]
[0107] In some embodiments, the ESB further comprise a capture region which may be used for isolation of UBA-ESB complexes and/or immobilization of the UBA-ESB complexes on a solid surface. In some embodiments, the capture region may be an affinity tag, a bead, a slide, or an array. In some embodiments, the capture region is the ESB, for example, the ESB can be a detectable bead such as a bead with a unique spectral signature (e.g. a bead that incorporates specific fluorophores emitting in the visible, near-infrared, or infrared). In some embodiments, the UBA is directly or indirectly attached to the capture region of the ESB.
C. Cell Origination Barcodes (COB)
[0108] The presently disclosed methods, compositions, and kits further provide means for creating cell origination barcodes, wherein each COB provides a unique code that can be associated to a specific cell of origin. In some embodiments, attachment of a COB to one or more bound UBA-ESB complexes (e.g. using common linker oligonucleotides) identifies the cell of origin for the target molecule(s) to which UBA/ESB complexes have bound. In some embodiments, the COB of the present disclosure are molecular entities (or assemblies, complexes, or conjugates) that may comprise (i) a common linker sequence that is capable of hybridizing to a common linker oligonucleotide associated with a UBA-ESB complex, (ii) a unique code that is associated with a specific cell of origin, and (iii) one or more primer sequences, or combinations thereof.
[0109] In some embodiments, COB are modular structures comprised of two or more APS. In some embodiments, COB are modular structures comprised of two or more APS attached in linear combination. In some embodiments, the COB comprise a plurality of APS attached in linear combination, wherein the APS comprise small molecules of deterministic weight. In some embodiments, the COB comprise 2, 3, 4, 5, 6, 7, 8, 9, 10, or more unique APS attached in a linear combination. In some embodiments, COBs comprise linear combinations of several APS, for example linear combinations of four APS, which are assembled using a split-pool combinatorial synthesis approach (
[0110] In some embodiments, the set of APS recognition sequences of the template or splint molecule are each separated by a linker comprising 1, 2, 3, or more carbon atoms, which acts as a “stop” signal or stop code for polymerase activity thereby preventing unwanted amplification of the full template molecule during nucleic acid amplification steps.
[0111] In some embodiments, the plurality of APS may comprise a set of uniquely designed nucleic acid sequences comprising one or more sub-code (SC) regions, wherein the sub-code sequence is unique for each individual APS molecule in the plurality of APS. In some embodiments, the SC regions or sequences are about 3, 4, 5, 6, 7, 8, 9, 10, or more than 10 nucleotides in length. In some embodiments, the sub-codes comprise a unique set of nucleic acid sequences of defined length, e.g. 7 nucleotides (
[0112] In some embodiments, the APS further comprise one or more common linker (CL) regions or sequences (e.g. common linker oligonucleotides) for the purpose of facilitating attachment of the APS to each other, or to the ESB, or to a template molecule used for assembly of the COB. Thus, in some embodiments, the common linker regions comprise annealing regions designed to hybridize to complementary sequences on a template molecule. The common linker can be directly or indirectly attached to the rest of the APS molecule, and facilitates either covalent or non-covalent assembly of the APS into a COB. In some embodiments, the common linker sequence may include oligonucleotide sequences or tandem-repeat sequences of about 10 to about 25 nucleotides in length. In some embodiments, the APS comprises two common linker sequences that flank the SC region. In some embodiments, common linker sequences can also be attached at either the 5′ end or the 3′ end of a COB, and may be utilized for capture and immobilization of a COB on a surface for detection or imaging purposes, e.g. by attaching a sequence that is complementary to the common linker sequence to a solid support or substrate.
[0113] In some embodiments, the APS or CL further comprises a random tag region allowing for subsequent quantitation of the number of detected COB. Methods for making and using such random tag regions are known in the art, e.g. see Casbon et al. (2011), Nucleic Acids Research 39(12):e81. The random tag region may function as a molecular counter to estimate the number of template molecules associated with each sequence variant. In some embodiments, a molecular counter is incorporated into an ESB, APS, CL, or an assembled COB prior to performing an amplification reaction, e.g. PCR amplification. A library of molecular counters comprising degenerate base regions (DBR) may be incorporated into the ESB, APS, CL, or assembled COB. The number of unique DBR in a library is generally limited by the length and base composition of the DBR. For example, a DBR comprising a single nucleotide would allow for four different possible counters, one for each base. Larger libraries of unique counter sequences can be achieved by using longer DBR, e.g. an eight nucleotide DBR corresponds to 4.sup.8=65,536 unique sequences. Molecular counters can be used to determine whether a sequence variant is associated with a single template molecule, or alternatively, with multiple template molecules. The number of different DBR sequences associated with one sequence variant can thus serve as a direct measure of the number of initial template molecules. This information can supplement or replace the information provided by read numbers of each sequence variant, including, for example, read numbers obtained after performing a PCR amplification reaction. DBRs can also be used to determine the probability that a sequence variant derives from a polymerase error during an amplification reaction or is a true original variant that was present prior to performing a PCR amplification reaction.
[0114] In some embodiments, the elements of a COB can be found in a single molecular entity (a singular COB), or two distinct molecular entities (a dual COB). Each molecular entity may be composed of one molecule, or more than one molecule attached to one another by covalent or non-covalent means. In some embodiments, each component of a dual COB has a target molecule-specific UBA-EBS complex that binds to a different site on the same target molecule. When using a dual COB system, one of the COB may be either labeled as described below or unlabeled. In some embodiments, the unlabeled COB may comprise a capture region.
[0115] In some embodiments, complementary oligonucleotide sequences designed to hybridize to an SC serve to attach detectable molecules, e.g. labels or label monomers, to each SC of the COB. The complementary oligonucleotide sequences may be directly labeled, for example, by covalent incorporation of one or more detectable label molecules into the complementary oligonucleotide sequence. Alternatively, the complementary oligonucleotide sequences may be indirectly labeled, for example, by incorporation of biotin or other molecule capable of providing a specific, high affinity ligand interaction, into the complementary oligonucleotide sequence. In such instances, the ligand (e.g. avidin or streptavidin in the case of biotin incorporation) may be covalently attached to the detectable molecule. In cases where the detectable molecules attached to an SC are not directly incorporated into the complementary oligonucleotide sequence, the complementary sequence serves as a bridge between the detectable molecule and the SC, and may be referred to as a bridging molecule, e.g., a bridging nucleic acid.
[0116] The COB of the present disclosure, and the APS molecules of which they are composed, can be labeled with any of a variety of labels or label monomers, e.g. radioisotopes, fluorophores, dyes, enzymes, nanoparticles, mass tags, chemiluminescent markers, biotin, or other labels or label monomers known in the art that can be detected directly (e.g. by light emission) or indirectly (e.g. by binding of a fluorescently-labeled antibody). In some embodiments, one or more of the SC in the COB is labeled with one or more label monomers, and the signals provided by the label monomers attached to the SC of a COB constitute a detectable code that identifies the target (or cell) to which the UBA (or the UBA-ESB-COB) binds. In some embodiments, the lack of a given signal from the SB (e.g. a dark spot) may also constitute part of the detectable code. Other examples of label monomers that can be used with the COB described herein, and methods to incorporate the label monomers into the COB are described in U.S. Pat. No. 7,473,767; and U.S. application Ser. Nos. 10/542,458, 12/324,357, 11/645,270, and 12/541,131, which are incorporated herein by reference in their entirety.
D. Primers
[0117] In some embodiments of the disclosed methods, compositions, and kits, target-specific primers, generic primers, semi-random primers, or combinations thereof, are used to selectively amplify UBA-ESB-COB complexes for targets of interest in order to optimize the cost efficiency and throughput of the sequencing reactions used for detection and quantitation of target molecules in individual cells.
[0118] An example of a target specific primer of the disclosed methods, compositions, and kits is illustrated schematically in
[0119] An example of a generic primer of the disclosed methods, compositions and kits is illustrated schematically in
[0120] An example of a semi-random primer of the disclosed methods, compositions and kits is illustrated schematically in
IV. Methods
[0121] A. Incubation of Cells with UBA-ESB Complexes
[0122] In many embodiments of the disclosed methods, cell suspensions or samples are incubated with one or more UBA-ESB complexes under conditions suitable for binding or hybridization with specific molecular targets on the surfaces of or within the individual cells. In some embodiments, one or more of the targets of interest may be intracellular targets, and the cells may be fixed and permeabilized using any of the methods known in the art, e.g. by adding cold methanol to the cell sample and incubating for a short period of time, followed be aspiration of the methanol, rinsing, and blocking with a bovine serum albumin or casein solution prior to incubation with the UBA-ESB.
B. Assembly of Cell Origination Barcodes (COB)
[0123] Methods for barcoding single cells and assembling the associated cell origination barcodes have been described previously in published patent applications PCT/US2012/023411 and PCT/US2013/054190, which are incorporated herein by reference. COB assembly or synthesis can be performed by any suitable method known in the art, including the ones described briefly herein. In some embodiments, the COB may be assembled by stepwise addition of assayable polymer subunits (APS) comprising oligonucleotides. In some embodiments, a COB is attached to the UBA-ESB complex via a common linker (CL) that may itself be an oligonucleotide, and which may be part of the APS itself or a separate molecular component. In some embodiments, the ESB, APS, and CL may all comprise oligonucleotide sequences. Accordingly, following assembly by means of hybridization between complementary, or substantially complementary, annealing regions on the ESB, APS and CL, the assembled oligonucleotides may be ligated to form covalent bonds between ESB-APS, adjacent APS, or adjacent APS-CL units. Annealing regions may be provided on either or both ends of an oligonucleotide ESB, APS, or CL.
[0124] In some embodiments, the APS are added to the bound UBA-ESB by performing one or more rounds of split pool synthesis, wherein each round comprises splitting the cell sample into a plurality of aliquots, incubating each aliquot with a different APS (comprising a different SC) to allow annealing of complementary sequences between the APS and the growing UBA-ESB-APS chain, ligation (in the case of oligonucleotide ESB and APS), rinsing, and pooling of the aliquots. If the APS do not include incorporated CL regions, the cycle may also include an incubation step wherein a CL is allowed to anneal to the growing UBA-ESB-APS chain. In some embodiments, an annealing region that is specific to each step of the stepwise synthesis maybe incorporated into the oligonucleotide components of the reaction. In this case, the use of a step-specific annealing region may stall further assembly of the COB for any cell wherein the previous addition step failed.
[0125] The diversity of the COB library (i.e. the number of unique COB that are theoretically possible) that can be achieved by means of performing stepwise split-pool assembly and synthesis is dependent on the number of unique APS available for use in each round, and the total number of rounds used to assemble the COB. For example, for a COB created using two rounds of assembly/synthesis (i.e. for a COB having two APS positions) and 10 unique APS, the total number of unique COB sequences that are possible is 2.sup.10=1,024. Alternatively, for a COB created using four rounds of assembly/synthesis (i.e. for a COB having four APS positions) and 10 unique APS, the total number of unique COB sequences that are possible is 4.sup.10=1,048,576. In general, it is desirable to design the COB library such that the total number of unique barcodes available is significantly larger than the number of individual cells to be labeled, thereby ensuring that the probability that any two cells are labeled with the same cell origination barcode is extremely low.
[0126] In some embodiments, the APS are stitched together and/or to a CL using an annealing primer (i.e. a template molecule or “splint”). The annealing primer may comprise a first complementary region to the CL or an APS added during the previous round of stepwise synthesis. The annealing primer may also comprise a second complementary region to the APS being added during a current round. Thus, the annealing primer can hybridize to two oligonucleotide subunits of successive rounds, thereby stitching them together. In some embodiments, the first complementary regions of annealing primers of each round are different from the first complementary regions of annealing primers of other rounds. In some embodiments, the second complementary regions of annealing primers of each round are different from the second complementary regions of annealing primers of other rounds. In some embodiments, the first or second complementary regions of annealing primers of different rounds are shared between rounds. In some embodiments, a template or “splint” (i.e. an extended CL molecule) is used for assembly of APS, wherein the splint includes multiple sets of annealing regions designed to permit the stepwise hybridization and ligation of individual APS to create the completed COB.
[0127] In some embodiments, a CL or “splint” oligonucleotide comprises one or more pairs of loop annealing regions. Accordingly, the APS can be designed to hybridize to the CL or splint to create loop geometries, i.e. by hybridizing to the loop annealing regions at each end of a CL. In some embodiments, the loop annealing regions may be designed to be specific to the round of split-pool synthesis such that successive rounds of addition and hybridization populate the APS positions along the splint. The APS can then be linked together using any of the methods known in the art, for example, by ligation. In some embodiments, the APS may be designed to ensure that they do not hybridize efficiently to the splint at the loop annealing regions specific to other synthesis rounds. Consequently, if an APS from a particular round is missing for some reason, APS that are added in subsequent rounds are less likely to be ligated properly, thus reducing the likelihood of downstream analysis errors. Alternatively, COB may occasionally be synthesized even with a missing APS, the location of which is flanked by a pair of loop annealing regions. The resulting COB can then be analyzed accordingly, and can either be discarded or the retrieved information can be alternatively processed.
[0128]
[0129]
[0130]
[0131]
C. Methods for Detection of Barcodes
[0132] Methods for amplification and detection of epitope specific barcodes and cell origination barcodes have been described more fully in published patent applications PCT/US2012/023411 and PCT/US2013/054190, which are incorporated herein by reference. In some embodiments, the assembled UBA-ESB-COB or ESB-COB products are amplified and, optionally, the results are compared with amplification of similar target nucleic acids from a reference sample. Nucleic acid amplification can be performed by any means known in the art. In some cases, the ligated products are amplified by polymerase chain reaction (PCR). Examples of PCR techniques that can be used include, but are not limited to, quantitative PCR, quantitative fluorescent PCR (QF-PCR), multiplex fluorescent PCR (MF-PCR), real time PCR (RT-PCR), single cell PCR, restriction fragment length polymorphism PCR (PCR-RFLP), real-time restriction fragment length polymorphism PCR (RT-PCR-RFLP), hot start PCR, nested PCR, in situ polonony PCR, in situ rolling circle amplification (RCA), bridge PCR, picotiter PCR and emulsion PCR. Other suitable amplification methods include the ligase chain reaction (LCR), transcription amplification, self-sustained sequence replication, selective amplification of target polynucleotide sequences, consensus sequence primed polymerase chain reaction (CP-PCR), arbitrarily primed polymerase chain reaction (AP-PCR), degenerate oligonucleotide-primed PCR (DOP-PCR) and nucleic acid based sequence amplification (NABSA). Other amplification methods that can be used herein include those described in U.S. Pat. Nos. 5,242,794; 5,494,810; 4,988,617; and 6,582,938. In some embodiments, the amplification is performed inside a cell.
[0133] In some embodiments of the disclosed methods, compositions, and kits, target-specific or semi-random primers are used to selectively amplify UBA-ESB-COB complexes for targets of interest in order to optimize throughput and minimize costs for performing the sequencing reactions used for detection and quantitation of target molecules in individual cells.
[0134] In some embodiments, a target-specific primer, as illustrated schematically in
[0135] In some embodiments, a generic primer, as illustrated schematically in
[0136] In some embodiments, a semi-random primer, as illustrated schematically in
[0137] In any of the embodiments, the detection or quantitative analysis of the UBA-ESB-COB, ESB-COB, or COB library can be accomplished by sequencing. The APS subunits or entire COB can be detected via full sequencing of all oligonucleotide tags by any suitable methods or systems known in the art, e.g. by using the Illumina HiSeq 2000 sequencing system. Sequencing can be accomplished through classic Sanger sequencing methods which are well known in the art. Sequencing can also be accomplished using high-throughput and/or next-generation sequencing systems, some of which allow detection of a sequenced nucleotide immediately after or upon its incorporation into a growing strand, i.e., detection of sequence in red time or substantially real time. In some cases, high throughput sequencing generates at least 1,000, at least 5,000, at least 10,000, at least 20,000, at least 30,000, at least 40,000, at least 50,000, at least 100,000 or at least 500,000 sequence reads per hour; with each read being at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 120, at least 150, at least 200, or at least 250 bases per read.
D. Multiplexed Testing
[0138] In certain embodiments, the methods of detection are performed in multiplex assays, wherein a plurality of target molecules is detected in the same assay (i.e. in a single reaction mixture). In some embodiments, the assay is a hybridization assay or an affinity binding assay in which the plurality of target molecules is detected simultaneously. In some embodiments, the assay is a hybridization assay or an affinity binding assay in which the plurality of target molecules is detected simultaneously in single cells. In certain embodiments, the plurality of target molecules detected in the same assay is, at least 2 different target molecules, at least 5 different target molecules, at least 10 different target molecules, at least 20 different target molecules, at least 50 different target molecules, at least 75 different target molecules, at least 100 different target molecules, at least 200 different target molecules, at least 500 different target molecules, at least 750 different target molecules, or at least 1,000 different target molecules. In other embodiments, the plurality of target molecules detected in the same assay is up to 5 different target molecules, up to 10 different target molecules, up to 20 different target molecules, up to 50 different target molecules, up to 100 different target molecules, up to 150 different target molecules, up to 200 different target molecules, up to 500 different target molecules, up to 750 different target molecules, up to 1,000 different target molecules, up to 2,000 target molecules, or up to 5,000 target molecules. In yet other embodiments, the plurality of target molecules detected is any range in between the foregoing numbers of different target molecules, such as, but not limited to, from 20 to 50 different target molecules, from 50 to 200 different target molecules, from 100 to 1000 different target molecules, from 500 to 5000 different target molecules, and so on and so forth.
E. Quantitative Detection
[0139] In addition to the qualitative analytical capabilities provided by the UBA-ESB-COB complexes of the present disclosure and analytical techniques based thereon, in some embodiments the UBA-ESB-COB can be uniquely suitable for conducting quantitative analyses. By providing a one-to-one binding stoichiometry between the UBA-ESB-COB and their associated target molecules, e.g. in embodiments in which the UBA-ESB complex comprises a short random sequence (
F. Detection of mRNA Target Molecules
[0140] A non-limiting example of the process used to detect specific mRNA target molecules and label each occurrence with a unique cell origination barcode is illustrated in
[0141]
[0142]
TABLE-US-00002 (SEQ ID NO: 6) GCTCCCTGTCTGACG XXXXXXXXXXX
Following addition of the sequence-specific primer to the cell sample, a reverse transcription reaction is performed, after which a “splint” oligonucleotide is annealed and used for assembly of APS comprising coding regions SC1-SC3 into a unique cell origination barcode that may be amplified and sequenced using Illumina primers. In some embodiments, one or more rounds of nested PCR amplification may be performed using an internal primer, prior to amplification and sequencing using the Illumina primers. In some embodiments, a hexamer primer, e.g.
TABLE-US-00003 (SEQ ID NO: 7) GCTCCCTGTCTGACG NNNNNN
is used to hybridize with target mRNA molecules.
[0143] In some embodiments, target mRNA molecules are detected using a proximity probe set, the compositions for which are described above. The use of a pair of proximity oligonucleotide probes, each comprising a target recognition sequence that is complementary to non-overlapping but closely spaced sequence regions of the same target mRNA, provides for reduced non-specific probe hybridization and increased target detection specificity by creating a requirement that two sequence recognition events occur simultaneously and in close proximity to one another.
[0144]
[0145]
[0146]
[0147]
[0148]
[0149]
[0150]
[0151] In some embodiments, the proximity probe sets disclosed herein may be used for detection of specific mRNA sequences in the absence of performing additional cell origination barcoding steps. For example, in some embodiments, a cell sample may be lysed to release mRNA, following which the sample is contacted with a plurality of beads, wherein a bead comprises a plurality of tethered oligonucleotide sequences capable of hybridizing to the released mRNA molecules, e.g. through the use of a poly-T sequence recognition region. Following hybridization of the released mRNA from the cell sample, a first oligonucleotide proximity probe is annealed with the hybridized mRNA molecules on the plurality of beads, wherein the first oligonucleotide proximity probe comprises an epitope specific barcode sequence and a first target recognition sequence that is capable of hybridizing to a first segment of the target nucleic acid sequence. Simultaneously, or subsequently, a second oligonucleotide proximity probe is annealed with the hybridized mRNA molecules on the beads, wherein the second oligonucleotide proximity probe comprises a second target recognition sequence that is capable of hybridizing to a second segment of the target nucleic acid sequence, and wherein the first and second segments of the target nucleic acid sequence are different and are separated from each other by a specified number of nucleotides, N. A bridge oligonucleotide is then, simultaneously or subsequently, annealed with the hybridized oligonucleotide proximity probes on the plurality of beads, wherein the bridge oligonucleotide comprises two probe recognition sequences, wherein the first probe recognition sequence is capable of hybridizing to a segment of the first oligonucleotide proximity probe, and the second probe recognition sequence is capable of hybridizing to a segment of the second oligonucleotide proximity probe, thereby creating a target specific probe complex that includes the epitope specific barcode. In some embodiments, the annealed components (i.e. the pair of oligonucleotide proximity probes and the bridge oligonucleotide) are ligated to create covalently joined target specific probe complexes. In many embodiments, the plurality of tethered oligonucleotide sequences further comprise one or more primer sequences, e.g. amplification primers or sequencing primers. In some embodiments, the target specific probe complexes are amplified using a PCR reaction and one or more target specific primers. In some embodiments, the PCR amplification products are sequenced to detect or quantify the presence of one or more mRNA sequences in the sample.
F. Discrimination between Whole Cells and Dead Cells, Cell Fragments, or Cell Clusters
[0152] When performing assays to identify a plurality of target molecules in individual cells in a sample comprising a complex mixture of cells, it may be desirable to discriminate between whole cells and dead cells, cell fragments, or clusters of cells so that data for the latter may be excluded from subsequent analysis, thereby improving the quality of the data. In studies involving samples comprising millions of cells, where each cell is individually barcoded, the presence of cell fragments, cell doublets, or larger clusters of cells can contribute “noise” in the form of erroneous data indicating the presence of cells that have markers that they shouldn't have. Accordingly, the methods, compositions, and kits of the present disclosure provide means for discriminating between the single cells of interest and dead cells, cell fragments, or clusters of cells present in samples.
[0153] In some embodiments, discrimination between the single cells of interest and dead cells, cell fragments, or clusters of cells present in samples is achieved by analyzing the ratio of DNA to protein for each “cell”. In some embodiments, discrimination between the single cells of interest and dead cells, cell fragments, or clusters of cells is achieved by analyzing the amount of DNA detected per “cell”. In yet other embodiments, discrimination is achieved by analyzing the amount of protein detected per “cell”.
[0154] In some embodiments, the amount of DNA per “cell” may be determined by choosing to include one or more UBA that are directed towards genomic DNA or chromosomal DNA structures, for example, binding agents including, but not limited to, antibodies that bind DNA or histones, or DNA intercalating molecules (such as berberine, ethidium bromide, proflavine, daunomycin, dactinomycin, doxorubicin, daunorubicin, or thalidomide) in the set of UBAs chosen to identify a specific set of target molecules. Following completion of the assay, the amount of DNA per “cell” is determined from the number of DNA-specific UBA-ESB complexes detected for each cell, as identified by the cell origination barcode (COB). In some embodiments, it may be useful to compare the number of DNA-specific UBA-ESB complexes recovered to a calibration curve generated using the same set of DNA-directed UBAs and known concentrations of genomic or chromosomal DNA, under similar incubation conditions to correct for binding stoichiometry in cases where the binding stoichiometry between the DNA-specific UBA and genomic DNA or chromosomal DNA structures is not 1-to-1. In some embodiments, the same approach is used to discriminate between whole cells and “dead” cells by performing the incubation with one or more UBAs directed towards genomic DNA or chromosomal DNA structures prior to fixing and permeabilizing the cell sample.
[0155] In some embodiments, the amount of protein per “cell” may be determined by choosing to include one or more UBA that are directed non-specifically towards protein, for example, including but not limited to amine-reactive moieties such as succinimidyl esters, sulfosuccinimidyl esters, tetrafluorophenyl esters, sulfodichlorophenol esters, isothiocyanates, or sulfonyl chlorides, or one or more UBA that are directed specifically towards a common protein, e.g. antibodies directed towards actin or other housekeeping proteins, in the set of UBA chosen to identify a specific set of target molecules. Following completion of the assay, the amount of protein per “cell” is determined from the number of non-specific protein UBA-ESB complexes detected for each cell, as identified by the cell origination barcode (COB). In some embodiments, it may be useful to compare the number of non-specific protein UBA-ESB complexes recovered (or specific protein UBA-ESB complexes in the case that antibodies to actin or other housekeeping proteins are used) to a calibration curve generated using the same set of protein-directed UBA and known concentrations of protein, under similar incubation conditions to correct for binding stoichiometry in cases where the binding stoichiometry between the non-specific protein UBA and protein is not 1-to-1. Alternatively, in some embodiments, the average number of accessible amine groups on the surface of a given protein or set of proteins is calculated on the basis of protein structural data, and is subsequently used to determine the amount of protein per cell based on the number of non-specific protein UBA-ESB complexes recovered for each cell.
G. Methods for Identification of Rare Cells
[0156] When performing assays to identify a plurality of target molecules in individual cells in a sample comprising a complex mixture of cells, it is often desirable to identify a specific sub-population of cells within the complex mixture and focus the subsequent analysis on that sub-population, thereby improving the specificity of the data. In studies involving samples comprising millions of cells, each individually barcoded, the presence of rare cells may constitute as little as 0.01% of the total cell population. Accordingly, the methods, compositions, and kits of the present disclosure provide for means for discriminating between the subset of cells of interest and the majority of cells present in the sample.
[0157] In some embodiments, a specific subset of cells may be identified by including one or more UBA that are directed towards specific intracellular or cell surface markers, for example, including, but not limited to, oligonucleotide probe sequences that are designed to hybridizes to viral genomic sequences, e.g. HIV viral sequences, or antibodies directed against CD1, CD3, CD8, or CD4, in the set of UBA chosen to identify a specific set of target molecules. Subsequent analysis is restricted to the selected sub-population of cells by selectively amplifying and sequencing those COB that are attached to the UBA-ESB complexes used to identify the sub-population of cells, thereby generating a list of all cells (as identified by their respective COB) which meet the selection criteria used to define the sub-population.
[0158] A complete listing of additional UBA-ESB associated with the selected sub-population of cells may be determined using the list of COB for the sub-population. In some embodiments, the list of COB is used to design a set of primers, for example 4 sets of primers in the case that 4 APS (each comprising an SC) are used to construct the COB (see
H. Methods for Filtering Out Selected Cell Sub-Populations from Further Analysis
[0159] When performing assays to identify a plurality of target molecules in individual cells in a sample comprising a complex mixture of cells, it is often desirable to filter out specific sub-populations of cells within the complex mixture, and focus the subsequent analysis on the remaining cells, thereby improving the specificity of the data. For example, in some applications it may be desirable to identify mature B cells in a population of cells (using antibodies directed towards cell surface markers such as CD19, CD38, BCMA, and the like) and eliminate them from further consideration, so that subsequent analysis may be focused on any stem cells that are present. Accordingly, the methods, compositions, and kits of the present disclosure also provide means for eliminating specified populations of cells from further analysis. In some embodiments this is accomplished by labeling multiple UBA (e.g. a set of antibodies) with the same ESB, so that following the binding step of the assay, selective amplification and sequencing of ESB-COB conjugates for the specified set of UBA provides a list of cells to be excluded from further analysis. Selective amplification and sequencing may be performed as described above.
I. Resampling to Detect Additional Target Molecules in Selected Sub-Populations of Cells
[0160] When performing assays to identify a plurality of target molecules in individual cells in a sample comprising a complex mixture of cells, it is often desirable to resample a barcoded cell suspension to determine if additional target molecules are also present in a selected sub-population. Accordingly, the methods, compositions, and kits of the present disclosure also provide means for resampling to detect one or more target molecules of interest at a point in time that is subsequent to that at which the initial cell barcoding procedure was performed. In some embodiments, detection of one or more target molecules of interest in individual cells of the barcoded cell suspension is enabled by including one or more UBA that are directed non-specifically towards protein, for example, including but not limited to amine-reactive moieties such as succinimidyl esters, sulfosuccinimidyl esters, tetrafluorophenyl esters, sulfodichlorophenol esters, isothiocyanates, or sulfonyl chlorides, in the original set of UBA used to perform the initial cell barcoding. Following the selective amplification and sequencing performed as described above to obtain a list of cell origination barcodes associated with cells of a selected sub-population, an aliquot of the barcoded cell suspension is lysed and incubated with beads comprising, for example, an immobilized antibody directed against one of the additional target molecules of interest and a tethered secondary primer (
[0161] In some embodiments, a similar approach is utilized to detect mRNA molecules of interest in a selected sub-population of cells by using a non-specific UBA directed towards mRNA molecules in general, e.g. a UBA comprising a poly-T (or poly-dT) sequence, in the cell barcoding step, and a set of beads comprising immobilized oligonucleotide probes that are specific for the mRNA molecules of interest, along with immobilized secondary primers.
V. Kits
[0162] The present disclosure also describes kits for barcoding molecules and cells, wherein the kits comprise one or more of the compositions described above. In some embodiments, the kits may comprise one or more target specific UBA-ESB complexes, or reagents for attaching pre-synthesized ESB to user-supplied UBA. In some embodiments, the UBA of the presently disclosed kits comprise one or more antibodies, which may further comprise attached ESB that encode the identity of the associated antibody. In some embodiments, the UBA of the presently disclosed kits comprise one or more oligonucleotide probes that are designed to hybridize to selected nucleic acid target, and which may further comprise attached ESB that encode for the identity of the associated target probe. In some embodiments, the disclosed kits may comprise, additionally or as a stand-alone product, sets of APS and any additional enzymes or reagents that may be required for their assembly into cell origination barcodes. In some embodiments, the sets of APS comprise sets of sub-code regions that are designed to provide error detection and correction capability at the sequencing step of the analysis. In some embodiments, the disclosed kits may comprise, additionally or as a stand-alone product, sets of primers for the selective amplification of epitope specific barcodes for a selected sub-population of cells.
VI. Applications
[0163] The compositions, methods, and kits disclosed herein can be used for diagnostic, prognostic, therapeutic, patient stratification, drug development, treatment selection, and screening purposes. The disclosed compositions, methods, and kits provide the advantage that many different target molecules can be analyzed at one time, at the single cell level, from a single biological sample. This enables, for example, several diagnostic tests to be performed on one sample.
[0164] Examples of applications include, but are not limited, to biomarker discovery, target validation for drug discovery, gene expression profiling, protein expression profiling, proteome analyses, metabolomic studies, post-translation modification studies (e.g. for monitoring glycosylation, phosphorylation, acetylation, and other amino acid modifications), pharmacokinetic studies (e.g. drug metabolism, ADME profiling, and toxicity studies), analyses of specific serum or mucosal antibody levels; evaluation of non-nucleic acid diagnostic indicators, pathogen detection, foreign antigen detection, and the like.
VII. Computer Software
[0165] Also disclosed herein are computer software packages stored on non-transitory computer readable media that provide analysis capabilities for decoding and grouping the sequencing data obtained for sets of epitope specific barcode-cell origination barcode conjugates. Examples of the capabilities provided by such software packages include sequence alignment and comparison tools, hierarchical clustering tools, amplification and/or sequencing error detection and correction tools, data visualization tools, and the like.
[0166] While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.