A NON-INVASIVE PRENATAL TEST WITH ACCURATE FETAL FRACTION MEASUREMENT
20210164048 · 2021-06-03
Assignee
Inventors
- Yuan GAO (La Jolla, CA, US)
- Rui LIU (La Jolla, CA, US)
- Christopher HARTL (La Jolla, CA, US)
- Bin XIE (La Jolla, CA, US)
- Jingyi LU (La Jolla, CA, US)
Cpc classification
C12Q1/6881
CHEMISTRY; METALLURGY
International classification
Abstract
The present disclosure relates to methods for non-invasive prenatal testing (NIPT) using semm from a maternal blood sample taken during pregnancy. The methods provide efficient access to genetic information about the fetus, including gender, fetal DNA fraction, paternity, and possible genetic abnormalities. This approach is referred to herein as Afisawa, and makes NIPT genetic testing more efficient and cost effective than previous methods.
Claims
1. A plurality of polynucleotides, wherein each polynucleotide comprises: a first target-specific domain and a second target-specific domain configured to bind to a first target sequence and a second target sequence, respectively, of a nucleic acid target, and a unique molecule identifier (UMI) and a linker between the first and second target-specific domains, wherein the first and second target-specific domains are configured to be connected to each other such that the polynucleotide forms a circle, optionally after a polymerase-mediated extension of the first or second target-specific domain, and wherein the nucleic acid target comprises a polymorphic nucleotide within the first target sequence and/or the second target sequence, or between the first and second target sequences.
2. The plurality of polynucleotides of claim 1, which are single-stranded polynucleotides.
3. (canceled)
4. The plurality of polynucleotides of claim 1, which are between about 50 nucleotides and about 200 nucleotides in length, e.g., between about 90 nucleotides and about 100 nucleotides in length.
5-7. (canceled)
8. The plurality of polynucleotides of claim 1, wherein the linker comprises one or more common nucleotides for subsequent PCR annealing.
9-11. (canceled)
12. The plurality of polynucleotides of claim 1, wherein the nucleic acid target is from a sex chromosome, such as a chromosome X or chromosome Y, or from an autosome.
13. The plurality of polynucleotides of claim 1, wherein the nucleic acid target is from a mammalian chromosome, such as a human chromosome.
14. The plurality of polynucleotides of claim 1, wherein the plurality of polynucleotides are configured to bind to a target sequence on human chromosome 1, human chromosome 2, human chromosome 3, human chromosome 4, human chromosome 9, human chromosome 13, human chromosome 15, human chromosome 18, human chromosome 19, human chromosome 21, human chromosome 22, human chromosome X, or human chromosome Y, or any combination thereof.
15-16. (canceled)
17. The plurality of polynucleotides of claim 1, wherein the polymorphic nucleotide is at a single nucleotide polymorphism (SNP) site.
18. The plurality of polynucleotides of claim 1, wherein the polymorphic nucleotide comprises a plurality of polymorphic nucleotides, for example, nucleotides at a plurality of single nucleotide polymorphism (SNP) sites.
19. The plurality of polynucleotides of claim 1, comprising between about 50 and about 150 polynucleotides (e.g., about 120 polynucleotides) configured to bind to a target sequence on human chromosome 1, e.g., any of target sequences 1-117 as set forth in Table 1 (the Table in
20. The plurality of polynucleotides of claim 1, comprising between about 10 and about 50 polynucleotides (e.g., about 40 polynucleotides) configured to bind to a target sequence on human chromosome 2, e.g., any of target sequences 2747-2784 as set forth in Table 1 (the Table in
21. The plurality of polynucleotides of claim 1, comprising between about 10 and about 80 polynucleotides (e.g., about 60 polynucleotides) configured to bind to a target sequence on human chromosome 3, e.g., any of target sequences 4072-4126 as set forth in Table 1 (the Table in
22. The plurality of polynucleotides of claim 1, comprising between about 10 and about 80 polynucleotides (e.g., about 50 polynucleotides) configured to bind to a target sequence on human chromosome 4, e.g., any of target sequences 4127-4171 as set forth in Table 1 (the Table in
23. The plurality of polynucleotides of claim 1, comprising between about 10 and about 80 polynucleotides (e.g., about 40 polynucleotides) configured to bind to a target sequence on human chromosome 9, e.g., any of target sequences 4172-4212 as set forth in Table 1 (the Table in
24. The plurality of polynucleotides of claim 1, comprising between about 100 and about 1,500 polynucleotides (e.g., about 1,200 polynucleotides) configured to bind to a target sequence on human chromosome 13, e.g., any of target sequences 118-1337 as set forth in Table 1 (the Table in
25. The plurality of polynucleotides of a claim 1, comprising between about 10 and about 150 polynucleotides (e.g., about 100 polynucleotides) configured to bind to a target sequence on human chromosome 15, e.g., any of target sequences 1338-1444 as set forth in Table 1 (the Table in
26. The plurality of polynucleotides of claim 1, comprising between about 100 and about 1,500 polynucleotides (e.g., about 1,200 polynucleotides) configured to bind to a target sequence on human chromosome 18, e.g., any of target sequences 1445-2681 as set forth in Table 1 (the Table in
27. The plurality of polynucleotides of claim 1, comprising between about 10 and about 100 polynucleotides (e.g., about 60 polynucleotides) configured to bind to a target sequence on human chromosome 19, e.g., any of target sequences 2682-2746 as set forth in Table 1 (the Table in
28. The plurality of polynucleotides of claim 1, comprising between about 100 and about 1,500 polynucleotides (e.g., about 1,200 polynucleotides) configured to bind to a target sequence human chromosome 21, e.g., any of target sequences 2785-3995 as set forth in Table 1 (the Table in
29. The plurality of polynucleotides of claim 1, comprising between about 10 and about 120 polynucleotides (e.g., about 70 polynucleotides) configured to bind to a target sequence on human chromosome 22, e.g., any of target sequences 3996-4071 as set forth in Table 1 (the Table in
30. The plurality of polynucleotides of claim 1, comprising between about 100 and about 500 polynucleotides (e.g., about 300 polynucleotides) configured to bind to a target sequence on human chromosome X, e.g., any of target sequences 4213-4462 as set forth in Table 1 (the Table in
31. The plurality of polynucleotides of claim 1, comprising between about 300 and about 800 polynucleotides (e.g., about 500 polynucleotides) configured to bind to a target sequence on human chromosome Y, e.g., any of target sequences 4463-4962 as set forth in Table 1 (the Table in
32. The plurality of polynucleotides of claim 1, comprising between about 300 and about 4,500 polynucleotides (e.g., about 3,600 polynucleotides) configured to bind to target sequences on human chromosomes 13, 18, and 21.
33-34. (canceled)
35. A method for analyzing a fetal genetic information, e.g., fetal fraction, a chromosome abnormality such as trisomy, sex determination and/or prenatal paternity test, comprising: a) contacting a sample from a female subject with the plurality of polynucleotides of claim 1; and wherein nucleic acid sequence information of the sample is obtained, which indicates a fetal genetic information.
36-40. (canceled)
61. A kit for analyzing a fetal genetic information, e.g., fetal fraction, a chromosome abnormality such as trisomy, sex determination and/or prenatal paternity test, comprising a plurality of polynucleotides of claim 1.
62-73. (canceled)
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
DETAILED DESCRIPTION
[0036] Numerous specific details are set forth in the following description in order to provide a thorough understanding of the methods and other aspects of the invention. These details are provided for the purpose of example and the claimed subject matter may be practiced according to the claims without some or all of these specific details. It is to be understood that other embodiments can be used and structural changes can be made without departing from the scope of the claimed subject matter. It should be understood that the various features and functionality described in one or more of the individual embodiments are not limited in their applicability to the particular embodiment with which they are described. They instead can be applied alone, or in some combination, to one or more of the other embodiments of the disclosure, whether or not such embodiments are described, and whether or not such features are presented as being a part of a described embodiment. For the purpose of clarity, technical material that is known in the technical fields related to the claimed subject matter has not been described in detail so that the claimed subject matter is not unnecessarily obscured.
[0037] All publications, including patent documents, scientific articles and databases, referred to in this application are incorporated by reference in their entireties for all purposes to the same extent as if each individual publication were individually incorporated by reference. Citation of the publications or documents is not intended as an admission that any of them is pertinent prior art, nor does it constitute any admission as to the contents or date of these publications or documents.
[0038] All headings are for the convenience of the reader and should not be used to limit the meaning of the text that follows the heading, unless so specified.
[0039] The practice of the provided embodiments will employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and sequencing technology, which are within the skill of those who practice in the art. Such conventional techniques include polypeptide and protein synthesis and modification, polynucleotide synthesis and modification, polymer array synthesis, hybridization and ligation of polynucleotides, detection of hybridization, and nucleotide sequencing. Specific illustrations of suitable techniques can be had by reference to the examples herein. However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Green, et al., Eds., Genome Analysis: A Laboratory Manual Series (Vols. I-IV) (1999); Weiner, Gabriel, Stephens, Eds., Genetic Variation: A Laboratory Manual (2007); Dieffenbach, Dveksler, Eds., PCR Primer: A Laboratory Manual (2003); Bowtell and Sambrook, DNA Microarrays: A Molecular Cloning Manual (2003); Mount, Bioinformatics: Sequence and Genome Analysis (2004); Sambrook and Russell, Condensed Protocols from Molecular Cloning: A Laboratory Manual (2006); and Sambrook and Russell, Molecular Cloning: A Laboratory Manual (2002) (all from Cold Spring Harbor Laboratory Press); Ausubel et al. eds., Current Protocols in Molecular Biology (1987); T. Brown ed., Essential Molecular Biology (1991), IRL Press; Goeddel ed., Gene Expression Technology (1991), Academic Press; A. Bothwell et al. eds., Methods for Cloning and Analysis of Eukaryotic Genes (1990), Bartlett Publ.; M. Kriegler, Gene Transfer and Expression (1990), Stockton Press; R. Wu et al. eds., Recombinant DNA Methodology (1989), Academic Press; M. McPherson et al., PCR: A Practical Approach (1991), IRL Press at Oxford University Press; Stryer, Biochemistry (4th Ed.) (1995), W. H. Freeman, New York N.Y.; Gait, Oligonucleotide Synthesis: A Practical Approach (2002), IRL Press, London; Nelson and Cox, Lehninger, Principles of Biochemistry (2000) 3rd Ed., W. H. Freeman Pub., New York, N.Y.; Berg, et al., Biochemistry (2002) 5th Ed., W. H. Freeman Pub., New York, N.Y., all of which are herein incorporated in their entireties by reference for all purposes.
A. Definitions
[0040] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of ordinary skill in the art to which the present disclosure belongs. If a definition set forth in this section is contrary to or otherwise inconsistent with a definition set forth in the patents, applications, published applications and other publications that are herein incorporated by reference, the definition set forth in this section prevails over the definition that is incorporated herein by reference.
[0041] As used herein, “a” or “an” means “at least one” or “one or more.” As used herein, the singular forms “a,” “an,” and “the” include the plural reference unless the context clearly dictates otherwise.
[0042] Throughout this disclosure, various aspects of the claimed subject matter are presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the claimed subject matter. Accordingly, the description of a range should be considered to have specifically disclosed all the possible sub-ranges as well as individual numerical values within that range. For example, where a range of values is provided, it is understood that each intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the claimed subject matter. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the claimed subject matter, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the claimed subject matter. This applies regardless of the breadth of the range.
[0043] Reference to “about” a value or parameter herein includes (and describes) variations that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X”. Additionally, use of “about” preceding any series of numbers includes “about” each of the recited numbers in that series. For example, description referring to “about X, Y, or Z” is intended to describe “about X, about Y, or about Z.”
[0044] The term “average” as used herein refers to either a mean or a median, or any value used to approximate the mean or the median, unless the context clearly indicates otherwise.
[0045] A “subject” as used herein refers to an organism, or a part or component of the organism, to which the provided compositions, methods, kits, devices, and systems can be administered or applied. For example, the subject can be a mammal or a cell, a tissue, an organ, or a part of the mammal. As used herein, “mammal” refers to any of the mammalian class of species, preferably human (including humans, human subjects, or human patients). Mammals include, but are not limited to, farm animals, sport animals, pets, primates, horses, dogs, cats, and rodents such as mice and rats. Typically a subject is a mammal; preferably a subject is a human.
[0046] As used herein the term “sample” refers to anything which may contain a target molecule for which analysis is desired, including a biological sample. As used herein, a “biological sample” can refer to any sample obtained from a living or viral (or prion) source or other source of macromolecules and biomolecules, and includes any cell type or tissue of a subject from which nucleic acid, protein and/or other macromolecule can be obtained. The biological sample can be a sample obtained directly from a biological source or a sample that is processed. For example, isolated nucleic acids that are amplified constitute a biological sample. Biological samples include, but are not limited to, body fluids, such as blood, plasma, serum, cerebrospinal fluid, synovial fluid, urine, sweat, semen, stool, sputum, tears, mucus, amniotic fluid or the like, an effusion, a bone marrow sample, ascitic fluid, pelvic wash fluid, pleural fluid, spinal fluid, lymph, ocular fluid, extract of nasal, throat or genital swab, cell suspension from digested tissue, or extract of fecal material, and tissue and organ samples from animals and plants and processed samples derived therefrom.
[0047] The terms “polynucleotide,” “oligonucleotide,” “nucleic acid” and “nucleic acid molecule” are used interchangeably herein to refer to a polymeric form of nucleotides of any length, and comprise ribonucleotides, deoxyribonucleotides, and analogs or mixtures thereof. The terms include triple-, double- and single-stranded deoxyribonucleic acid (“DNA”), as well as triple-, double- and single-stranded ribonucleic acid (“RNA”). It also includes modified, for example by alkylation, and/or by capping, and unmodified forms of the polynucleotide. More particularly, the terms “polynucleotide,” “oligonucleotide,” “nucleic acid,” and “nucleic acid molecule” include polydeoxyribonucleotides (containing 2-deoxy-D-ribose), polyribonucleotides (containing D-ribose), including tRNA, rRNA, hRNA, and mRNA, whether spliced or unspliced, any other type of polynucleotide which is an N- or C-glycoside of a purine or pyrimidine base, and other polymers containing nonnucleotidic backbones, for example, polyamide (e.g., peptide nucleic acids (“PNAs”)) and polymorpholino (commercially available from the Anti-Virals, Inc., Corvallis, Oreg., as Neugene) polymers, and other synthetic sequence-specific nucleic acid polymers providing that the polymers contain nucleobases in a configuration which allows for base pairing and base stacking, such as is found in DNA and RNA. Thus, these terms include, for example, 3′-deoxy-2′,5′-DNA, oligodeoxyribonucleotide N3′ to P5′ phosphoramidates, 2′-O-alkyl-substituted RNA, hybrids between DNA and RNA or between PNAs and DNA or RNA, and also include known types of modifications, for example, labels, alkylation, “caps,” substitution of one or more of the nucleotides with an analog, inter-nucleotide modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoramidates, carbamates, etc.), with negatively charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), and with positively charged linkages (e.g., aminoalkylphosphoramidates, aminoalkylphosphotriesters), those containing pendant moieties, such as, for example, proteins (including enzymes (e.g. nucleases), toxins, antibodies, signal peptides, poly-L-lysine, etc.), those with intercalators (e.g., acridine, psoralen, etc.), those containing chelates (of, e.g., metals, radioactive metals, boron, oxidative metals, etc.), those containing alkylators, those with modified linkages (e.g., alpha anomeric nucleic acids, etc.), as well as unmodified forms of the polynucleotide or oligonucleotide. A nucleic acid generally will contain phosphodiester bonds, although in some cases nucleic acid analogs may be included that have alternative backbones such as phosphoramidite, phosphorodithioate, or methylphophoroamidite linkages; or peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with bicyclic structures including locked nucleic acids, positive backbones, non-ionic backbones and non-ribose backbones. Modifications of the ribose-phosphate backbone may be done to increase the stability of the molecules; for example, PNA:DNA hybrids can exhibit higher stability in some environments. The terms “polynucleotide,” “oligonucleotide,” “nucleic acid” and “nucleic acid molecule” can comprise any suitable length, such as at least 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 100, 200, 300, 400, 500, or 1,000, or more than 1,000 nucleotides.
[0048] It will be appreciated that, as used herein, the terms “nucleoside” and “nucleotide” include those moieties which contain not only the known purine and pyrimidine bases, but also other heterocyclic bases which have been modified. Such modifications include methylated purines or pyrimidines, acylated purines or pyrimidines, or other heterocycles. Modified nucleosides or nucleotides can also include modifications on the sugar moiety, e.g., wherein one or more of the hydroxyl groups are replaced with halogen, aliphatic groups, or are functionalized as ethers, amines, or the like. The term “nucleotidic unit” is intended to encompass nucleosides and nucleotides. In preferred embodiments, the nucleoside or nucleotide is selected from the natural moieties comprised in DNA or RNA.
[0049] The terms “complementary” and “substantially complementary” include the hybridization or base pairing or the formation of a duplex between nucleotides or nucleic acids, for instance, between the two strands of a double-stranded DNA molecule or between an oligonucleotide primer and a primer binding site on a single-stranded nucleic acid. Complementary nucleotides are, generally, A and T (or A and U), or C and G. Two single-stranded RNA or DNA molecules are said to be substantially complementary when the nucleotides of one strand, optimally aligned and compared and with appropriate nucleotide insertions or deletions, pair with at least about 80% of the other strand, usually at least about 90% to about 95%, and even about 98% to about 100%. In one aspect, two complementary sequences of nucleotides are capable of hybridizing, preferably with less than 25%, more preferably with less than 15%, even more preferably with less than 5%, most preferably with no mismatches between opposed nucleotides. Preferably the two molecules will hybridize under conditions of high stringency.
[0050] “Hybridization” as used herein may refer to the process in which two single-stranded polynucleotides bind non-covalently to form a stable double-stranded polynucleotide. In one aspect, the resulting double-stranded polynucleotide can be a “hybrid” or “duplex.” “Hybridization conditions” typically include salt concentrations of approximately less than 1 M, often less than about 500 mM and may be less than about 200 mM. A “hybridization buffer” includes a buffered salt solution such as 5% SSPE, or other such buffers known in the art. Hybridization temperatures can be as low as 5° C., but are typically greater than 22° C., and more typically greater than about 30° C., and typically in excess of 37° C. Hybridizations are often performed under stringent conditions, i.e., conditions under which a sequence will hybridize to its target sequence but will not hybridize to other, non-complementary sequences. Stringent conditions are sequence-dependent and are different in different circumstances. For example, longer fragments may require higher hybridization temperatures for specific hybridization than short fragments. As other factors may affect the stringency of hybridization, including base composition and length of the complementary strands, presence of organic solvents, and the extent of base mismatching, the combination of parameters is more important than the absolute measure of any one parameter alone. Generally stringent conditions are selected to be about 5° C. lower than the T.sub.m for the specific sequence at a defined ionic strength and pH. The melting temperature T.sub.m can be the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. Several equations for calculating the T.sub.m of nucleic acids are well known in the art. As indicated by standard references, a simple estimate of the T.sub.m value may be calculated by the equation, T.sub.m=81.5+0.41 (% G+C), when a nucleic acid is in aqueous solution at 1 M NaCl (see e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985)). Other references (e.g., Allawi and SantaLucia, Jr., Biochemistry, 36:10581-94 (1997)) include alternative methods of computation which take structural and environmental, as well as sequence characteristics into account for the calculation of T.sub.m.
[0051] In general, the stability of a hybrid is a function of the ion concentration and temperature. Typically, a hybridization reaction is performed under conditions of lower stringency, followed by washes of varying, but higher, stringency. Exemplary stringent conditions include a salt concentration of at least 0.01 M to no more than 1 M sodium ion concentration (or other salt) at a pH of about 7.0 to about 8.3 and a temperature of at least 25° C. For example, conditions of 5×SSPE (750 mM NaCl, 50 mM sodium phosphate, 5 mM EDTA at pH 7.4) and a temperature of approximately 30° C. are suitable for allele-specific hybridizations, though a suitable temperature depends on the length and/or GC content of the region hybridized. In one aspect, “stringency of hybridization” in determining percentage mismatch can be as follows: 1) high stringency: 0.1×SSPE, 0.1% SDS, 65° C.; 2) medium stringency: 0.2×SSPE, 0.1% SDS, 50° C. (also referred to as moderate stringency); and 3) low stringency: 1.0×SSPE, 0.1% SDS, 50° C. It is understood that equivalent stringencies may be achieved using alternative buffers, salts and temperatures. For example, moderately stringent hybridization can refer to conditions that permit a nucleic acid molecule such as a probe to bind a complementary nucleic acid molecule. The hybridized nucleic acid molecules generally have at least 60% identity, including for example at least any of 70%, 75%, 80%, 85%, 90%, or 95% identity. Moderately stringent conditions can be conditions equivalent to hybridization in 50% formamide, 5×Denhardt's solution, 5×SSPE, 0.2% SDS at 42° C., followed by washing in 0.2×SSPE, 0.2% SDS, at 42° C. High stringency conditions can be provided, for example, by hybridization in 50% formamide, 5 ×Denhardt's solution, 5×SSPE, 0.2% SDS at 42° C., followed by washing in 0.1×SSPE, and 0.1% SDS at 65° C. Low stringency hybridization can refer to conditions equivalent to hybridization in 10% formamide, 5×Denhardt's solution, 6×SSPE, 0.2% SDS at 22° C., followed by washing in 1×SSPE, 0.2% SDS, at 37° C. Denhardt's solution contains 1% Ficoll, 1% polyvinylpyrolidone, and 1% bovine serum albumin (BSA). 20×SSPE (sodium chloride, sodium phosphate, EDTA) contains 3 M sodium chloride, 0.2 M sodium phosphate, and 0.025 M EDTA. Other suitable moderate stringency and high stringency hybridization buffers and conditions are well known to those of skill in the art and are described, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Press, Plainview, N.Y. (1989); and Ausubel et al., Short Protocols in Molecular Biology, 4th ed., John Wiley & Sons (1999).
[0052] Alternatively, substantial complementarity exists when an RNA or DNA strand will hybridize under selective hybridization conditions to its complement. Typically, selective hybridization will occur when there is at least about 65% complementary over a stretch of at least 14 to 25 nucleotides, preferably at least about 75%, more preferably at least about 90% complementary. See M. Kanehisa, Nucleic Acids Res. 12:203 (1984).
[0053] A “primer” used herein can be an oligonucleotide, either natural or synthetic, that is capable, upon forming a duplex with a polynucleotide template, of acting as a point of initiation of nucleic acid synthesis and being extended from its 3′ end along the template so that an extended duplex is formed. The sequence of nucleotides added during the extension process is determined by the sequence of the template polynucleotide. Primers usually are extended by a polymerase, for example, a DNA polymerase.
[0054] “Ligation” may refer to the formation of a covalent bond or linkage between the termini of two or more nucleic acids, e.g., oligonucleotides and/or polynucleotides, in a template-driven reaction. The nature of the bond or linkage may vary widely and the ligation may be carried out enzymatically. As used herein, ligations are usually carried out enzymatically to form a phosphodiester linkage between a 5′ carbon terminal nucleotide of one oligonucleotide with a 3′ carbon of another nucleotide.
[0055] “Amplification,” as used herein, generally refers to the process of producing multiple copies of a desired sequence. “Multiple copies” means at least 2 copies. A “copy” does not necessarily mean perfect sequence complementarity or identity to the template sequence. For example, copies can include nucleotide analogs such as deoxyinosine, intentional sequence alterations (such as sequence alterations introduced through a primer comprising a sequence that is hybridizable, but not complementary, to the template), and/or sequence errors that occur during amplification.
[0056] “Sequence determination” and the like include determination of information relating to the nucleotide base sequence of a nucleic acid. Such information may include the identification or determination of partial as well as full sequence information of the nucleic acid. Sequence information may be determined with varying degrees of statistical reliability or confidence. In one aspect, the term includes the determination of the identity and ordering of a plurality of contiguous nucleotides in a nucleic acid.
[0057] The term “Sequencing,” “High throughput sequencing,” or “next generation sequencing” includes sequence determination using methods that determine many (typically thousands to billions) of nucleic acid sequences in an intrinsically parallel manner, i.e. where DNA templates are prepared for sequencing not one at a time, but in a bulk process, and where many sequences are read out preferably in parallel, or alternatively using an ultra-high throughput serial process that itself may be parallelized. Such methods include but are not limited to pyrosequencing (for example, as commercialized by 454 Life Sciences, Inc., Branford, CT); sequencing by ligation (for example, as commercialized in the SOLiD™ technology, Life Technologies, Inc., Carlsbad, Calif.); sequencing by synthesis using modified nucleotides (such as commercialized in TruSeq™ and HiSeg™ technology by Illumina, Inc., San Diego, Calif.; HeliScope™ by Helicos Biosciences Corporation, Cambridge, Mass.; and PacBio RS by Pacific Biosciences of California, Inc., Menlo Park, Calif.), sequencing by ion detection technologies (such as Ion Torrent™ technology, Life Technologies, Carlsbad, Calif.); sequencing of DNA nanoballs (Complete Genomics, Inc., Mountain View, Calif.); nanopore-based sequencing technologies (for example, as developed by Oxford Nanopore Technologies, LTD, Oxford, UK), and like highly parallelized sequencing methods.
[0058] “SNP” or “single nucleotide polymorphism” may include a genetic variation between individuals; e.g., a single nitrogenous base position in the DNA of organisms that is variable. SNPs are found across the genome; much of the genetic variation between individuals is due to variation at SNP loci, and often this genetic variation results in phenotypic variation between individuals. SNPs for use in the present disclosure and their respective alleles may be derived from any number of sources, such as public databases (U.C. Santa Cruz Human Genome Browser Gateway (genome.ucsc.edu/cgi-bin/hgGateway) or the NCBI dbSNP website (ncbi.nlm nih gov/SNP/), or may be experimentally determined as described in U.S. Pat. No. 6,969,589; and US Pub. No. 2006/0188875 entitled “Human Genomic Polymorphisms.” Although the use of SNPs is described in some of the embodiments presented herein, it will be understood that other biallelic or multi-allelic genetic markers may also be used. A biallelic genetic marker is one that has two polymorphic forms, or alleles. As mentioned above, for a biallelic genetic marker that is associated with a trait, the allele that is more abundant in the genetic composition of a case group as compared to a control group is termed the “associated allele,” and the other allele may be referred to as the “unassociated allele.” Thus, for each biallelic polymorphism that is associated with a given trait (e.g., a disease or drug response), there is a corresponding associated allele. Other biallelic polymorphisms that may be used with the methods presented herein include, but are not limited to multinucleotide changes, insertions, deletions, and translocations.
[0059] It will be further appreciated that references to DNA herein may include genomic DNA, mitochondrial DNA, episomal DNA, and/or derivatives of DNA such as amplicons, RNA transcripts, cDNA, DNA analogs, etc. The polymorphic loci that are screened in an association study may be in a diploid or a haploid state and, ideally, would be from sites across the genome. Sequencing technologies are available for SNP sequencing, such as the BeadArray platform (GOLDENGATE™ assay) (Illumina, Inc., San Diego, Calif.) (see Fan, et al., Cold Spring Symp. Quant. Biol., 68:69-78 (2003)), may be employed.
[0060] “Multiplexing” or “multiplex assay” herein may refer to an assay or other analytical method in which the presence and/or amount of multiple targets, e.g., multiple nucleic acid sequences, can be assayed simultaneously by using more than one markers, each of which has at least one different detection characteristic, e.g., fluorescence characteristic (for example excitation wavelength, emission wavelength, emission intensity, FWHM (full width at half maximum peak height), or fluorescence lifetime) or a unique nucleic acid or protein sequence characteristic.
[0061] As used herein, “disease or disorder” refers to a pathological condition in an organism resulting from, e.g., infection or genetic defect, and characterized by identifiable symptoms.
B. Overview
[0062] The following enumerated embodiments are representative of certain facets of the invention.
[0063] 1. In a first embodiment, the invention provides a plurality of polynucleotides, wherein each polynucleotide comprises: [0064] a first target-specific domain and a second target-specific domain configured to bind to a first target sequence and a second target sequence, respectively, of a nucleic acid target, and [0065] a unique molecule identifier (UMI) and a linker between the first and second target-specific domains, [0066] wherein the first and second target-specific domains are configured to be connected to each other such that the polynucleotide forms a circle, optionally after a polymerase-mediated extension of the first or second target-specific domain, and [0067] wherein the nucleic acid target comprises a polymorphic nucleotide within the first target sequence and/or the second target sequence, or between the first and second target sequences.
[0068] 2. The plurality of polynucleotides of embodiment 1, which are single-stranded polynucleotides.
[0069] 3. The plurality of polynucleotides of embodiment 1 or 2, which comprise a nucleic acid, an oligonucleotide, a DNA molecule, a DNA with pseudo-complementary bases, a DNA or RNA with one or more protected bases, an RNA molecule, a BNA molecule, an XNA molecule, a LNA molecule, a PNA molecule, or a γPNA molecule, or a combination thereof.
[0070] 4. The plurality of polynucleotides of any one of embodiments 1-3, which are between about 50 nucleotides and about 200 nucleotides in length, e.g., between about 90 nucleotides and about 100 nucleotides in length.
[0071] 5. The plurality of polynucleotides of any one of embodiments 1-4, wherein the first target-specific domain and/or the second target-specific domain are between about 15 nucleotides and about 30 nucleotides in length, e.g., about 20 nucleotides in length.
[0072] 6. The plurality of polynucleotides of any one of embodiments 1-5, wherein the UMI is between about 5 and about 15 nucleotides in length, e.g., between about 6 and about 8 nucleotides in length.
[0073] 7. The plurality of polynucleotides of any one of embodiments 1-6, wherein the linker is between about 10 and about 100 nucleotides in length, e.g., about 50 nucleotides in length.
[0074] 8. The plurality of polynucleotides of any one of embodiments 1-7, wherein the linker comprises one or more common nucleotides for subsequent PCR annealing. Any suitable nucleotide or short nucleotide sequence that will facilitate PCR can be used.
[0075] 9. The plurality of polynucleotides of any one of embodiments 1-8, wherein the distance between the first and the second target sequences are between about 0 and about 100 nucleotides.
[0076] 10. The plurality of polynucleotides of any one of embodiments 1-9, wherein each polynucleotide comprises the first target-specific domain, the UMI, the linker, and the second target-specific domain in the 5′ to 3′ direction.
[0077] 11. The plurality of polynucleotides of any one of embodiments 1-9, wherein each polynucleotide comprises the first target-specific domain, the linker, the UMI, and the second target-specific domain in the 5′ to 3′ direction.
[0078] 12. The plurality of polynucleotides of any one of embodiments 1-11, wherein the nucleic acid target is from a sex chromosome, such as a chromosome X or chromosome Y, or from an autosome.
[0079] 13. The plurality of polynucleotides of any one of embodiments 1-12, wherein the nucleic acid target is from a mammalian chromosome, such as a human chromosome.
[0080] 14. The plurality of polynucleotides of any one of embodiments 1-13, wherein the plurality of polynucleotides are configured to bind to a target sequence on human chromosome 1, human chromosome 2, human chromosome 3, human chromosome 4, human chromosome 9, human chromosome 13, human chromosome 15, human chromosome 18, human chromosome 19, human chromosome 21, human chromosome 22, human chromosome X, or human chromosome Y, or any combination thereof.
[0081] 15. The plurality of polynucleotides of embodiments14, wherein the plurality of polynucleotides are configured to bind to a target sequence on human chromosome 21, human chromosome 18, human chromosome 13, human chromosome X, human chromosome Y, or at least one other human autosome, e.g., human chromosome 1, human chromosome 2, human chromosome 3, human chromosome 4, human chromosome 9, human chromosome 15, human chromosome 19, human chromosome 21, or any combination thereof.
[0082] 16. The plurality of polynucleotides of embodiment 15, wherein the plurality of polynucleotides are configured to bind to a target sequence on human chromosome 21, human chromosome 18, human chromosome 13, human chromosome X, human chromosome Y, and at least one other human autosome.
[0083] 17. The plurality of polynucleotides of any one of embodiments 1-16, wherein the polymorphic nucleotide is at a single nucleotide polymorphism (SNP) site.
[0084] 18. The plurality of polynucleotides of any one of embodiments 1-17, wherein the polymorphic nucleotide comprises a plurality of polymorphic nucleotides, for example, nucleotides at a plurality of single nucleotide polymorphism (SNP) sites.
[0085] 19. The plurality of polynucleotides of any one of embodiments 1-18, comprising between about 50 and about 150 polynucleotides (e.g., about 120 polynucleotides) configured to bind to a target sequence on human chromosome 1, e.g., any of target sequences 1-117 as set forth in Table 1 (the Table in
[0086] 20. The plurality of polynucleotides of any one of embodiments 1-19, comprising between about 10 and about 50 polynucleotides (e.g., about 40 polynucleotides) configured to bind to a target sequence on human chromosome 2, e.g., any of target sequences 2747-2784 as set forth in Table 1 (the Table in
[0087] 21. The plurality of polynucleotides of any one of embodiments 1-20, comprising between about 10 and about 80 polynucleotides (e.g., about 60 polynucleotides) configured to bind to a target sequence on human chromosome 3, e.g., any of target sequences 4072-4126 as set forth in Table 1 (the Table in
[0088] 22. The plurality of polynucleotides of any one of embodiments 1-21, comprising between about 10 and about 80 polynucleotides (e.g., about 50 polynucleotides) configured to bind to a target sequence on human chromosome 4, e.g., any of target sequences 4127-4171 as set forth in Table 1 (the Table in
[0089] 23. The plurality of polynucleotides of any one of embodiments 1-22, comprising between about 10 and about 80 polynucleotides (e.g., about 40 polynucleotides) configured to bind to a target sequence on human chromosome 9, e.g., any of target sequences 4172-4212 as set forth in Table 1 (the Table in
[0090] 24. The plurality of polynucleotides of any one of embodiments 1-23, comprising between about 100 and about 1,500 polynucleotides (e.g., about 1,200 polynucleotides) configured to bind to a target sequence on human chromosome 13, e.g., any of target sequences 118-1337 as set forth in Table 1 (the Table in
[0091] 25. The plurality of polynucleotides of any one of embodiments 1-24, comprising between about 10 and about 150 polynucleotides (e.g., about 100 polynucleotides) configured to bind to a target sequence on human chromosome 15, e.g., any of target sequences 1338-1444 as set forth in Table 1 (the Table in
[0092] 26. The plurality of polynucleotides of any one of embodiments 1-25, comprising between about 100 and about 1,500 polynucleotides (e.g., about 1,200 polynucleotides) configured to bind to a target sequence on human chromosome 18, e.g., any of target sequences 1445-2681 as set forth in Table 1 (the Table in
[0093] 27. The plurality of polynucleotides of any one of embodiments 1-26, comprising between about 10 and about 100 polynucleotides (e.g., about 60 polynucleotides) configured to bind to a target sequence on human chromosome 19, e.g., any of target sequences 2682-2746 as set forth in Table 1 (the Table in
[0094] 28. The plurality of polynucleotides of any one of embodiments 1-27, comprising between about 100 and about 1,500 polynucleotides (e.g., about 1,200 polynucleotides) configured to bind to a target sequence human chromosome 21, e.g., any of target sequences 2785-3995 as set forth in Table 1 (the Table in
[0095] 29. The plurality of polynucleotides of any one of embodiments 1-28, comprising between about 10 and about 120 polynucleotides (e.g., about 70 polynucleotides) configured to bind to a target sequence on human chromosome 22, e.g., any of target sequences 3996-4071 as set forth in Table 1 (the Table in
[0096] 30. The plurality of polynucleotides of any one of embodiments 1-29, comprising between about 100 and about 500 polynucleotides (e.g., about 300 polynucleotides) configured to bind to a target sequence on human chromosome X, e.g., any of target sequences 4213-4462 as set forth in Table 1 (the Table in
[0097] 31. The plurality of polynucleotides of any one of embodiments 1-30, comprising between about 300 and about 800 polynucleotides (e.g., about 500 polynucleotides) configured to bind to a target sequence on human chromosome Y, e.g., any of target sequences 4463-4962 as set forth in Table 1 (the Table in
[0098] 32. The plurality of polynucleotides of any one of embodiments 1-31, comprising between about 300 and about 4,500 polynucleotides (e.g., about 3,600 polynucleotides) configured to bind to target sequences on human chromosomes 13, 18, and 21.
[0099] 33. The plurality of polynucleotides of any one of embodiments 1-32, comprising between about 120 and about 800 polynucleotides (e.g., about 540 polynucleotides) configured to bind to target sequences on one or more human autosomes other than chromosomes 13, 18, and 21.
[0100] 34. The plurality of polynucleotides of any one of embodiments 1-33, wherein the nucleic acid target comprises fragmented DNA of between about 100 and about 200 nucleotides in length (e.g., about 150 nucleotides in length).
[0101] 35. A method for analyzing a fetal genetic information, e.g., fetal fraction, a chromosome abnormality such as trisomy, sex determination and/or prenatal paternity test, comprising: [0102] a) contacting a sample from a female subject with the plurality of polynucleotides of any one of embodiments 1-34; and [0103] wherein nucleic acid sequence information of the sample is obtained, which indicates a fetal genetic information.
[0104] 36. The method of embodiment 35, wherein the female subject is known to be pregnant or suspected of being pregnant.
[0105] 37. The method of embodiment 35 or 36, wherein the sample is a blood, serum, plasma, buccal swab, urine, saliva, tear, or body fluid sample.
[0106] 38. The method of any one of embodiments 35-37, wherein the sample is freshly isolated or archived.
[0107] 39. The method of any one of embodiments 35-38, wherein the sample comprises genomic DNA and/or cfDNA.
[0108] 40. The method of any one of embodiments 35-39, wherein the sample comprises both maternal DNA and fetal DNA.
[0109] 41. The method of any one of embodiments 35-40, wherein the sample is obtained by a method comprising a blood collection step, a sample transportation step, a plasma preparation step, and/or a cfDNA extraction step, before the contacting step.
[0110] 42. The method of any one of embodiments 35-41, further comprising: [0111] b) allowing the plurality of polynucleotides to bind to nucleic acid targets in the sample.
[0112] 43. The method of embodiment 42, further comprising: [0113] c) allowing the first and second target-specific domains of each polynucleotide bound to its nucleic acid target sequences to connect with each other such that the polynucleotide forms a circle.
[0114] 44. The method of embodiment 43, wherein the connection is achieved by ligation.
[0115] 45. The method of embodiment 43, wherein the connection is achieved by polymerase-mediated extension of the first or second target-specific domain, followed by ligation of the extended first (or second) target-specific domain to the second (or first) target-specific domain, or by ligation of the extended second (or first) target-specific domain to the first (or second) target-specific domain.
[0116] 46. The method of any one of embodiments 43-45, further comprising: [0117] d) eliminating polynucleotides that are not in circular form, e.g., polynucleotides that are not bound to any nucleic acid target and/or polynucleotides whose first and second target-specific domains are not connected in step c).
[0118] 47. The method of embodiment 46, wherein the polynucleotides to be eliminated are linear, and step d) comprises contacting the sample from step c) with a nuclease, such as an exonuclease, e.g., Exo I and/or III.
[0119] 48. The method of embodiment 46 or 47, further comprising: [0120] e) releasing polynucleotides that are in circular form from their nucleic acid targets.
[0121] 49. The method of embodiment 48, wherein the releasing comprises cleaving the linkers of the polynucleotides that are in circular form.
[0122] 50. The method of embodiment 48 or 49, further comprising: [0123] f) an enrichment step, such as an amplification reaction, of the released polynucleotides.
[0124] 51. The method of embodiment 50, wherein the amplification reaction is a polymerase chain reaction (PCR), e.g., PCR using one or more primers in the linkers, a reverse-transcription PCR amplification, allele-specific PCR (ASPCR), single-base extension (SBE), allele specific primer extension (ASPE), strand displacement amplification (SDA), transcription mediated amplification (TMA), ligase chain reaction (LCR), nucleic acid sequence based amplification (NASBA), primer extension, rolling circle amplification (RCalif.), self-sustained sequence replication (3SR), the use of Q Beta replicase, nick translation, or loop-mediated isothermal amplification (LAMP), or any combination thereof.
[0125] 52. The method of embodiment 50 or 51, further comprising: [0126] g) obtaining the nucleic acid sequence information of the released polynucleotides, such as by hybridization-based detection and/or sequencing, including the observed UMI counts.
[0127] 53. The method of embodiment 52, further comprising: [0128] h) analyzing the nucleic acid sequence information obtained in step g).
[0129] 54. The method of embodiment 53, which is configured for analyzing fetal fraction, a chromosome abnormality such as trisomy, sex determination and/or prenatal paternity test.
[0130] 55. The method of embodiment 54, wherein the analyzing step comprises analyzing the fetal fractions and the UMI counts, and choosing the fetal fraction that best explains the observed UMI counts.
[0131] 56. The method of embodiment 55, where in the analyzing step comprising assuming that the genotypes of the fetus and the female subject are known as g.sub.c and g.sub.m, respectively, and calculating the frequencies of the polymorphic nucleotides (such as SNP nucleotides) according to the following formula:
[0133] 57. The method of embodiment 54, which is configured for analyzing trisomy using a hybrid of a depth-based and genotype-based approach.
[0134] 58. The method of embodiment 54, which is configured for sex determination using an extension of the trisomy depth model to the sex chromosomes.
[0135] 59. The method of any one of embodiments 54-58, which is conducted without using or referring to a known genotype.
[0136] 60. The method of embodiment 54, which is configured for prenatal paternity test using the following analysis: [0137] 1) using a fully marginalized model to determine fetal fraction and produce a ‘baseline’ likelihood; [0138] 2) conducting a second analysis through the model to perform the same calculation as in step 1) using a putative father's genotype to constrain the genotypes to only those consistent with inheriting a paternal allele to obtain a second likelihood; and [0139] 3) deciding against paternity of the putative father when the ratio (P[data|baseline]/P[data|paternal] is more than a threshold, e.g., (P[data|baseline]/P[data|paternal] >10).
[0140] 61. A kit for analyzing a fetal genetic information, e.g., fetal fraction, a chromosome abnormality such as trisomy, sex determination and/or prenatal paternity test, comprising a plurality of polynucleotides of any one of embodiments 1-34.
[0141] 62. The kit of embodiment 61, which further comprises a reagent and/or a container for obtaining, preparing, isolating, enriching, purifying, storing and/or transporting a sample, e.g., a blood, serum, plasma, buccal swab, urine, saliva, tear, or body fluid sample.
[0142] 63. The kit of embodiments 61 or 62, which further comprises a reagent for obtaining, preparing, isolating, enriching, purifying, storing and/or transporting polynucleotides, e.g., genomic DNA and/or cfDNA, from a sample.
[0143] 64. The kit of any one of embodiments 61-63, wherein the polynucleotides comprise both maternal DNA and fetal DNA.
[0144] 65. The kit of any one of embodiments 61-63, which further comprises a ligase.
[0145] 66. The method of embodiment 65, which further comprises an enzyme, e.g., a polymerase, and/or another reagent for polymerase-mediated extension of the first or second target-specific domain.
[0146] 67. The kit of any one of embodiments 61-66, which further comprises a reagent, e.g., an enzyme, a buffer or a washing solution, for eliminating polynucleotides that are not in circular form, e.g., polynucleotides that are not bound to any nucleic acid target and/or polynucleotides whose first and second target-specific domains are not connected.
[0147] 68. The method of embodiment 67, wherein the enzyme is a nuclease, such as an exonuclease, e.g., Exo I and/or III.
[0148] 69. The method of embodiment 67 or 68, which further comprises a reagent, e.g., an enzyme or a polymerase, for enriching or amplifying the released polynucleotides.
[0149] 70. The kit of embodiment 69, wherein the reagent, e.g., an enzyme, is configured to be used in amplification reaction selected from the group consisting of a polymerase chain reaction (PCR), e.g., PCR using one or more primers in the linkers, a reverse-transcription PCR amplification, allele-specific PCR (ASPCR), single-base extension (SBE), allele specific primer extension (ASPE), strand displacement amplification (SDA), transcription mediated amplification (TMA), ligase chain reaction (LCR), nucleic acid sequence based amplification (NASBA), primer extension, rolling circle amplification (RCalif.), self-sustained sequence replication (3SR), the use of Q Beta replicase, nick translation, loop-mediated isothermal amplification (LAMP), and any combination thereof.
[0150] 71. The kit of any one of embodiments 61-70, which further comprises a reagent, e.g., an enzyme, for obtaining the nucleic acid sequence information of the released polynucleotides, such as by hybridization-based detection and/or sequencing, including the observed UMI counts.
[0151] 72. The kit of embodiment 71, which further comprises means for analyzing the nucleic acid sequence information.
[0152] 73. The kit of embodiment 72, wherein the means is configured for analyzing fetal fraction, a chromosome abnormality such as trisomy, sex determination and/or prenatal paternity test.
[0153] The methods of the invention are illustrated by the following description and examples: based on these, the skilled person can apply the methods to a variety of samples and targets. The examples are illustrative and are not to be seen as limiting the scope of the invention.
[0154] A plurality of single stranded DNA connector inversion probes are typically used to simultaneously capture selected human genomic DNA regions which contain one or more SNPs on chromosome 21, 18, 13, X, Y, and some other autosomal chromosomes. The captured regions provide both SNP information and depth coverage information which allow the user to simultaneously measure fetal fraction and better characterize Trisomy, for example.
[0155] Target selection and connector inversion probes design are illustrated in
[0156] Capture method and optimization. The procedures to capture the human genomic targets using single stranded connector inversion probes were described previously [21,22]. As illustrated in
[0157] Input DNA. Genomic DNA can be fragmented with ˜150 bp peak using Corvaris, purified using AmpureXP and the concentration measured by Qubit. DNA from mother and DNA from son or daughter can be mixed at proper ratio to create a DNA mixture to mimic cfDNA from pregnant women with various fetal fraction and/or T21 pregnancy. cfDNA can be extracted from the plasma of pregnant women by methods well-known in the art.
[0158] Connector inversion probes selection. Around 5000 probes were selected as final working probes for Afisawa assay from initial 14,000 probes designed using licensed software (in US 20140357497 A1 patent-Designing padlock probes for targeted genomic sequencing (Kun Zhang, Athurva Gore)) as previously described [21,22].
Example 1
[0159] Using the methods described above, reference samples and test samples were subjected to Afisawa testing, and the results are shown in
[0160] As depicted in
[0161] 6 ng either cfDNA from a pregnant woman or genomic DNA mixture was used as input for Afisawa assay. Fastq file from each sample went through the mapping pipeline. Total-UMIs represent the total number of targets being captured (
[0162] The human mapped reads, on targeted reads, total UMIs, the number of UMI containing different SNP are shown in
Fetal Fraction Estimate
[0163] The fetal fraction estimate is produced by choosing the fraction of fetal reads which best explains the observed UMI counts. The basis for the model is the observation that SNP allele counts are informative whenever the fetal genotype differs from the maternal genotype. While neither of these genotypes are known, we observe that (i) the parents are unrelated and (ii) each SNP has a known population frequency. These two observations enable one to propagate allele count information through the genotype uncertainty.
[0164] Specifically: assuming the genotypes are known, the SNP allele counts follow a standard binomial distribution, with frequencies of
[0165] where f is the fetal fraction, and g.sub.c and g.sub.m are the genotypes of the child and mother, respectively (0 for AA, 1, for Aa, and 2 for aa).
[0166] The count-model is nested within the genotype model. gm is a binomial draw from the population frequency, q; while g.sub.c must share one allele with g.sub.m due to Mendelian inheritance, and the other allele is randomly drawn from the population with frequency q. We can then use Bayes' rule to maximize:
Trisomy Model
[0167] The trisomy model is a hybrid of a depth-based and genotype-based approach. For the depth model, by knowing the number of probes for each chromosome, we can calculate the expected number of UMI for each chromosome under normal, trisomy, and haploid states. In particular:
where N is the total number of UMI, and M.sub.i is the number of probes on chromosome i. The variances of these estimates are Nq(1−q); and they are approximately normal due to the law of large numbers. Then at a fetal fraction f, the observed number of UMI in the triploid state follows a normal distribution with mean
N.sub.21.sup.(obs,trip)=fN.sub.21.sup.(trip)+(1−f)N.sub.21.sup.(dip)
[0168] and variance fNq.sub.21.sup.(trip)(1−q.sub.21 .sup.(trip))+(1−f)Nq.sub.21.sup.(dip)(1−q.sub.21.sup.(dip)). We summarize the observed UMI as both a Z-score under the pure N.sub.21.sup.(dip) distribution, as well as a Bayes factor for N.sub.21.sup.(obs,trip) vs N.sub.21.sup.(dip).
[0169] The genotype-based trisomy model follows the same approach as the fetal fraction model, but instead contrasts a trisomy or a normal model. Briefly
where S is either ‘trisomy’ or ‘normal’.
[0170] The observed UMI counts are again binomial, with the mean and variance given by the (fetal fraction)-weighted average of the child and mother allele frequencies. The likelihood child's genotype state is dependent on two additional unknown factors: i) which parent contributed the extra chromosome, and ii) which meiotic division resulted in the duplication. For instance, a paternal first-division nondisjunction will contribute both of the father's alleles; while a paternal second-division nondisjunction will contribute one of the father's alleles at copy number 2. Based on epidemiological studies, the probability of paternal origin is set at 8.3%, and the probability of first-division nondisjunction is set at 30%. The evidence of trisomy based on SNP allele counts is summarized as a Bayes factor for P[trisomy|D] vs P[norma|D].
[0171] The final trisomy adding the depth and genotype Bayes factors, with larger scores corresponding to higher confidence in the presence of an extra copy of chromosome 21.
Sex Determination
[0172] Applying this method for sex determination is an extension of the trisomy depth model to the sex chromosomes. In particular
[0173] We model the observed X and Y counts as binomial distributions with rates
fq.sub.x.sup.(H)+(1−f)q.sub.x.sup.(female); fq.sub.x.sup.(H)+(1−f)q.sub.y.sup.(female)
respectively, with H the hypothesized child sex. Whichever hypothesis maximizes the posterior likelihood (after marginalizing) is selected as the observed sex.
Prenatal Paternity Test
[0174] Afisawa can be potentially used as a prenatal paternity test if gene typing of the potential father's genomic DNA is available. The genotype model for fetal fraction is used; and the fully marginalized model is used to determine fetal fraction and produce a ‘baseline’ likelihood. A second pass through the model performs the same calculation, but instead of summing over all possible child genotypes, the putative father's genotypes are used to constrain the genotypes to only those consistent with inheriting a paternal allele. This results in a second likelihood. If this likelihood is ten times less likely than the baseline (P[data|baseline]/P[data|paternal]>10), then this is taken as evidence against paternity.
Example 2
Gender Identification
[0175] In this example, 200-500 single stranded connector inversion probes targeting human Y Chromosome were used to detect the male/female pregnancy. In the example of
Example 3
Fetal Fraction
[0176] One major feature of Afisawa assay is its capability to measure the fetal fraction regardless of male or female pregnancy. Fragmented DNA mixtures mimicking either female pregnancy (
[0177]
[0178] In the example of
Example 4
Trisomy Call
[0179] In the example summarized in
[0180] One to four applications (fetal fraction estimate, fetal sex determination, Trisomy call, prenatal paternity test) of Afisawa can be achieved by utilizing the different combinations of probes on target capture and/or data analysis from a single test.
[0181] Based on the foregoing, the skilled person can design suitable polynucleotides to use as probes for the methods of the invention. The probes can be directed to many different targets of interest for NIPT testing. The Table in
REFERENCS
[0182] 1. Gao Y, Xie B, Liu R. Delivering noninvasive prenatal testing in a clinical setting using semiconductor sequencing platform. Sci China Life Sci. 2014 July; 57(7):737-8. doi: 10.1007/s11427-014-4696-0. Epub 2014 Jun 26. No abstract available. PMID:24969704 [0183] 2. Lo Y M, Corbetta N, Chamberlain P F, Rai V, Sargent I L, Redman C W, Wainscoat J S. Presence of fetal DNA in maternal plasma and serum. Lancet. 1997 Aug. 16; 350(9076):4857.PMID:9274585 [0184] 3. Wong F C, Lo Y M Prenatal Diagnosis Innovation: Genome Sequencing of Maternal Plasma. Annu Rev Med. 2016; 67:419-32. doi: 10.1146/annurev-med-091014-115715. Epub 2015 Oct. 15. [0185] 4. Chiu R W, Chan K C, Gao Y, Lau V Y, Zheng W, Leung T Y, Foo C H, Xie B, Tsui N B, Lun F M, Zee B C, Lau T K, Cantor C R, Lo Y M. Noninvasive prenatal diagnosis of fetal chromosomal aneuploidy by massively parallel genomic sequencing of DNA in maternal plasma. Proc Natl Acad Sci U S A. 2008 Dec. 23; 105(51):20458-63. doi: 10.1073/pnas.0810641105. Epub 2008 Dec. 10. [0186] 5. Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood. Fan H C, Blumenfeld Y J, Chitkara U, Hudgins L, Quake S R. Proc Natl Acad Sci U S A. 2008 Oct. 21; 105(42):16266-71. doi: 10.1073/pnas.0808319105. Epub 2008 Oct. 6. PMID:18838674 [0187] 6. Non-invasive prenatal assessment of trisomy 21 by multiplexed maternal plasma DNA sequencing: large scale validity study. Chiu R W, Akolekar R, Zheng Y W, Leung T Y, Sun H, Chan K C, Lun F M, Go A T, Lau E T, To W W, Leung W C, Tang R Y, Au-Yeung S K, Lam H, Kung Y Y, Zhang X, van Vugt J M, Minekawa R, Tang MH, Wang J, Oudejans C B, Lau T K, Nicolaides K H, Lo Y M. BMJ. 2011 Jan. 11; 342:c7401. doi: 10.1136/bmj.c7401 [0188] 7. DNA sequencing versus standard prenatal aneuploidy screening. Bianchi D W, Parker R L, Wentworth J, Madankumar R, Saffer C, Das AF, Craig J A, Chudova D I, Devers P L, Jones K W, Oliver K, Rava R P, Sehnert A J; CARE Study Group.N Engl J Med. 2014 Feb. 27; 370(9):799-808. doi: 10.1056/NEJMoa1311037 [0189] 8. Cell-free DNA analysis for noninvasive examination of trisomy. Norton M E, Jacobsson B, Swamy G K, Laurent L C, Ranzini A C, Brar H, Tomlinson M W, Pereira L, Spitz J L, Hollemon D, Cuckle H, Musci T J, Wapner R J. N Engl J Med. 2015 Apr. 23; 372(17):1589-97. doi: 10.1056/NEJMoa1407349. Epub 2015 Apr. 1. [0190] 9. Non-Invasive Chromosomal Evaluation (NICE) Study: results of a multicenter prospective cohort study for detection of fetal trisomy 21 and trisomy 18.Norton M E, Brar H, Weiss J, Karimi A, Laurent L C, Caughey A B, Rodriguez M H, Williams J 3rd, Mitchell M E, Adair C D, Lee H, Jacobsson B, Tomlinson M W, Oepkes D, Hollemon D, Sparks A B, Oliphant A, Song K. Am J Obstet Gynecol. 2012 August; 207(2):137.e1-8. doi: 10.1016/j.ajog.2012.05.021. Epub 2012 Jun. 1. [0191] 10. Noninvasive prenatal detection and selective analysis of cell-free DNA obtained from maternal blood: evaluation for trisomy 21 and trisomy 18. Sparks A B, Struble C A, Wang E T, Song K, Oliphant A. Am J Obstet Gynecol. 2012 April; 206(4):319.e1-9. doi: 10.1016/j.ajog.2012.01.030. Epub 2012 Jan. 26. PMID:22464072 [0192] 11. Selective analysis of cell-free DNA in maternal blood for evaluation of fetal trisomy. Sparks A B, Wang E T, Struble C A, Barrett W, Stokowski R, McBride C, Zahn J, Lee K, Shen N, Doshi J, Sun M, Garrison J, Sandler J, Hollemon D, Pattee P, Tomita-Mitchell A, Mitchell M, Stuelpnagel J, Song K, Oliphant A. Prenat Diagn. 2012 January; 32(1):3-9. doi: 10.1002/pd.2922. Epub 2012 Jan. 6. [0193] 12. Noninvasive prenatal aneuploidy testing of chromosomes 13, 18, 21, X, and Y, using targeted sequencing of polymorphic loci Zimmermann B, Hill M, Gemelos G, Demko Z, Banjevic M, Baner J, Ryan A, Sigurjonsson S, Chopra N, Dodd M, Levy B, Rabinowitz M. Prenat Diagn. 2012 December; 32(13):1233-41. doi: 10.1002/pd.3993. Epub 2012 Oct 30. [0194] 13. Agarwal, A., Sayres, L. C., Cho, M. K., Cook-Deegan, R., & Chandrasekharan, S. (2013). Commercial Landscape of noninvasive prenatal testing in the United States. Prenatal Diagnosis, 33(6), 521-531. http://doi.org/10.1002/pd.4101 [0195] 14. The impact of maternal plasma DNA fetal fraction on next generation sequencing tests for common fetal aneuploidies. Canick J A, Palomaki G E, Kloza E M, Lambert-Messerlian G M, Haddow J E. Prenat Diagn. 2013 July; 33(7):667-74. doi: 10.1002/pd.4126. Epub 2013 May 31. PMID:23592541 [0196] 15. Gregg A R et al, Noninvasive prenatal screening for fetal aneuploidy, 2016 update: a position statement of the American Colleg of Medical Genetics and Genomics, 2016, Genetics in Medicine. [0197] 16. Bioinformatics Approaches for Fetal DNA Fraction Estimation in Noninvasive Prenatal Testing. Peng X L, Jiang P.IntJMolSci.2017Feb. 20; 18(2).pii:E453.doi:10.3390/ijms18020453.Review.PMID:2823 0760 [0198] 17. Nilsson M, Malmgren H, Samiotaki M, Kwiatkowski M, Chowdhary B P, Landegren U (1994). “Padlock probes: circularizing oligonucleotides for localized DNA detection”. Science. 265 (5181): 2085-208 [0199] 18. Michael S. Akhras, Magnus Unemo, Sreedevi Thiyagarajan, Pål Nyrén, Ronald W. Davis, Andrew Z. Fire, and Nader Pourmand Connector Inversion Probe Technology: A Powerful One-Primer Multiplex DNA Amplification System for Numerous Scientific Applications PLoS One. 2007; 2(9): e915. Published online 2007 Sep. 19. doi: 10.1371/journal.pone.0000915 [0200] 19. Porreca G J, Zhang K, Li JB, Xie B, Austin D, Vassallo S L, LeProust E M, Peck B J, Emig C J, Dahl F, Gao Y, Church G M, Shendure J (2007). “Multiplex amplification of large sets of human exons”. Nat Methods. 4 (11): 931-936.Multiplex padlock targeted sequencing reveals human hypermutable CpG variations. [0201] 20. Li J B, Gao Y, Aach J, Zhang K, Kryukov G V, Xie B, Ahlford A, Yoon J K, Rosenbaum A M, Zaranek A W, LeProust E, Sunyaev S R, Church G M. Genome Res. 2009 September; 19(9):1606-15. doi: 10.1101/gr.092213.109. Epub 2009 Jun. 12 [0202] 21. Deng J, Shoemaker R, Xie B, Gore A, LeProust E M, Antosiewicz-Bourget J, Egli D, Maherali N, Park I H, Yu J, Daley G Q, Eggan K, Hochedlinger K, Thomson J, Wang W, Gao Y, Zhang K. Targeted bisulfite sequencing reveals changes in DNA methylation associated with nuclear reprogramming Nat Biotechnol. 2009 April; 27(4):353-60. doi: 10.1038/nbt.1530. Epub 2009 Mar. 29. [0203] 22. Library-free methylation sequencing with bisulfite padlock probes. Diep D, Plongthongkum N, Gore A, Fung HL, Shoemaker R, Zhang K. Nat Methods. 2012 Feb. 5; 9(3):270-2. doi: 10.1038/nmeth.1871.