Means and methods for non-invasive diagnosis of chromosomal aneuploidy

09784742 · 2017-10-10

Assignee

Inventors

Cpc classification

International classification

Abstract

The invention relates to a prenatal diagnostic method for the determination of a fetal chromosomal aneuploidy in a biological sample obtained from a pregnant woman, which method comprises enrichment and quantification of selected cell-free deoxyribonucleic acid sequences showing consensus nucleosome binding regions.

Claims

1. A method for determining a fetal chromosomal aneuploidy in a biological sample of a pregnant female individual, wherein the biological sample includes nucleic acid molecules, the method comprising: (a) selecting for and isolating from a biological sample of a pregnant female individual one or more target sequences of DNA molecules present in the biological sample using sequence-specific selection of the target sequences, wherein said target sequences comprise DNA sequences including sequence regions located from 500 bp upstream to 1500 bp downstream of a transcription start site (TSS); (b) amplifying said selected target sequences; (c) sequencing said amplified selected target sequences, allotting each target sequence to a chromosome of the genome and identifying the unique allotted target sequences, wherein each identified unique allotted target sequence maps to a single chromosome; (d) determining a first amount for each of one or more first chromosomes identified on the basis of said unique allotted target sequences originating from said one or more first chromosomes; (e) determining a second amount for each of one or more second chromosomes identified on the basis of said unique allotted target sequences originating from said one or more second chromosomes; and (f) determining based on the said first and second amount a fetal chromosomal aneuploidy of one or more of said first chromosomes.

2. The method of claim 1, wherein step (f) further comprises: (i) determining a parameter from said first amount relative to said second amount; (ii) comparing the parameter to a corresponding cut off control value; and based on the comparison, determining whether or not there is a difference allowing for the prediction of a fetal chromosomal aneuploidy of one or more of said first chromosomes.

3. The method of claim 1, wherein said biological sample is a maternal blood sample.

4. The method of claim 3, wherein said blood sample is a sample of maternal blood plasma or maternal blood serum.

5. The method of claim 1, wherein said biological sample is a urine sample or a saliva sample.

6. The method of claim 1, wherein said one or more first chromosomes are selected from the group consisting of chromosome 21, chromosome 18, chromosome 13, chromosome X, and chromosome Y.

7. The method of claim 1, wherein step (a) comprises contacting the biological sample with a nucleic acid probe specific for each target sequence.

8. The method of claim 7, wherein the nucleic acid probe comprises RNA or DNA.

9. The method of claim 7, wherein the nucleic acid probe is biotinylated.

Description

FIGURES

(1) FIG. 1 shows a flowchart delineating the principle workflow of a method for performing prenatal diagnosis for the determination of a fetal chromosomal aneuploidy in a biological sample obtained from a pregnant woman according to the present invention.

(2) FIG. 2 (A) shows a plot of percentage representation of reads uniquely mapped without any mismatch to chromosome 13 in selected cell-free DNA samples according to an embodiment of the present invention; (B) shows a plot of percentage representation of reads uniquely mapped without any mismatch to chromosome 21 in selected cell-free DNA samples according to an embodiment of the present invention.

(3) FIG. 3 (A) shows a plot of percentage representation of sequence reads uniquely mapped to bait regions on chromosome 13 (150 bp upstream-150 bp downstream) without any mismatch in selected cell-free DNA samples according to an embodiment of the present invention; (B) shows a plot of percentage representation of sequence reads uniquely mapped to bait regions on chromosome 21 (150 bp upstream-150 bp downstream) without any mismatch in selected cell-free DNA samples according to an embodiment of the present invention.

(4) FIG. 4 (A) shows a plot of percentage representation of sequence reads uniquely mapped to TSS (500 bp upstream-1500 bp downstream) on chromosome 13 without any mismatch in selected cell-free DNA samples according to an embodiment of the present invention; (B) shows a plot of percentage representation of sequence reads uniquely mapped to TSS (500 bp upstream-1500 bp downstream) on chromosome 21 without any mismatch in selected cell-free DNA samples according to an embodiment of the present invention.

(5) FIG. 5 shows a plot of percentage representation of chromosome 13 sequence reads in selected cell-free DNA samples according to earlier results using shotgun sequencing strategy without enrichment process.

(6) In order that the invention described herein may be more fully understood, the following example is set forth. It is for illustrative purposes only and shall not be construed as limiting this invention in any respect.

(7) It is further understood that the present invention shall also comprise variations of the expressly disclosed embodiments to an extent as would be contemplated by a person of ordinary skill in the art.

EXAMPLES

Example 1

Performance of Prenatal Diagnosis for Detection of Fetal Chromosomal Disorders by Target Enrichment Technology

(8) Firstly, up to 15 ml of peripheral blood are taken from a pregnant woman and collected in tubes containing EDTA. Cell-free plasma is obtained by centrifugation of the blood sample. The cell-free plasma DNA is extracted from the plasma by using the QIAamp DSP DNA Blood Mini Kit (Qiagen) or QIAamp DNA Micro Kit (Qiagen).

(9) After DNA extraction a step of target enrichment is carried out. More specifically, one or more specific DNA sequences comprising consensus nucleosome binding regions around transcriptional start sites of protein-coding genes of the chromosome(s) of interest are selectively enriched by solution-based hybrid selection technique. For this specific enrichment the SureSelect Target Enrichment System (Agilent Technologies) may be used according to the user's manual. For initiating sample preparation up to 10 ng of cell-free plasma DNA are used for prepped library production specific to the sequencing platform utilized downstream such as the Illumina sequencing instrument. The subsequent library preparation is carried out according to the corresponding manufacturer's protocol with the modification that no fragmentation by nebulization or sonication is done on the cell-free plasma DNA sample.

(10) In parallel with library production a specific SureSelect kit containing a mixture of designed SureSelect RNA oligonucleotides is created on Agilent's web-based design tool. Table 1 shows an example of a customized kit design suitable for the detection of chromosomal aneuploidies and sex-linked genetic diseases caused by a mutation on the X chromosome. In detail, for each chromosome of interest including reference chromosome(s) hybridization probes for the regions around the transcriptional start sites (approximately 1.5 kb) of all known protein-coding genes on the respective chromosome are generated resulting in an amount of enriched DNA of about 4 Mb.

(11) TABLE-US-00001 TABLE 1 Example of a customized kit design for detection of chromosomal aneuploidies and X-linked genetic diseases Protein-coding Genes (corresponding to number of approximately enriched kb- hybridisation probes) (Ensembl length per 1.5 kb around Syndrome or release 55 - July 2009; transcription start site chromosomal Chromosome www.ensembl.org/index.html) of protein-coding genes disorder 13 359 520 Patau syndrome (trisomy 13) 16 1038 1250 trisomy 16 18 315 470 Edwards syndrome (trisomy 18) 21 265 390 Down syndrome (trisomy 21) X 883 1250 Turner syndrome, Triple X syndrome, X- linked genetic diseases Y 86 120 XYY syndrome, sex- linked genetic diseases 4 Mb

(12) To perform the DNA-capture the size-selected library is incubated with the designed SureSelect RNA oligonucleotides, and the thereby generated RNA-DNA-hybrids are incubated with streptavidin-labeled magnetic beads in order to allow for capturing the RNA-DNA hybrids by linking them to the beads. After collecting the loaded beads by attracting them onto a magnet the beads are washed and the RNA oligonucleotides are digested whereupon only the remaining enriched DNA of interest is harvested. After final DNA amplification and estimation of the quality of the PCR products using, for example, Agilent 2100 Bioanalyzer (Agilent Technologies) the enriched pool of target DNA sequences is then subjected to sequencing by massive parallel sequencing using, for example, the 454 platform (Roche) (Margulies, M et al. 2005 Nature 437, 376-380), Illumina Genome Analyzer or SOLiD System (Applied Biosystems), which allows for sequencing of many nucleic acid molecules isolated from one human plasma DNA sample in a parallel fashion.

(13) The subsequent bioinformatics procedure is then used to locate each of these DNA sequences on the human genome. More specifically, the short reads are collected from the sequencing instrument and aligned to the human reference genome (hg18, NCBI build 36 (GenBank accession numbers: NC_000001 to NC_000024) using several bioinformatic tools such as ELAND (Efficient Large-Scale Alignment of Nucleotide Databases). To ensure a high quality of the results, it is preferred that only those reads are considered for further analysis which are located in pre-selected genomic regions comprising consensus nucleosome binding regions of the chromosome(s) of interest and which are uniquely mapped to the human genome with only one or two mismatches against the human genome.

(14) The resulting digital readout of nucleic acid molecules is then used for the detection of fetal chromosomal aneuploidies, e.g. trisomy 13, 18 or 21, and can likewise be used for the determination of the gender of the fetus.

(15) An imbalance such as a chromosomal aneuploidy in a given experimental sample is revealed by differences in the number or percentage of sequences aligned to a given chromosomal region of interest as compared to the corresponding number or percentage of such sequences expected or pre-determined for a euploid human genome sample.

(16) The present method is suitable for the detection of one or more chromosomal aneuploidies in one run, wherein the affected chromosome is typically selected from the group consisting of chromosome 21 (trisomy 21), chromosome 18 (trisomy 18), chromosome 13 (trisomy 13), and chromosome X (Turner syndrome).

(17) The selective target enrichment sequencing method according to the present invention may also be applicable to other diagnostic applications involving qualitative and/or quantitative evaluation of serum or plasma nucleic acid contents, e.g., in oncology and transplantation medicine. For example, the afore described selective target enrichment sequencing technique on cell-free DNA may also be used to detect tumor-specific chromosomal alterations associated with specific cancer.

(18) The principle of the invention is further described in the independent claim hereinafter, the various embodiments of the invention being the subject matter of the dependent claims.

Example 2

Performance of Prenatal Diagnosis for Detection of Trisomy 21 and Trisomy 13 by Target Enrichment Technology and Multiplexed Barcode Sequencing

(19) For the study maternal blood samples were selected from 4 singleton pregnancies. One of the pregnant women was carrying an euploid male fetus and the other three were carrying a fetus with trisomy 21, trisomy 13 and trisomy 18, respectively (Table 2, below). Up to 15 ml of peripheral venous blood were taken from these pregnant women and collected in EDTA tubes. The plasma was obtained from the blood samples by centrifugation at 1600 g at 4° C. for 10 min. To remove residual cells the plasma was additionally centrifuged at 16000 g at 4° C. for 10 min. From plasma samples the cell-free DNA was extracted from 0.8-1 ml of plasma by using the QIAamp Circulating Nucleic Acid Kit (Qiagen) according to manufacture's protocol.

(20) Up to 10 ng of cell-free plasma DNA was then used to construct Illumina sequencing libraries. The DNA library preparation followed the Illumina standard sample preparation protocol for paired-end sequencing with a few modifications. Briefly, no fragmentation by nebulization or sonication was done on the cell-free plasma DNA samples. The library preparation was carried out according to the beta Chromatin Immunoprecipitation Sequencing (ChIP-Seq) sample preparation protocol (Illumina; Part #11257047 Rev. A) using enzymes from Fermentas (T4 DNA Polymerase, Klenow DNA Polymerase, T4 polynucleotide kinase, DNA Ligase) as well as from Finnzymes Oy (Phusion* Polymerase). The products were end-repaired and 3′ non-template A's were added. To make multiplexed barcode sequencing available, DNA libraries were “tagged” with different identifiers (barcodes) during paired-end adaptor ligation. The first adaptor contained the sequencing primer sites for application read 1 and a 4 bp-identifier as well as a ‘T’ which is necessary for adaptor ligation. The second adaptor contained the sequencing primer sites for application read 2 and a ‘T’ too. The DNA libraries with barcoded samples were then additionally amplified using a 12-cycle PCR and primers containing the attachment sites for the flow cells. The adapter-ligated DNA fragments were size selected in the range of 150-300 bp using 2% agarose gel electorphoresis. A quality control and quantification of libraries were done using a High Sensitivity DNA kit on the Agilent 2100 Bioanalyzer according to the manufacturer's instructions.

(21) To target all human exons and their associated human genomic regions corresponding to the transcription start sites (TSSs), 500 ng of libraries were incubated with the SureSelect Human All Exon Kit (Agilent Technologies) and enriched according to the manufacturer's protocol. After elution of the captured DNA fragments, the libraries were reamplified for 12-14 cycles of PCR with SureSelect Illumina-specific primers. Amplification enables accurate quantification using the Bioanalyzer High Sensitivity chip before sequencing.

(22) The four different barcoded samples are then pooled into a single tube and clonal clusters were generated using cBOT clonal amplification system with the cBOT Paired-End Cluster Generation Kit. Following Illuminas sequencing workflow the amplified single-molecule DNA templates were sequenced using massive parallel synthesis on Illumina Genome Analyzer IIx.

(23) The subsequent bioinformatics procedure included image analysis, base calling and alignment by using Illumina's pipeline software. For individual downstream analysis a semi-automated tag sorting strategy identified each uniquely barcoded sample. The first 32-bp of each read of each sample were aligned to the repeat-masked human genomic reference sequence NCBI build 36 (also known as hg18; GenBank accession numbers: NC_000001 to NC_000024) downloaded from UCSC Genome Browser using ELAND alignment software (GAPipeline-1.4.0 software) provided by Illumina.

(24) The resulting digital readouts of nucleic acid molecules were then used for the detection of fetal chromosomal aneuploidies, e.g. trisomy 13 or 21, and can likewise be used for the determination of the gender of the fetus.

(25) Initially, the total number of sequenced reads for each sample were counted. Subsequently, only sorted reads that had uniquely mapped to one location in the repeat-masked human genomic reference sequence and without any nucleotide mismatch were used for further analysis (see Table 2, below).

(26) In the first place, an imbalance such as trisomy 21 and trisomy 13 in the given experimental samples was revealed by differences in the number or percentage of repeat-masked uniquely mapped reads without any mismatch of interest (originating from chromosomes 13 and 21, respectively) as compared to the corresponding number or percentage of such sequences determined for the euploid human genome sample. The expected percentage of representation of each chromosome was obtained by dividing the number of repeat-masked uniquely mapped reads without any mismatch per chromosome by the number of total repeat-masked uniquely mapped reads without any mismatch of all chromosomes. As shown in FIG. 2A, the percentage of reads uniquely mapped to chromosome 13 from sample S_T13 was higher than that from sample S_euploid with an euploid fetus as well as from sample S_T21 carrying a fetus with Trisomie 21. The percentage of reads uniquely mapped to chromosome 21 from sample S_T21 was also higher than that from sample S_euploid with an euploid fetus as well as from sample S_T13 carrying a fetus with Trisomie 13 (FIG. 2B).

(27) In the second place, an imbalance such as trisomy 21 and trisomy 13 in the given experimental samples was revealed by differences in the number or percentage of repeat-masked uniquely mapped reads without any mismatch aligned to a given chromosomal region of interest compared to the corresponding number or percentage of such sequences determined for the euploid human genome sample. The chromosomal region of interest was characterized by the predetermined 120 bp-bait regions (available at eArray platform by Agilent Technologies) of the SureSelect Human All Exon Kit (Agilent Technologies) plus flanking 150 bp-regions upstream and downstream located of bait regions. The expected percentage of representation of each chromosome was then obtained as described before. An overrepresentation of uniquely mapped reads was observed for chromosomes 13 and chromosomes 21 in T13 and T21 cases, respectively (FIGS. 3A and 3B).

(28) In the third place, an imbalance such as trisomy 21 and trisomy 13 in the given experimental samples was revealed by differences in the number or percentage of repeat-masked uniquely mapped reads without any mismatch aligned to a given consensus nucleosome binding region as compared to the corresponding number or percentage of such sequences expected or pre-determined for the euploid human genome sample. The consensus nucleosome binding region as used herein included sequence regions from 500 bp upstream the TSS to 1500 bp downstream the TSS. The percentage of reads uniquely mapped to chromosome 13 from sample S_T13 was higher than that from sample S_euploid with an euploid fetus as well as that from sample S_T13 (FIG. 4A). The percentage of reads uniquely mapped to chromosome 21 from sample S_T21 was only higher than that from sample S_T13 and not higher than that from sample S_euploid with an euploid fetus (FIG. 4B).

(29) The present method was suitable for the detection of one or more chromosomal aneuploidies in one run, wherein the affected chromosome is typically selected from the group consisting of chromosome 21 (trisomy 21), chromosome 18 (trisomy 18), chromosome 13 (trisomy 13), and chromosome X (Turner syndrome).

(30) In comparison to previous experiments (FIG. 5 and Table 3, below) using solely shotgun sequencing this method was appropriate to even detect a trisomy 13.

(31) Furthermore, the described method tends to result in reduction of storage capacities for raw data as well as for mapping and alignment of generated sequence reads.

(32) TABLE-US-00002 TABLE 2 (i) Summary of clinical data and number of sequence reads of Example 2 “Performance of prenatal diagnosis for detection of Trisomy 21 and Trisomy 13 by target enrichment technology and multiplexed barcode sequencing” Total no. of Total no. of Gestational Age sequence sequence Sample Karyotype (weeks + days) reads_read1 reads_read2 S_T13 47XY + 13 13 + 5 5005094 5005094 S_T21 47XY + 21 13 + 0 6739231 6739231 S_euploid 46XY 16 + 0 3415786 3415786 No. of uniquely No. of uniquely No. of uniquely No. of uniquely Total no. Total no. mapped reads mapped reads mapped reads mapped reads of uniquely of uniquely without any without any without any without any mapped reads mapped reads mismatch of mismatch of mismatch of mismatch of without any without any chromosome chromosome chromosome chromosome Sample mismatch_read1 mismatch_read2 13_read1 13_read1 21_read1 21_read2 (ii) Summary of number of sequence reads of Example 2 S_T13 2856635 2745340 8643 83057 32290 31010 S_T21 3913173 3761195 109842 105110 45647 44115 S_euploid 1959738 1895396 56616 54658 2191 21186 (iii) Summary of number of sequence reads of a chromosomal region of interest (characterized by the predetermined 120 bp-bait regions of the SureSelect Human All Exon Kit plus flanking 150 bp-regions upstream and downstream located of bait regions) of Example 2 S_T13 1599317 1530800 42125 40162 17060 16256 S_T21 2215860 2120634 54244 51598 24433 23345 S_euploid 1091265 1051800 27601 26577 11679 11303 (iv) Summary of number of sequence reads aligned to a given consensus nucleosome binding region including sequence regions from 500 bp upstream the TSS To 1500 bp downstream the TSS (Example 2) S_T13 202723 196213 4254 4087 2755 2652 S_T21 282640 273834 5520 5408 3951 3814 S_euploid 135892 131693 2665 2567 1880 1857

(33) TABLE-US-00003 TABLE 3 Summary of number of sequence reads of previous experiments using shotgun sequencing method Total no. No. of uniquely No. of uniquely of uniquely mapped reads mapped reads Total mapped without any without any no. of reads mismatch of mismatch of sequence without any chromosome chromosome Sample reads mismatch 13 21 S_T13 16611762 5992776 215666 73885 S_T21 26137898 10752119 402319 141130 S_euploid 20289419 7928946 285433 99306