Method and kit for detecting a wild-type and/or a mutated target DNA sequence
09938574 ยท 2018-04-10
Assignee
Inventors
Cpc classification
C12Q2525/161
CHEMISTRY; METALLURGY
C12Q2535/138
CHEMISTRY; METALLURGY
C12Q2535/138
CHEMISTRY; METALLURGY
C12Q1/6806
CHEMISTRY; METALLURGY
C12Q2525/161
CHEMISTRY; METALLURGY
International classification
Abstract
The present invention relates to a method for detecting a first and/or a second target DNA sequence from a DNA library, differing in that a mutation generates/eliminates a restriction site for a restriction endonuclease, comprising the steps of: (a) providing the DNA library, in which each of the DNA sequences comprises a first sequence segment, a second sequence segment of genomic DNA as cleaved by the restriction endonuclease, and a third sequence segment reverse complementary to the union of said first sequence segment and 5 overhang, if any, of the restriction endonuclease; (b) amplifying the library of DNA sequences by PCR using: a first reverse primer which hybridizes to the 3 end region of the second sequence segment of the first or second target sequence positive strand; a second forward primer which hybridizes to the 3 end region of the second sequence segment of the first target sequence antipositive strand; a third forward primer comprising a first portion hybridizing to the 5 end region of the third sequence segment of the second target sequence antipositive strand and a second portion hybridizing to the 3 end region of the second sequence segment of the second target sequence antipositive strand, wherein the first portion of the third forward primer has a length from 20% to 80% with respect to the total length of the third forward primer; (c) detecting DNA sequences amplified in step (b).
Claims
1. A method for detecting at least one of at least one first target DNA sequence and at least one second target DNA sequence from a library of DNA sequences, wherein the first target DNA sequence differs from the second target DNA sequence in that a single or multiple nucleotide substitution or deletion or insertion in the second target sequence generates a restriction site for a restriction endonuclease, giving riseif cleaved by the restriction endonucleaseto a first cleaved second target sequence 3 of the generated restriction site and a second cleaved second target sequence 5 of the generated restriction site, comprising the steps of: (a) providing the library of DNA sequences, each of the DNA sequences comprising, respectively from the 5 end to the 3 end, a first sequence segment having a length from 15 to 50 nucleotides, a second sequence segment of genomic DNA as cleaved by the restriction endonuclease, and a third sequence segment reverse complementary to the union of the first sequence segment and, if any, the 5 overhang generated by the restriction endonuclease; (b) amplifying the library of DNA sequences by PCR using: at least one first reverse primer which hybridises to the 3 end region of the second sequence segment of the at least one first target sequence positive strand or at least first cleaved second target sequence positive strand; at least one second forward primer which hybridises to the 3 end region of the second sequence segment of the at least one first target sequence antipositive strand; at least one third forward primer comprising a first 5 portion hybridising to the 5 end region of the third sequence segment of the at least first cleaved second target sequence antipositive strand and a second 3 portion hybridising to the 3 end region of the second sequence segment of the at least one first cleaved second target sequence antipositive strand, wherein the first portion of the at least one third forward primer has a length from 20% to 80% with respect to the total length of the at least one third forward primer; (c) detecting DNA sequences amplified in step (b).
2. The method according to claim 1, wherein step (b) further uses at least one fourth reverse primer which hybridises to the 3 end region of the second sequence segment of the at least one second target sequence positive strand.
3. The method according to claim 1, wherein the library of DNA sequences is obtained by deterministic restriction site whole genome amplification.
4. The method according to claim 1, wherein step (c) is performed by a DNA sequencing method.
5. The method according to claim 4, wherein the DNA sequencing method is Sanger sequencing or sequencing by synthesis.
6. The method according to claim 1, wherein the first portion of the at least one third forward primer has a length from 40% to 60% with respect to the total length of the at least one third forward primer.
7. The method according to claim 1, wherein said second portion of the at least one third forward primer has a length in bases comprised between a minimum corresponding to the consensus sequence of said restriction endonuclease minus, if any, the 5 overhang generated by the restriction endonuclease, all divided by two, and a maximum of 30 bases.
8. The method according to claim 1, wherein at least one of said primers further comprises a 5 end region which does not hybridize to any of said first or second target sequence, positive or antipositive strand.
9. The method according to claim 1, wherein the restriction endonuclease is MseI.
10. A kit comprising a first, a second and a third primer according to claim 1.
11. The kit according to claim 10 for use in the diagnosis of ALK or EGFR or PIK3CA mutations.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7) In
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
DETAILED DESCRIPTION OF THE INVENTION
(23) The method according to the present invention for detecting at least one of at least one first target DNA sequence and at least one second target DNA sequence from a library of DNA sequences comprises steps (a) to (c). The first target DNA sequence differs from the second target DNA sequence in that a single or multiple nucleotide substitution or deletion or insertion in the second sequence generates a restriction site for a restriction endonuclease. With reference to
(24) In step (a), the library of DNA sequences is provided. Each of the DNA sequences of the library comprises, respectively from the 5 end to the 3 end, a first sequence segment having a length from 15 to 50 nucleotides, a second sequence segment of genomic DNA as cleaved by the restriction endonuclease, and a third sequence segment reverse complementary to the union of the first sequence segment and, if any, the 5 overhang generated by the RE. With reference to
(25) The restriction endonuclease is preferably MseI.
(26) In step (b), the library of DNA sequences is amplified by PCR using: at least one first reverse primer which hybridises to the 3 end region of the second sequence segment of the at least one first or second target sequence positive strand; at least one second forward primer which hybridises to the 3 end region of the second sequence segment of the at least one first target sequence antipositive strand; at least one third forward primer comprising a first portion hybridising to the 5 end region of the third sequence segment of the at least second target sequence antipositive strand and a second portion hybridising to the 3 end region of the second sequence segment of the at least one second target sequence antipositive strand, wherein the first portion of the at least one third forward primer has a length from 20% to 80% with respect to the total length of the at least one third forward primer.
(27) The third forward primer is hereinafter sometimes referred to in short as hybrid primer.
(28) Preferably, at least one fourth reverse primer which hybridises to the 3 end region of the second sequence segment of the at least one second target sequence positive strand is used in step (b).
(29) Preferably, the first portion of the at least one third forward primer has a length from 40 to 60%, with respect to the total length of the at least one third forward primer.
(30) Preferably, the second portion of the at least one third forward primer has a length in bases comprised between a minimum corresponding to the consensus sequence of the restriction endonuclease minus, if any, the 5 overhang generated by the restriction endonuclease, all divided by two, and a maximum of 30 bases.
(31) With reference to
(32) The same principle applies in
(33) In step (c), the DNA sequences amplified in step (b) are detected. Step (c) may be performed by several detection methods known in the art, for example gel electrophoresis, capillary electrophoresis, DNA sequencing. Preferably, step (c) is performed by a DNA sequencing method. Even more preferably, the DNA sequencing method is Sanger sequencing, or sequencing by synthesis.
(34) The method of the present invention may be used with any library of DNA sequences having the structure shown in
(35) According to the present invention there is also provided a kit comprising a first and/or a second and/or a third primer as defined above.
(36) More specifically, the kit for detecting at least one of at least one first target DNA sequence and at least one second target DNA sequence from a library of DNA sequences, wherein the first target DNA sequence differs from the second target DNA sequence in that a single or multiple nucleotide substitution or deletion or insertion in the second sequence generates a restriction site for a restriction endonuclease, and wherein each of the DNA sequences of the library comprises, respectively from the 5 end to the 3 end, a first sequence segment having a length from 15 to 50 nucleotides, a second sequence segment of genomic DNA as cleaved by the restriction endonuclease, and a third sequence segment reverse complementary to the union of the first sequence segment and, if any, the 5 overhang generated by the restriction endonuclease, comprises: at least one first reverse primer which hybridises to the 3 end region of the second sequence segment of the at least one first or second target sequence positive strand; at least one second forward primer which hybridises to the 3 end region of the second sequence segment of the at least one first target sequence antipositive strand; at least one third forward primer comprising a first portion hybridising to the 5 end region of the third sequence segment of the at least second target sequence antipositive strand and a second portion hybridising to the 3 end region of the second sequence segment of the at least one second target sequence antipositive strand, wherein the first portion of the at least one third forward primer has a length from 20% to 80% with respect to the total length of the at least one third forward primer.
(37) The kit preferably further comprises at least one fourth reverse primer which hybridises to the 3 end region of the second sequence segment of the at least one second target sequence positive strand.
(38) The kit may be used to detect any kind of mutation generating or eliminating a restriction site for the restriction endonuclease of the ends of the second sequence segment of the DNA fragments of the library. The kit is preferably used in the diagnosis of mutations in the (anaplastic lymphoma kinase) ALK or (epidermal growth factor receptor) EGFR or (phosphatidylinositol 3-kinase catalytic alpha polypeptide) PIK3CA gene.
EXAMPLES
Example 1
Bivalent Primer Approach
(39) Preliminary tests were carried out on SY5Y cell lines (SH-SY5Y ATCC Catalog No. CRL-2266), which harbour a heterozygous C to A substitution at codon 1174 of the ALK gene, turning a Phenylalanine into a Leucine (F1174L); considering the flanking sequence, the heterozygous substitution introduces one new restriction site (RS) in the mutated allele, whereas the wild type allele does not have any RS.
(40)
(41) To detect mutations occurring on the RS, the following approach was tested. The universal primer of the whole genome amplification (DRS-WGA primer, SEQ ID NO:1 having sequence AGTGGGATTCCTGCTGTCAGT) was exploited to design a 5 primer in a new PCR primer pair where the 3 primer overlaps a region in 3 with respect to the RS. The strategy consisted in designing a bivalent primer pair comprising: a 5 primer having 95% homology with the DRS-WGA primer; and a 3 PCR primer which should provide the specificity required to the PCR, to selectively amplify the target region instead of other DRS-WGA amplicons.
(42) This bivalent primer pair should in theory serve for the amplification of wild-type (WT) sequence and mutant (M) sequence.
(43) Experimental evidence shows that this approach results to be poor and improper, and cannot guarantee the detection of the mutation at the RS. As shown in
(44) One factor contributing to this poor result is that the 5 bivalent primer which corresponds by 95% to the ligated WGA-primer, is present on all DNA fragments of the DRS-WGA library, and the 3 bivalent primer does not provide the PCR reaction with sufficient specificity.
(45) As an example, the human genome reference (Homo Sapiens hg 19) comprises 3,095,693,981 bases. If the genome is digested with a restriction endonuclease with a four base restriction site (e.g. TTAA), the mean length of the DNA fragments generated is 4 (the possible bases) to the power of 4 (the digestion sequence length considered) 256. The generated DNA library would thus comprise approximately 3,095,693,981/25612.1 million different fragments, with a simplified assumption of a random sequence of the nucleotides in the DNA. All of them would comprise the same 5 primer (corresponding to the WGA-primer from the primary PCR).
(46) The use of the bivalent primer pair therefore gives unspecific bands.
Example 2
Hybrid Primer Homology Range Limit
(47) Amplification tests were carried out on the same SY5Y cell line as that used in Example 1, but using the method of the present invention.
(48) To test the amplification of both wild-type (WT) and mutated (M) alleles in DRS-WGA products, individual SY5Y cells were isolated with DEPArray, which provides pure single cells.
(49) The amplification approach of using one 5 PCR primer matching the WGA universal primer by 86% its length provided a solution for the amplification neither of the WT nor of the M allele. As shown in
(50) Primers having different percentages of homology with the WGA universal primer were tested. The results are summarised in following Table 1.
(51) TABLE-US-00009 TABLE 1 Homology to Homology to F32 [# the WGA- Original of Primer primer DNA basis] TEST Universal 21/22 95% 1/22 5% 0 KO Mutant 1 19/22 86% 3/22 14% 1 KO Mutant 2 10/20 50% 10/22 50% 8 OK Mutant 3 14/22 64% 8/22 36% 6 OK
(52) In Table 1, column F32 reports the length in number of bases of the second portion of third forward primer i.e. the primer portion which has the same sequence as the original DNA, excluding the restriction endonuclease overhang.
(53) It is clear from the results of Table 1 that a balanced compromise needs to be identified to meet the method requirements. Several tests have shown that the ideal percentage of identity of the hybrid primer with the WGA universal primer is from 20% to 80%, with an even better efficiency in the range from 40% to 60%.
Example 3
Introduction of a New RS in the Mutant Allele (ALK Gene)Assay Design
(54) The method according to the invention guarantees amplification (and sequencing) even in the case of incomplete digestion by the restriction endonuclease. In fact, the activity of the restriction endonuclease is not guaranteed for all the RS in the target DNA, and statistically a small percentage of undigested RS is present in the DRS-WGA, which nevertheless are successfully amplified by DRS-WGA, albeit with the WGA-primer (primaryPCR) being in another RS.
(55) In case of an undigested site the use for the mutation assay of just one primer pair designed for the mutant sequence would not allow the amplification and the sequencing of the target.
(56) Amplification tests were again carried out on the SY5Y cell line, whichas previously disclosedharbours a heterozygous C to A substitution at codon 1174, turning a Phenylalanine into a Leucine (F1174L). The heterozygous substitution thus introduces a new RS in the mutated allele, whereas the wild-type allele does not have any RS.
(57) The PCR primer sequences used for the amplification of WT and M alleles are shown in Table 2. For mutant allele forward primer, the first portion of the primer sequence homologous to the WGA primer is shown in bold and underlined, while the second portion of the primer which has the same sequence as the original DNA, excluding the restriction endonuclease overhang, (F32=8 basis) is shown boxed.
(58) TABLE-US-00010 TABLE2 Primer Name Sequence ALK_WT_F SEQIDNO:2 5 CCTCTCTGCTCTGCAGCAAAT3 ALK_WT_R SEQIDNO:3 5 TCTCTCGGAGGAAGGACTTGAG3 ALK_M1_F SEQIDNO:4
(59) To test the amplification of both WT and M alleles in DRS-WGA products, individual SY5Y cells were isolated with DEPArray, which provides pure single cells.
(60) As negative control for the mutation detection, individual lymphocytes where also isolated with DEPArray and amplified with DRS-WGA.
(61) The PCR amplification of the WT allele on both WT (lymphocytes) and heterozygous M (SY5Y) was achieved perfectly by the use of the specifically designed WT 5 primer, which allows the exclusive amplification of the WT allele.
(62) As can be observed in
(63) The M-specific 5 primer was tested for the same lymphocytes and SY5Y cells to detect the specificity of the amplification provided by the primer designed straddling the target sequence and the universal DRS-WGA primer.
(64) As may be seen in
(65) To demonstrate that the amplification achieved was specific and allowed sequencing, all the amplification products were sequenced from their 3 end. The corresponding WT or M status was confirmed for all amplification products showing the specificity achieved with the described method. An example of sequencing of a WT allele is shown in
(66) Results are summarised in Table 3.
(67) TABLE-US-00011 TABLE 3 Single Sequence obtained Sequence obtained Cells with M-Specific with WT-Specific Replicates 5primer 5primer WBC 1 No PCR Product WT 2 No PCR Product WT 3 No PCR Product WT SY5Y 1 M WT 2 M WT 3 M WT
(68) In a preferred embodiment, the second portion (F32) of the third forward primer (F3) is shorter than 30 nucleotides so as not to mis-prime on the first target sequence antipositive strand (FTSAS)i.e. the wild-type sequence in this example, thus starting a PCR reaction which may result in a false-positive (as per its PCR product length and sequence). More preferably, the length of the second portion (F32) is shorter than 20 nucleotides. Even more preferably, the length of said second portion (F32) of said third forward primer (F3) is shorter or equal to 10 nucleotides.
(69) The second portion (F32) of the third forward primer (F3) should not be too short as to not provide enough specificity, (see for example results in table I). In particular the length of said second portion of the third forward primer, should be greater than the restriction site consensus sequence length minus the length of the 5 overhang of the digested DNA, all divided by two. In order to obtain a greater specificity, the second portion (F32) of the third forward primer (F3) should be at least 3 nucleotides, and even more preferably at least 6 nucleotides, longer than the restriction site consensus sequence length minus the length of the 5 overhang of the digestd DNA, all divided by two.
Example 4
Introduction of a New RS in the Mutant Allele (ALK Gene)Assay Validation
(70) The method described above has been further validated with 54 single cells: 10 single live, fresh SY5Y; 19 single SY5Y, previously fixed with 2% paraformaldehyde (PFA) 20 minutes at room temperature, and permeabilised with Inside Perm (Miltenyi Biotec); 19 single SY5Y, previously fixed with CytoChex, and permeabilised with Inside Perm; 2 single fresh, live lymphocytes; 2 single lymphocytes, previously fixed with 2% PFA 20 minutes at room temperature, and permeabilised with Inside Perm (Miltenyi Biotec).
(71) The method amplified the WT allele in 100% of SY5Y and lymphocytes cells, and the mutant allele was amplified in 9/10=90% of live SY5Y, 16/19=84% of SY5Y cells fixed & permeabilised with cyto-chex/inside-perm, 17/19=89% of SY5Y cells fixed & permeabilized with PFA 2% 20 @ room temperature/inside-perm, and 0/4=0% of lymphocytes.
(72) Results are shown in
(73) TABLE-US-00012 TABLE 4 ALK PCR of PCR of WT F1174L M Allele Allele SY5Y Live 100% 90% CytoChex, Inside Perm 100% 84% PFA, Inside Perm 100% 89% Lymphocytes Live 100% 0% PFA, Inside Perm 100% 0%
(74) These results show the efficacy and robustness of the method of the present invention on larger numbers of samples.
Example 5
Removal of a RS in the Mutant Allele (EGFR Gene)Assay Design
(75) Amplification tests were carried out on the HCC-827 cell line, harbouring a deletion of 5 codons in the EGFR gene. The deletion removes a restriction site (RS), allowing the detection of the M allele, but not of the WT allele which has the RS, when using a single PCR and primer pairs on the human genome.
(76) Individual HCC-827 cells were isolated with DEPArray, along with lymphocytes as a control of the WT condition.
(77) Two different primer pairs targeted for the M allele (with the deleted RS) and WT allele (still maintaining the RS) were designed and led to the correct identification of both WT and M conditions.
(78) The PCR primer sequences used for the amplification of WT and M alleles are shown in Table 5. For wild-type allele forward primer, the first portion of the primer sequence homologous to the WGA primer is shown in bold and underlined, while the second portion of the primer which has the same sequence as the original DNA, excluding the restriction endonuclease overhang, (F32=16b) is shown boxed.
(79) TABLE-US-00013 TABLE5 Primer Name Sequence Ex19_M_F SEQID 5TAAAATTCCCGTCGCTATCAA3 NO:6 Ex19_M_R SEQID 5TGTGGAGATGAGCAGGGTCTAG3 NO:7 Ex19_WT_F SEQID NO:8
(80)
(81)
Example 6
Removal of a RS in the Mutant Allele (EGFR Gene)Assay Validation
(82) The method described above has been further validated with 60 single cells: 31 single HCC-827, treated according Veridex CellSearch enrichment protocol; 11 single lymphocytes, treated according Veridex CellSearch enrichment protocol; 17 single fresh, live lymphocytes.
(83) The method amplified the WT allele in 28/31=90% of the single HCC-827 and the M allele in 31/31=100% of the single HCC-827.
(84) Considering the 11 Veridex-treated lymphocytes, 11/11=100% resulted in a positive PCR product for the WT PCR, 3/11=27% resulted in a positive PCR product for the M-PCR. These products were sequenced and confirmed to be WT. Hence, detecting the DNA by sequencing, the specificity on Veridex-treated lymphocytes is still 100%, whereas, just relying on the PCR positivity the specificity is (in this test) 8/11=73%. Detecting the DNA product length by gel electrophoresis would similarly allow to distinguish the length and determine that actually it is WT; detecting the DNA product by real-time PCR would not distinguish between WT and M products. Considering the 17 fresh lymphocytes, 17/17=100% resulted in a positive PCR product for the WT PCR, 0/17=0% resulted in a positive PCR product for the M-PCR. These products were sequenced and confirmed to be WT.
(85) As there are 2 WT alleles per lymphocyte, the difference in undigested RS between Veridex-treated (3/22=14%) and fresh lymphocytes (0/34=0%) is statistically significant.
(86) This demonstrates the robustness of the above described method in case of incomplete RE digestion activity.
(87) Results are shown in
(88) TABLE-US-00014 TABLE 6 EGFR Exon19 PCR of WT PCR of Del. Treatment n Allele E746_A750 M Allele HCC-827 Veridex 31 90% 100% WBC Veridex 11 100% 27% (*) WBC Fresh 17 100% 0% (*) All sequences WT
(89) The above examples show that the method according to the present invention guarantees the amplification (and the sequencing) even in case of incomplete digestion activity of the restriction endonuclease. The activity of the RE cannot always guarantee the effective digestion of all the RS present in the target DNA, because of the treatment which the cells have been subjected to (as in the previous example), or for other reasons linked to the specific sequence around the restriction site.
(90) Statistically a small percentage of undigested RS is present in the DRS-WGA, which nevertheless are successfully Whole Genome Amplified, albeit with the universal (primaryPCR) primer being connected to another RS.
(91) In case of an undigested site the use of just one PCR for the third target sequence (with the MDRS) would not allow the amplification and the sequencing of said target. In case of incomplete DNA digestion by the restriction enzyme, the method of the invention allows the detection of both WT and M allele when they are present in the DRS-WGA library.
(92)
Example 7
Introduction of a New RS in the Mutant Allele (PIK3CA Gene)
(93) As another example, mutation M1043I, of the exon 21 of the PIK3CA gene stemming from the single nucleotide change ATG/TAAT, can be detected by the method according to the present invention.
(94) From an analysis of the features of the method and kit of the present invention, the resulting advantages are apparent.
(95) In particular, in virtue of the particular design of the primers used to amplify by PCR the library of DNA sequences, the method allows to differentially detect the first target DNA sequence and the second target DNA sequence (differing in the presence of a restriction site for the restriction endonuclease of the DRS-WGA) with great specificity and robustness.
(96) Further, the use of a fourth reverse primer allows an even more specific and robust detection and an amplicon-size based detection, which is fast, simple and cost-effective.
(97) Further, the method of the present invention may be applied downstream of deterministic restriction site whole genome amplification to detect mutations in a specific and robust manner. These mutations are impossible to otherwise detect with the traditional detection methods available.
(98) Moreover, the use of a DNA sequencing method, in particular Sanger sequencing or pyrosequencing, guarantees the correct detection of even the false positives which could occur in the case of incomplete digestion of the restriction endonuclease of the DNA library.
(99) Furthermore, a percentage of identity from 20% to 80%, better from 40% to 60%, of the third forward primer with the WGA primer allows to obtain an optimal result.
(100) Finally, it is clear that modifications and variants to the method and kit disclosed and shown may be made without because of this departing from the scope of protection of the appended claims.
(101) In particular, the method may be multiplexed by using further pairs of primers which do not interfere with the PCR amplification with the first, second, third and possibly fourth primer.
(102) Additionally, one or more of said primers may further include a 5 end sequence which does not hybridize to any of said first or second target sequence positive or antipositive strand. This feature can advantageously be used for one or more of the following purposes: barcoding the PCR products with a sample tag, introducing in the PCR product an adaptor for next-generation sequencing preventing spurious priming in multiplexed target PCR reaction.
(103) Furthermore, as the WGA products from the PCR reaction may display some background signal, it may be of advantage to use a different primer for sequencing. This adds an extra layer of specificity, improving the signal-to-noise and readability of the sequence plot.