DETECTION AND SEQUENCING OF FRAGMENTED DNA
20230026775 · 2023-01-26
Inventors
Cpc classification
C12Q2537/143
CHEMISTRY; METALLURGY
C12N15/1093
CHEMISTRY; METALLURGY
C12N15/1068
CHEMISTRY; METALLURGY
C12Q2537/143
CHEMISTRY; METALLURGY
C12N15/1093
CHEMISTRY; METALLURGY
International classification
Abstract
The present invention provides modified single primer extension-based methods for generating an amplified library of fragments of a target gene or genome of interest from a sample of fragmented DNA, wherein the library is suitable for use in detecting, quantifying and/or sequencing the target gene or genome of interest. The present invention also provides compositions for use in such methods. In some embodiments the present invention provides methods and compositions specifically for detecting, quantifying and/or sequencing circulating tumor derived HPV DNA.
Claims
1. A method of generating a library of fragments of a target gene or genome of interest from a sample of fragmented DNA, wherein the library is suitable for use in detecting, quantifying and/or sequencing the target gene or genome of interest, the method comprising: (a) Contacting a sample of fragmented DNA with a pool of target-specific forward primers complementary to multiple different primer binding sites located within, and spanning the length of, a target gene or genome of interest, wherein each target-specific forward primer comprises: (i) a sequence that is complementary to a primer binding site within the target gene or genome of interest and (ii) a first next generation sequencing (NGS) based adapter located 5′ to the sequence that is complementary to the primer binding site, (b) Performing a single primer extension reaction to generate first-generation copies of the target gene of genome of interest, (c) Adding a common sequence to the 3′ end of the first-generation copies of the target gene or genome of interest, thereby generating 3′ tagged first generation copies of the fragmented target gene or genome of interest, (d) Performing a first PCR reaction using: a common reverse primer comprising: (i) a sequence that is complementary to the common sequence and (ii) a second next generation sequencing (NGS) based adapter located 5′ to the sequence that is complementary to the common sequence, and (e) Performing a second PCR reaction using: (i) a forward primer complementary to the NGS-based adapter present in the target-specific forward primer, and (ii) a reverse primer complementary to the NGS-based adapter present the common reverse primer, thereby generating a library of fragments of a target gene or genome of interest from a sample of fragmented DNA, wherein the library is suitable for use in detecting, quantifying and/or sequencing the target gene or genome of interest.
2. The method of claim 1, wherein the single primer extension reaction of step (b) is performed in the presence of biotinylated nucleotides such that the first-generation copies of the target gene of genome of interest generated by the single primer extension reaction are biotinylated, (d), a biotin-based selection step is performed to select for only biotinylated nucleic acid molecules.
3. The method of claim 1 or claim 2, further comprising performing next generation sequencing of the library of fragments of the target gene or genome of interest.
4. The method of claim 1 or claim 2, further comprising performing quantitative PCR of the library of fragments of the target gene or genome of interest.
5. The method of any of the preceding claims, wherein the sample of fragmented DNA is circulating cell free DNA (cfDNA).
6. The method of any of the preceding claims, wherein the sample of fragmented DNA is circulating tumor DNA (ctDNA).
7. The method of any of the preceding claims, wherein the pool of target-specific forward primers comprises primers complementary to approximately 1,000, or 2,000, or 3,000, or 4,000, or 5,000 different primer binding sites within the target gene or genome of interest.
8. The method of any of the preceding claims, wherein the different primer binding sites within the target gene or genome of interest to which the pool of target-specific forward primers is complementary are spaced approximately 25-200 nucleotides apart.
9. The method of any of the preceding claims, wherein in step (b) 1-99 cycles of the single primer extension reaction are performed.
10. The method of any of the preceding claims, wherein in step (d) 1-99 cycles of the first PCR reaction are performed.
11. The method of any of the preceding claims, wherein in step (e) 1-99 cycles of the second PCR reaction are performed.
12. The method of any of the preceding claims, wherein the NGS based adapter is an Ilumina adapter.
13. The method of any of the preceding claims, wherein the common sequence is a polyC, polyG, polyA or polyT sequence.
14. The method of claim 2, wherein the biotinylated nucleotides are biotin-dCTP, biotin-dGTP, biotin-dATP, biotin dTTTP, or a combination thereof.
15. The method of any of the preceding claims, wherein the sample of fragmented DNA is HPV circulating tumor DNA (ctDNA) and wherein the target gene or genome of interest is an HPV gene or genome.
16. The method of claim 15, wherein the HPV ctDNA is from an HPV-associated squamous cell carcinoma of the head and neck, oropharynx, cervix, vulva, vagina, anal canal or penis.
17. The method of any of claims 15-16, wherein the HPV ctDNA is from HPV type 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58 or 59.
18. The method of any of claims 15-17, wherein the pool of target-specific forward primers comprises primers that bind to a target sequence in the HPV genome that: (i) is within the E6 and/or E7 region of the HPV genome, and/or (ii) is 100% conserved between European and non-European HPV isolates.
19. The method of any of claims 15-18, wherein the pool of HPV-specific forward primers comprises one or more of SEQ ID NO. 1 through SEQ ID NO. 77.
20. The method of any of claims 15-18, wherein the pool of HPV-specific forward primers comprises SEQ ID NO. 1 through SEQ ID NO. 77.
21. A composition comprising a pool of target-specific forward primers suitable for use in a single primer extension reaction, wherein the pool comprises primers complementary to multiple different primer binding sites located within, and spanning the length of, a target gene or genome of interest, wherein each target-specific forward primer comprises: (i) a sequence that is complementary to a primer binding site within the target gene or genome of interest and (ii) a first next generation sequencing (NGS) based adapter located 5′ to the sequence that is complementary to the primer binding site.
22. The composition of claim 21, wherein the target gene or genome of interest is present in circulating cell free DNA (cfDNA).
23. The composition of claim 21, wherein the target gene or genome of interest is in circulating tumor DNA (ctDNA).
24. The composition of claim 21, wherein the pool of target-specific forward primers comprises primers complementary to approximately 75 different primer binding sites within the target gene or genome of interest.
25. The composition of claim 21, wherein the wherein the different primer binding sites within the target gene or genome of interest to which the pool of target-specific forward primers is complementary are spaced approximately 100 nucleotides apart.
26. The composition of claim 21, wherein the wherein the NGS based adapter is an Ilumina adapter.
27. The composition of any of claims 21-26, wherein the target gene or genome of interest is in HPV circulating tumor DNA (ctDNA).
28. The composition of claim 27, wherein the HPV ctDNA is from an HPV-associated squamous cell carcinoma of the head and neck, oropharynx, cervix, vulva, vagina, anal canal or penis.
29. The composition of claim 27 or claim 28, wherein the HPV ctDNA is from HPV type 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58 or 59.
30. The composition of any of claims 27-29, wherein the pool of target-specific forward primers comprises primers that bind to target sequences in the HPV genome that are conserved between HPV strains and sub-strains.
31. The composition of any of claims 27-30, wherein the pool of HPV-specific forward primers comprises one or more of SEQ ID NO. 1 to SEQ ID NO. 77.
32. The composition of any of claims 27-30, wherein the pool of HPV-specific forward primers comprises SEQ ID NO. 1 to SEQ ID NO. 77.
33. The method of claim 1 wherein the target gene or genome of interest is an HPV 16 gene or genome add wherein the pool of HPV-specific forward primers comprises one or more of SEQ ID NO. 1 through SEQ ID NO. 77.
34. The method of claim 1 wherein the target gene or genome of interest is an HPV 16 gene or genome add wherein the pool of HPV-specific forward primers comprises SEQ ID NO. 1 through SEQ ID NO. 77.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0013]
[0014]
[0015]
DETAILED DESCRIPTION OF THE INVENTION
Definitions & Abbreviations
[0016] As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents, unless the context clearly dictates otherwise. The terms “a” (or “an”) as well as the terms “one or more” and “at least one” can be used interchangeably.
[0017] Furthermore, “and/or” is to be taken as specific disclosure of each of the two specified features or components with or without the other. Thus, the term “and/or” as used in a phrase such as “A and/or B” is intended to include A and B, A or B, A (alone), and B (alone). Likewise, the term “and/or” as used in a phrase such as “A, B, and/or C” is intended to include A, B, and C; A, B, or C; A or B; A or C; B or C; A and B; A and C; B and C; A (alone); B (alone); and C (alone).
[0018] Units, prefixes, and symbols are denoted in their Systeme International de Unites (SI) accepted form. Numeric ranges provided herein are inclusive of the numbers defining the range.
[0019] Where a numeric term is preceded by “about” or “approximately,” the term includes the stated number and values ±20% of the stated number.
[0020] Numbers in parentheses or superscript following text in this patent disclosure refer to the numbered references provided in the “Reference List” section at the end of this patent disclosure.
[0021] Wherever embodiments are described with the language “comprising,” otherwise analogous embodiments described in terms of “consisting of” and/or “consisting essentially of” are included.
[0022] Other abbreviations and definitions may be provided elsewhere in this patent specification, or may be well known in the art.
Methods
[0023] The present invention provides modified single primer extension-based methods for generating an amplified library of fragments of a target gene or genome of interest from a sample of fragmented DNA. Such libraries are useful for detecting, quantifying and/or sequencing such a target gene or genome of interest. The present invention also provides various compositions for use in such methods. In some embodiments the methods and compositions provided herein are designed specifically for, and are useful for, detecting, quantifying and/or sequencing circulating tumor DNA, such as circulating tumor derived HPV DNA (i.e. from HPV-associated or HPV-driven tumors).
[0024] Accordingly, in one embodiment the present invention provides a method of generating a library of fragments of a target gene or genome of interest from a sample of fragmented DNA, wherein the library is suitable for use in detecting, quantifying and/or sequencing the target gene or genome of interest, the method comprising: (a) contacting a sample of fragmented DNA with a pool of target-specific forward primers complementary to multiple different primer binding sites located within, and spanning the length of, a target gene or genome of interest, wherein each target-specific forward primer comprises: (i) a sequence that is complementary to a primer binding site within the target gene or genome of interest and (ii) a first next generation sequencing (NGS) based adapter located 5′ to the sequence that is complementary to the primer binding site, (b) performing a single primer extension reaction to generate first-generation copies of the target gene of genome of interest, (c) performing a nucleotide tailing reaction or adding a common sequence to the 3′ end of the first-generation copies of the target gene or genome of interest, thereby generating 3′ tagged first generation copies of the fragmented target gene or genome of interest, (d) performing a first PCR reaction using: a common reverse primer comprising: (i) a sequence that is complementary to the common sequence and (ii) a second next generation sequencing (NGS) based adapter located 5′ to the sequence that is complementary to the common sequence, and (e) performing a second PCR reaction using: (i) a forward primer complementary to the NGS-based adapter present in the target-specific forward primer, and (ii) a reverse primer complementary to the NGS-based adapter present the common reverse primer, thereby generating a library of fragments of the target gene or genome of interest from a sample of fragmented DNA, wherein the library is suitable for use in detecting, quantifying and/or sequencing the target gene or genome of interest.
[0025] In some embodiments the present invention provides a variation of the above method, in which the single primer extension reaction of step (b) is performed in the presence of biotinylated nucleotides—such that the first-generation copies of the target gene of genome of interest generated by the single primer extension reaction are biotinylated. This enables a biotin-based selection/purification step to be performed to select for only the single primer extension products before proceeding with subsequent steps of the method. Any suitable biotin-based selection method can be used. For example, the products of the single primer extension reaction can be contacted with a streptavidin-coated solid support (e.g. beads or a column) to which the biotinylated products will bind and can be eluted—using methods well known in the art.
[0026] The libraries of fragments of the target gene or genome of interest generated using the methods of the present invention can be analyzed in various ways to facilitate the detection, quantification, and/or sequencing of the target gene or genome of interest. For example, in some embodiments the libraries of fragments of the target gene or genome of interest generated using the methods of the present invention can be analyzed by performing quantitative PCR (qPCR). In some embodiments the libraries of fragments of the target gene or genome of interest generated using the methods of the present invention can be analyzed by performing sequencing, such as next generation sequencing.
[0027] The sample of fragmented DNA used in the methods of the present invention can be any suitable source of fragmented DNA. In one embodiment the fragmented DNA is, or comprises, circulating cell free DNA (cfDNA). In another embodiment the fragmented DNA is, or comprises, circulating tumor DNA (ctDNA).
[0028] In some embodiments the pool of target-specific forward primers used in the methods of the present invention comprises primers complementary to from tens to up to thousands different primer binding sites within the target gene or genome of interest (e.g. approximately 10, or 20, or 30, or 40, or 50, or 60, or 70, or 75, or 80, or 90, or 100, or 200, or 300, or 400, or 500, or 600, or 700, or 800, or 900, or 1000, or 1250, or 1500, or 1750, or 2000, or 3000, or 4000, or 5000 different primer binding sites within the target gene or genome of interest). For example, in one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 10 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 20 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 30 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 40 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 50 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 60 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 70 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 75 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 80 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 90 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 100 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 200 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 300 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 400 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 500 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 600 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 700 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 800 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 900 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 1,000 different primer binding sites within the target gene or genome of interest. In another embodiment the pool of target-specific forward primers comprises primers complementary to approximately 2,000 different primer binding sites within the target gene or genome of interest. In another embodiment the pool of target-specific forward primers comprises primers complementary to approximately 3,000 different primer binding sites within the target gene or genome of interest. In another embodiment the pool of target-specific forward primers comprises primers complementary to approximately 4,000 different primer binding sites within the target gene or genome of interest. In another embodiment the pool of target-specific forward primers comprises primers complementary to approximately 5,000 different primer binding sites within the target gene or genome of interest. In some embodiments the pool of target-specific forward primers comprises primers complementary to from about 50 to about 100 different primer binding sites within the target gene or genome of interest. In some embodiments the pool of target-specific forward primers comprises primers complementary to from about 60 to about 90 different primer binding sites within the target gene or genome of interest. In some embodiments the pool of target-specific forward primers comprises primers complementary to from about 70 to about 80 different primer binding sites within the target gene or genome of interest.
[0029] In some embodiments the different primer binding sites within the target gene or genome of interest to which the pool of target-specific forward primers is complementary are spaced approximately 20 to 200 nucleotides apart (e.g. approximately 20, or 30, or 40, or 50, or 60, or 70, or 80, or 90, or 110, or 120, or 140, or 160, or 180, or 200 nucleotides apart).
[0030] The number of cycles of the single primer extension reaction that is performed can be selected as desired. For example, in some embodiments from 1 to about 99 cycles of the single primer extension reaction are performed. In some embodiments about 10 cycles of the single primer extension reaction are performed. In some embodiments about 20 cycles of the single primer extension reaction are performed. In some embodiments about 30 cycles of the single primer extension reaction are performed. In some embodiments about 40 cycles of the single primer extension reaction are performed. In some embodiments about 50 cycles of the single primer extension reaction are performed. In some embodiments about 60 cycles of the single primer extension reaction are performed. In some embodiments about 70 cycles of the single primer extension reaction are performed. In some embodiments about 80 cycles of the single primer extension reaction are performed. In some embodiments about 90 cycles of the single primer extension reaction are performed. In some embodiments about 100 cycles of the single primer extension reaction are performed.
[0031] The number of cycles of the first PCR reaction that is performed can be selected as desired. For example, in some embodiments from 1 to about 99 cycles of the first PCR reaction are performed. In some embodiments about 10 cycles of the first PCR reaction are performed. In some embodiments about 20 cycles of the first PCR reaction are performed. In some embodiments about 30 cycles of the first PCR reaction are performed. In some embodiments about 40 cycles of the first PCR reaction are performed. In some embodiments about 50 cycles of the first PCR reaction are performed. In some embodiments about 60 cycles of the first PCR reaction are performed. In some embodiments about 70 cycles of the first PCR reaction are performed. In some embodiments about 80 cycles of the first PCR reaction are performed. In some embodiments about 90 cycles of the first PCR reaction are performed. In some embodiments about 100 cycles of the first PCR reaction are performed.
[0032] The number of cycles of the second PCR reaction that is performed can be selected as desired. For example, in some embodiments from 1 to about 99 cycles of the second PCR reaction are performed. In some embodiments about 10 cycles of the second PCR reaction are performed. In some embodiments about 20 cycles of the second PCR reaction are performed. In some embodiments about 30 cycles of the second PCR reaction are performed. In some embodiments about 40 cycles of the second PCR reaction are performed. In some embodiments about 50 cycles of the second PCR reaction are performed. In some embodiments about 60 cycles of the second PCR reaction are performed. In some embodiments about 70 cycles of the second PCR reaction are performed. In some embodiments about 80 cycles of the second PCR reaction are performed. In some embodiments about 90 cycles of the second PCR reaction are performed. In some embodiments about 100 cycles of the second PCR reaction are performed.
[0033] Any suitable next generation sequencing (NGS) based adapters in can be used in the methods of the invention. In some embodiments an Illumina NGS based adapter is used.
[0034] In some embodiments the “common sequence” used in the methods of the present invention is any suitable polynucleotide sequence. In some embodiments the “common sequence” is a polyC sequence. In some embodiments the “common sequence” is a polyG sequence. In some embodiments the “common sequence” is a polyA sequence. In some embodiments the “common sequence” is a polyT sequence.
[0035] In those embodiments of the present invention that utilize biotinylated nucleotides any suitable biotinylated nucleotides can be used. In some embodiments biotin-dCTP nucleotides are used. In some embodiments biotin-dGTP nucleotides are used. In some embodiments biotin-dATP nucleotides are used. In some embodiments any combination of biotin-dCTP, biotin-dGTP, biotin-dATP and/or biotin dTTP nucleotides are used.
[0036] In some embodiments the sample of fragmented used in the methods of the present invention is, or comprises, HPV circulating tumor DNA (ctDNA) and the target gene or genome of interest is an HPV gene or genome. In some such embodiments the HPV ctDNA is from an HPV-associated squamous cell carcinoma of the head and neck, oropharynx, cervix, vulva, vagina, anal canal or penis. In some such embodiments the HPV ctDNA is from an HPV-associated squamous cell carcinoma of the head and neck. In some such embodiments the HPV ctDNA is from an HPV-associated squamous cell carcinoma of the oropharynx. In some such embodiments the HPV ctDNA is from an HPV-associated squamous cell carcinoma of the cervix. In some such embodiments the HPV ctDNA is from an HPV-associated squamous cell carcinoma of the vulva. In some such embodiments the HPV ctDNA is from an HPV-associated squamous cell carcinoma of the vagina. In some such embodiments the HPV ctDNA is from an HPV-associated squamous cell carcinoma of the anal canal. In some such embodiments the HPV ctDNA is from an HPV-associated squamous cell carcinoma of the penis. Similarly, in some such embodiments the HPV ctDNA is from HPV type 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58 or 59. For example, in some embodiments the HPV ctDNA is from HPV type 16. In some embodiments the HPV ctDNA is from HPV type 18. In some embodiments the HPV ctDNA is from HPV type 31. In some embodiments the HPV ctDNA is from HPV type 33. In some embodiments the HPV ctDNA is from HPV type 35. In some embodiments the HPV ctDNA is from HPV type 39. In some embodiments the HPV ctDNA is from HPV type 45. In some embodiments the HPV ctDNA is from HPV type 51. In some embodiments the HPV ctDNA is from HPV type 52. In some embodiments the HPV ctDNA is from HPV type 56. In some embodiments the HPV ctDNA is from HPV type 58. In some embodiments the HPV ctDNA is from HPV type 59.
[0037] In those embodiments of the present invention where the target gene or genome of interest is an HPV gene or genome, the methods provided herein utilize a pool of target-specific forward primers that are HPV-specific—i.e. a pool of HPV-specific forward primers. Examples of suitable HPV-specific forward primers are provided in Table 1, below—which provides the sequences of 77 different HPV-specific forward primers. For example, in some such embodiments the pool of target-specific forward primers comprises one or more of SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 10 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 20 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 30 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 40 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 50 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 60 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 70 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises each of SEQ ID NO. 1 through SEQ ID NO. 77. In some embodiments the pool of target-specific forward primers comprises primers that bind to a target sequence in the HPV genome that: (i) is within the E6 and/or E7 region of the HPV genome, and/or (ii) is 100% conserved between European and non-European HPV isolates.
Compositions
[0038] The present invention also provides various compositions useful in performing the methods described herein. For example, the present invention provides compositions comprising a pool of target-specific forward primers suitable for use in a single primer extension reaction, wherein the pool comprises primers complementary to multiple different primer binding sites located within, and spanning the length of, a target gene or genome of interest, wherein each target-specific forward primer comprises: (i) a sequence that is complementary to a primer binding site within the target gene or genome of interest and (ii) a first next generation sequencing (NGS) based adapter located 5′ to the sequence that is complementary to the primer binding site.
[0039] In some such embodiments such compositions comprise a pool of target-specific forward primers complementary to a primer binding sites in circulating cell free DNA (cfDNA). In some such embodiments such compositions comprise a pool of target-specific forward primers complementary to a primer binding sites in circulating tumor DNA (ctDNA).
[0040] In some embodiments such compositions comprise a pool of target-specific forward primers that are complementary to from tens to up to thousands different primer binding sites within the target gene or genome of interest (e.g. approximately 10, or 20, or 30, or 40, or 50, or 60, or 70, or 75, or 80, or 90, or 100, or 200, or 300, or 400, or 500, or 600, or 700, or 800, or 900, or 1000, or 1250, or 1500, or 1750, or 2000, or 3000, or 4000, or 5000 different primer binding sites within the target gene or genome of interest). For example, in one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 10 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 20 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 30 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 40 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 50 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 60 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 70 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 75 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 80 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 90 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 100 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 200 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 300 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 400 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 500 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 600 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 700 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 800 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 900 different primer binding sites within the target gene or genome of interest. In one embodiment the pool of target-specific forward primers comprises primers complementary to approximately 1,000 different primer binding sites within the target gene or genome of interest. In another embodiment the pool of target-specific forward primers comprises primers complementary to approximately 2,000 different primer binding sites within the target gene or genome of interest. In another embodiment the pool of target-specific forward primers comprises primers complementary to approximately 3,000 different primer binding sites within the target gene or genome of interest. In another embodiment the pool of target-specific forward primers comprises primers complementary to approximately 4,000 different primer binding sites within the target gene or genome of interest. In another embodiment the pool of target-specific forward primers comprises primers complementary to approximately 5,000 different primer binding sites within the target gene or genome of interest. In some embodiments the pool of target-specific forward primers comprises primers complementary to from about 50 to about 100 different primer binding sites within the target gene or genome of interest. In some embodiments the pool of target-specific forward primers comprises primers complementary to from about 60 to about 90 different primer binding sites within the target gene or genome of interest. In some embodiments the pool of target-specific forward primers comprises primers complementary to from about 70 to about 80 different primer binding sites within the target gene or genome of interest.
[0041] In some such embodiments such compositions comprise a pool of target-specific forward primers complementary to primer binding sites within the target gene or genome of interest that are spaced approximately 25 nucleotides apart. In some such embodiments such compositions comprise a pool of target-specific forward primers complementary to primer binding sites within the target gene or genome of interest that are spaced approximately 50 nucleotides apart. In some such embodiments such compositions comprise a pool of target-specific forward primers complementary to primer binding sites within the target gene or genome of interest that are spaced approximately 75 nucleotides apart. In some such embodiments such compositions comprise a pool of target-specific forward primers complementary to primer binding sites within the target gene or genome of interest that are spaced approximately 100 nucleotides apart. In some such embodiments such compositions comprise a pool of target-specific forward primers complementary to primer binding sites within the target gene or genome of interest that are spaced approximately 125 nucleotides apart. In some such embodiments such compositions comprise a pool of target-specific forward primers complementary to primer binding sites within the target gene or genome of interest that are spaced approximately 150 nucleotides apart. In some such embodiments such compositions comprise a pool of target-specific forward primers complementary to primer binding sites within the target gene or genome of interest that are spaced approximately 175 nucleotides apart. In some such embodiments such compositions comprise a pool of target-specific forward primers complementary to primer binding sites within the target gene or genome of interest that are spaced approximately 200 nucleotides apart.
[0042] In some such embodiments the NGS based adapter in the primers in the pool is an Ilumina adapter.
[0043] In some such embodiments the primers in the pool are complementary to primer binding sites located within an HPV gene or genome. In some such embodiments the primers in the pool are complementary to primer binding sites located within and HPV circulating tumor DNA (ctDNA). In some such embodiments the HPV ctDNA is from an HPV-associated squamous cell carcinoma of the head and neck, oropharynx, cervix, vulva, vagina, anal canal or penis. In some such embodiments the HPV ctDNA is from HPV type 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58 or 59. In some such embodiments the primers in the pool are complementary to primer binding sites the HPV genome that are conserved between HPV strains and sub-strains.
[0044] In some such embodiments the pool of target-specific forward comprises one or more of SEQ ID NO. 1 to SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 10 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 20 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 30 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 40 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 50 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 60 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises at least 70 primers from among SEQ ID NO. 1 through SEQ ID NO. 77. In some such embodiments the pool of target-specific forward primers comprises each of SEQ ID NO. 1 through SEQ ID NO. 77. In some embodiments the pool of target-specific forward primers comprises primers that bind to a target sequence in the HPV genome that: (i) is within the E6 and/or E7 region of the HPV genome, and/or (ii) is 100% conserved between European and non-European HPV isolates.
[0045] Each of the compositions described herein can, in some embodiments, comprise one or more additional components compatible with the storage and/or use of the pool of primers, such as suitable salts, buffers, preservatives, nucleotides, enzymes, and the like.
Applications
[0046] The methods and compositions provided herein have a variety of applications. For example, such methods and compositions can be employed to screen for, quantify and/or sequence tumor DNA (such DNA from an HPV-associated tumor) in the circulation of a subject (e.g. a patient suspected of having cancer, such as an HPV-associated cancer). Similarly, in some embodiments such methods and compositions can be employed to assess tumor burden (for example of an HPV-positive tumor) in a subject—as the quantity of the amplified library products should correlate to tumor burden. In some of such methods, controls and/or standard curves are used to quantify or give an estimate of the tumor burden—e.g. in terms of tumor volume or number or tumor cells, etc.
[0047] Similarly, the methods and compositions of the present invention can be employed to monitor the progression or recurrence of a cancer (such as an HPV-positive cancer) in a subject, or to monitor the response to therapy of cancer (such as an HPV-positive cancer) in a subject. Such methods involve determining changes in the quantity of the specific amplified library products over time. Typically, such methods entail performing the methods described herein using two or more plasma samples obtained from the subject at different time points. For example, in some embodiments the methods of the present invention are performed using a first plasma sample obtained from a subject at a first time point and a second plasma sample obtained from the subject at a second time point. Using such methods, an increase or decrease in the quantity of the specific amplified library products between the first sample/time point and the second sample/time point can be detected and quantified. For example, an increase in the quantity of the amplified library products between the first sample/time point the second sample/time point may indicate an increase tumor burden, for example as a result of tumor progression, or as a result of tumor recurrence following a previous treatment. Similarly, a decrease in the quantity of the specific amplified library products between a first sample/time point prior to treatment (or earlier in treatment) and a second sample/time point subsequent to commencement of treatment (or later in treatment, or after treatment) may indicate that the treatment is effective. Conversely, an increase in the quantity of the specific amplified library products between a first sample/time point prior to treatment (or earlier in treatment) and a second sample/time point subsequent to commencement of treatment (or later in treatment, or after treatment) may indicate that the treatment is ineffective. In those methods aimed monitoring the response to therapy, the methods may be performed using both a “test” sample and a “control” sample. For example, the test sample may be obtained from a subject treated with a new/test therapeutic molecule and the control sample may be obtained from an untreated subject, or a subject treated with a placebo, or a subject treated with a comparator therapeutic molecule. Such methods can be used to monitor the response to any desired type of therapy, including, but not limited to, therapy with chemotherapeutic agents, therapy with other therapeutic molecules, therapy using radiation, and surgical therapy.
[0048] One of skill in the art will recognize that the various methods and compositions of the present invention described throughout this patent disclosure can be combined in various different ways, and that such combinations are within the scope of the present invention.
[0049] One of skill in the art will also recognize that the methods and compositions of the present invention described herein are applicable more widely than to only detection of circulating tumor DNA and to only detection in plasma samples, but can also be applied to, and used in conjunction with, detection of various other forms of DNA (i.e. other than ctDNA) and various other tissue samples (i.e. other than plasma), including, but not limited to, blood, urine, cerebrospinal fluid, saliva, and cervical tissue samples. Thus, in each instance in the present specification, and the accompanying claims, in which an embodiment of the invention is described as involving a plasma sample, the present invention also encompasses the analogous embodiment in which another tissue sample (such as blood, urine, cerebrospinal fluid, saliva, or a cervical sample) is used in place of the plasma sample. Similarly, in each instance in the present specification, and the accompanying claims, in which an embodiment of the invention is described as involving ctDNA, the present invention also encompasses the analogous embodiment in which another type of source of DNA (i.e. other than ctDNA) is used or detected.
[0050] The invention is further described by the following non-limiting “Examples,” as well as the Figures referred to therein and the descriptions of such Figures provided above.
EXAMPLES
Example 1
“SPECTRE-Seq” for Improved Detection and Monitoring of HPV Associated Cancers
[0051] The present example demonstrates the development of a modified single primer extension (SPEX) technique termed “SPECTRE-seq” which is useful for the detection and sequencing of ctDNA and other forms of cfDNA. In the present non-limiting example, this technique is used to capture the entire HPV genome from a ctDNA sample for high throughput sequencing. However, the SPECTRE-seq method can also be applied to detection of other target sequences of interest in ctDNA, cfDNA or other sources of fragmented DNA.
[0052] Rationale: cfDNA is fragmented (106-200 bp) and conventional PCR approaches capture only a fraction of cfDNA that contains opposing primer sequences. Without a multiplex strategy to assay for HPV, PCR based methods cannot estimate the absolute number of number of HPV copies, nor capture multiple loci from the same sample.
[0053] Experimental Design: SPEX was previously developed to generate strand specific accurate sequence information from ancient DNA. (1) SPEX can overcome the limitations of amplifying fragmented DNA by using only one sequence-specific primer per target sequence. We modified prior SPEX methods to improve their utility, sensitivity and specificity for detection of cfDNA and ctDNA. A schematic representation of one of our improved SPECTRE-seq methods is provided in
[0054] These SPECTRE-seq methods include several modifications over and above prior SPEX techniques. One modification is the addition of next generation sequencing (NGS)-based adapters on the 5′ ends of both the forward (target-specific) and reverse (not target-specific) primers. This enables the generation of an NGS library without downstream ligation or amplification steps. Another modification is the use of a set of multiple different target-specific forward primers (i.e. a “primer set” or “primer pool”) that tiles across the entirety of the target sequence to be detected. In this particular non-limiting example the primer set comprised primers that tile across the entirety of the 8 kb HPV genome.
[0055] In an additional improvement we also developed a version of this technique in which the SPEX extension reaction is supplemented with biotinylated nucleotides (biotin-dCTP) (as shown schematically in
[0056] Preliminary results: We optimized the extension, C-tailing, and biotin-SA purification portion of the protocol using one primer and a single stranded DNA template with an HPV sequence.
[0057]
[0058] The key advantages of the SPECTRE-seq technique include: (1) the ability to multiplex across a large sequence (such as the HPV genome), (2) in the case of HPV, the ability to utilize relevant high risk HPV strains, such as 16, 18, 33, and 35, simultaneously, and (3) the potential to incorporate additional genomic regions of interest for detection of cancer mutations.
Example 2
“SPECTRE-Seq” Primers for Detection & Sequencing of the HPV16 Genome
[0059] Table 1, below, provides an exemplary pool of SPECTRE primers designed to tile across the HPV16 genome—for use in the SPECTRE-seq methods described herein.
TABLE-US-00001 TABLE 1 Amplicon Forward primer (SEQ ID NO.) (nucleotide sequence) 1 tataaaactaagggcgtaacc 2 aatgtttcaggacccaca 3 cagttactgcgacgtgag 4 gacattattgttatagtttgtatgga 5 aagcaaagacatctggaca 6 ttgcagatcatcaagaaca 7 tgcaaccagagacaactg 8 ggacagagcccattacaa 9 gggcacactaggaattgt 10 gggatgtaatggatggttt 11 atttaacacaggcagaaaca 12 gcagtacaggttctaaaacga 13 agagctgcaaaaaggaga 14 gactgaaacaccatgtagtca 15 acactatatgccaaacacca 16 ggggtgagtttttcagaat 17 gctgacagtataaaaacactattaca 18 caattgaaaaattgctgtcta 19 gcagcagcattatattggtat 20 catttgaattatcacagatgg 21 aatgcaagtgcctttctaa 22 aaatgagtatgagtcaatggata 23 aagatttttgcaaggcata 24 aatttctgcaagggtctg 25 tagcagatgccaaaatagg 26 caactaaaatgccctcca 27 tggtggtgtttacatttcc 28 tggtccagattaagtttgc 29 agtacagacctacgtgaccata 30 gccaacactggctgtatc 31 aagtggacattacaagacgtt 32 tgcagtttgatggagaca 33 atgttcatgaaggaatacga 34 ctgtgtttagcagcaacg 35 accgaagaaacacagacg 36 acaccactaagttgttgcac 37 agtaacactacacccatagtacatt 38 tggacaggacataatgtaaaa 39 aaataccaaaaactattacagtgtc 40 taatacgtccgctgcttt 41 cctctgcgtttaggtgtt 42 gacacaaacgttctgcaa 43 cctaaggttgaaggcaaa 44 gggttaggaattggaaca 45 agaccccctttaacagtagat 46 ccccagatgtatcaggatt 47 cactttcactgacccatct 48 ttatgaagaaattcctatggatac 49 atagtcgcacaacacaacag 50 gcatatgaaggtatagatgtgg 51 taggccagcattaacctc 52 ttgatcctgcagaagaaat 53 tgatatttatgcagatgactttatt 54 tcaggttatattcctgcaaa 55 atagttccagggtctccac 56 ctctttggctgcctagtg 57 tatgttgcacgcacaaac 58 atcaggattacaatacagggta 59 cctgtgtaggtgttgaggt 60 ggatgacacagaaaatgct 61 ccacctataggggaacact 62 cacagttattcaggatggtg 63 tccagattatattaaaatggtgtc 64 ggtgaaaatgtaccagacg 65 ttttcctacacctagtggttc 66 tgttggggtaaccaactatt 67 acatggggaggaatatga 68 cactattttggaggactgg 69 acacctccagcacctaaa 70 ttcctttaggacgcaaat 71 tacaactgctaaacgcaaa 72 gtgcttgtaaatattaagttgtatgt 73 attgtgtcatgcaacataaata 74 aaacttgtacgtttcctgct 75 gcactatgtgcaactactgaa 76 gcacatatttttggcttgt 77 atttgtaaaactgcacatgg
REFERENCES
[0060] 1. Brotherton, P. et al. Novel high-resolution characterization of ancient DNA reveals C > U-type base modification events as the sole cause of post-mortem miscoding lesions. Nucleic Acids Research 35, 5717-5728, doi:10.1093/nar/gkm588 (2007). [0061] 2. Brotherton, P., Sanchez, J. J., Cooper, A. & Endicott, P. Preferential access to genetic information from endogenous hominin ancient DNA and accurate quantitative SNP-typing via SPEX. Nucleic Acids Research 38, e7-e7, doi:10.1093/nar/gkp897 (2010).