Methods for Detection of Nucleotide Modification
20200095633 ยท 2020-03-26
Inventors
Cpc classification
C07H1/00
CHEMISTRY; METALLURGY
C01G55/002
CHEMISTRY; METALLURGY
C12Q1/6876
CHEMISTRY; METALLURGY
C12Q1/6806
CHEMISTRY; METALLURGY
International classification
C12Q1/6876
CHEMISTRY; METALLURGY
C07H1/00
CHEMISTRY; METALLURGY
C12Q1/6806
CHEMISTRY; METALLURGY
Abstract
This invention relates to the identification of modified cytosine residues, such as 5-methylcytosine (5mC), 5-hydroxymethylcytosine (5hmC) and 5-formylcytosine (5fC) to be distinguished from cytosine (C) in a sample nucleotide sequence. Methods may comprise oxidising or reducing a first portion of polynucleotides which comprise the sample nucleotide sequence; treating the oxidised or reduced first portion and a second portion of polynucleotides with bisulfite; sequencing the polynucleotides in the first and second portions of the population following steps ii) and iii) to produce first and second nucleotide sequences, respectively and; identifying the residue in the first and second nucleotide sequences which corresponds to a cytosine residue in the sample nucleotide sequence. These methods may be useful, for example in the analysis of genomic DNA and/or of RNA.
Claims
1.-32. (canceled)
33. A method comprising: contacting a sample comprising a 5-hydroxymethylcytosine with a metal oxo complex, wherein said metal oxo complex converts said 5-hydroxymethylcytosine to 5-formylcytosine.
34. The method of claim 33, wherein said sample comprises a nucleotide sequence.
35. The method of claim 33, wherein said metal oxo complex is a perruthenate.
36. The method of claim 35, wherein said perruthenate is KRuO4.
37. The method of claim 34, wherein said nucleotide sequence comprises genomic DNA.
38. The method of claim 34, wherein said nucleotide sequence comprises RNA.
39. The method of claim 34, wherein said nucleotide sequence is immobilized.
40. The method of claim 34, further comprising confirming a presence of said 5-hydroxymethylcytosine.
41. The method of claim 40, wherein said confirming comprises amplifying said nucleotide sequence.
42. The method of claim 40 or 41, wherein said confirming comprises sequencing said nucleotide sequence.
43. The method of claim 34, further comprising reducing said nucleotide sequence.
44. The method of claim 33, wherein said contacting occurs under aqueous conditions.
Description
[0157]
[0158]
[0159]
[0160]
[0161]
[0162]
[0163]
[0164]
[0165]
[0166]
[0167]
[0168] Table 1 shows sequencing outcomes for cytosine and modified cytosines subjected to various treatments.
[0169] Table 2 shows the structures of cytosine (1a), 5-methylcytosine (5mC; 1b), 5-hydroxymethylcytosine (5hmC; 1c) and 5-formylcytosine (5fC; 1d)
[0170] Table 3 shows a summary of the efficiencies of oxidation of 5hmC in DNA for some examples of water-soluble oxidants.
[0171] Tables 4 and 5 show the retention times for the peaks in the HLPC traces of DNA (
[0172] Experiments
[0173] 1. Methods
[0174] 1.1 d5hmCTP Oxidation to d5fCTP and d5cCTP with MnO2
[0175] 2.5 pL d5hmCTP (100 mM, Bioline) in 497.5 L H2O with 51.6 mg MnO2 (for d5fCTP) or 500 mg MnO2 (for d5cCTP) (Alpha Aeser) was shaken at 50 degrees for 2 h and 30 min. Then MnO2 was removed by filtration using Amicon Ultra 0.5 mL 10 kDa columns (Millipore) and the sample was lyophilized. The nucleotide triphosphate was resuspended (5 mM) and dephosphorylated with alkaline phosphatase (New England Biolabs) overnight at 37 C.
[0176] 1.2 Bisulfite Timecourse with d5fC and d5cC Nucleoside
[0177] 9 L d5fC or d5cC (5 mM), 0.5 L dA (0.1 M, Roche) and 2.5 L H2O were mixed and then 33 L 4 M NaHSO3 (MP Biochemicals) was added. This was split into three 15 L reactions and held at 50 C. in the dark. 0.5 L fractions were taken out at various time points and worked up in 2.5 L H2O and 2 L NaOH (1 M). After being held for at least 30 min at room temperature they were injected into the HPLC. Peak areas were measured, correlated to a calibration curve of d5fC, d5cC, dC or dU, and standardised to the level of dA in the chromatogram.
[0178] 1.3 DNA Digestion for HPLC Analysis
[0179] DNA was digested as by a literature protocol (30), purified with Amicon Ultra 0.5 mL 10 kDa columns and analysed by HPLC using an Agilent 1100 HPLC with a flow of 1 mL/min over an Eclipse XDB-C18 3.5 m, 3.0150 mm column. The column temperature was maintained at 45 degrees. Eluting buffers were buffer A (500 mM Ammonium Acetate (Fisher) pH 5), Buffer B (Acetonitrile) and Buffer C (H2O). Buffer A was held at 1% throughout the whole run and the gradient for the remaining buffers was 0 min0.5% B, 2 min1% B, 8 min4% B, 10 min95% B.
[0180] The retention times of 2-deoxynucleosides are as follows: 2-deoxy-5-carboxycytidine (1.0 min), 2-deoxycytidine (1.8 min), 2-deoxy-5-hydroxymethylcytidine (2.1 min), 2-deoxyuridine (2.7 min), 2-deoxy-5-methylcytidine (4.0 min), 2-deoxyguanosine (4.5 min), deoxy-5-formylcytidine (5.4 min), 2-deoxythymidine (5.7 min), 2-deoxyadeosine (7.4 min).
[0181] The same protocol was used to digest RNA for HPLC analysis.
[0182] 1.4 Single and Double Stranded DNA Sequences
[0183] 15mer oligos were purchased from IBA containing either cytosine, 5-methylcytosine, or 5-hydroxymethylcytosine. 122mer and 135mer dsDNA template and primers were purchased from Biomers. All C's in primers are 5-methylcytosine. 5-hydroxymethylcytosine was added to the strand at all other cytosine positions by PCR, using d5hmCTP and Fermentas DreamTaq Polymerase.
[0184] 1.5 General Reduction
[0185] DNA (approx 1-10 L) was incubated on ice for 5 minutes with 40 L of NaBH4 (10,000 equivalents per L). This reaction was then shaken at 25 degrees with an open lid in the dark for 1 hour. The reaction was purified with quick spin oligo columns (Roche).
[0186] 1.6 Oxidations
[0187] General Oxidation
[0188] DNA was made up to 24 L with NaOH (0.05 M final concentration) on ice, then 1 L of a KRuO4 (Alpha Aeser) solution (15 mM in 0.05 M NaOH) was added and the reaction was held on ice for 1 hour, with occasional vortexing. The reaction was purified with a mini quick spin oligo column (Roche) (after four 600 L H2O washes).
[0189] These conditions were also used for the oxidation of RNA.
[0190] Single Stranded DNA Oxidation
[0191] 1 g 15mer synthetic ssDNA oxidised according to the general oxidation.
[0192] Synthetic Double Stranded DNA Double Oxidation
[0193] The dsDNA was precipitated with ethanol and then filtered through a mini quick spin oligo column (after four 600 uL H2O washes). A double oxidation was required for synthetic dsDNA as NaOH denaturation is not 100% efficient with a solution of a single homologous DNA fragment (unlike genomic DNA).
[0194] 1 g DNA was denatured in 0.05 M NaOH (total volume 19 L) for 30 min at 37 C. The reaction was then snap cooled on ice and left for 5 min. The reaction was then oxidised according to the general oxidation but with a total volume of 20 L. This DNA was re-denatured in 0.05 M NaOH (total volume 24 L) for 30 min at 37 C. The reaction was again snap cooled on ice and left for 5 min and oxidised according to the general oxidation.
[0195] General Oxidation for Genomic DNA
[0196] DNA (1 g or less) was precipitated with ethanol prior to oxidation then filtered through a mini quick spin oligo column (after four 600 L H2O washes). DNA was denatured in 0.05 M NaOH (24 or 40 L total volume) for 30 min at 37 C. This was then snap cooled on ice and left for 5 min and oxidised according to the general oxidation.
[0197] 1.7 Sanger and Illumina Sequencing of Oxidative Bisulfite Treated 20 dsDNA
[0198] For Sanger sequencing, 1 g of 122mer DNA containing C, 5mC and 5hmC was oxidised according to the dsDNA double oxidation and bisulfite-treated using the Qiagen Epitect kit, according to the manufacturer's instructions for FFPE samples, except that the thermal cycle was run twice over. These samples were then submitted for Sanger sequencing (Source BioScience).
[0199] For Illumina sequencing, 1 g of 122mer and 135mer DNA containing 5hmC was digested overnight with DraI (2 L, New England Biolabs) and SspI (1 L, New England Biolabs). The digested bands were gel purified with the Fermentas GeneJET gel extraction kit and methylated adaptors (Illumina) were ligated using the NEBNext DNA sample prep master mix set 1. After oxidation and bisulfite treatment as above, ligated fragments were amplified (18 cycles) using Pfu Turbo Cx (Agilent) and adaptor-specific primers(Illumina), followed by purification using AMPure XP beads (Agencourt).
[0200] 1.8 Mass Spectrometry
[0201] Nucleosides were derived from DNA by digestion with DNA Degradase Plus (Zymo Research) according to the manufacturer's instructions 5 and were analysed by LC-MS/MS on a LTQ Orbitrap Velos mass spectrometer (Thermo Scientific) fitted with a nanoelectrospray ion-source (Proxeon). Mass spectral data for 5hmC, 5fC, and where relevant 5mC and T, were acquired in high resolution full scan mode (R>40,000 for the protonated pseudomolecular ions and >50,000 for the accompanying protonated base fragment ions), and also in selected reaction monitoring (SRM) mode, monitoring the transitions 258->142.0611 (5hmC), 256->140.0455 (5fC), 242->126.0662 (5mC) and 243->127.0502 (T). Parent ions were selected for SRM with a 4 mass unit isolation window and fragmented by HCD with a relative collision energy of 20%, with R>14,000 for the fragment ions.
[0202] Peak areas from extracted ion chromatograms of the relevant ions for 5hmC and 5fC were normalised to those from either 5mC (where present) or T, and quantified by external calibration relative to 20 standards obtained by digestion of nucleotide triphosphates or oligonucleotides.
[0203] 1.9 ES Cell Culture and DNA Extraction
[0204] J1 ES cells (12954/SvJae) were purchased from ATCC (Cat. SCRC-1010) and cultured on a y-irradiated pMEF feeder layer at 37 C. and 5% 002 in complete ES medium (DMEM 4500 mg/L glucose, 4 mM L-glutamine and 110 mg/L sodium pyruvate, 15% fetal bovine serum, 100 U of penicillin/100 pg of streptomycin in 100 mL medium, 0.1 mM non-essential amino acids, 50 M (3-mercaptoethanol, 103U LIF ESGROC1). Genomic DNA was prepared from ES cells at passage 14 or 20 using the Qiagen Allprep DNA/RNA mini kit.
[0205] 1.10 oxRRBS
[0206] RRBS libraries from oxidised and non-oxidised DNA were prepared based on a previously published protocol (31). Briefly, 2 g of genomic DNA were digested with MspI (Fermentas) followed by end repair and A-tailing with Klenow (Fermentas) and ligation of methylated adaptors (Illumina) with T4 DNA ligase (NEB). Adaptor-ligated MspI-digested DNA was run on a 3% agarose gel and size selected (110-380 bp), followed by purification with the Qiagen QIAquick gel purification quick and ethanol precipitation.
[0207] Prior to oxidation, size-selected DNA was filtered through a mini quick spin oligo column (after four 600 L H2O washes) to remove any last remaining buffers/salts and adjusted to a final volume of 25 L. 5 L of this solution were kept for generation of the non-oxidised library. The remaining was oxidised according to the general oxidation for genomic DNA.
[0208] Both oxidised and non-oxidised DNA samples were bisulfite-treated using the Qiagen Epitect kit, according to the manufacturer's instructions for FFPE samples, except that the thermal cycle was run twice over. Final library amplification (18 cycles) was done using Pfu Turbo Cx (Agilent) and adaptor-specific primers (Illumina), after which the libraries were purified using AMPure XP beads (Agencourt).
[0209] 1.11 Sequencing and Read Alignment
[0210] Sequencing (single-end, 40 bp reads) was performed on the Illumina GAIIx platform. Bases were called by reprocessing raw images using OLB version 1.8 after applying bareback-processing to the first three base pairs (32). Bisulfite read alignments to the mouse genome (build NCBIM37) were carried out using Bismark v0.6.4 (33), using options -n 1-1 40 --phred64-quals --vanilla. Bismark alignments to individual LINE1 5 monomer sequences were performed slightly more stringently (-n 0); published consensus sequences were used for alignment of reads to L1A (34), L1Tf and L1Gf (35) monomer subtypes.
[0211] Bisulfite conversion rates were estimated from the number of unconverted cytosines at Klenow-filled in 3 MspI sites of sequencing reads that were short enough to read through these sites. Read phred quality remained high at 3 ends. Estimated bisulfite conversion rates varied between 99.8% and 99.9%.
[0212] 1.12 oxRRBS Data Processing
[0213] The numbers of converted and unconverted cytosines within CGIs (25) were extracted from each BS and oxBS dataset. For each CpG position, the amount of 5mC was taken as the percentage of unconverted cytosines in each oxBS dataset, and the amount of 5hmC was taken by subtracting this value from the percentage of unconverted cytosines in the corresponding BS dataset. An overall value per CGI was calculated by pooling data from all the CpGs covered within each CGI. CpGs with fewer than 10 reads were excluded, as were CpGs for which the 5mC estimation deviated from the overall CGI 5mC value by more than 20% or the 5hmC estimation deviated from the overall value by more than 10%. After this outlier filtration step, only CGIs with 5 representative CpGs or more were analyzed.
[0214] To test for CGIs that contained 5mC levels significantly above the bisulfite conversion error of the oxBS dataset, a binomial test was applied using a Benjamini-Hochberg corrected p-value cutoff of 0.01. Similarly, a binomial test was used to select CGIs with significant amounts of unconverted cytosines in the BS dataset; within these, differences between the BS and oxBS datasets were tested by applying a Fisher's test and using a corrected p-value cutoff of 0.05. CGIs with a significantly lower fraction of unconverted cytosines in the oxBS dataset were taken as hydroxymethylated CGIs. CGIs with the opposite pattern are assumed to be artefacts and were used to estimate a false discovery rate.
[0215] 1.13 GlucMS-qPCR
[0216] Quantification of 5mC and 5hmC levels at MspI sites by glucMS-qPCR was performed as previously described (6).
[0217] 2. Results
[0218] We pursued a strategy that would discriminate 5mC from 5hmC in DNA by exploiting chemical reactivity that is selective for 5hmC, in particular, by chemically removing the hydroxymethyl group and thus transforming 5hmC to C, which could then be readily transformed to U by bisulfite-mediated deamination. During our chemical reactivity studies on 5-formylcytosine (5fC), we observed the decarbonylation and deamination of 5fC to uracil (U) under bisulfite conditions that would leave 5mC unchanged (
[0219] Bisulfite profiles of 2deoxy-5-formylcytosine and 2deoxy-5-carboxycytosine were determined (
[0220] Therefore, we required specific oxidation of 5hmC to 5fC using an oxidant that was mild, compatible with aqueous media and selective over other bases and the DNA backbone. A range of potentially suitable water-soluble oxidants were tested (Table 3) and we found potassium perruthenate (KRuO4) to possess the properties and conversion efficiency we sought. KRuO4 can, in principle, oxidize both alcohols and carbon-carbon double bonds (23). However, in our reactivity studies on a synthetic 15mer single stranded DNA (ssDNA) containing 5hmC, we established conditions under which KRuO4 reactivity was highly specific for the primary alcohol of 5hmC (quantitative conversion of 5hmC to 5fC by mass spectrometry,
[0221] A 140 bp DNA molecule (SEQ ID NO: 1) was prepared which contained 45 5hmC nucleosides incorporated through PCR using 5-methylcytosine primers and hmCTP. The DNA was oxidised using KRuO4. Before and after oxidation, the DNA was digested to nucleosides with Benzonase, Phosphodiesterase I and Alkaline Phosphatase. This mixture was then injected into the HPLC, to give the traces shown in
[0222] A single stranded 15 bp DNA molecule (SEQ ID NO: 2) containing 3 5fC residues was treated with bisulfite as described above. Before and after bisulfite treatment, the DNA was digested to nucleosides with Benzonase, Phosphodiesterase I and Alkaline Phosphatase. This mixture was then injected into the HPLC, to give the traces shown in
[0223] Following bisulfite treatment, only a very small peak for 5fC remains, and negligible cytosine is present. The uracil peak in
[0224] A 140 bp DNA molecule (SEQ ID NO: 1) was prepared which contained 45 5fC nucleosides incorporated through PCR. The DNA was reduced using NaBH4 as described above. Before and after reduction, samples of the DNA were digested to nucleosides with Benzonase, Phosphodiesterase I and Alkaline Phosphatase. This mixture of nucleosides was then injected into the HPLC, to give the traces shown in
[0225] Oxidised bisulfite conversion of a ClaI site (ATCGAT) in a 122 base pair double stranded DNA (SEQ ID NO: 3) was investigated to test the efficiency and selectivity of the oxidative bisulfite method. A double stranded 122 base pair DNA fragment with a single CpG in the centre (in the context of a ClaI ATCGAT restriction site; SEQ ID NO: 3) was amplified by PCR using 5-methylcytosine primers and either CTP, 5mCTP or 5hmCTP. The amplified product contained 5-methylcytosine in the primer regions and CpG, 5mCpG, or 5hmCpG in the centre CpG.
[0226] As described above, the three synthetic 122mer dsDNAs containing either C, 5mC or 5hmC were each oxidised with KRuO4 and then subjected to a conventional bisulfite conversion protocol. Sanger sequencing was carried out on each of the three strands (
[0227] The C-containing strand completely converted to U (
[0228] To gain an accurate measure of the efficiency of conversion of 5hmC to U, Illumina sequencing was carried out on the synthetic strand containing 5hmC after oxidative bisulfite treatment. An overall 5hmC to U conversion level of 94.5% was observed (
[0229] We then used the oxidative bisulfite principle to quantitatively map 5hmC at high resolution in the genomic DNA of mouse ES cells. We chose to combine oxidative bisulfite with reduced representation bisulfite sequencing (RRBS) (24), which allows for selective sequencing of a portion of the genome that is highly enriched for CpG islands (CGIs), thus ensuring adequate sequencing depth to detect this less abundant mark. We therefore generated RRBS and oxRRBS datasets, achieving an average sequencing depth of 120 reads per CpG, which when pooled yielded an average of 3,300 methylation calls per CGI. After applying depth and breadth cutoffs (see Materials and Methods), 55% (12,660) of all CGIs (25) were covered in our datasets. Our RRBS (i.e., non-oxidised) data correlates well with published RRBS and BS-Seq datasets (24, 26).
[0230] To identify 5hmC-containing CGIs, we tested for differences between the RRBS and oxRRBS datasets using stringent criteria (see Materials and Methods). It was expected that most significant differences would stem from CGIs that had a lower proportion of unconverted cytosines in the oxRRBS set when compared with the RRBS set. CGIs that had the reverse trend were used to estimate a false discovery rate, which was 3.7% (
[0231] To validate our method, we selected 21 CGIs containing MspI restriction sites and quantified 5hmC and 5mC levels at these CpGs by glucMS-qPCR (28) (
[0232] Reduced bisulfite conversion (reBS-Seq) of DNA strand containing a 5-formylcytosine (5fC) was investigated.
[0233] A synthetic 100mer DNA strand (SEQ ID NO: 8) containing the sequence ACGGASfCGTA was put through a reduction with NaBH4, and then subjected to a conventional bisulfite conversion protocol.
[0234] Sanger sequencing was then carried out on the strand (
[0235]
[0236] In summary, we have shown that the oxBS-Seq method reliably maps and quantifies both 5mC and 5hmC at single nucleotide level. Oxidative 25 bisulfite is also compatible with non-sequencing downstream approaches such as Sequenom, as demonstrated here. Therefore, by comparing the sequence of bisulfite treated and oxidised and bisulfite treated genomic DNA, it is possible to determine the presence of 5-methylcytosine and 5-hydroxymethylcytosine, along with the non-modified cytosine.
[0237] For example, uracil residues at the same position in the sequences of both bisulfite treated and oxidised and bisulfite treated genomic DNA indicate the presence of non-modified cytosine. Cytosine residues at the same position in the sequences of both bisulfite treated and oxidised and bisulfite treated genomic DNA indicate the presence of 5-methylcytosine. A cytosine residue in the sequence of the oxidised and bisulfite treated genomic DNA also indicates the presence of 5-methylcytosine. A cytosine residue in the sequence of the bisulfite treated genomic DNA and a uracil residue at the same position in the sequence of the oxidised and bisulfite treated genomic DNA indicates the presence of 5-hydroxymethylcytosine.
[0238] 5-formylcytosine may also be sequenced to single nucleotide resolution. 5fC may be quantitatively reduced to hmC in genomic DNA using NaBH4 (as shown by HPLC). By comparing the sequence of untreated, bisulfite treated, oxidised and bisulfite treated and reduced and bisulfite treated genomic DNA, the presence of all three known cytosine mammalian modifications, 5-methylcytosine, 5-hydroxymethylcytosine and 5-formylcytosine, may be determined along with the non-modified cytosine. For example, uracil residues at the same position in the sequences of i) bisulfite treated, ii) oxidised and bisulfite treated and iii) reduced and bisulfite treated genomic DNA (UUU) indicate the presence of non-modified cytosine.
[0239] Cytosine residues at the same position in the sequences of i)bisulfite treated, ii) oxidised and bisulfite treated and iii) reduced and bisulfite treated genomic DNA (CCC) indicate the presence of 5-methyl cyto sine.
[0240] A cytosine residue in the sequence of the bisulfite treated genomic 25 DNA; a uracil residue at the same position in the sequence of the oxidised and bisulfite treated genomic DNA and, optionally, a cytosine residue at the same position in the sequence of the reduced and bisulfite treated genomic DNA (CUC) indicates the presence of 5-hydroxymethylcytosine.
[0241] A uracil residue in the sequence of the bisulfite treated genomic DNA; a cytosine residue at the same position in the sequence of the reduced and bisulfite treated genomic DNA; and optionally, a uracil residue at the same position in the sequence of the oxidised and bisulfite treated genomic DNA (UCU) and indicates the presence of 5-formylcytosine.
[0242] Both modified and unmodified cytosines are read as cytosine when untreated genomic DNA is sequenced.
[0243] The HPLC chromatograms shown in
REFERENCES
[0244] 1. A. M. Deaton et al Genes Dev. 25, 1010 (May 15, 2011). [0245] 2. M. Tahiliani et al. Science 324, 930 (May 15, 2009). [0246] 3. S. Ito et al. Nature 466, 1129 (Aug. 26, 2010). [0247] 4. A. Szwagierczak et al Nucleic Acids Res, (Aug. 4, 2010). [0248] 5. K. P. Koh et al. Cell Stem Cell 8, 200 (Feb. 4, 2011). [0249] 6. G. Ficz et al., Nature 473, 398 (May 19, 2011). [0250] 7. K. Williams et al. Nature 473, 343 (May 19, 2011). [0251] 8. W. A. Pastor et al. Nature 473, 394 (May 19, 2011). [0252] 9. Y. Xu et al. Mol. Cell 42, 451 (May 20, 2011). [0253] 10. M. R. Branco et al Nat. Rev. Genet. 13, 7 (January, 2012). [0254] 11. S. Kriaucionis et al Science 324, 929 (May 15, 2009). [0255] 12. M. Munzel et al. Angew. Chem. Int. Ed. 49, 5375 (July 2010) [0256] 13. H. Wu et al. Genes Dev. 25, 679 (Apr. 1, 2011). [0257] 14. S. G. Jin et al Nuc. Acids. Res. 39, 5015 (July, 2011). [0258] 15. C. X. Song et al. Nat. Biotechnol. 29, 68 (January, 2011). [0259] 16. M. Frommer et al. PNAS. U.S.A. 89, 1827 (March 1992). [0260] 17. Y. Huang et al. PLoS One 5, e8888 (2010). [0261] 18. C. Nestor et al Biotechniques 48, 317 (April, 2010). [0262] 19. C. X. Song et al. Nat. Methods, (Nov. 20, 2011). [0263] 20. J. Eid et al. Science 323, 133 (Jan. 2, 2009). [0264] 21. E. V. Wallace et al. Chem. Comm. 46, 8195 (Nov. 21, 2010). [0265] 22. M. Wanunu et al. J. Am. Chem. Soc., (Dec. 14, 2010). [0266] 23. G. Green, W et al J Chem Soc Perk T 1, 681 (1984). [0267] 24. A. Meissner et al. Nature 454, 766 (Aug. 7, 2008). [0268] 25. R. S. Illingworth et al. PLoS genetics 6, (September, 2010). [0269] 26. M. B. Stadler et al. Nature 480, 490 (Dec. 22, 2011). [0270] 27. J. Borgel et al et al Nat. Genet. 42, 1093 (December, 2010). [0271] 28. S. M. Kinney et al. J. Biol. Chem. 286, 24685 (Jul. 15, 2011). [0272] 29. N. Lane et al. Genesis 35, 88 (February, 2003). [0273] 30. E. P. Quinlivan et al 3rd, Anal. Biochem. 373, 383 (February 2008). [0274] 31. H. Gu et al. Nat. Protoc. 6, 468 (April, 2011). [0275] 32. F. Krueger et al PLoS One 6, e16607 (2011). [0276] 33. F. Krueger et al Bioinformatics 27, 1571 (Jun. 1, 2011). [0277] 34. S. A. Schichman et al Mol. Biol. Evol. 10, 552 (May, 1993). [0278] 35. J. L. Goodier et al. Genome research 11, 1677 (October, 2001). [0279] 36. C. Qin et al. Mol. Carcinog. 49, 54 (January, 2010). [0280] 37. Li et al Nucleic Acids (2011) Article ID 870726 [0281] 38. Pfaffeneder, T. et al (2011) Angewandte. 50. 1-6 [0282] 39. Lister, R. et al (2008) Cell. 133. 523-536 [0283] 40. Wang et al (1980) Nucleic Acids Research. 8 (20), 4777-4790 [0284] 41. Hayatsu et al (2004) Nucleic Acids Symposium Series No. 48 (1), 261-262 [0285] 42. Lister et al (2009) Nature. 462. 315-22 [0286] 43. Sanger, F. et al PNAS USA, 1977, 74, 5463 [0287] 44. Bentley et al Nature, 456, 53-59 (2008) [0288] 45. KJ McKernan et al Genome Res. (2009) 19: 1527-1541 [0289] 46. M Ronaghi et al Science (1998) 281 5375 363-365 [0290] 47. Eid et al Science (2009) 323 5910 133-138 [0291] 48. Korlach et al Methods in Enzymology 472 (2010) 431-455) [0292] 49. Rothberg et al (2011) Nature 475 348-352). 15
TABLE-US-00001 ModelSequences Modifiednucleotidesareinbolditalics 140basepairdoublestrandedDNAmodel(SEQIDNO:1): CACATCCCACACTATACACTCATACATACCTGCTCACGACGACGCTGTACACCTACGTA CTCGTGCACGCTCGTCACGTGATCGAC CATGACTCTGACGCACTGAGGTATGGGAAGTAGTGAGTAGATTGTAGTAAGGAG 15nucleotidelongsinglestrandedDNAmodel(SEQIDNO:2): GAGACGACGTACAGG 122basepairdoublestrandedDNAmodel(SEQIDNO:3): CACATCCCACACTATACACTCATACATACCATTTAAATAAATTAAATAATATTAATATAT CGATTAATAATAAAT AATAATTAATTAATATTGGGAAGTAGTGAGTAGATTGTAGTAAGGAG 135basepairdoublestrandedDNAmodel(SEQIDNO:4): CACATCCCACACTATACACTCATACATACCATTTAACGATAAATTACAATAACGTATCT AATCATATCGATTAAC TAATCGAAATAATAATTACGCATTAATATTGGGAAGTAGTGAGTAGATTGTAGTAAGGA G dsDNAfwdprimer(SEQIDNO:5): CACATCCCACACTATACACTCATACATACC dsDNArevprimer(SEQIDNO:6): CTCCTTACTACAATCTACTCACTACTTCCC 28nucleotideRNAmodelsequence(SEQIDNO:7): UGUGGGGAGGGCGGGGCGGGGUCUGGGG 100nucleotide5fCcontainingsequence(SEQIDNO:8) [5fCpositionindicatedbybold,italics] GACGGACGTACGATCGAGCGAGGTCTTGGGTCAGCAGGTGGCGACTGTTAGCTCAGAT GGCTAGCAAGTGGGTATGTATGAGTGTATAGTGTGGGATGTG
TABLE-US-00002 TABLE 1 Oxidation Reduction then then Regular Bisulfite Bisulfite Bisulfite Base Sequencing Sequencing Sequencing Sequencing C C U U U 5mC C C C C 5hmC C C U C 5fC C U U C
TABLE-US-00003 TABLE 2
TABLE-US-00004 TABLE 3 Oxidant Comment KRuO.sub.4 Complete conversion to aldehyde CrO.sub.3 No oxidation observed PDC No oxidation observed PCC No oxidation observed MnO.sub.2 Small amount of aldehyde observed but substantial degradation with excess oxidant
TABLE-US-00005 TABLE 4 Retention Times for HPLC Peaks (DNA) Base Retention Time/min C 1.8 5hmC 2.1 U 2.7 G 4.5 5fC 5.3 T 5.7 A 7.3
TABLE-US-00006 TABLE 5 Retention Times for HPLC Peaks (RNA) Base Retention Time/min C 1.3 U 1.8 G 3.7 A 6.7