EPIGENETIC PROFILING METHOD

20220389501 · 2022-12-08

    Inventors

    Cpc classification

    International classification

    Abstract

    The present invention relates to a method for analyzing DNA including forming labeled DNA fragments by cleaving genomic DNA into DNA fragments, selectively functionalizing any non-methylated CpG sites present in the DNA with a linker including a hydrolyzable moiety, and attaching a label to the linker. The method further includes the step of separating the labeled DNA fragments from any non-labeled DNA fragments, hydrolyzing the hydrolyzable moiety of the linker of the separated labeled DNA fragments so as to release the DNA fragments from the label, and sequencing the released DNA fragments.

    Claims

    1.-25. (canceled)

    26. A method for analyzing DNA, the method comprising the following steps: forming labeled DNA fragments by: (a) cleaving genomic DNA into DNA fragments; (b) selectively functionalizing any non-methylated CpG sites present in the DNA with a linker comprising a hydrolyzable moiety; and (c) attaching a label to the linker; separating the labeled DNA fragments from any non-labeled DNA fragments; hydrolyzing the hydrolyzable moiety of the linker of separated labeled DNA fragments, so as to release the DNA fragments from the label; and sequencing released DNA fragments.

    27. The method of claim 26, wherein step (c) is carried out before step (b), and/or step (a) is carried out after step (b) or after step (c).

    28. The method of claim 26, wherein selectively functionalizing any non-methylated CpG sites in the DNA with the linker is carried out using a DNA methyltransferase enzyme which is capable of selectively transferring a transferable group from a S-adenosyl-L-methionine cofactor analogue to the non-methylated CpG sites of the DNA, wherein the transferrable group constitutes the linker.

    29. The method of claim 28, wherein the DNA methyltransferase enzyme is a cytosine-5 methyltransferase.

    30. The method of claim 29, wherein the DNA methyltransferase enzyme is a double mutant (Q136A/N374A) of M.MpeI.

    31. The method of claim 26, wherein the hydrolyzable moiety comprises an imine moiety, an oxime moiety, or a hydrazone moiety.

    32. The method of claim 31, wherein the hydrolyzable moiety comprises a Schiff base.

    33. The method of claim 28, wherein the S-adenosyl-L-methionine cofactor analogue has the following general formula: ##STR00012## wherein R represents a transferable group, which constitutes the linker; FG represents a functional group; Z represents a non-reactive group of an aliphatic linkage or an aromatic linkage; A-B-C represent the hydrolyzable moiety; Y represents a non-reactive group of an aliphatic linkage or an aromatic linkage; U represents an unsaturated bond; and k represents an integer of 1 or 2.

    34. The method of claim 33, wherein Z comprises a polyether chain and/or FG is an azide, an alkyne, an isothiocyanate, or an isocyanate moiety.

    35. The method of claim 33, wherein the S-adenosyl-L-methionine cofactor analogue has the following general formula: ##STR00013##

    36. The method of claim 33, wherein the S-adenosyl-L-methionine cofactor analogue has the following general formula: ##STR00014##

    37. The method of claim 33, wherein the S-adenosyl-L-methionine cofactor analogue has the following general formula: ##STR00015## wherein the hydrolyzable moiety is a Schiff base moiety comprising C═N—X—C-Q; p represents an integer of from 1 to 15; Q represents one oxygen atom or two hydrogen atoms independently bonded to the carbon center; X represents an oxygen atom or a nitrogen atom; Z represents a non-reactive group of an aliphatic linkage or an aromatic linkage; U represents an unsaturated bond selected from the group consisting of an alkene, an alkyne, an aryl group, a carbon atom comprising a carbonyl group, and a sulfur atom comprising one or two S═O bonds; k represents an integer of 1 or 2; and FG represents the functional group.

    38. The method of claim 33, wherein the S-adenosyl-L-methionine cofactor analogue has the following general formula: ##STR00016## wherein the hydrolyzable moiety is —C═N—N—C═O; p represents an integer of from 1 to 15; q represents an integer of from 1 to 15; k represents an integer of 1 or 2; and FG represents a second functional group.

    39. The method of claim 33, wherein the S-adenosyl-L-methionine cofactor analogue has the following general formula: ##STR00017## wherein the hydrolyzable moiety is —C═N—O—; p represents an integer of from 1 to 15; q represents an integer of from 1 to 15; k represents an integer of 1 or 2; and FG represents a second functional group.

    40. The method of claim 39, wherein FG is an azide moiety, p is 4, and q is 2 or 3.

    41. The method of claim 26, wherein attaching a label to the linker comprises forming a covalent bond between a reactive center of a functional group of the linker and the label.

    42. The method of claim 26, wherein the label comprises a ligand conjugated to a moiety comprising a second functional group which is capable of reacting with a functional group of the linker to form a covalent bond, and wherein the label optionally comprises biotin conjugated to a moiety comprising an alkyne.

    43. The method of claim 26, wherein separating the labeled DNA fragments from any non-labeled DNA fragments comprises using an immobilized capture agent which selectively binds to the label.

    44. The method of claim 26, further comprising at least one of the steps of ligating the released DNA fragments together and amplifying the DNA, prior to sequencing.

    45. The method of claim 26, wherein at least one of the DNA is sequenced using nanopore sequencing and cleavage of the genomic DNA is carried out using a restriction enzyme.

    Description

    [0122] Embodiments of the invention will now be described by way of example and with reference to the accompanying figures, in which:

    [0123] FIG. 1 is a reaction scheme for the formation of S-adenosyl-L-methionine cofactor analogues;

    [0124] FIG. 2 provides an overview of the method in accordance with an embodiment of the present invention;

    [0125] FIG. 3 shows molecular structures produced by a method in accordance with an embodiment of the present invention;

    [0126] FIG. 4 is a plot showing the efficiency of capture of biotinylated DNA fragments on streptavidin-coated beads, the DNA fragments containing zero, one, two, three or four unmethylated CpG sites;

    [0127] FIG. 5 is a plot showing the efficiency of capture and release of biotinylated DNA fragments using streptavidin-coated beads, the DNA fragments having been generated by cleaving human genomic DNA;

    [0128] FIG. 6 is a graph showing the lengths of DNA sequences generated by ligating and amplifying released DNA fragments;

    [0129] FIG. 7 is a plot of the aligned read lengths against the sequenced read lengths;

    [0130] FIG. 8 is an alignment of sequencing reads of a gene promoter region obtained by the method of the invention (SUURF ID1, 2 and 3) with a sequencing read obtained by MeDIP; and

    [0131] FIG. 9 shows the results of illumina sequencing.

    [0132] With reference to FIG. 1, there is shown a reaction scheme for the formation of S-adenosyl-L-methionine cofactor (AdoMet) analogues.

    Example 1: Synthesis of S-adenosyl-1-methionine Cofactor Analogues

    Synthesis of Precursor 1

    Synthesis of 8-hydroxyoct-6-ynoic acid 7.

    [0133] A solution of 6-heptynoic acid (2 g, 15.87 mmol) was made in dry THF (42 ml) under argon, to this HMPA (34.9 mmol, 6.13 ml) was added and the solution was cooled to −78° C. To this nBuLi (1.6 M in hexanes, 34.9 mmol, 21.8 ml) was added dropwise whilst maintaining the temperature below −60° C. The solution was then warmed to −40° C. and stirred for 1 hour. After 1 hour paraformaldehyde (1.47 g, 47.6 mmol) was added via powder funnel under an argon flow. The reaction mixture was then warmed to 45° C. for 4 hours. After reaction, the mixture was quenched with 1 M HCl to pH 4-5 and extracted with EtOAc. The solvent was then dried and the EtOAc was removed by rotary evaporation giving the crude product. Purification was completed using flash column chromatography (silica gel, Hex:EtOAc, 6:4): Yield=68%, Rf=0.27 (Hex:EtOAc, 6:4); 1H NMR (300 MHz, DMSO-d6) δ12.03 (s, 1H), 5.03 (s, 1H), 4.02 (d, J=2.6 Hz, 2H), 2.29-2.14 (m, 4H), 1.63-1.50 (m, 2H), 1.50-1.39 (m, 2H); MS: m/z [M-H]=155.46.

    [0134] Synthesis of tert-butyl 2-(8-hydroxyoct-6-ynoyl)hydrazine-1-carboxylate 8

    [0135] 8-hydroxyoct-6-ynoic acid 7 (1.35 g, 8.65 mmol) and tert-butyl carbazate (1.4 g, 10.38 mmol) were dissolved in 2:1 THF:H2O (13.5:6.75 ml). To this EDC.HCl (1.87 g, 9.52 mmol) was added slowly over 15 minutes. The mixture was left to stir for 3 hours and then extracted with EtOAc. The organic layer was washed with 0.1 M HCl, water and brine and then the organic layer is collected, dried over anhydrous sodium sulfate and the and the solvent was removed under reduced pressure yielding the product as a white solid: Yield=63%; 1H NMR (400 MHz, DMSO-d6) δ9.47 (s, 1H), 8.66 (s, 1H), 5.04 (t, J=5.9 Hz, 1H), 4.02 (dt, J=5.9, 2.2 Hz, 2H), 2.19 (tt, J=7.1, 2.2 Hz, 2H), 2.06 (t, J=7.2 Hz, 2H), 1.58 (p, J=7.3 Hz, 2H), 1.50-1.32 (m, 12H); 13C NMR (101 MHz, DMSO) δ172.01, 84.36, 80.94, 79.42, 49.59, 33.06, 28.53, 28.08, 24.70, 18.24; MS: m/z [M+Na]=294.15.

    Synthesis of tert-butyl 2-(8-bromooct-6-ynoyl)hydrazine-1-carboxylate 1

    [0136] A solution of tert-butyl 2-(8-hydroxyoct-6-ynoyl)hydrazine-1-carboxylate 8 (300 mg, 1.11 mmol) was made in dry DCM (3.33 ml) and cooled on ice. Triphenylphosphine (437 mg, 1.67 mmol) was added and left to dissolve, once dissolved tetrabromomethane (552 mg, 1.67 mmol) was added slowly. The reaction was then brought to room temperature and left to stir for 1 hour. After reaction the solvent was removed under reduced pressure and the crude mixture was purified by flash column chromatography (silica gel Hex:EtOAc, 7:3): Yield=55%; Rf=0.15 (Hex:EtOAc 7:3); 1H NMR (300 MHz, DMSO-d6) δ9.48 (s, 1H), 8.67 (s, 1 H), 4.21 (t, J=2.3 Hz, 2H), 2.27 (tt, J=6.9, 3.4 Hz, 2H), 2.06 (t, J=7.4 Hz, 2H), 1.65-1.31 (m, 13H); 13C NMR (101 MHz, DMSO) 6 171.4, 155.2, 87.7, 78.9, 76.3, 54.9, 39.5, 32.5, 28.0, 27.3, 24.1, 17.9, 17.2; MS: m/z [M+Na]=355/357.08.

    Synthesis of Precursor 4

    Synthesis of 7-Bromo-hept-1-yne 9

    [0137] A solution of 6-heptyn-1-ol (5g, 44.6 mmol) was made in dry DCM (60 ml) and cooled on ice. To this triphenylphosphine (17.6 g, 67 mmol) was added, upon complete dissolution tetrabromomethane (22.2 g, 67 mmol) was added slowly. The reaction mixture was brought to room temperature and stirred for 1 hr. After completion, the solvent was removed under reduced pressure. Hexane was added to the crude forming a white suspension. The hexanefraction was filtered, collected and then the solvent was removed. An oily residue remained which was purified by flash column chromatography with hexane: Yield=91%, Rf=0.45 (hexane); %); custom-charactermax(neat)/cm-1 540 (C—Br); 1H NMR (300 MHz, DMSO-d6) δ3.53 (t, J=6.7 Hz, 2H), 2.75 (t, J=2.7 Hz, 1 H), 2.23-2.10 (m, 2H), 1.89-1.74 (m, 2H), 1.50-1.43 (m, 4H).

    Synthesis of 8-bromooct-2-yn-1-ol 10

    [0138] A solution of 7-bromohept-1-yne 9 (20.56 mmol, 3600 mg) was made in Dry THF (12.3 ml) and cooled to −78° C. under Argon. To this a solution of nBuLi in hexanes (1.6 M, 13 ml) was added dropwise, whilst maintaining the temperature below −60° C. The reaction mixture was then warmed to 0° C. in an ice bath at which point paraformaldehyde (1718 mg, 55.5 mmol) was added under a flow of Argon and stirred for 30 minutes. The mixture was then warmed to room temperature and left to stir, the temperature was maintained below 30° C. until the exothermic reaction had stopped. The mixture was then heated to 45° C. for 2 hrs. Once complete the reaction was extracted with ether and sat. NH4Cl. The organic layer was collected and the solvents were removed under reduced pressure to yield the crude product as an oil. Once dry, purification was completed by flash column chromatography (silica gel, Hexane: Ethyl Acetate, 9:1). The product was then collected as a colourless oil: Yield=55%, Rf=0.15 (Hex: EtOAc 9:1), 1H NMR (300 MHz, DMSO-d6) δ5.04 (t, J=5.7 Hz, 1H), 4.03 (dt, J=5.5, 2.1 Hz, 2H), 3.54 (t, J=6.7 Hz, 2H), 2.20 (m, 2H), 1.88-1.75 (m, 2H), 1.52-1.40 (m, 4H).

    Synthesis of tert-butyl ((8-hydroxyoct-6-yn-1-yl)oxy)carbamate 11

    [0139] To a solution of N-Boc Hydroxyl amine (890 mg, 6.55 mmol) in DMF (4.3 ml) 8-bromooct-2-yn-1-ol 10 (1200 mg, 5.85 mmol) and 1,8-Diazabicyclo[5.4.0]undec-7-ene (1000 mg, 6.55 mmol) was added. The solution was stirred at 50° C. for 20 hrs. Once complete, the reaction was extracted with DCM and 15% citric acid solution. The organic phases were dried and collected and the solvent was removed under reduced pressure. A colourless oil was collected as the crude product. This was further purified by flash column chromatography (silica gel, Hexane: Ethyl Acetate, 8:2). The product was collected as a colourless oil: Yield=73%, Rf=0.27; 1H NMR (300 MHz, DMSO-d6) δ9.91 (s, 1H), 5.03 (t, J=5.9 Hz, 1H), 4.02 (dt, J=5.9, 2.2 Hz, 2H), 3.66 (t, J=6.2 Hz, 2H), 2.17 (tt, J=6.7, 1.7 Hz, 2H), 1.40 (m, 15H); MS: m/z [M+H]=258.2.

    Synthesis of tert-butyl ((8-bromooct-6-yn-1-yl)oxy)carbamate 4

    [0140] A solution of tert-butyl((8-hydroxyoct-6-yn-1-yl)oxy)carbamate 11 (1 g, 3.89 mmol) was made in dry DCM (5.2 ml) and cooled on ice. To this triphenylphosphine (1.53 g, 67 mmol) was added. Upon complete dissolution tetrabromomethane (1.94 g, 67 mmol) was added slowly. The reaction mixture was brought to room temperature and allowed to stir for 1 hr. After completion, the solvent was removed under reduced pressure. Purification was completed using flash column chromatography (silica gel, Hexane: Ethyl Acetate, 8:2): Yield=67%, Rf 0.52 (Hex:EtOAc, 8:2); λmax(neat)/cm-1 1712 (C═O), 607 (C—Br); 1H NMR (300 MHz, DMSO-d6) δ9.90 (s, 1H), 4.21 (t, J=2.4 Hz, 2H), 3.66 (t, J=6.2 Hz, 2H), 2.25 (tt, J=6.9, 2.4 Hz, 2H), 1.40 (m, 15H); 13C NMR (101 MHz, DMSO) δ156.04, 87.85, 79.37, 76.22, 75.05, 39.52, 28.05, 27.64, 27.04, 24.76, 18.06, 17.25; MS: m/z [M+Na]=342.35/344.35, [M-tBuOH]=246.38/248.38.

    General Coupling Procedure

    [0141] Precursors 1, 4 were reacted with S-adenosyl-L-homocysteine under acidic conditions to give reversible and rewritable Boc-protected AdoMet derivatives.

    [0142] A solution of S-adenosyl-1-homocysteine (15 mg, 0.04 mmol) was made in a 1:1 mixture of formic and acetic acid (300 μl). Precursor 1 or 4 (tert-butyl 2-(8-bromooct-6-ynoyl)hydrazone-1-carboxylate or tert-butyl ((8-bromooct-6-yn-1-yl)oxy)carbamate) (1.2 mmol, 30 equivs) was then added dropwise, on ice. The reaction mixture was warmed to 35° C. and left to stir overnight. After overnight stirring the reaction mixture was extracted with diethyl ether and the aqueous layer was collected and dried by lyophilisation: MS: m/z [M+H]=638 (2), [M+H]=624 (5).

    Cofactor Deprotection

    [0143] The AdoMet analogues were deprotected under acidic conditions to reveal the hydrazone or alkoxyamine moieties. The crude product was dissolved in TFA (400 μl) and left stir for 2 hrs at room temperature. After reaction the acid was removed under a flow of argon.

    Cofactor Purification

    [0144] Any excess precursor was removed by purification.

    [0145] Both diastereomers of the deprotected cofactors could be separated by HPLC, a separation which was not possible at later stages.

    [0146] The crude reaction mixture was then dissolved in water (2 ml). Purification of AdoMet analogues was performed by preparative reversed-phase HPLC (ACE 5 C-18 25×2.12 cm) eluting with 20 mM Ammonium Acetate pH 5.5 Water (A)/MeOH (B) gradient at a flow rate of 10 ml/min. Gradient system: 30 mins 3-30% B, 30-97% B over 30 mins, hold at 97% B for 5 minutes, stop programme. Retention times: Hydrazide iso. 1=17.51 mins, iso. 2=18.73 mins, hydroxylamine iso. 1=25.47 mins, iso. 2=28.24 mins: MS: m/z [M+H]=538 (2), [M+H]=524 (5).

    [0147] The deprotected AdoMet derivatives slowly degrade, in particular following freeze-drying, via multiple pathways, giving additional peaks at higher retention times.

    Aldehyde Coupling

    [0148] To mitigate against degradation the AdoMet derivatives were reacted with a commercially available benzaldehyde immediately after purification by HPLC in order to minimise side reactions due to the nucleophilic nature of the hydrazone and alkoxyamine moieties.

    [0149] To the collected HPLC fractions Ald-PEG3-N3 (1.2 equivs) was added and rolled for 30 mins at room temperature. The fractions were then dried by lyophilsation. Once dry the solids were dissolved in 100 μl 0.1% Acetic Acid and stored at −20° C. Concentrations were determined by UV absorption analysis with ε260=15.400 dm-3 mol-1 cm-1: MS: m/z [M+H]=867 (3), [M+H]=856 (6).

    [0150] The resulting AdoMet analogues contain reactive terminal azides that can be readily conjugated to a range if functional groups, while condensation of the aldehyde with the hydrazone or alkoxyamine incorporates a dynamic functionality, that can be reversibly functionalised.

    [0151] A slight excess of aldehyde (1.2 equivs) was employed to ensure full functionalisation of the deprotected intermediate.

    [0152] No degradation of the freeze-dried AdoMet analogues was observed.

    [0153] With reference to FIG. 2, a method in accordance with the present invention is used for epigenetic profiling of genomic DNA (10), such as human genomic DNA. In a first step (A), the genomic DNA is digested into DNA fragments using a restriction enzyme such as SaqAl. The DNA fragments produced by the enzymatic digestion include fragments which have not been methylated at CpG sites (12) and fragments with have been methylated at CpG sites (14).

    [0154] Step (B) comprises methyltransferase directed unmethylated CpG functionalization. In this step, the non-methylated CpG sites present in the DNA fragments (12) are functionalized using a methyl transferase enzyme (such as M.Mpel) and the S-adenosyl-L-methionine (AdoMet) analogue AdoHCY-8-HY (shown in FIG. 3A). The methyltransferase transfers a transferable group, i.e. a linker (16) from the cofactor to position 5 of the cytosine of non-methylated CpG sites, thereby producing functionalized DNA fragments (18).

    [0155] As shown in FIG. 3A, the linker which is transferred from the cofactor to the DNA fragments comprises a hydrolysable C═N moiety (a Schiff base), and a terminal azide (N.sub.3) group.

    [0156] In step (C), the functionalized DNA fragments (18) are reacted with diazo biotin-DBCO, forming labelled DNA fragments (20). A covalent bond is formed between the linker and the biotin label by virtue of a click reaction between the terminal azide of the linker and the alkyne of the DBCO moiety, resulting in the structure shown in FIG. 3B.

    [0157] In step (D), the labelled DNA fragments (20) are captured using streptavidin-coated beads (22). The beads (22) are then washed to remove any non-specifically bound (i.e. non-labelled) DNA. Captured DNA fragments (24) are then released by hydrolyzing the hydrolysable moiety (step (E)), giving the structure shown in FIG. 3C.

    [0158] The released fragments are then re-ligated together in a random fashion using DNA ligase (step (F)). This creates long sequences of DNA (26) comprised of many ligated DNA fragments containing the CpG sites which were not methylated in the original genomic DNA sequence. Optionally, the ligated DNA is amplified by PCR to remove the linkers (step (G)). Finally, the amplified DNA (28) is sequenced.

    Example 2: Capture and Release of Control DNA Fragments that Contain Varying Numbers of Capture Sites

    Methodology

    PCR Amplification of CpG Site Containing Fragments

    [0159] DNA fragments (−150 bp) containing 0, 1, 2, 3 or 4 CpG sites were produced by PCR amplification of sections of the Lambda genome (NEB) using Q5® High-Fidelity DNA Polymerase (NEB) following manufacturer's instructions. The following amplification programme was used: 98° C. for 30 s, 30 cycles 98° C. 10 s, 61° C. 30 s, 72° C. 20 s and a final extension at 72° C. for 2 mins. After amplification the DNA was purified using 2x AMPure XP beads and eluted into 100 μl EB (10 mM Tris-HCl (pH 8.5)). The DNA concentration was quantified using Qubit™ 4 Fluorometer using the dsDNA BR Assay Kit (Thermo Fisher) and sized using the High Sensitivity D5000 ScreenTape on the TapeStation 2200 (Agilent).

    CpG Capture Analysis

    [0160] 750 ng of CpG site (0, 1, 2, 3 or 4) containing PCR fragments were mTAG labelled in 35 μl reactions containing 10x cutsmart buffer (3.5 μl)(NEB), 500 μM AdoHcy-8-Hy cofactor (1.17 μl), M.Mpel enzyme (double mutant (Q136A/N374A), 2.5 μl)(1.7 mg/ml) and water. Samples were incubated at 37° C. for 1 hr. Following this 1 μL proteinase K (800 units/ml) (NEB) was added and samples were incubated for 1 hr at 50° C. Next, 1 μl of Diazo Biotin-DBCO (Jena bioscience) was added and the samples incubated at 37° C. for 1 hr at 1000 rpm. Samples were then purified using 2x AMPure XP beads (washed 2× with 500 μl 80% ethanol) and eluted into Tris buffer A (10 mM Tris, 1 mM Nacl, pH 7.5). The DNA concentration was quantified using Qubit™ 4 Fluorometer and dsDNA HS Assay Kit (Thermo Fisher).

    [0161] 5 μl of Dynabeads™ MyOne™ Streptavidin Cl beads (Thermo Fisher) for each sample were washed 2× with an equal volume of Tris buffer A. The wash solution was removed, and 250 ng of biotin labelled PCR DNA fragments in 5 μL Tris buffer A, were added to the 5 μl of washed streptavidin beads and incubated at RT, 1000rpm for 20 mins. Samples were placed onto a magnet and the supernatant removed and stored. The beads were then washed 2× with 5 μl Tris buffer A and the washes were stored. The percentage of DNA captured was calculated using the Qubit™ 4 Fluorometer with the dsDNA HS Assay Kit (Thermo Fisher).

    Results

    [0162] The results of this experiment are shown in FIG. 4. This demonstrates the ability to capture control fragments of DNA that contain varying numbers of CpG sites. DNA which did not possess a capture site (0 CG) was not captured, whereas DNA that contained one (1 CG), two (2 CG), three (3 CG) or four (4 CG) capture sites were each captured efficiently. This demonstrates the specificity and efficiency of the method.

    Example 3: Epigenetic Study using Human DNA

    Methodology

    Genomic DNA Extraction

    [0163] Human genomic DNA was extracted from cultured GM12878 human cells (Coriell: GM12878). Cell culture was done using Epstein—Barr virus (EBV)-transformed B lymphocyte culture from the GM12878 cell line, grown in RPMI-1640 media, supplemented with 2 mM L-glutamine, 15% FBS and incubated at 37° C. Genomic DNA was extracted using the QIAGEN Genomic-tip 500/G kit (Qiagen) following manufacturer's instructions.

    Digestion of Genomic DNA

    [0164] Human genomic DNA (NA12878) was digested using Anza™ 64 SaqAl (Thermo Fisher), in an 80 μL reaction containing 4 μg DNA, 8 μL buffer, 4 μl SaqAl enzyme and water. The reaction was then incubated at 37° C. for 1 hr. The fragmented DNA was cleaned using the QIAquick PCR Purification Kit (Qiagen) and eluted into 30 μl EB (10 mM Tris-HCl (pH 8.5)). If additional DNA was required, the reaction was repeated. The DNA concentration was quantified using the Qubit™ 4 Fluorometer and dsDNA HS Assay Kit (Thermo Fisher) and sized using High Sensitivity D5000 ScreenTape on the TapeStation 2200 (Agilent).

    mTAG-Directed Functionalisation of Human DNA and Enrichment of Unmethylated CpG Containing Fragments

    [0165] mTAG labelling and biotin tagging of human DNA fragments was done 3×. Each reaction contained 1.5 μg of SaqAI digested human DNA, 7 μl 10x cutsmart buffer, 2.33 μl AdoHcy-8-Hy cofactor (500 μM), 5 μL M.Mpel enzyme (double mutant (Q136A/N374A)) (1.7 mg/ml) and water to a final volume of 70 μl. Samples were incubated at 37° C. for 1 hr. To these reactions 2 μl proteinase K (800 units/ml) (NEB) was added and incubated at 50° C. for 1 hr. Next, 2 μl Diazo Biotin-DBCO (Jena bioscience) was added and reactions were incubated for a further 1 hr at 37° C., 1000 rpm. Each sample was purified using 2x AMPure XP beads (washed 2× with 1000 μl 80% ethanol) and eluted into 30 μl Tris buffer A (10 mM Tris, 1 mM Nacl, pH 7.5). The DNA concentration was quantified using the Qubit™ 4 Fluorometer and dsDNA HS Assay Kit (Thermo Fisher).

    [0166] 20 μl of Dynabeads™ MyOne™ Streptavidin Cl beads (Thermo Fisher) for each sample were washed 2× with an equal volume of Tris buffer A. 1 μg of labelled and biotin tagged DNA in 30 μl Tris buffer A from each sample was added to 20 μl of washed streptavidin beads and incubated at RT, 1000 rpm for 20 mins. Samples were then placed onto a magnet and the supernatant was removed and beads were washed 2× in 20 μl Tris buffer A. To release the captured DNA from the beads, 80 μl of release buffer (11.2 mM ammonium acetate (pH 6.5), 1M NaCl) along with 20 μl of 0.85 M hydroxylamine solution (170 mM final) was added and incubated at 50° C., 1000 rpm for 1 hr. Released DNA was then purified using 2x AMPure XP beads and eluted into 14 μl EB (10 mM Tris-HCl (pH 8.5)). The DNA concentration was quantified using the Qubit™ 4 Fluorometer with the dsDNA HS Assay Kit (Thermo Fisher) and sized using the High Sensitivity D5000 ScreenTape on the TapeStation 2200 (Agilent).

    Ligation and PCR Amplification of Released Fragments

    [0167] After quantification and sizing of released human DNA fragments ˜170-190 ng of DNA remained in each 10.5 μl sample. To this an equal volume (10.5 μl) of Anza™ T4 DNA Ligase Master Mix (Thermo Fisher) was added and samples were incubated at RT for 1 hr. Each sample was then purified using 2x AMPure XP beads and eluted into 13 μl EB (10 mM Tris-HCl (pH 8.5)). DNA concentration was quantified using the Qubit™ 4 Fluorometer with the dsDNA HS Assay Kit (Thermo Fisher) and sized using the Genomic DNA ScreenTape on the TapeStation 2200 (Agilent).

    [0168] Following this, the remaining 10 μl of each sample was end-repaired and dA-tailed using NEBNext® Ultra™ II End Repair/dA-Tailing Module (NEB). DNA was then cleaned using 2x AMPure XP beads and eluted into 15 μl EB (10 mM Tris-HCl (pH 8.5)). To the 15 μl of end repaired and dA-tailed DNA, 10 μl PCA (Oxford Nanopore Technologies (ONT)) and 25 μl Blunt/TA Ligase Master Mix was added and samples were incubated at 25° C. for 1 hr. The DNA was then purified using 2x AMPure XP beads and eluted into 12 μl EB. The DNA concentration was quantified using the Qubit™ 4 Fluorometer with the dsDNA HS Assay Kit (Thermo Fisher).

    [0169] Next, 3×50 μl PCR reactions were prepared, containing 4.5 μl (10 ng) PCA ligated DNA from the previous step, 1.5 μl dNTPs (10 mM), 2 μl PRM (Oxford Nanopore Technologies (ONT)), 10 μl LongAmp® Taq reaction buffer (NEB), 2 μl LongAmp® Taq DNA Polymerase (NEB) and 30 μl water. The following amplification programme was used: 94° C. for 2 mins, 21 cycles 94° C. 30 s, 62° C. 15 s, 65° C. 15 mins and a final extension at 65° C. for 15 mins. DNA was purified using 2x AMPure XP beads and the DNA concentration was quantified using the using the Qubit™ 4 Fluorometer with the dsDNA BR Assay Kit (Thermo Fisher) and sized using the Genomic DNA ScreenTape on the TapeStation 2200 (Agilent).

    Library Preparation and MinION Sequencing

    [0170] 1 μg of re-ligated and PCR amplified DNA in 50 μl of nuclease-free water was end-repaired and dA-tailed using the NEBNext® Ultra™ II End Repair/dA-Tailing Module (NEB). Samples were then purified using 2x AMPure XP beads and eluted into Nuclease-free water. Sequencing adapters (AMX) were then ligated using NEBNext Quick T4 DNA Ligase (NEB) following the manufacturer's protocol (1D genomic DNA by ligation (SQK-LSK109) (Oxford Nanopore Technologies (ONT)).

    [0171] Each library was loaded onto a R9.4.1 flow cell (FLO-MIN106D) following manufacturer's instructions (ONT) and sequenced for 48 hours, using the standard parameters specified for the library preparation protocol. Base-calling was done using Guppy (2.0.10), with parameters based on the library preparation method.

    Read Alignment

    [0172] The sequenced reads were mapped to the human genome reference (hg19) using minimap2.sup.1 with the “-ax map-ont -K 500M” options.

    Results

    [0173] Initial digestion of the human genomic DNA with the SaqAl enzyme was found to yield DNA fragments of about 150 bp in length. Following mTAG functionalization of the DNA fragments using the AdoHcy-8-Hy cofactor, and labelling using Diazo Biotin-DBCO, the labelled DNA fragments were captured on streptavidin-coated beads. Consistent capture of about 24% of DNA across all three samples was observed. In addition, highly efficient recovery of the captured DNA from the streptavidin beads (˜95%) was achieved in a single step, for all samples (FIG. 5).

    [0174] The released fragments of DNA from each sample were then randomly stuck together to form long fragments of DNA. FIG. 6 shows the successful re-ligation and PCR amplification of the captured and released DNA fragments in all three repeats of the experiment.

    [0175] The released and re-ligated human DNA from each sample was then sequenced using a MiniION nanopore sequencing device. As each sequencing read consists of many short fragments of DNA ligated/stuck together randomly, each individual fragment in the read was aligned to the genomic location from which it was derived, using publically available algorithms.

    [0176] Evidence supporting correct alignment of the individual DNA fragments within each read can be seen in FIG. 7. This figure shows the “Sequenced read length” over the “Aligned read length”. The sequenced read length is the length of each sequencing read (each read consists of multiple small fragments stuck together). If the short fragments within each sequencing read were aligned to the correct locations on the genome, the “aligned read lengths” would be expected to be shorter than the “sequenced read lengths”, which is exactly what is shown in FIG. 7. This demonstrates the alignment algorithm is capable of aligning the short fragments within each sequencing read.

    [0177] To ensure that the correct sites were captured, the sequencing data obtained was compared to MeDIP sequencing data. MedIP is a method which captures methylated sites of the genome, in contrast to the method of the invention which captures unmethylated sites. Therefore, the method of the invention should not be capturing the same locations of the genome as MeDIP.

    [0178] FIG. 8 shows a comparison of sequencing reads obtained using the method (SUURF ID1, ID2 and 1D3) with a sequencing read obtained using MeDIP. The region shown in the box is a gene promoter sequence which is known to be unmethylated. In this region, it can be seen that there is a build up of sequencing reads from the SUURF samples (capture of non-methylated DNA), and a decrease in sequencing reads from the MedIP sample (capture of methylated DNA). This demonstrates that the method of the invention can be used to successfully capture and sequence non-methylated regions of the genome.

    Example 4: Human Head and Neck Cancer Capture and Release for Illumina Sequencing

    [0179] Head and neck squamous cell carcinoma (HNSCC) (VU40T) DNA was fragmented to 150 bp in the following reaction; 50 μl DNA (5.6 μg), 6.5 μl 10x Fragmentase Reaction Buffer v2, 3 μl NEBNext dsDNA Fragmentase enzyme (M0348), 3.5 μl 200 mM MgCl2, 2 μl nuclease free water to a final volume of 65 μl. The reaction was incubated at 37° C. for 35 mins. To stop the reaction 35 μl 400 mM EDTA was added. The DNA was then purified using 2.5x AMPure XP beads (Beckman), and the size checked using a 1% agarose gel, pre stained with GelRed® (Biotium) (120V for 55 mins).

    [0180] Next a labelling reaction was prepared containing; 19.4 μl fragmented head and neck cancer DNA (800 ng) (150 bp), 10 μl M.Mpel enzyme (double mutant (Q136A/N374A)) (stock 1.7 mg/ml), 7 μl 10x CutSmart® Buffer (NEB), 2.33 AdoHcy-8-Hy cofactor (500 μM final) and 31.27 μl nuclease free water to a final volume of 70 μl, the reaction was then incubated at 37° C. for 1 h. Next, 4 μl μL proteinase K (800 units/ml) (NEB) was added and the sample was incubated for a further 1 h at 50° C. Finally, 4 μl of Sulfo-DBCO-Biotin Conjugate (15 mM stock) (Jena Bioscience) was added and the same was incubated for 1 h 37° C. The sample was then purified using 2.5 X AMPure XP beads (Beckman) and the DNA concentration was quantified using Qubit™ 4 Fluorometer and dsDNA HS Assay Kit (Thermo Fisher).

    [0181] Following this, 600 ng of fragmented and biotinylated DNA in 50 μl Tris buffer A (10 mM Tris, 1 mM Nacl, pH 7.5) was incubated with 60 μl of washed Dynabeads™ MyOne™ Streptavidin Cl beads (Thermo Fisher) at RT for 20 mins. The beads were then washed 2× with 100 μl Tris buffer A to remove any non-specifically bound DNA. The captured DNA was released from the beads using 90 μl of release buffer (11.2 mM ammonium acetate (pH 6.5), 1M NaCl) and 10 μl of 0.85 M hydroxylamine solution (170 mM final) at 50° C., 1200 rpm for 1 h. The released DNA was then purified using 2.5 X AMPure XP beads (Beckman) and the DNA concentration was quantified using Qubit™ 4 Fluorometer and dsDNA HS Assay Kit (Thermo Fisher). The sample was then prepared for illumina sequencing using the KAPA HyperPrep Kit following manufacturer's instructions.

    [0182] FIG. 9 shows a data comparison from the genome browser showing number of reads mapped in a region containing a CpG island at the start of the MLM1 gene for (top) the present unmethylome chemistry showing the location of unmethylated CpG sites and (middle and bottom) MeDIP data (antibody-based capture) showing complementary capture of methylated regions of the genome.

    Example 5: Amplification of Captured DNA

    Labelling of Unmethylated Genomic DNA in Fixed Cells with Methyltransferase Enzymes and AdoMet Analogue

    [0183] 1×10.sup.6 cells (MCF7 or MCF10A) were seeded on 10 cm dish and incubated for 24 hrs. Then cells were fixed with 5 ml of cold MeOH/AcOH (95:5) for 10 minutes at −20° C. and washed 2×PBS. Fixed cells were incubated for 1 hr at 37° C. with 5 ml of the solution in 1x CutSmart buffer:

    [0184] Taq—37.4 μl of M.Taql (WT) (1.1 mg/ml), 4 μM AdoHcy-6-N3

    [0185] M.Mpel—90.9 μl M.Mpel (double mutant (Q136A/N374A)) (6.9 mg/ml), 65 μM AdoHcy-6-N3

    [0186] Washed twice with PBS, incubated overnight at RT with 100 μM sulfo-DBCO-Biotin in PBS. Cells were then washed 3×PBST. 1 ml of PBST was added and cells were scraped. DNA was purified using QIAGEN Genomic-tip 20/G. Biotin labelled DNA was then resuspended in 100 mM Tris-HCl pH 8.5 and sonicated to 150 bp.

    Labelling of Unmethylated Genomic DNA In Vitro with Methyltransferase Enzymes and AdoMet Analogue

    [0187] 2 μg of extracted genomic DNA from MCF7 or MCF10A cells was incubated with 0.5 μl of M.Taq (1.1 mg/ml) or M.Mpel (6.9 mg/ml) and AdoHcy-6-N3 (4 μM and 65 μM respectively) in 20 μL total volume of 1x CutSmart buffer for 1 hour at 37° C. 3 μL of Proteinase K was added and samples were incubated for 1 hour at 50° C. followed by incubation with 2 mM of DBCO-sulfo-Biotin for 1 hour at 37° C. Biotin labelled DNA was then purified with GenElute Bacterial Genomic DNA Kit. DNA was eluted twice in 10 mM Tris-HCl pH 8.5.

    Library Construction (Following Procedure from Ponnaluri, V.K.C., et al. Genome Biol 18, 122 (2017))

    [0188] 1 μg of DNA was end-repaired, dA-tailed, and ligated with NEBNext Ultra8482 II DNA Library Prep Kit. Without further purification, the ligation product was mixed with 50 μL of Streptavidin magnetic beads (Invitrogen 65001, blocked using 0.1% cold fish gelatin in 1×PBS overnight at 4° C.) in 1 mL of B&W buffer (10 mM Tris-HCl pH 8.0, 1 mM EDTA, 2 M NaCl). Biotin labelled DNA was captured by streptavidin magnetic beads at 4° C. for 2 h with end-over-end rotation. The beads were washed four times with B&W buffer plus 0.05% of Triton X-100 followed by one wash with TE plus Triton X-100. The beads were resuspended in 40 μL of nuclease-free water and 4 μL was used for library amplification using standard PCR. It was found that the presence of the biotin labels did not affect the amplification process.