PROCESS FOR PRODUCING A CHROMATIN CONFORMATION CAPTURE (3C) LIBRARY

20220098656 · 2022-03-31

    Inventors

    Cpc classification

    International classification

    Abstract

    The present invention relates to a process for producing a chromatin conformation capture (3C) library. This may be used for identifying nucleic acid regions within a nucleic acid sample which interact with one another. The process comprises treating nucleic acids in a population of eukaryotic cells, the process comprising the steps: (i) immobilising the nucleic acids within the cells in a population of eukaryotic cells; (ii) permeabilising or removing the cell membranes of the eukaryotic cells; and (iii) fragmenting the immobilised nucleic acids within the cells to produce nucleic acid fragments.

    Claims

    1. (canceled)

    2. A process for treating nucleic acids in a population of eukaryotic cells, the process comprising the steps: (i) immobilising the nucleic acids within the cells in a population of eukaryotic cells; (ii) permeabilising or removing the cell membranes of the eukaryotic cells; and (iii) fragmenting the immobilised nucleic acids within the cells to produce nucleic acid fragments.

    3. The process as claimed in claim 2, wherein the nucleic acids are chromatin comprising nucleosomes which are linked by inter-nucleosomal linkers.

    4. The process as claimed in claim 2, wherein in Step (i), the nucleic acids are immobilised by cross-linking the nucleic acids.

    5. The process as claimed in a claim 2, wherein in Step (ii), the outer cell membranes and nuclear membranes are permeabilised.

    6. The process as claimed in claim 2, wherein in Step (iii), the immobilised nucleic acids are fragmented using an endo-exonuclease or using micrococcal nuclease.

    7. The process as claimed in claim 2, wherein in Step (iii), the immobilised nucleic acids are fragmented such that mono-nucleosomes are produced, wherein the inter-nucleosomal linkers are at least partially intact.

    8. The process as claimed in claim 2, wherein the eukaryotic cells are mammalian cells and wherein the number of cells in the population of mammalian cells is 1-10,000, 10,000-1 million, or 1 million to 100 million.

    9. A process for producing a 3C library, the process comprising the steps: (a) treating nucleic acids by a process as defined in claim 2; (b) ligating the nucleic acid fragments to produce ligated nucleic acid fragments; and (c) de-immobilising or de-crosslinking the ligated nucleic acid fragments.

    10. A method of identifying nucleic acid regions within a nucleic acid sample which interact with one another, the method comprising the steps: producing a 3C library by a process as defined in claim 9; (d) fragmenting the 3C library to produce nucleic acid fragments; (e) optionally, adding sequencing adaptors to the ends of the nucleic acid fragments and/or amplifying the nucleic acid fragments; (f) contacting the nucleic acid fragments with a targeting nucleic acid which binds to a subgroup of the nucleic acid fragments, wherein the targeting nucleic acid is labelled with the first half of a binding pair; (g) isolating the subgroup of nucleic acid fragments which have been bound by the targeting nucleic acid using the second half of the binding pair; (h) amplifying the isolated subgroup of nucleic acid fragments; (j) optionally repeating Steps (f), (g) and (h) one or more times; and (k) optionally sequencing the amplified isolated subgroup of nucleic acid fragments in order to identify nucleic acid regions within the nucleic acid sample which interact with one another.

    11. The method as claimed in claim 10, wherein the targeting nucleic acid is a DNA oligonucleotide.

    12. The method as claimed in claim 10, wherein the concentration of the targeting nucleic acid is 5 μM to 1 pM or 2.9 μM to 29 pM or 1 μM to 30 pM or 300 nM to 30 pM.

    13. The method as claimed in claim 11, wherein the targeting nucleic acid is selected such that is capable of binding within a nucleosome-depleted region of a promoter of a gene or non-coding RNA of interest, or of a regulatory element or an enhancer, repressor or CTCF binding site in the nucleic acids.

    14. The method as claimed in claim 13, wherein the targeting nucleic acid is selected such that is capable of binding within the central region of the nucleosome-depleted region of a promoter of a gene or non-coding RNA of interest, or of a regulatory element or an enhancer, repressor or CTCF binding site in the nucleic acids.

    15. The method as claimed in claim 10, wherein Step (j) is repeated 1, 2, 3, 4 or 5 times.

    16. A method of identifying allele-specific interaction profiles in SNP-containing regions of nucleic acids, the method comprising a method as defined in claim 10 including sequencing the amplified isolated subgroup of nucleic acid fragments in order to identify allele-specific interaction profiles in SNP-containing regions.

    17. A method of identifying one or more interacting nucleic acid regions that are indicative of a particular disease state or disorder, the method comprising: a) carrying out a method as defined in claim 10 on a nucleic acid sample of mammalian cells obtained from a subject with a particular disease state or disorder; b) quantifying a frequency of interaction between a first nucleic acid region and a second nucleic acid region; and c) comparing the frequency of interaction in the nucleic acid sample from the subject with said disease state or disorder with the frequency of interaction in a control nucleic acid sample from a healthy subject, such that a difference in the frequency of interaction in the nucleic acid samples is indicative of a particular disease state or disorder.

    18. A kit for identifying nucleic acid regions within a nucleic acid sample which interact with one another, the kit comprising buffers and reagents for performing a method as defined in claim 2.

    19. The process as claimed in claim 2, wherein the eukaryotic cells are mammalian cells.

    Description

    BRIEF DESCRIPTION OF THE FIGURES

    [0133] FIG. 1. Overview of the method of the invention for producing a 3C library.

    [0134] FIG. 2. Nucleosome fragmentation profiles. Following DNA extraction the material was assayed using automated gel electrophoresis (Agilent TapeStation with D1000 reagents). Optimal levels of digestion were obtained when the chromatin was digested predominantly to mono-nucleosomes (180-200 bp) but with the inter-nucleosomal linkers attached (FIGS. 2 and 3). Over-digestion to <160 bp removed the inter-nucleosomal linkers and this meant that it was not possible to ligate fragments in close proximity.

    [0135] FIG. 3. Model to explain the rationale behind the optimal digestion. Prior to digestion, chromatin is wrapped around nucleosomes. Approximately 148 bp is wrapped around each nucleosome, with a linker sequence of around 20-80 bp. When the sample is digested to a peak fragment size of 180-200 base pairs, the linkers between the nucleosomes are cut, but not digested. This allows the ligation reaction to proceed between different nucleosomes. If the linkers are fully digested between the nucleosomes, then it is impossible to get the ligation reaction to proceed.

    [0136] FIG. 4a, b and c. Comparison of data generated by different 3C methods. These panels show the increased resolution obtained using the method of the invention in comparison to data from Hsieh et al. [15].

    [0137] FIG. 4a shows a 100 kb section of the alpha globin locus and shows how small changes in the position of the oligonucleotides used for capture change the interaction profile dramatically. In particular, oligonucleotides placed directly over the hypersensitive site at the promoter of the gene reveal highly discrete interactions with the enhancer regulatory elements that control gene expression. Data from NG Capture-C and 4C-seq methods [10, 12, 13] are included to allow comparison with the previously-best available methods.

    [0138] FIG. 4b shows a 20 kb section from FIG. 4a and includes 20 kb data from Hsieh et al. [15], generated from S. cerevisiae to allow comparison.

    [0139] FIG. 4c shows a 1 kb section from FIG. 4b, which highlights the resolution obtainable from the method of the invention. When the ligation junctions are plotted, this gives close to single base pair resolution and this potentially highlights the binding sites of transcription factors within the enhancer region.

    [0140] FIG. 5 shows a comparison of MCC performed with an intact whole cell preparation compared to a nuclear preparation.

    [0141] FIG. 6. Micro-C data. This illustrates the nucleotide sequence resolution obtained using a method of the prior art (taken from FIG. 5 of Hsieh et al. [14]).

    EXAMPLES

    [0142] The present invention is further illustrated by the following Examples, in which parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions. Thus, various modifications of the invention in addition to those shown and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.

    Example 1

    Preparation of Micrococcal Nuclease Chromatin Conformation Capture (MCC) Library

    [0143] An overview of the method is shown in FIG. 1.

    Fixation

    [0144] 1-2×10.sup.7 cells were fixed in 10 ml media with a final concentration of 2% formaldehyde for 10 minutes at room temperature for 10 minutes. This reaction was quenched by adding 1M cold glycine (final concentration 130 mM) and centrifuged for 5 minutes at 300 g/4° C. The supernatant was discarded and the cell pellet was resuspended in phosphate buffered saline, centrifuged (300 g/4° C.) and the supernatant discarded. The cell pellet was then resuspended in phosphate buffered saline and digitonin (Sigma) was added to a final concentration of ˜0.05% (sufficient to permeabilise the cells depending on the batch of digitonin). The cells can be snap-frozen and stored at −80° C., if desired, at this point.

    Digestion

    [0145] The permeabilised cells were centrifuged for 5 minutes at 300 g, the supernatant discarded, and the cells were resuspended in a reduced calcium content micrococcal nuclease buffer (Tris HCL pH 7.5 10 mM; CaCl.sub.2 1 mM). A titration of different concentrations of micrococcal nuclease (NEB or Worthington) was used to digest the chromatin (typically ranging from 0.5-40 Kunitz U for a reaction volume of 800 μl containing 2,000,000 cells). This reaction was incubated for 1 hour at 37° C. on an Eppendorf Thermomixer at 800 rpm. Nucleosome digestion profiles are shown in FIG. 2.

    [0146] The reaction was quenched with EGTA (ethylene glycol-bis(β-aminoethyl ether)-N,N,N′,N′-tetraacetic acid (Sigma)) to a final concentration of 5 mM. 200 μl was removed as a control to measure the digestion efficiency. The reaction was centrifuged (5 minutes at 300 g) and the digestion buffer was discarded. The cells were resuspended in phosphate buffered saline and centrifuged again (5 minutes at 300 g) and the supernatant was discarded.

    Ligation

    [0147] End repair and phosphorylation of the DNA was performed prior to ligation. Cells were resuspended in DNA ligase buffer (Thermo Scientific; final concentrations 40 mM Tris HCl pH 7.5, 10 mM MgCl.sub.2, 10 mM DTT, 5 mM ATP) supplemented with dNTPs (final concentration 400 uM of each of dATP, dCTP, dGTP and dTTP (Thermo Fischer R0191)) and EGTA 5 mM. T4 Polynucleotide Kinase PNK (NEB M0201L) and DNA Polymerase I (Large (Klenow) Fragment NEB M0210L) were added to final concentrations of 200 U/ml and 100 U/ml respectively and the reaction was incubated at 37° C. for 1 hour. T4 DNA ligase (Thermo Scientific, High Concentration Ligase (30 U/μl) EL0013) was added to a final concentration of 300 U/ml and the reaction was incubated at 16° C. overnight using an Eppendorf Thermomixer at 800 rpm.

    De-crosslinking

    [0148] The chromatin was decrosslinked with proteinase K at 65° C. (>2 hours) and either phenol chloroform with RNAse treatment (Roche: 1119915) or the Qiagen DNeasy blood and tissue kit were used to purify the DNA.

    [0149] Digestion and ligation efficiencies were assessed using either gel electrophoresis or the Agilent Tapestation (D1000 reagents). This should show >80% mono-nucleosomes and a significant increase in the fragments size in the 3C ligation product (FIG. 2). Over digestion of the chromatin removes the inter-nucleosomal linker sequences and when this occurs, the samples fail to ligate (FIGS. 2 and 3).

    Sonication

    [0150] The oligonucleotide capture protocol was performed as for conventional Next Generation Capture-C. Briefly, the micrococcal nuclease 3C library was sonicated to a mean fragment size of 200 base pairs using a Covaris S220 Focussed Ultrasonicator.

    Addition of Sequencing Adaptors

    [0151] Sequencing adaptors were added using the NEB Ultra II kit and PCR amplified using the Herculase PCR kit (Agilent). These libraries were hybridised typically with 120 base pair biotinylated oligonucleotides (at a concentration of 13pm-130 fmols/sample depending on the number of oligonucleotides used) for 72 hours using the Roche SeqCap reagents.

    Bead Capture

    [0152] The samples were captured with streptavidin beads (Thermo Fischer M270), washed and amplified using the Roche SeqCap reagents and standard protocols. A second round of oligonucleotide capture was performed with the same oligonucleotides and reagents with only a 24-hour hybridization reaction.

    Sequencing

    [0153] The material was sequenced using the Illumina platform with 300 base pair reads (150 base pair paired end).

    Results

    [0154] The data was analysed as illustrated in FIG. 4. FIG. 4 shows data from the Micrococcal nuclease Capture-C (MCC) experiment. In this experiment, data was generated for 35 genes simultaneously. The experimental design included a central capture oligonucleotide designed to capture contacts directly from the middle of the hypersensitive site at the promoter of the gene and two flanking oligos, one ˜1 kb upstream (labelled cleft') and one ˜1 kb downstream (labelled ‘right’). The data show that the resolution of MCC is much greater than that achievable by the best methods previously available (NG Capture-C and 4C-seq) for defining interaction profiles at high resolution in mammalian genomes. In addition, the data have a substantially improved resolution compared to the all v all contact maps in yeast generated using the Micro-C protocol [14] despite the much greater genome size.

    [0155] The position of the oligonucleotides used for capture change the interaction profile dramatically. When oligonucleotides are placed directly over the hypersensitive site at the promoter of the gene, MCC reveals highly discrete interactions with the enhancer regulatory elements that are known to control gene expression (FIGS. 4a, b, c). However, when the biotinylated oligonucleotides are placed ˜1 kb upstream or downstream from the central oligo position on the DNase site, the profile changes and the interactions are more diffuse. Data from NG Capture-C and 4C-seq are included to allow comparison with the previously-best available methods for defining one vs all interaction profiles in large mammalian genomes. FIG. 4b shows a 20 kb section from FIG. 4a and includes 20 kb data from Hsieh et al. [15] generated from S. cerevisiae.

    [0156] FIG. 4c shows a 1 kb section from FIG. 4b; this highlights the resolution obtainable from the method of the invention. When the ligation junctions are plotted (in contrast to a pile up of the whole reads shown in the other tracks), this gives close to single-base pair resolution. This highlights the potential transcription factor binding sites within the enhancer region in a similar to DNase I hypersensitivity footprinting data. In this experiment another 35 genes were analysed and these data show similar improvements in resolution.

    Example 2

    Effect of Digitonin on Resolution

    [0157] FIG. 5 shows a comparison of MCC performed with an intact whole-cell preparation compared to a nuclear preparation. The whole-cell preparation shows much more distinct peaks with the enhancer elements in comparison to the data generated from nuclei. NG Capture-C and 4C-seq data were included for comparison (both of these were generated from 3C libraries generated from nuclei rather than from intact cells).

    Comparative Example 3

    Resolution Obtained Using Micro-C Method

    [0158] For comparative purposes only, reference is made to the Micro-C method of the prior art (Hsieh et al., 2015 & 2016 [14, 15]). FIG. 4 shows data from Hsieh et al. [15] (Supplementary FIG. 2) showing a 20 kb region of yeast chromosome IX and FIG. 6 herein (reproduced from FIG. 5C of Hsieh et al. [14]) shows two 20 kb×20 kb matrices showing wild-type and ssu72-2 Micro-C data. These illustrate the lower level of resolution obtained in that Micro-C method.

    REFERENCES

    [0159] 1. Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10, 57-63 (2009).

    [0160] 2. Mikkelsen, T. S. et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553-60 (2007).

    [0161] 3. Robertson, G. et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 4, 651-7 (2007).

    [0162] 4. Hesselberth, J. R. et al. Global mapping of protein-DNA interactions in vivo by digital genomic footprinting. Nat Methods 6, 283-9 (2009).

    [0163] 5. Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat Methods 10, 1213-8 (2013).

    [0164] 6. Dekker, J., Rippe, K., Dekker, M. & Kleckner, N. Capturing chromosome conformation. Science 295, 1306-11 (2002).

    [0165] 7. Tolhuis, B., Palstra, R. J., Splinter, E., Grosveld, F. & de Laat, W. Looping and interaction between hypersensitive sites in the active beta-globin locus. Mol Cell 10, 1453-65 (2002).

    [0166] 8. Noordermeer, D. et al. The dynamic architecture of Hox gene clusters. Science 334, 222-5 (2011).

    [0167] 9. Sanyal, A., Lajoie, B. R., Jain, G. & Dekker, J. The long-range interaction landscape of gene promoters. Nature 489, 109-13 (2012).

    [0168] 10. van de Werken, H. J. et al. Robust 4C-seq data analysis to screen for regulatory DNA interactions. Nat Methods 9, 969-72 (2012).

    [0169] 11. de Laat, W. & Duboule, D. Topology of mammalian developmental enhancers and their regulatory landscapes. Nature 502, 499-506 (2013).

    [0170] 12. Davies J. O. J., Oudelaar A. M., Higgs D. R. and Hughes J. R. How best to identify chromosomal interactions: a comparison of approaches. Nature Methods 2017, 14 (2), 125-134

    [0171] 13. Davies J. O. J., Telenius J. M., McGowan S., Roberts N. A., Taylor S., Higgs D. R. and Hughes J. R. ‘Multiplexed analysis of chromosome conformation at vastly improved sensitivity’, Nature Methods 2016; 13, 74-80

    [0172] 14. Hsieh T. H., Weiner A., Lajoie B., Dekker J., Friedman N., Rando O. J. Mapping Nucleosome Resolution Chromosome Folding in Yeast by Micro-C. Cell. 2015 Jul 2;162(1):108-19.

    [0173] 15. Hsieh T. S. Fudenberg G., Goloborodko A., Rando O. J. Micro-C XL: assaying chromosome conformation from the nucleosome to the entire genome. Nat Methods. 2016 Dec;13(12):1009-1011.