CHEMICAL PLATFORM ASSISTED PROXIMITY CAPTURE (CAP-C)
20210317506 · 2021-10-14
Assignee
Inventors
Cpc classification
C12N15/11
CHEMISTRY; METALLURGY
C12Q2563/155
CHEMISTRY; METALLURGY
C12Q2563/155
CHEMISTRY; METALLURGY
C12N2830/46
CHEMISTRY; METALLURGY
C07C69/003
CHEMISTRY; METALLURGY
C12Q1/6806
CHEMISTRY; METALLURGY
C12Q1/6806
CHEMISTRY; METALLURGY
International classification
Abstract
Certain embodiments are directed to compositions and methods for capture of elements in physical proximity. In certain aspects the methods comprise (a) contacting a target with a functionalized scaffold or capture agent that comprises activatable cross-linking moieties to form a target/scaffold mixture; (b) exposing the target/scaffold mixture to an activator to activate the cross-linking moieties of the dendrimer and form a cross-linked target/scaffold complex; (c) isolating the target/scaffold complexes; and (d) identify portions of the target or targets that are cross linked with the scaffold.
Claims
1. A method comprising: (a) contacting a target biomolecule(s) with a functionalized scaffold comprising activatable cross-linking moieties to form a target/scaffold mixture; (b) exposing the target/scaffold mixture to an activator to activate the cross-linking moieties of the functionalized scaffold and form a cross-linked target/scaffold complex; (c) isolating the target/scaffold complexes; and (d) identify portions of the target biomolecule or target biomolecules that are cross linked with the scaffold.
2. The method of claim 1, wherein the scaffold is a dendrimer or a nanoparticle.
3. The method of claim 2, wherein the nanoparticle is a silicon nanoparticle, a silicon dioxide nanoparticle, gold nanoparticle, or quantum dot.
4. The method of claim 1, wherein the scaffold has a diameter of about 1 to 100 nm.
5. The method of claim 1, wherein the target biomolecule(s) is contacted with at least a second scaffold having a diameter that is different than the first scaffold.
6. The method of claim 5, wherein a biological sample comprising the target biomolecule(s) is first divided into a first and second portion and the first and second portions are each contacted separately with a first and second scaffolds having differing diameters.
7. The method of claim 5, wherein the first and second scaffolds are coupled to distinct labels or reagents.
8. The method of claim 5, wherein the target biomolecule(s) is contacted with at least three distinct scaffolds with each scaffold having a different diameter as compared to the other scaffolds.
9. The method of claim 5, wherein a biological sample comprising the target biomolecule(s) is first divided into at least three portions and is separately contacted with a distinct scaffold with each scaffold having a different diameter as compared to the other scaffolds.
10. The method of claim 1, wherein the target biomolecule(s) are in a molecular complex comprising one or more nucleic acids, one or more polypeptides, or one or more nucleic acids and one or more polypeptides.
11. The method of claim 1, wherein the activator is light.
12. The method of claim 11, wherein the light is ultraviolet light.
13. The method of claim 11, wherein the light has a wavelength of about 350 to 375 nm.
14. The method of claim 11, wherein the light comprises a wavelength of 365 nm.
15. The method of claim 1, wherein the activator is temperature change.
16. The method of claim 1, where the activator is pH change.
17. The method of claim 1, wherein the functionalized scaffold further comprises at least a second or third functional group.
18. The method of claim 17, wherein the at least a second functional group is a label, a tag, or a second crosslinking moiety.
19. The method of claim 18, wherein the label is an imaging agent.
20. The method of claim 18, wherein the second crosslinking moiety is a protein cross-linking moiety, or an RNA cross-linking moiety, or a DNA cross-linking moiety, or combinations thereof.
21. The method of claim 1, wherein the functionalized scaffold is coupled to one or more CRISPR sequences or antibodies.
22. The method of claim 1, wherein isolating the target/scaffold complexes further comprises exposing the target/scaffold complex to a proteinase, a nuclease, a biotin-protein ligase enzyme, or other enzyme or condition forming a treated target/scaffold complex.
23. The method of claim 22, further comprising precipitating or isolating the treated target/scaffold complexes forming an isolated target/scaffold complex.
24. The method of claim 23, wherein precipitating or isolating further comprises contacting the treated target/scaffold complex with an affinity agent or an affinity agent ligand.
25. The method of claim 24, wherein the affinity agent or affinity agent ligand is an antigen, an antibody, an oligonucleotide probe, or an oligonucleotide primer.
26. The method of claim 22, wherein isolating the target/scaffold complexes further comprises fragmenting the target and ligating or modifying the resulting fragments.
27. The method of claim 22, wherein fragmenting a target comprises nuclease digestion.
28. The method of claim 27, wherein the nuclease is an exonuclease.
29. The method of claim 28, wherein the exonuclease is a micrococcal nuclease (MNase).
30. The method of claim 27, wherein the nuclease is an endonuclease.
31. The method of claim 30, wherein the endonuclease is a restriction endonuclease.
32. The method of claim 31, wherein the endonuclease is selected form MboI, Sau3AI, DpnII, BfuCI, MluCI, HpyCH4IV, AluI, FatI, NlaIII, CviAII, AciI, HpaII, MspI, MnII, or BstUI.
33. The method of claim 26, wherein modifying the resulting fragments comprises conjugating a fragment to a probe or primer.
34. The method of claim 1, wherein isolating the target/scaffold complexes further comprises fragmenting the target and ligating a bivalent linker or an affinity tag to the target fragment crosslinked to the scaffold.
35. The method of claim 1, wherein isolating the target/scaffold complexes further comprises contacting the target/scaffold complex with an affinity agent that specifically binds a component or portion of the target.
36. The method of claim 1, wherein the target comprises DNA.
37. The method of claim 36, wherein the target biomolecule(s) are part of or associated with chromatin.
38. The method of claim 37, wherein the chromatin is in situ.
39. The method of claim 37, wherein the chromatin is in a cell.
40. The method of claim 37, wherein the cell is a diseased or pathologic cell.
41. The method of claim 37, wherein the cell is a cancer cell.
42. The method of claim 37, further comprising fixing the cell prior to contacting the cell with functionalized scaffolds.
43. The method of claim 37, wherein the cell is fixed with formaldehyde.
44. The method of claim 42, further comprising unfixing the cell after formation of target/scaffold complexes.
45. The method of claim 36, wherein identifying a DNA target comprises sequencing DNA targets isolated from target/scaffold complexes.
46. The method of claim 1, wherein the target biomolecule(s) comprise RNA.
47. The method of claim 46, wherein the RNA is labeled with a nucleotide specific agent.
48. The method of claim 47, wherein the nucleotide specific agent is a modified kethoxal bearing a functional tag.
49. The method of claim 47 wherein the nucleotide specific agent is further modified with a crosslinking moiety.
50. The method of claim 49, wherein the crosslinking moiety is azide.
51. The method of 46, wherein the functionalized scaffold comprises photoactivable crosslinking groups that crosslink to azide-modified kethoxal upon photoactivation.
52. The method of claim 47, wherein the cross-linking moiety is coupled to the scaffold.
53. The method of claim 46, wherein the target further comprises DNA and/or protein.
54. The method of claim 46, wherein the target is an RNA interactome.
55. The method of claim 1, wherein the target comprises a polypeptide.
56. The method of claim 55, wherein identifying a polypeptide target comprises immunoblotting the target biomolecule(s) or fragments thereof from the isolated target/scaffold complexes.
57. A chromatin mapping method comprising: (a) contacting a chromatin target with a functionalized scaffold to form a chromatin/scaffold mixture; (b) exposing the chromatin/scaffold mixture to an activator to activate the cross-linking moieties of the functionalized scaffold and form a cross-linked chromatin/scaffold complex; (c) isolating the chromatin/scaffold complexes; (d) identifying chromatin loci from the isolated chromatin/scaffold complexes.
58. The method of claim 57, wherein the scaffold is a dendrimer or nanoparticle.
59. The method of claim 57, wherein the scaffold has a diameter of 1 to 100 nm.
60. The method of claim 57, wherein the target is a molecular complex comprising one or more nucleic acids, one or more polypeptides, or one or more nucleic acids and one or more polypeptides.
61. The method of claim 57, wherein the chromatin is in situ.
62. The method of claim 57, wherein the chromatin is in a cell.
63. The method of claim 62, further comprising fixing the cell prior to contacting the cell with functionalized scaffold.
64. The method of claim 63, wherein the cell is fixed with formaldehyde.
65. The method of claim 63, further comprising unfixing the cell after formation of target/scaffold complexes.
66. The method of claim 57, wherein the activator is light.
67. The method of claim 66, wherein the light is ultraviolet light.
68. The method of claim 66, wherein the light has a wavelength of about 350 to 375 nm.
69. The method of claim 66, wherein the light comprises a wavelength of 365 nm.
70. The method of claim 57, wherein isolating the target/scaffold complexes further comprises exposing the target/scaffold complex to a proteinase, a nuclease, or other enzyme forming a treated target/scaffold complex.
71. The method of claim 70, further comprising precipitating or isolating the treated target/scaffold complexes forming an isolated target/scaffold complex.
72. The method of claim 71, wherein isolating the target/scaffold complexes further comprises fragmenting the target and conducting proximal ligation of the resulting fragment.
73. The method of claim 72, wherein fragmenting a DNA target comprises endonuclease or exonuclease digestion.
74. The method of claim 57, wherein isolating the target/scaffold complexes further comprises fragmenting the target and ligating a bivalent linker to the target fragment crosslinked to the scaffold.
75. The method of claim 57, wherein isolating the target/scaffold complexes further comprises contacting the target/scaffold complex with an affinity agent that specifically binds a component or portion of the target.
76. The method of claim 75, wherein the affinity agent is a nucleic acid probe.
77. The method of claim 57, wherein identifying a DNA target comprises sequencing the targets from the isolated target/scaffold complexes.
78. The method of claim 57, wherein identifying a polypeptide in the target comprises immunoblotting the targets from the isolated target/scaffold complexes.
79. A method comprising: (a) contacting a chromatin target with a functionalized scaffold comprising activatable cross-linking moieties to form a chromatin target/scaffold mixture, wherein the scaffold is also coupled to an avidity tag; (b) exposing the chromatin target/scaffold mixture to an activator to activate the cross-linking moieties of the dendrimer and form a cross-linked target/scaffold complex; (c) contacting the chromatin target with an affinity agent that binds a chromatin associated protein, wherein the affinity agent is coupled to an avidity tag modification agent, wherein the avidity tag modification agent when brought proximity to an avidity tag modifies the avidity tag forming an isolatable chromatin/scaffold complex (d) isolating the chromatin target/scaffold complexes via the avidity tag; and (e) identify portions of nucleic acid associated linked with the isolatable chromatin/scaffold complex.
80. The method of claim 79, wherein the avidity tag is a biotinylation substrate.
81. The method of claim 79, wherein the avidity tag modification agent is a biotin-protein ligase.
82. The method of claim 79, wherein a DNA crosslinking reagent is used.
83. A method comprising: (a) contacting a RNA-binding target or protein target in proximity to RNA with a functionalized scaffold comprising activatable cross-linking moieties to form a target/scaffold mixture, wherein the scaffold is also coupled to an avidity tag; (b) exposing the target/scaffold mixture to an activator to activate the cross-linking moieties of the dendrimer and form a cross-linked target/scaffold complex; (c) contacting the target with an affinity agent that binds a RNA-binding protein or protein target in proximity to RNA, wherein the affinity agent is coupled to an avidity tag modification agent, wherein the avidity tag modification agent when brought proximity to an avidity tag modifies the avidity tag forming an isolatable chromatin/scaffold complex (d) isolating the target/scaffold complexes via the avidity tag; and (e) identify portions of the RNAs that are associated linked with the isolatable target/scaffold complex.
84. The method of claim 83, wherein the avidity tag is a biotinylation substrate.
85. The method of claim 83, wherein the avidity tag modification agent is a biotin-protein ligase.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0038] The following drawings form part of the present specification and are included to further demonstrate certain aspects of the compositions and methods. Certain embodiments may be better understood by reference to one or more of these drawings in combination with the detailed description of the specification.
[0039]
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
[0056]
[0057]
[0058]
[0059]
[0060]
[0061]
[0062]
[0063]
DETAILED DESCRIPTION OF THE INVENTION
[0064] CAP-C represents a new method for studying chromatin architecture, as well as other molecular complexes. CAP-C utilizes a multifunctional capture agent or scaffold (e.g., dendrimer) platform instead of DNA-bound proteins to crosslink DNA, achieving informative spatial chromatin organization at higher resolution than in situ Hi-C. The high resolution achieved with CAP-C is not completely dependent on the sequencing depth but stems from its ability to preserve abundant informative short-range (1-20 Kb) chromatin contacts.
[0065] CAP-C offers several distinct advantages over conventional 3C-based methods. For chromatin packed in a highly crowded environment, DNA-bound proteins block the accessibility of DNA motifs for efficient restriction digestion and subsequent ligation in conventional 3C, these proteins are stripped away in CAP-C before restriction enzyme digestion, thus exposing all potential restriction sites to favor ligation of proximal contacts at all length scales. Unlike conventional 3C, CAP-C can also reveal DNA-DNA interactions that are not mediated by protein complexes. The association of proximal DNA contacts within the same capture agent can facilitate derivation of loci-specific interactomes, by enrichment of DNA bait without ligation.
[0066] The CAP-C strategy is not limited to studying chromatin structure via proximity ligation and high throughput sequencing. Crosslinked DNA-capture agent complexes, which preserve intact chromatin structure, could be purified and coupled with other downstream methods such as electronic microscopy or fluorescent microscopy to directly visualize native chromatin structure at high resolution. In addition, the surface exposed amines can be functionalized with crosslinking groups for RNA and protein, allowing broad application of the strategy to study all potential interactions among large biomolecules.
[0067] A. Capture Agents
[0068] Capture agents or functionalized scaffolds are reagents that have a plurality of extensions or arms that can be independently functionalized with a functional group. The extensions or arms can be linkers (e.g., a polymeric chain). The capture agents have a particular size, reach, or distance between two functional groups. The functional groups have a chemical or physical characteristic for binding or capturing a target that is in the physical proximity of the capture agent. The physical distance between two targets determines the coincident interaction with a particular capture agent. The smaller the physical distance the smaller the capture agent.
[0069] In certain aspects a capture agent can have at least 2, 3, 4, 5, 6, 7, 8, 9, 10 or more functional groups. The functional groups can be various properties that can be utilized to capture 2, 3, 4, 5, 6, 7, 8, 9, 10 or targets as long as the target are compatible with the capture agent and in physical proximity. The functional groups can be or terminate in an activatable cross-linking moiety (cross-linking moiety). A cross-linking moiety can be coupled to an extension or arm of the capture agent with different arms being coupled to the same or different crosslinking moiety or functional group. Activatable cross-linking moieties can be activated by a variety of treatments and environmental conditions. In particular aspects the cross-linking moiety can be activated by light, temperature, pH. A capture agent can comprises 30, 40, or 50 to 60, 70, or 80% of the termini of an arm or linker are functionalized with a cross-linking moiety or other functional group. In certain aspects a capture agent can have 5 to 125 crosslinking moieties or functional groups. In particular aspects the capture agent has 10 to 50 crosslinking moieties.
[0070] One cross-linking moiety can be psoralen. Psoralen (7H-furo[3,2-g]chromen-7-one) is the parent compound in a family of natural products known as furocoumarins. It is structurally related to coumarin by the addition of a fused furan ring.
##STR00001##
[0071] In particular aspects the scaffold or capture agent is a dendrimer or a nanoparticle. In certain aspects the scaffold or capture agent has an effective diameter of about, at least about, or at most about 1, 50, 100, 150, 200, 250, 300, 350, 400, 450 to 500 nm. An effective diameter is the length between functional groups that is indicative of the physical distance between two targets that are capable of reacting or interacting with the capture agent. The capture need not be spherical or circular.
[0072] In certain aspects a functional group can be a protein cross-linker, such as diazarine. In other aspects a functional group can be a nucleic acid cross-linker, such as psoralen. A capture agent can have 1, 2, 3, 4, or more cross-linking functional groups. In certain aspects one cross-linking agent can be a nucleic acid crosslinking agent and a second cross-linking agent can be a polypeptide cross-linking agent. In another aspect one cross-linking agent can be a polypeptide crosslinking agent and a second cross-linking agent is a different polypeptide cross-linking agent. In still another aspect one cross-linking agent can be a nucleic acid crosslinking agent and a second cross-linking agent can be a different nucleic acid cross-linking agent. In particular embodiments the cross-linking moiety is psoralen.
[0073] Capture agents can further comprise a label or labeling moiety. The labeling moiety can be biotin, AVI tag, V5 tag, Myc tag, HA tag, NE tag, hexa histidine tag, calmodulin tag, polyglutamate tag, E tag, or Flag tag. In certain aspects the labeling moiety is biotin or AVI tag.
[0074] The term “dendrimer” was derived from its tree-like branching structure and refers to a hyper-branched polymer. A dendrimer for proximity capture comprises a core and a plurality of repeating units, wherein at least one activatable cross-linking moiety is coupled to a subpopulation of the repeating units. In certain one photoactivatable cross-linking moiety is coupled to a subpopulation of the repeating units. In certain embodiments, dendrimers, such as PAMAM dendrimers, allow precise control of the spherical polymer size, with different sized dendrimers serving as “molecular rulers” that fit chromatin conformations of various densities and potentially “measuring” the physical distances between two genomic loci. Different sized dendrimers offer an opportunity to discern open and closed chromatin at high resolution. Small dendrimers such as G3 favor tightly compacted, closed chromatin regions, whereas open chromatin regions are packed loosely and enrich for large dendrimers. Larger dendrimer platforms can be used to probe interactions at large scale to investigate potential communications between chromosome territories.
[0075] A dendrimer comprises a dendrimer core. In certain embodiments a dendrimer core can be propargylamine, ethylenediamine, triethanolamine, pentaerythritol, azido-propyl(alkyl)amine, hydroxyethyl(alkyl)amine, tetraphenyl methane, trimesoylchloride, diamino hexane, diaminobutane, cystamine, or propylenediamine. In particular aspects dendrimer core is ethylenediamine. The dendrimer further comprises a repeating unit or arms. In certain aspects the repeating unit is propargylamine, ethylenediamine, triethanolamine, pentaerythritol, propylamine, propyleneimine, azido-propyl(alkyl)amine, hydroxyethyl(alkyl)amine, tetraphenyl methane, trimesoylchloride, diamino hexane, diaminobutane, cystamine, propylenediamine, and lysine. In particular aspects the repeating unit is amidoamine. A dendrimer can have 1, 3, 7, 15, 31, 63, 127, 255, 511 or more repeating unit. In certain aspects the dendrimer has a diameter of about 2 to about 10 nm. It is specifically contemplated that one or more of the aspects discussed herein may be excluded from an embodiment described.
[0076] In certain aspects the capture agent is a nanoparticle. A nanoparticle can be a silicon nanoparticle, a silicon dioxide nanoparticle, metallic (e.g., gold) nanoparticle, or quantum dot having a predetermined size distribution.
[0077] B. Proximity Capture of Nucleic Acids
[0078] Chemical Platform Assisted Proximity Capture (CAP-C) is described herein. In certain aspects the methods use, for example, a psoralen-functionalized (or any other chemical functional groups for DNA, RNA, or protein crosslinking for various applications) capture agent (e.g., a dendrimer) to crosslink chromatin that is in proximity. Dendrimers are repetitively branched polymers with multiple amines on their surface, serving as a substitute for protein to covalently crosslink DNA that is in proximity, forming a stable dendrimer-DNA complex through photo induced cycloaddition between thymine on DNA strand and psoralen on dendrimers. In certain aspects a nanoparticle or similar agent presenting a plurality of arms for functionalization can be substituted for the dendrimer. The crosslinked DNA can be purified, making the subsequent restriction digestion and re-ligation much more efficient. The capture agent-DNA complexes are then purified and sheared by sonication. The ligated chimeric DNA fragments are pulled down and subjected to high-throughput sequencing. It is also contemplated that the methods can be coupled with Cryo-EM (Li et al., Nature methods 10(6):584-90, 2013) so the native chromatin structure can be preserved and observed by capture agent crosslinking with high resolution. Dendrimers are “grown” off a central core in an iterative manufacturing process, with each subsequent step representing a new “generation” of dendrimer. Increasing generations produce larger molecular diameters, while each generation of PAMAM dendrimer has defined size. In this way, different sizes of the dendrimers can serve as a “molecular probe” to measure the physical distance of certain genome loci, making it a powerful tool to study how chromosome folded in nature and re-establish the 3D model of chromatin structure. Moreover, packing and folding of the chromatin fiber would lead to co-localization of a given pair of loci, determined by other (nearby) specific long-range interactions or other constraints, or can be due to random (nonspecific) collisions in the crowded nucleus. Those “DNA-DNA” interactions present difficulties in the 3C type experiment, utilizing “protein-DNA” crosslinking. In addition, the restriction enzyme recognized motif are randomly shielded with histone or other DNA binding proteins. However, CAP-C helps to bridge two DNA elements directly, preserving those co-localizations mediated indirectly through protein binding, and bypass the incomplete digestion result from protein occupancy, leading to map the chromatin interactions with higher resolution.
[0079] The inventors validate this method by performing proximity ligation without addition of a capture agent (e.g., dendrimer) or without UV crosslinking. In this way, the dendrimer is not crosslinked to the proximal DNA strand. The inventors then performed ligation after protein digestion and subjects the ligated nucleic acids to high-throughput sequencing. The results show that without crosslinking, the long-ranged DNA interactions diminished as compared to Hi-C. While the dendrimer crosslinked samples showed similar pattern. The inventors then mapped the contacts to the MboI digested fragments and found that less than 7% can be mapped to different fragments in “no UV” and “no dendrimer” control, with large quantity of contacts (>50%) mapped to the same fragment. On the contrary, the dendrimer crosslinked library showed more than 50% distinct fragment ligation. This result demonstrated that only the chemical crosslinking can preserve the native chromatin interactions, thus validate the feasibility of this method for studying the chromatin conformation.
[0080] Moreover, psoralen can interact with structured double strand RNA, allowing this crosslinking strategy to be expanded to investigate all possible interactions among nucleic acids. By functionalizing the capture agent with a biotin handle in conjunction with crosslinking, the crosslinked RNA-dendrimer complexes can be purified, isolated, and subjected to high through-put sequencing. In this way, all possible RNA species that are spatially in proximity can be identified. Previous methods such as SPLASH and PARIS are limited by using “zero length” crosslinkers AMT, a derivative of psoralen. Such strategy can only crosslink regions of RNA strand that are reverse complement to each other. Here, by modification of the dendrimer surface with psoralen and using dendrimers of different sizes, one is able to probe more inter RNA interactions with longer distance as well as those previously identified intramolecular RNA structure. In addition, with the help of 3D RNA-FISH, the inventors could validate some of these interactions in vivo. Taking information collected using different sizes of dendrimers, it is possible to map spatially dependent RNA-interactome.
[0081] Recent studies have revealed that transcription is more prevalent than previously expected. Apart from protein-coding mRNAs, a number of long non-coding RNA (lncRNA) or other enhancer RNAs are known to be transcribed and play vital role in gene regulation as well as shaping chromatin higher order structure. Previous methods, including GRID-seq, ChIRP and CHART, relied heavily on small molecule crosslinker, providing resourceful yet limited information regarding chromatin-RNA interactome. Substituting with the chemical platform crosslinking strategy described herein, allows comprehensive localization of all or most potential chromatin-interacting RNAs in an unbiased fashion. This method includes first crosslinking DNA-RNA that are in proximity with an appropriately functionalized dendrimer (e.g., psoralen functionalized dendrimer), and removing the associated proteins. The purified complex can be further fragmented by restriction enzyme, and subsequently ligated by bivalent linker to RNA and initiating reverse transcription. After removal of excess linker, in situ DNA ligation can be performed. The ligation product can be subjected to pair end deep sequencing. Thus, the sequencing pairs could be aligned to different region of the genome to investigate chromatin-RNA interactome.
[0082] CAP-C with formaldehyde crosslinking. Cells are grown under appropriate culture conditions. Adherent cells can be detached by centrifugation and resuspended. The cells can be treated with formaldehyde. Cells can be isolated, lysed, and contacted with a dendrimer followed by photo crosslinking the nuclei. Photo crosslinked nuclei can be treated with an appropriate proteinase.
[0083] CAP-C without formaldehyde crosslinking. Cells are grown under appropriate conditions. Adherent cells can be detached by centrifugation and resuspended. Cells can be isolated, lysed, and contacted with a dendrimer followed by photo crosslinking the nuclei. Photo crosslinked nuclei can be treated with an appropriate proteinase.
[0084] After crosslinking with or without formaldehyde DNA can be extract and isolated. Isolated DNA can be treated with an endonuclease or an endo-exonuclease, e.g., MNase and the nuclease treated DNA isolated. DNA ends are repaired. Repaired DNA is treated with a biotin linker and excess of biotin linkers are removed. Biotinylated complexes are isolated and treated with a DNA ligase. To repair ends of sheared DNA and remove biotin from unligated ends, resuspend beads and treated with DNA ligase, DNA polymerase I (e.g., T4 DNA polymerase and/or Large (Klenow) Fragment. Treated DNA is further treated in a ligation reaction forming a library.
[0085] In certain aspects, nucleic acid cross-linkers include, but are not limited to psoralens, trioxsalen, methoxypsoralen, hydroxymethyl-4,5′,8-trimethylpsoralen, alkylating agents such as nitrogen mustards, cis-platin, chloroethyl nitroso urea, mitomycin C, bifunctional aldehydes, and bifunctional quinone methides.
[0086] C. Proximity Capture of Polypeptides
[0087] Instead of designing a capture agent for nucleic acid, one could also modify capture agents with protein cross-linkers, such as diazarine. Most of protein-protein interactions are mediated by hydrophobic interactions and weak molecular bonds including ionic bond, van der Waals bond and hydrogen bond. Typical protein immunoprecipitation methods suffered from loss of binding target during the procedure because of the weak protein-protein interaction. Bivalent crosslinkers such DSG or derivatives to covalent crosslink the protein with its binding partners have been developed. However, such reactions require both substrates containing free amino group on its binding surface. Moreover, DSG is a highly reactive molecule and will gradually degraded in aqueous solution, restricting the application to study protein interactions ubiquitously. Diazarine, on the contrary, forms a radical upon UV irradiation. It could capture any proximal primary carbon nearby and form a covalent bond thus making it an appropriate reagent for protein crosslinking. Attaching it to the surface of a capture agent enables stabilizing those weak protein interactions. In this way, one can fish the pool of interacting or proximally located proteins via pulldown with antibodies specific to proteins of interest. The inventors could also incorporate any other cross-linkers such as DSG in our dendrimers to make it multivalent which will be much more efficient.
[0088] Despite modification of mono cross-linker, the dendrimer multiple available branches for functionalization with multiple cross-linkers. Here, the inventors could synthesize dendrimers with multiple psoralen and diazarine on the same dendrimer, which can crosslink with both nucleic acid and protein. In this way, it allows stability of the dynamic nucleic acid-interacting proteins with a dendrimer that covalently linked to the protein and its interacting DNA or RNA. After pulldown with specific antibodies, the nucleic acids are then purified and subjected to different library construction or other means of detection. These methods could be served as an improved version of current ChIP and ClIP with high signal to noise ratio. In addition, it renders the ability to investigate those proteins that have poor binding affinity to the nucleic acid with the help of different sizes of dendrimers. The conserved binding motifs are supposed to be shared among different sizes of dendrimers, and the confidence decrease as the dendrimer grows bigger. This not only allows identification of a specific protein binding region but also the distal locus that looped together.
[0089] Protein cross-linkers include, but are not limited to disuccinimidyl glutarate, disuccinimidyl suberate, disuccinimidyl tartrate, dimethyl adipimidate, dimethyl pimelimidate, dimethyl suberimidate, 1,5-difluoro-2,4-dinitrobenzene, N-maleimidopropionic acid hydrazide, 3-(2-pyridyldithio)propionyl hydrazide, bismaleimidoethane, diazarine, succinimidyl iodoacetate, N-maleimidoacet-oxysuccinimide ester, and succinimidyl 3-(2-pyridyldithio)propionate.
[0090] D. Chromatin Immunoprecipitation and Crosslinking Sequencing (ChIP-Seq and ClIP-Seq)
[0091] One imitation of chromatin immunoprecipitation sequencing (ChIP-seq) is that it requires large amounts of input material and yields ‘averaged’ profiles that are insensitive to cellular heterogeneity. This is a major shortcoming given that cell-to-cell variability is inherent to most tissues and cell populations. Several methods have attempted to improve current ChIP protocol and adapt to small amounts of starting materials. The inventors take advantage of the BirA enzyme, a biotin ligase that specifically transfers biotin to an AVI tag, and in vitro fused this protein with a selected antibody. The capture agents can be modified with psoralen or other functional groups in combination with an AVI tag. The cells are fixed in situ and the capture agent (e.g., dendrimer) introduced into the nucleus followed by activation of a cross-linker. A BirA/antibody fusion to target a protein of interest. After thoroughly washing away unbound antibody, biotin is supplied to initiate the transfer reaction. Since BirA only transfers biotin to proximal AVI tag, those capture agents that bind next to the target protein will be labeled with biotin. The protein is then digested and the DNA is sheared. Fragments that are specifically recognized by the biotin capture agent will be enriched and sequenced. This method can be sensitive enough to deal with low number of cells because of the high sensitivity and affinity between streptavidin and biotin compared to regular antibody antigen binding. This approach also offers a strategy to perform single-cell ChIP-seq, allowing barcoding of DNA from each cell along with pooling hundreds to thousands of cells together for pulldown and sequencing. The method can be adapted for single-cell ChIP-seq of a number of histone markers as well as other genomic features. With the same idea and effective crosslinking to RNA can perform similar methods for CLIP-seq or PAR-CLIP to study protein-RNA interactions.
[0092] In certain embodiments dendrimers carrying DNA (or RNA) crosslinkers are coupled to a specific antibody that recognizes a protein or histone marker of interest. The dendrimer can also be labeled or coupled to a tag or label, such as biotin. The cell is formaldehyde crosslinked, crosslinking DNA or RNA with proteins. The crosslinked cell lysate decorated with the antibodies coupled to the functional dendrimers, which are then subjected to a crosslinking activator. The antibody will recognize the protein in vitro and the dendrimers coupled to the antibody will crosslink to DNA or RNA bound by the protein (fixed by formaldehyde).
[0093] E. RNA Labeling Reagents
[0094] Certain embodiments are directed to RNA labeling reagents that can be used in conjunction with the methods described herein, particularly as a functional group attached to capture agent described herein. An RNA reagent can include azido-kethoxal and related kethoxal derivatives. In certain aspects the azido-kethoxal and it derivative are coupled to a functional tag. In certain aspects the RNA labeling reagent(s) have the chemical structure of
##STR00002##
[0095] The azide group can crosslink to the functional group of formula II present on the surface of a capture agent. In certain instances capture agents decorated with compounds of formula II are added at 4° C. or room temperature to cells with RNA labeled by formula I. The mixture is then incubated at higher temperature (37° C.) to initiate crosslinking. A compound of Formula II can also be generated upon photoactivation of formula III.
[0096] Kethoxal is known to efficiently label guanines in single-stranded RNA and DNA. Azido-kethoxal was designed for efficient labeling of ssRNA and ssDNA with an azido tag that can be crosslinked to formula II; formula II can be directly used, or can be generated through photo-activation of formula III.
[0097] Provided below is scheme 1 for the synthesis of azido-kethoxal.
##STR00003##
[0098] A compound or Formula I can be synthesized by adding 6 g sodium hydride and 50 mL THE to a 250 mL flask and keeping the reaction at 0° C. for 15 min. 8.7 g 2-azidoethanol was dissolved in 20 mL THE and subsequently added to the reaction dropwise. The reaction mixture was stirred at 0° C. for 15 min and then warmed to room temperature for 20 min. 27.15 g compound A was added to the reaction dropwise at 0° C. after which the reaction was warmed to room temperature and stirred overnight before it was quenched by H.sub.2O. The product was extracted from the mixture by diethyl ether and purified by column chromatography to yield compound B as a colorless liquid.
[0099] A compound of Formula II can be synthesized by adding 80 mL 1 M LiOH solution to 7.4 g compound B dissolved in 100 mL acetone. The reaction mixture was stirred at room temperature overnight and was subsequently quenched by adding HCl. The product was extracted from the mixture by diethyl ether and was purified by column chromatography to yield compound C as a colorless liquid.
[0100] A compound of Formula III can be synthesized by adding 1.59 g a compound of Formula II to 20 mL dichloromethane and 1.90 g oxalyl chloride was added dropwise. The reaction was then stirred at room temperature for 2 hr before the solvent was removed by vacuum. The residue was then dissolved in 50 mL acetonitrile and cooled to 0° C., to which trimethylsilyldiazomethane was added slowly. The reaction was continued at 0° C. for 1 h and slowly warmed to room temperature and stirred overnight. Solvent was then removed by vacuum and the product was isolated by column chromatography to yield compound D as a yellow oil.
[0101] Compound D was dissolved in acetone and 1.1 N fresh dimethyldioxirane was added in portions. The reaction was stirred at room temperature from 30 min and the solvent was removed by vacuum to yield azido-kethoxal as a yellow oil.
EXAMPLES
[0102] The following examples as well as the figures are included to demonstrate preferred embodiments of the invention. It should be appreciated by those of skill in the art that the techniques disclosed in the examples or figures represent techniques discovered by the inventors to function well in the practice of the invention, and thus can be considered to constitute preferred modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the invention.
Example 1
Chemical Platform Assisted Proximity Capture (Cap-C)
[0103] The inventors used CAP-C to analyze the mouse embryonic stem cell genome and uncovered two classes of chromatin domains, with one class anchored on Ctcf and cohesin binding sites while another class displayed plectoneme-like features previously only reported in prokaryotes and yeast. Further analyses revealed that chromatin domains could be arised from writhe-like structures generated through transcription-induced supercoiling. The discoveries of enrichment of condensing II at the boundaries of non-loop domain suggest that condensin loop extrusion contribute to generating non-loop domain. Despite the enrichment of such architecture protein, it was shown that transcription factors like YY1 could also be responsible for local enhancer-promoter contacts (Young et al., Cell 171(7), 1573-1588, 2017). These transcription activators or repressors could induce formation of local domains through. Thus, CAP-C revealed previously unappreciated chromatin domains at high resolution in mammalian cells, and can be modified to illuminate interactions among other biomolecules, including RNA and proteins.
[0104] A. Results
[0105] CAP-C: a crosslinking strategy to study chromatin architecture. To establish an approach that captures proximal chromatin contacts at all length scales, the inventors utilized a multifunctional dendrimers (PAMAM) that bear tens of crosslinking groups on the surface of polymer spheres with diameters ranging from 3-9 nm. PAMAM dendrimers are iteratively “grown” off a central core, with a new “generation” of dendrimer being synthesized at each subsequent step. Each generation of PAMAM dendrimer has a characteristic size and can be precisely tuned to control the number of surface amine groups ranging from 16-256 amines (Astruc et al., Chemical Reviews 110:1857-1959, 2010). The inventors used psoralen, which crosslinks to double-stranded DNA (dsDNA) upon UV irradiation, to functionalize approximately half of the surface amine branches on generation G3, G5 and G7 PAMAM dendrimers, with diameters of 3.6 nm, 5.4 nm, and 8.1 nm, respectively. The remaining amine branches were masked with acetyl groups, making them inert to cellular interactions (
[0106] To investigate chromatin architecture using CAP-C, the inventors fix cells with formaldehyde to make sure the subsequent application of dendrimers does not perturb native chromosome conformation. The inventors then diffuse dendrimers into the cell nucleus and expose these cells to UV irradiation (
[0107] The inventors conducted CAP-C using 10 μM G3 dendrimer with mouse embryonic stem cells (mESCs) and observed long-range chromatin contacts that required both the addition of dendrimer and UV irradiation (
[0108] Next, the inventors turned to compare CAP-C with in-situ Hi-C using mESCs. The inventors sequenced a total of 4.24 billion paired reads from six CAP-C libraries, consisting of primary and replicate libraries for each of the G3 (1.44 billion total reads), G5 (1.40 billion total reads) and G7 dendrimers (1.40 billion total reads), as well as a primary and replicate library for in-situ Hi-C (2.59 billion total reads). CAP-C datasets were processed employing a similar pipeline used for processing in-situ Hi-C libraries, followed by removal of PCR duplicates, uninformative reads, as well as reads with a low mapping quality that strongly indicate non-unique mapping (Table. 1). The inventors also performed strand orientation analysis and removed interactions below 1 Kb where read orientation is roughly equal to +/−1% (
TABLE-US-00001 TABLE 1 CAP-C Statistics No. listed as in-situ contact pairs G3 G5 G7 CAP-C Hi-C Raw 1,445,625,597 1,404,085,797 1,404,787,973 4,254,499,367 2,591,791,329 Aligned and 1,163,067,989 1,168,988,122 1,126,969,417 3,459,025,528 2,351,984,846 Paired Removal of PCR 933,818,854 824,770,123 944,984,475 2,703,573,452 2,188,090,255 duplicates Valid Pairs 50.7% 44.8% 57.2% 50.9% 80.8% Removal of 732,893,486 628,770,646 804,592,946 2,166,257,078 2,093,692,590 unligated pairs QC Step Removal of pairs 638,816,625 543,095,316 726,331,665 1,908,243,606 1,977,866,798 <1 Kb Statistics (Valid as Denom.) Intra 445,964,733 376,168,868 522,183,654 1,344,317,255 1,342,445,880 Inter 286,928,753 252,601,778 282,409,292 821,939,823 751,246,710 Intra/Inter (%) 60.8% / 59.8% / 64.9% / 62.0% / 64.1% / 39.2% 40.1% 35.0% 37.9% 35.8% Short (<20 Kb) 513,132,022 420,611,868 487,804,616 1,421,548,506 1,089,305,644 Long (>=20 Kb) 219,761,464 208,158,778 316,788,330 744,708,572 1,004,386,946 Short/Long (%) 70.0%/ 66.8%/ 60.6%/ 65.6%/ 52.0% / 30.0% 33.1% 39.4% 34.4% 48.0% MAPQ > 1 611,768,043 509,589,456 702,275,395 1,823,632,894 1,956,977,596 MAPQ > 30 566,582,697 468,055,862 661,863,785 1,696,502,344 1,888,920,169
TABLE-US-00002 TABLE 2 Mitochondria reads chrM CAP-C in-situ Hi-C Intra 494,110 47,924 Inter 2,103,932 1,765,589 Total 2,598,042 1,813,513
TABLE-US-00003 TABLE 3 Primers designed to test MboI restriction sites Primer Name Sequence 1F GATTTGCTCAGCAGATGGC (SEQ ID NO: 1) 1R GCAAATGCCCAGAGGTTC (SEQ ID NO: 2) 2F CTACCCAGAAACAGCAAGTG (SEQ ID NO: 3) 2R TTTCTGTGTTGCTATTCGGTA (SEQ ID NO: 4) 3F CATCAGATTAAGGGCGCCA (SEQ ID NO: 5) 3R ACGCAGTAGGAGACCGG (SEQ ID NO: 6) 4F GCTTTCCTCATGGAAATGC (SEQ ID NO: 7) 4R CAGGCACAGCCTCGT (SEQ ID NO: 8) 5F ACGTGGCTGAGGCTGA (SEQ ID NO: 9) 5R TCTCTGGCTCACTCACC (SEQ ID NO: 10) 6F TTCTCTCATCTGCACCGG (SEQ ID NO: 11) 6R CAGGCGGAAGTGACGT (SEQ ID NO: 12) 7F AGGACCATCTGTGCACGGAG (SEQ ID NO: 13) 7R GGTTACGCATGCAGAGCC (SEQ ID NO: 14) 8F CACCCCAAGGGCTTAGA (SEQ ID NO: 15) 8R AAGGATGCTCCACCACC (SEQ ID NO: 16) 9F CATAGACGAGTCATTGTTTCG (SEQ ID NO: 17) 9R GCCCTCTGGTGGAGACAT (SEQ ID NO: 18) 10F CCAGAGGCTGTGGCTTC (SEQ ID NO: 19) 10R CAAGAGACAGCTAAATCAGGGT (SEQ ID NO: 20) 11F CTTATCGACTGTTGCCATGG (SEQ ID NO: 21) 11R CTTAGCCTTGGTATCAACTGG (SEQ ID NO: 22) 12F GGGAGTAGAAAGAAGGCCC (SEQ ID NO: 23) 12R GGCATTTCACCTCACTGCA (SEQ ID NO: 24) 13F TCTATAAAGATGCCTCTGAGGT (SEQ ID NO: 25) 13R TCTGCTTCATTGAGAATTTACAG (SEQ ID NO: 26) 14F TATAGGTTGGCTCCAAGCTCT (SEQ ID NO: 27) 14R GATACCTGATTCAGATGGTGCA (SEQ ID NO: 28) 15F GACATCCTGCCTTCCCTG (SEQ ID NO: 29) 15R GTGGGTCAAGTTCTCAATGG (SEQ ID NO: 30) 16F TACTGACTCTGACACCAGATG (SEQ ID NO: 31) 16R CATGAGACTGACTTAAGCATCT (SEQ ID NO: 32)
TABLE-US-00004 TABLE 4 Published Datasets Description Reference Availability H3K4me3 Mouse ENCODE ENCODE consortium H3K9me3 Mouse ENCODE ENCODE consortium H3K27ac Mouse ENCODE ENCODE consortium H3K27me3 Mouse ENCODE ENCODE consortium H3K36me3 Mouse ENCODE ENCODE consortium Pol2ra Mouse ENCODE ENCODE consortium H3K9me2 Liu et al. GSE54412 Ctcf Hansen et al. GSE90994 Rad21 Hansen et al. GSE90994 Smc1 Kagey et al. GSE22557 Smc3 Kagey et al. GSE22557 Med1 Kagey et al. GSE22557 Med12 Kagey et al. GSE22557 PRO-seq Engreitz et al. GSE85798 Smc3 ChIA- Dowen et al. GSE57913 PET Smc3 HiChIP Mumbach et al. GSE80820 ATAC-seq Wu et al. GSE66390 TADs Dixon et al. chromosome.sdsc.edu/ annotations mouse/hi- c/mESC.doma.tar.gz ChromHMM Bogu et al. github.com/gireeshkbogu/ states chromatin-states_ chromHMM_mm9 Replication Hiratani et al. replicationdomain.com/ Timing data.php
[0109] CAP-C revealed finer local chromatin structures than in-situ Hi-C. The inventors hypothesized that different sized dendrimer crosslinkers will capture distinct spatial relationships at different length scales. Indeed, the smallest dendrimer, G3, strongly crosslinked loci between 1 to 5 Kb in distance, whereas G5 and G7 dendrimers preferentially crosslinked loci with distances between 5 to 20 Kb. The total chromatin contacts between 1-20 Kb captured by merging all dendrimer data were 2-3 folds greater than for in-situ Hi-C (
[0110] In contrast to higher-order chromatin structures that have been studied extensively by Hi-C, enrichment of short-range CAP-C contacts allowed us to better resolve new features of the genome at shorter length-scales. For comparison, contact maps of merged CAP-C and in-situ Hi-C datasets with similar depths (1.90 billion vs 1.98 billion) were plotted over a 70 Kb region (chr4:129.58-129.65 Mb) encompassing 6 different genes at 1 Kb resolution (
[0111] At similar sequencing depths, high-resolution peak calling using HiCCUPs yielded more peaks (2.5-fold) with merged CAP-C contact maps than with in-situ Hi-C libraries. Proportionally, there was a 1.4-fold enrichment of peaks from CAP-C that were less than 100 Kb in size than peaks from in-situ Hi-C (Fisher's Exact Test, P<0.0001) (
[0112] Different sized dendrimers probe different chromatin compartments. Different sized dendrimers might also access and probe distinct regions of chromatin compaction. This would be revealed by dendrimer size-dependent enrichment of interactions in distinct regions. Using principal component analysis, the inventors determined the eigenvector with the highest eigenvalue using the pixel values of each G3, G5 and G7 contact maps and plotted a 2D map which the inventors named as “dendrimer map” based on the eigenvector values of the 1.sup.st principal component. At multiple resolutions (500 Kb, 100 Kb, 10 Kb and 5 Kb), the 1.sup.st principal component tended to explain 90-95% of the variance instead of 50% for random contact map. Most importantly, these “dendrimer maps” showed bifurcation similar to that of compartment intervals identified previously (Lieberman-Aiden et al., Science 326, 289-93, 2009) (
[0113] To validate above hypothesis, the inventors produced “CAP-C eigenvector” similar to the eigenvector constructed previously in determining compartments by performing principal component analysis on the row sums, instead of the pixels, of all three dendrimer contact maps, and arbitrarily assigned positive values to regions which are gene-rich. Indeed, our “CAP-C eigenvectors” showed good positive correlations with compartment intervals derived from the eigenvector analysis on in-situ Hi-C at 500 Kb resolution (Pearson's R=0.861), and replication timing data from RepliSeq experiments in mESC (Pearson's R=0.850) (Hiratani et al., PLoS Biol. 6, e245, 2008) as well as moderately negative correlation with H3K9me2 ChIP-Seq (Pearson's R=−0.329) (Liu et al., Genes Dev. 29, 379-93, 2015), a histone modification mark for constitutive heterochromatin in mESC (
[0114] The inventors next inspected the “dendrimer maps” at the 5 Kb resolution to reveal additional compartment details that are missed in previous low resolution Hi-C experiments (
[0115] In summary, the above analyses confirmed that smaller G3 dendrimers preferentially crosslink tightly packed heterochromatin in B compartments, whereas the larger G5 and G7 dendrimers tend to capture chromatin contacts in the open and gene-rich compartments (
[0116] Two types of chromatin domains with different boundary properties. Given that our dendrimer maps showed high correlation between transcription and genome segregation, the inventors investigated how transcription affects the formation of the contact domains the inventors discovered. Recent studies using biophysical models have proposed different mechanisms to explain the self-associating and insulating properties of chromosomal domains in prokaryotes as well as in mammals. In model organisms such as C. crescentus and S. pombe, which lack Ctcf, polymer models attribute transcription-induced supercoiling as the force responsible for conformational changes in the form of writhes termed plectonemes. Boundaries of these domains, generically termed chromosomal interacting domains (CIDs), span the transcriptional start sites of active genes. On the contrary, the detection of TADs enriched with Ctcf at its boundaries in low-resolution maps, followed by the identification of Ctcf-cohesin-mediated loops and loop-domains in high-resolution maps, suggested that loop extrusion might be responsible for chromatin organization in mammals. However, the loop extrusion model may not explain the self-association property in large TADs unless supercoiling is taken into account. Hence, it is not entirely clear whether chromatin loop domains form in mammals exclusively via the loop-extrusion model, or whether multiple mechanisms underlie loop domain formation. To further complicate matters, only 30% of our high-resolution contact domains show loops at the corners of loop-domains and 65% of the same contact domains overlap Ctcf (+/−10 kB), implying that not all Ctcf-enriched boundaries form loops. In our high-resolution maps, the inventors noticed that a substantial proportion of contact domains called at high resolution revealed boundaries starting close to the promoters of short active protein-coding genes, which either terminate at their own transcription end sites (TSS), or half-way through the gene body of another gene (
[0117] The boundaries of domains starting at active promoter regions have been previously characterized in S. cerevisiae and recently observed in mESCs. The associations of CAP-C loops with histone modifications and transcription factor features around their anchor points suggest that the increased loops captured in CAP-C are not artifacts but functionally similar with loops identified in in-situ Hi-C (
[0118] To study the possible mechanisms separating the two types of domains, the inventors next overlapped domain boundaries and domain bodies with a series of histone modification marks. To account for the long-tailed size distribution of some of these domains, and the relatively smaller peaks generally associated with histone modification marks and transcription factors, the inventors extracted only signals+/−2 Kb around the boundary, and signals from 5-95% around the domain body. As expected, loop domains showed stronger Ctcf and cohesin signals than non-loop domains at their boundaries. However, some of the non-loop domain boundaries are also enriched with Ctcf and cohesin binding, suggesting that not all Ctcf- and cohesion-enriched domain boundaries form loops. Conversely, non-loop domains exhibit stronger H3K4me3, H3K27ac, PolII and Top2b signals than loop domains at their boundaries (
[0119] As loop domains were proposed to form via Ctcf-cohesin loop extrusion, the above observation led us to hypothesize that non-loop domains might be established through transcription-induced supercoiling, similar to the formation of CIDs in S. cerevisiae and C. crescentus. The twin-supercoiling domain model could predict how waves of supercoiling that propagate through diffusional pathways react when encountering each other; they either enforce or cancel each other based on the propagation direction. Consistent with this model, our mouse CAP-C maps showed similar domain formation based on the orientation of gene pairs previously shown in S. cerevisiae (
[0120] Effects of supercoiling on the structure of genes with multiple active promoters. Alternative promoter usage is a common mechanism for generating transcript complexity. Unlike alternative splicing, alternative promoter usage generates diversity across multiple cell-types by selectively positioning the pre-initiation complex at different transcription start sites (TSS) before elongation. As distances between alternative promoters can range from only tens to thousands of base pairs, these features can now be discernable by our high-resolution contact maps with enriched short-range interactions. Because multiple active promoters that occur in a single gene are in the tandem direction, the inventors predict from the twin-domain-supercoiling model an attenuation of boundaries as positive and negative supercoils cancel each other at the active downstream promoter; this is analogous to the mean O/E contact map of gene pairs that are arranged in a tandem fashion (
[0121] Inhibition of transcription reduces supercoiling and leads to global loss of chromatin contacts. Chromatin topology highly associates with supercoiling, and supercoiling domains have been proposed and identified. These supercoiling domains were shown to partially overlap with TADs. Motivated by a relationship between transcription-induced supercoiling and domain organization, the inventors next examined whether transcription inhibition affects chromatin architecture. The inventors explored two different transcription elongation inhibitors, flavopiridol and α-amanitin. Reduced levels and rates of supercoiling have been observed upon transcription inhibition. Thus, the inventors performed time-series CAP-C experiments using G5 dendrimers to crosslink mESC samples treated with 2 μM flavopiridol for 1 h and 6 h, as well as samples treated with 4 μg/ml of α-amanitin for 6 h and 12 h, respectively.
[0122] No significant differences were observed between the compartments of G5-control and inhibitor-treated G5 samples, indicating that transcription is not required to maintain compartments, and that compartmentalization may have been established much earlier during early development (
[0123] Therefore, the inventors conclude that domain formations are dependent on transcription-induced supercoiling. Blocking transcription elongation abrogated both loop and non-loop domains; however, loops were attenuated but largely retained. These observations support the critical role of transcription-induced supercoiling in the formation of non-loop domains, but also suggest that transcription-induced supercoiling and loop extrusion likely work synergistically to shape the overall chromatin architecture as the formation of loop domains also appear to be dependent on transcription. Taken together, the inventors propose that positive and negative supercoiling generated during transcription elongation are responsible for the intra-domain contact interactions observed in our experiments.
[0124] Probing the openness of transcription starting sites (TSS) by biotinylated psoralen functionalized dendrimers. Different sized dendrimers were functionalized with biotin and psoralen. Each capture experiment was conducted by crosslinking chromatin with one certain sized dendrimer, proteins were removed by proteinase K and the dendrimer-DNA complex was purified by streptavidin pulldown. Enriched DNA fragments were added with Illumina adapters and subjected to high-throughput sequencing. Transcription starting sites (TSS) of wild type cells were first classified by their transcription strength using Pro-seq data into 10 percentiles. 90.sup.th percentile shows the highest nascent gene expression while the 0.sup.th percentile exhibits the lowest. Then, the counts were normalized by sequencing depth (FPM) and plotted+/−2 Kb around each types of TSS (
Example 2
Modified CAP-C
[0125] To test the feasibility of a modified embodiment of CAP-C, mouse embryonic stem cells (mESCs) were fixed with formaldehyde. The azide and psoralen functionalized dendrimers were then diffused into the cell nucleus and expose these cells to 365 nm UV irradiation for 30 min. The formaldehyde fixing is then reversed, and DNA-bound proteins are digested with protease to expose all DNA motifs, the dendrimer-DNA complexes are subsequently purified with ethanol precipitation. The purified dendrimer-DNA complexes are then subjected to MNase digestion followed by end polishing and A tailing. Excess enzymes were purified away with phenol chloroform extraction. Bi-functional linkers containing DBCO and biotin were then attached to dendrimer through “Click chemistry”. Excess bridge linkers were purified away by size selection with Ampure-XP beads. The DNA-dendrimer complex is then ultra-diluted in ligation buffer and proximal end is joint together via bridge linker by overnight ligation. The ligated products were then pulled out with streptavidin beads followed by library construction and next generation sequencing. A fixation-free version of CAP-C was developed without the need for crosslinking cells with formaldehyde. The azide and psoralen functionalized dendrimers were crosslinked with native chromatin under 365 nm UV irradiation for 30 min with the rest of the procedures remain the same. (
[0126] Some Advantages of CAP-C over in-situ Hi-C: First, the use of micrococcal nuclease (MNase) or a similar enzyme in CAP-C leads to fragmentation of genome into evenly smaller pieces compared to restriction enzymes. Relative frequency of chromatin contacts of short range (below 10 Kb) captured by CAP-C showed 30% increase compared to in situ Hi-C. Enrichment of short-range CAP-C contacts allowed better resolution of new features of the genome at shorter length-scales. In contrast to the highest mESC chromatin contact matrix, CAP-C map at high resolution is clearer and sharper. Many of the small triangles with enhanced contact frequency close to the diagonal were observed in CAP-C, and were called as domains by using Arrowhead at 500 bp, which were not distinguishable as domains in in-situ Hi-C maps with a similar sequencing depth. (
[0127] Secondly, with the help of bridge linker, CAP-C is able to filter out genomic contacts that are randomly joint together to achieve low background on the contact matrix compared to in-situ Hi-C. Meta-analyses performed on short (100-200 Kb) and long (300-500 Kb) concordant peaks around loop anchors between CAP-C and in-situ Hi-C suggested that even though depth-normalized signal values (FPM) at the foci were similar between maps, a faster decay in mean long-range contacts between the two anchors decreases the mean lower-left background values in CAP-C. (
[0128] Third, different sizes of dendrimer crosslinkers used in CAP-C are able to access and probe distinct regions of chromatin compaction as a result of dendrimer size-dependent enrichment of interactions in differential regions. Using principal component analysis, it was determined that the eigenvector with the highest eigenvalue using the pixel values of each G3, G5 and G7 contact maps and plotted a 2D map which we named as “dendrimer map” based on the eigenvector values of the 1st principal component. Most importantly, these “dendrimer maps” showed plaid-like pattern similar to that of A/B compartment intervals identified previously in Hi-C, with small dendrimer G3 enriched regions showed high correlation with B compartment while large dendrimer G5 and G7 favored regions correlate better with A compartment. Compartment B is highly associated with heterochromatin and showed high correlation with inactive histone mark H3K27me3 while compartment A is positively related to open chromatin and active histone mark H3K36me3. It is reasonable to explain such observation as small dendrimer will access to the close chromatin conformation while large dendrimers are better fit for open chromatin conformation. Moreover, “CAP-C eigenvectors” were obtained in a series of resolution for different species and discovered smaller compartment intervals that are kilobases in length, suggesting that genomes are partitioned into A/B compartments at an ultra-small scale and such folding principles are shared among species. (
Detailed Examples of CAP-C Protocols:
[0129] CAP-C with formaldehyde crosslinking. Grow five million cells under recommended culture conditions. Detach adherent cells by centrifugation at 300×G for 5 min. Resuspend cells in fresh medium at 1 million cells per 1 ml medium. Add 16% formaldehyde solution to a final concentration of 1%, v/v. Incubate at r.t. for 5 min on rotating rocker. Add 2.5 M glycine solution to a final concentration of 0.2 M to quench the reaction. Incubate at r.t. for 5 min on rotating rocker. Centrifuge for 5 min at 300×G at 4° C. Discard supernatant. Resuspend cells in 1 ml of cold 1× PBS and spin for 5 min at 300×G at 4° C. Discard supernatant and flash-freeze cell pellets in liquid nitrogen (can be stored in −80° C. for up to a year). Combine 250 μl of ice-cold lysis buffer (10 mM Tris-HCl, pH 8.0, 10 mM NaCl, 0.2% Igepal CA630) with 50 μl of protease inhibitors (Sigma, P8340). Add to formaldehyde fixed pellet of cells. Incubate cell suspension on ice for 20 min. Centrifuge at 2500×G for 5 min. Discard the supernatant. Wash pelleted nuclei once with 500 μl of ice-cold Hi-C lysis buffer. Centrifuge and discard the supernatant. Resuspend the cell pellet in 1 ml 50 μM dendrimer in methanol. Incubate at 4° C. on a rocker with rotation. Photo crosslink the nuclei by irradiating under 365 nm UV for 30 min. Centrifuge for 5 min at 2500×G at 4° C. Discard supernatant. Wash pelleted nuclei twice with 500 μl of ice-cold Hi-C lysis buffer. Centrifuge and discard the supernatant. Resuspend the pellet in proteinase K buffer (420 μl Hi-C lysis buffer, 50 μl 10% SDS, 30 μl 20 mg/ml proteinase K) Incubate at 65° C. for O/N on a thermomixer at 800 rpm.
[0130] CAP-C without formaldehyde crosslinking. Grow five million cells under recommended culture conditions. Detach adherent cells by centrifugation at 300×G for 5 min. Combine 250 μl of ice-cold nucleus lysis buffer (10 mM Tris, pH 7.5, 10 mM NaCl, 3 mM MgCl.sub.2, 0.5% NP-40, 0.15 mM spermine, 0.5 mM spermidine) with 50 μl of protease inhibitors (Sigma, P8340). Add to pellet of cells. Incubate cell suspension on ice for 5 min. Centrifuge at 500×G for 5 min. Discard the supernatant. Wash pelleted nuclei once with 500 μl of resuspension buffer (10 mM Tris-HCl pH 7.4, 15 mM NaCl, 60 mM KCl, 0.15 mM spermine, 0.5 mM spermidine). Centrifuge at 500×G for 5 min and discard the supernatant. Resuspend the cell pellet in 1 ml 50 μM dendrimer in methanol. Incubate at 4° C. on a rocker with rotation for 10 min. Photo crosslink the nuclei by irradiating under 365 nm UV for 30 min. Centrifuge for 5 min at 2500×G at 4° C. Discard supernatant. Wash pelleted nuclei twice with 500 μl of resuspension buffer. Centrifuge and discard the supernatant. Resuspend the pellet in proteinase K buffer (420 μl Hi-C resuspension buffer, 50 μl 10% SDS, 30 μl 20 mg/ml proteinase K) Incubate at 65° C. for O/N on a thermomixer at 800 rpm.
[0131] Extract the DNA with 500 μl phenol:chloroform. Centrifuge at max for 10 min at r.t. Transfer the upper layer to a new tube. Add 800 μl EtOH and 50 μl 3 M NaOAc (pH 5.5). Incubate at −80° C. for 1 h. Centrifuge at max for 15 min at 4° C. Discard the supernatant. Wash the pellet twice with 500 μl 70% EtOH. Centrifuge at max for 5 min at 4° C. Discard the supernatant.
[0132] Resuspend the DNA pellet in 100 μl MNase digestion buffer (10 mM Tris-HCl pH 7.4, 15 mM NaCl, 60 mM KCl, 1 mM CaCl.sub.2), 0.15 mM spermine, 0.5 mM spermidine). Add 1 unit of MNase and incubate at 37° C. for 5 min then stop the reaction by adding 150 μl of Stop Buffer. (20 mM EDTA, 20 mM EGTA, 0.4% SDS) Incubate the mixture at 65° C. for 30 min. Purify DNA with ethanol precipitation by adding 800 μl EtOH and 50 1 3 M NaOAc (pH 5.5). Incubate at −80° C. for 1 h. Centrifuge at max for 15 min at 4° C. Discard the supernatant. Wash the pellet twice with 500 μl 70% EtOH. Centrifuge at max for 5 min at 4° C. Discard the supernatant. Resuspend the DNA pellet in 100 μl H.sub.2O.
[0133] Repair DNA ends and add “A” using the KAPA Hyper plus kit by adding the following mix: 100 μl of above DNA-Dendrimer complex; 28 μl of ER&AT buffer mix; 12 μl of ER&AT enzyme mix.
[0134] Incubate at 20° C. for 30 min then 65° C. for 30 min. Purify DNA with ethanol precipitation by adding 500 μl EtOH and 20 μl 3 M NaOAc (pH 5.5). Incubate at −80° C. for 1 h. Centrifuge at max for 15 min at 4° C. Discard the supernatant. Wash the pellet twice with 500 μl 70% EtOH. Centrifuge at max for 5 min at 4° C. Discard the supernatant. Resuspend the DNA pellet in 100 μl H2O. Add 2 μl of 100 μM biotin linker and incubate at 37° C. for 2 h on a thermomixer at 800 rpm. Excess of biotin linkers are removed by XP beads size selection. DNA is eluted with 100 μl of H.sub.2O.
[0135] Prepare for biotin pull-down by washing 20 μl of 10 mg/ml Dynabeads MyOne Streptavidin C1 beads (Life technologies) with 400 μl of 1× Tween Washing Buffer (1× TWB: 5 mM Tris-HCl (pH 7.5); 0.5 mM EDTA; 1 M NaCl; 0.05% Tween 20). Separate on a magnet and discard the solution. Resuspend the beads in 100 μl of 2× Binding Buffer (2× BB: 10 mM Tris-HCl (pH 7.5); 1 mM EDTA; 2 M NaCl) and add to the reaction. Incubate at room temperature for 15 min with rotation to bind biotinylated DNA to the streptavidin beads. Separate on a magnet and discard the solution. Wash the beads by adding 600 μl of 1× TWB and transferring the mixture to a new tube. Heat the tubes on a Thermomixer at 55° C. for 2 min with mixing. Reclaim the beads using a magnet. Discard supernatant. Repeat wash. Ligate the proximal DNA on the same dendrimer by adding the following mix: 4 ml of water; 500 μl of 10× NEB T4 DNA ligase buffer (NEB, B0202); 1 ml of above DNA-Dendrimer complexes; and 20 μl of 400 U/μl T4 DNA Ligase (NEB, M0202). Incubate at 16° C. for overnight on a rotating rocker. Separate on a magnet and discard the solution.
[0136] Wash the Streptavidin C1 beads by adding 600 μl of 1× TWB and transferring the mixture to a new tube. Heat the tubes on a Thermomixer at 55° C. for 2 min with mixing. Reclaim the beads using a magnet. Discard supernatant. Repeat wash. Perform all the following steps in low-bind tubes. Resuspend beads in 100 ul 1× NEB T4 DNA ligase buffer (NEB, B0202) and transfer to a new tube. Reclaim beads and discard the buffer. To repair ends of sheared DNA and remove biotin from unligated ends, resuspend beads in 100 μl of master mix: 88 μl of 1× NEB T4 DNA ligase buffer with 10 mM ATP S33, 2 μl of 25 mM dNTP mix, 5 μl of 10 U/μl NEB T4 PNK (NEB, M0201), 4 μl of 3 U/μl NEB T4 DNA polymerase I (NEB, M0203), 1 μl of 5 U/μl NEB DNA polymerase I, Large (Klenow) Fragment (NEB, M0210) Incubate at room temperature for 30 min. Separate on a magnet and discard the solution. Wash the beads by adding 600 μl of 1× TWB and transferring the mixture to a new tube. Heat the tubes on a Thermomixer at 55° C. for 2 min with mixing. Reclaim the beads using a magnet. Discard supernatant. Repeat wash. Resuspend beads in 100 μl 1× NEBuffer 2 and transfer to a new tube. Reclaim beads and discard the buffer. Resuspend beads in 100 μl of dATP attachment master mix: 90 μl of 1× NEBuffer 2, 5 μl of 10 mM dATP, 5 μl of 5 U/μl NEB Klenow exo minus (NEB, M0212). Incubate at 37° C. for 30 min. Separate on a magnet and discard the solution. Wash the beads by adding 600 μl of 1× TWB and transferring the mixture to a new tube. Heat the tubes on a Thermomixer at 55° C. for 2 min with mixing. Reclaim the beads using a magnet. Discard supernatant. Repeat wash. Resuspend beads in 100 μl 1× Quick ligation reaction buffer (NEB, B6058) and transfer to a new tube. Reclaim beads and discard the buffer. Resuspend in 50 μl of 1× NEB Quick ligation reaction buffer. Add 2 μl of NEB DNA Quick ligase (NEB, M2200). Add 3 μl of Illumina indexed adapter. (Nextflex) Record the sample-index combination. Mix thoroughly. Incubate at room temperature for 15 min. Separate on a magnet and discard the solution. Wash the beads by adding 600 μl of 1× TWB and transferring the mixture to a new tube. Heat the tubes on a Thermomixer at 55° C. for 2 min with mixing. Reclaim the beads using a magnet. Remove supernatant. Repeat wash. Wash 3 times with 100 μl water. Reclaim the beads with 50 μl water. Incubate at 98° C. for 10 min to elute the DNA from the beads. Transfer the supernatant to an 8-well PCR tube. PCR amplify 7-12 cycles with following conditions: 98° C. 30 s; 98° C. 15 s; 60° C. 30 s; 72° C. 30 s; Repeat 12 cycles; 72° C. 1 min.
[0137] Purify the libraries with 0.9× Ampure beads. Elute with 30 μl water. Check the ligation efficiency by aliquot 8 μl DNA libraries and adding 1 μl 10× CutSmart buffer, 1 μl BspdI. Incubate at 37° C. for 1 h. Run a 2% agarose gel with digested libraries and original libraries side by side. A clear shift-down to small size should be observed with EcoRV digested libraries.