Flourescent protein composition for DNA sequence analysis and method for DNA sequence analysis using same

Abstract

The present invention relates to a composition for DNA sequence analysis and a method for DNA sequence analysis, the method comprising treating a sample with the composition. The composition of the present invention can attain efficient optical identification at a single-DNA molecule level by linking both an A/T-specific DNA-binder agent and an A/T-non-specific complementary DNA-binder agent to DNA, and thus can be helpfully used in studying chromosomal organization of genomes, protein immunolocalization, and the like.

Claims

1. A method for DNA sequence analysis, the method comprising treating a sample with a composition comprising: an adenine/thymine (A/T)-specific DNA binding protein linked with a first fluorescent protein; an NT-non-specific DNA binding protein linked with a second fluorescent protein; wherein the NT-specific DNA binding protein is a histone-like nucleoid-structuring (H-NS) protein or a high mobility group (HMG); and comparing the NT frequency of the entire genome of an analysis target and the A/T frequency of the sample treated with the composition; wherein the NT-non-specific DNA binding protein is a breast cancer 1 (BRCA1) protein or a protein having a structure of Chemical Formula 1 below:
(XY).sub.n, [Chemical Formula 1] wherein X and Y each are independently any amino acid independently selected from lysine (K), tryptophane (W), or derivatives thereof; wherein n is an integer of 1 to 5; and wherein the concentration ratio of the A/T-specific DNA binding protein linked with the first fluorescent protein and the A/T-non-specific DNA binding protein linked with the second fluorescent protein is 1:2-5.

2. The method of claim 1, wherein the sample is a single DNA molecule.

3. The method of claim 1, wherein the first fluorescent protein and the second fluorescent protein exhibit different colors.

4. The method of claim 3, wherein the first fluorescent protein is mCherry.

5. The method of claim 3, wherein the second fluorescent protein is enhanced green fluorescent protein (eGCP).

6. The method of claim 1, wherein the first fluorescent protein is mCherry.

7. The method of claim 1, wherein the second fluorescent protein is enhanced green fluorescent protein (eGCP).

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1A schematically shows stained DNA molecules according to an embodiment of the present invention, and specifically, λ DNA molecules stained with H-NS-mCherry and BRCA1-eGFP.

(2) FIG. 1B shows stained DNA molecules according to an embodiment of the present invention. The arrows indicate molecular orientations of 5′ to 3′; the white profile indicates the A/T frequency of λ DNA (scale bar: 10 μm), and respective numerical values indicate cross-correlation values (cc).

(3) FIG. 2 shows λ DNA molecules stained with H-NS-mCherry and BRCA1-eGFP at various concentrations according to an embodiment of the present invention. Respective numerical values indicate cross-correlation values (cc).

(4) FIG. 3A shows λ DNA molecules stained with various combinations of fluorescent protein-DNA binding protein according to an embodiment of the present invention. The respective combinations are shown below (scale bar: 10 μm):

(5) i) H-NS-mCherry and BRCA1-eGFP, ii) H-NS-mCherry and 2(KW).sub.2-eGFP, iii) 2HMG-mCherry and BRCA1-eGFP, iv) 2HMG-mCherry and 2(KW).sub.2-eGFP, v) H-NS-mCherry and 2HMG-eGFP, and vi) 2(KW).sub.2-mCherry and BRCA1-eGFP.

(6) FIG. 3B shows λ DNA molecules stained with various combinations of fluorescent protein-DNA binding protein according to an embodiment of the present invention. The respective combinations are shown below (scale bar: 10 μm):

(7) i) H-NS-mCherry and BRCA1-eGFP, ii) H-NS-mCherry and 2(KW).sub.2-eGFP, iii) 2HMG-mCherry and BRCA1-eGFP, and iv) 2HMG-mCherry and 2(KW).sub.2-eGFP.

(8) FIG. 3C is a graph showing cross-correlation values of λ DNA molecules stained with various combinations of fluorescent protein-DNA binding protein according to an embodiment of the present invention (Random: 0.55±0.14, i: 0.84±0.10, ii: 0.87±0.05, iii: 0.84±0.04, iv: 0.86±0.03, v: 0.61±0.15, and vi: 0.59±0.11) (*p<0.02, **p<0.005).

(9) FIG. 4 schematically shows viral genomic DNA molecules stained with H-NS-mCherry and BRCA1-eGFP according to an embodiment of the present invention (scale bar: 5 μm).

DETAILED DESCRIPTION

(10) Hereinafter, the present invention will be described in more detail with reference to examples. These examples are only for illustrating the present invention more specifically, and it will be apparent to those skilled in the art that the scope of the present invention is not limited by these examples.

(11) Test Materials and Reagents

(12) DNA primers were purchased from Cosmogenetech (Korea). Biotin-labeled DNA oligomers were purchased from Bioneer (Korea). E. coli BL21 (DE3) strain was purchased from Yeastern (Taiwan). λ DNA(NC_001416.1, 48,502 bp) and single-stranded M13mp18 (7,249 bp) were purchased from New England Biolabs (US). Epoxy was purchased from Devcon (US). N-trimethoxymethyl silyl propyl-N,N,N-trimethyl ammonium chloride (50% methanol) was purchased from Gelest (Morrisville, US). Ni-NTA agarose resin and column were purchased from Qiagen (Venlo, Netherlands). Unless note, all enzymes were purchased from NEB, and all reagents were purchased from Sigma-Aldrich.

(13) Fluorescence Microscopy and DNA Visualization

(14) An inverted optical microscope (Olympus IX70, Japan) was equipped with 60× and 100× Olympus UPlanSApo oil immersion objectives, and a LED light source (SOLA SM II light engine, Lumencor, US) was used. The light was condensed through corresponding filter sets (Semrock, US) to set excitation and emission wavelengths.

(15) Fluorescence microscopic images were stored in a 16-bit TIFF format through an electron-multiplying charge-coupled device (EMCCD) digital camera device (Evolve EMCCD, Roper Scientific, US), and the software Micro-manager was used. For image processing and analysis, the Java plug-in and python programs developed by the present inventors and ImageJ software were used.

(16) Python Program imageCompare.py: a library of functions. seq2map.py: converts a FASTA sequence file in silico map into an image with high frequency portions in white and low frequency portions in black. insilicoMapFolder.py: scans and compares the in silico image file and the DNA image obtained from experiments for all images in a folder, and returns the position and value of a point with the highest cross-correlation coefficient, which are then stored in a new record file. sortView.py: reads the record file obtained by insilicoMapFolder.py to visualize signal comparison, cross-correlation coefficient search, and image comparison and create the same in a new window. randomtiff.py: creates tiff images having random brightness values.

(17) Preparation of Fluorescent Protein-DNA Binding Protein

(18) Plasmids necessary for protein production were constructed by overlap extension polymerase chain reaction (OE-PCR), which links a fluorescent protein to the C-terminal of DNA binding protein. The GGSGG linker containing glycine and serine was used, and respective primer sequences are shown in Table 1.

(19) HNS-mCherry:

(20) H-NS DNA was amplified using forward primer P1-HNS and reverse primer P2-HNS while DNA plug of E. coli MG1655 strain was used as template. mCherry DNA was amplified using the forward primer P3-mCherry and the reverse primer P4-mCherry. Then, H—NS and mCherry were linked with each other by overlap polymerase chain reaction.

(21) BRCA1-eGFP:

(22) BRCA1-DNA binding domain was amplified using the forward primer P5-BRCA and the reverse primer P6-BRCA while partial BRCA1 (Addgene plasmid #71116) including 452-1079 residues was used as a template. eGFP DNA was amplified using the forward primer P7-eGFP and the reverse primer P8-eGFP. Then, BRCA1 DNA binding domain and eGFP were linked with each other by overlap polymerase chain reaction.

(23) 2HMG-mCherry:

(24) 2HMG-mCherry was constructed by tagging DNA binding sites to each terminal of mCherry while the forward primer P9-HMG-mCherry and the reverse primer P10-HMG-mCherry were used.

(25) 2(KW).sub.2-mCherry:

(26) 2(KW).sub.2-mCherry was constructed by tagging DNA binding sites to each terminal of mCherry while the forward primer P11-(KW).sub.2-mCherry and the reverse primer P12-(KW).sub.2-mCherry were used.

(27) 2(KW).sub.2-eGFP:

(28) 2(KW).sub.2-eGFP was constructed by tagging DNA binding sites to each terminal of eGFP while the forward primer P13-(KW).sub.2-eGFP and the reverse primer P14-(KW).sub.2-eGFP were used.

(29) 2HMG-eGFP:

(30) 2HMG-eGFP was constructed by tagging DNA binding sites to each terminal of eGFP while the forward primer P15-HMG-eGFP and the reverse primer P16-HMG-eGFP were used.

(31) TABLE-US-00001 TABLE 1 SEQ ID NO Name Sequence (5′-3′) 1 P1-HNS ACTTCACATATGATGAGCGAAGCACTTAAAATTCTG 2 P2-HNS GCCACCAGAACCACCTTGCTTGATCAGGAAATCGTCG 3 P3-mCherry CAAGCAAGGTGGTTCTGGTGGCATGGTGAGCAAGGG CGAGGAG 4 P4-mCherry ATTTCAGGATCCCTACTTGTACAGCTCGTCCATGCC 5 P5-BRCA TATGCACATATGGTAGAGAGTAATATTGAAGACAAAAT ATTTGGG 6 P6-BRCA GCTCACCATACCGCCGCTGCCACCTTTTGGCCCTCTG TTTCTACCTAG 7 P7-eGFP GGTGGCAGCGGCGGTATGGTGAGCAAGGGCGAGGA G 8 P8-eGFP TATGCAGGATCCTTACGCCTTGTACAGCTCGTCCATG 9 P9-HMG-mCherry ATATTGCATATGACCCCGAAACGCCCGCGCGGCCGCC CGAAAAAAGGCGGCAGCGGCGGC/ATGGTGAGCAAG GGCGAGGAG 10 P10-HMG-mCherry ATATTGGGATCCTTAGCCGCCGCTGCCGCCTTTTTTC GGGCGGCCGCGCGGGCGTTTCGGGGT/CTTGTACAG CTCGTCCATGCC 11 P11-(KW).sub.2-mCherry ATGTTGCATATGAAATGGAAATGGAAAAAAGCGATGGT GAGCAAGGGCGAGGAG 12 P12-(KW).sub.2-mCherry ATGTTGGGATCCTTATTTCCATTTCCATTTTTTCGCCTT GTACAGCTCGTCCATGCC 13 P13-(KW).sub.2-eGFP ATGTTGCATATGAAATGGAAATGGAAAAAAGCGATGC GTGAGCAAGGGCGAGGAGC 14 P14-(KW).sub.2-eGFP ATGTTGGGATCCTTATTTCCATTTCCATTTTTTCGCCTT GTACAGCTCGTCCATGCC 15 P15-HMG-eGFP ATATTGCATATGACCCCGAAACGCCCGCGCGGCCGCC CGAAAAAAGGCGGCAGCGGCGGCATGCGTGAGCAAG GGCGAGGAGC 16 P16-HMG-eGFP ATATTGGGATCCTTAGCCGCCGCTGCCGCCTTTTTTC GGGCGGCCGCGCGGGCGTTTCGGGGTCTTGTACAGC TCGTCCATGCC

(32) Molecular Cloning

(33) Using standard subcloning procedures, fluorescent protein-DNA binding protein sequences were inserted into the pET-15b vector and transformed into the E. coli BL21 (DE3) strains by using NdeI and BamHI. A single colony of the transformed cells was inoculated in fresh LB media containing ampicillin and incubated for 1 h.

(34) After transformed cells were saturated, the cells were incubated to an optical density of about 0.8 at 37° C. with corresponding antibiotics. Fluorescent tagging proteins were overexpressed overnight with a final concentration of 1 mM for IPTG on a shaker at 20° C. and 250 rpm.

(35) Cells for protein purification were harvested by centrifugation at 12,000×g, for 10 min (following centrifugations were all performed under similar conditions), and the residual media was washed with the cell lysis buffer (50 mM Na.sub.2HPO.sub.4, 300 mM NaCl, 10 mM Imidazole, pH 8.0). The cells were lysed by ultrasonication for 30 min and cell debris were centrifuged at 13,000 rpm for 10 min at 4° C. His-tagged FP-DNA binding proteins were purified using affinity chromatography with Ni-NTA agarose resin.

(36) The mixture of cell protein and resin was kept on a shaking platform at 4° C. for 1 h. The lysate containing proteins bound Ni-NTA agarose resin was loaded onto the column for gravity chromatography and was washed several times using the protein washing buffer (50 mM Na.sub.2HPO.sub.4, 300 mM NaCl, 20 mM Imidazole, pH 8.0). Especially for H-NS-mCherry, a washing buffer containing 35 mM Imidazole (50 mM Na.sub.2HPO.sub.4, 300 mM NaCl, 35 mM Imidazole, pH 8.0) was used.

(37) Finally, the bound proteins were eluted using a protein elution buffer (50 mM Na.sub.2HPO.sub.4, 300 mM NaCl, 250 mM imidazole, pH 8.0). All proteins were diluted (10 μg/mL) using 50% w/w glycerol/1×TE buffer (Tris 10 mM, EDTA 1 mM, pH 8.0).

(38) Preparation of Coverslips and Modified Surfaces

(39) Glass coverslips were inserted into the Teflon rack, and soaked in piranha etching solution (30:70 v/v H.sub.2O.sub.2/H.sub.2SO.sub.4) for 2 h, and then washed with deionized water until the pH reached the neutral (pH 7).

(40) For positively-charged glass surfaces, 350 μL of N-trimethoxymethylsilylpropyl-N,N,N-trimethyl ammonium chloride dissolved in 50% methanol was mixed with 200 mL of deionized water.

(41) To prepare glass surface for DNA tethering, 2 mL of N-[3-(trimethoxysilyl)propyl] ethylenediamine was added to 200 mL of methanol and 10 mL of glacial acetic acid to add primary amino groups. The glass coverslips were incubated in the solution for 30 min, sonicated for 15 min, and then incubated for 12 h at room temperature. Then, the coverslips were washed with methanol and ethanol.

(42) Preparation of Microfluidic Devices

(43) To investigate DNA elongation and deposition on positively charged surfaces, polydimethylsiloxane (PDMS) microfluidic devices were manufactured employing a standard rapid phototyping method.

(44) More specifically, the patterns on a silicon wafer for microchannels (4 μm high and 100 μm wide) were fabricated using SU-8 2005 photoresist (Microchem, US). The PDMS pre-polymer mixed with a curing agent (10:1 weight ratio) was cast on the patterned wafer and cured at 65° C. for 4 h or longer. The cured PDMS was peeled off from the patterned wafer, and the PDMS devices were treated in an air plasma generator for 1 min with 100 W (Femto Science Cute Basic, Korea) to make PDMS surface hydrophilic. The PDMS devices were stored in water and air-dried before use.

Experimental Example 1: Confirmation of DNA Staining at Single-Molecule Level

(45) A composition for DNA sequence analysis was prepared by mixing 50 μL of 8 nM H-NS-mCherry and 50 μL of 40 nM BRCA1-eGFP at a ratio of 1:1.

(46) First, 10 μL of a solution obtained by diluting λ DNA to 15 ng/μL with 1×TE (10 mM Tris, 1 mM EDTA, pH 8) was mixed with 10 μL of the composition for DNA sequence analysis. After incubation at room temperature for a while, the mixture was diluted with 1×TE solution to 1/10-1/20, and then loaded at the entrance of a structure with a positively charged glass surface and a PDMS microchannel (100 μm×4 μm). Thereafter, DNA molecules were imaged on the microscope.

(47) As can be confirmed in FIG. 1A, as a result of staining of λ DNA using H-NS-mCherry and BRCA1-eGFP, λ DNA molecules could be aligned based on three distinct red spots on a green DNA backbone.

(48) Next, 3 μL of a solution obtained by diluting λ DNA to 1/10-1/20 with the 1×TE solution was dropped on the positively charged glass surface, which was then brought in contact with a slide glass to spread the solution, and imaged by a fluorescence microscope.

(49) As can be confirmed in FIG. 1B, λ DNA molecules were deposited on the positively charged surface. The alignment orientations of the randomly aligned DNA molecules can be obtained, and even in the case of the middle-broken DNA molecules but not the full molecule, the position information of corresponding fragments can be obtained.

Experimental Example 2: Confirmation of Optimal Concentration of Fluorescent Protein—DNA Binding Protein

(50) H-NS-mCherry at various concentrations (1, 2, 4, 8, or 16 nM) was mixed with BRCA1-eGFP (0, 10, or 20 nM) to prepare compositions for DNA sequence analysis, and λ DNA was visualized using PDMS microchannels by the method in Experimental Example 1″.

(51) As can be confirmed in FIG. 2, the stained DNA color pattern was varied according to the concentration of the fluorescent protein-DNA binding protein.

(52) More specifically, the use of 1 nM H-NS-mCherry and 20 nM BRCA1-eGFP generated a full green DNA molecule, but in contrast, the use of 16 nM H-NS-mCherry and 10 nM BRCA1-eGFP generated a full red DNA molecule. The optimal concentration was shown at the ratio of 4 nM H-NS-mCherry and 20 nM BRCA1-eGFP (the cc value is 0.91: the cross-correlation coefficient (hereinafter, cc) was evaluated by using the following equation).

(53) $cc = \frac{{.Math.}_{i = 1}^{n} (x_{i} - \overline{x}) (y_{i} - \overline{y})}{\sqrt{{.Math.}_{i = 1}^{n} {(x_{i} - \overline{x})}^{2}} \sqrt{{.Math.}_{i = 1}^{n} {(y_{i} - \overline{y})}^{2}}}$

(54) (n=number of samples; x.sub.i, y.sub.i=value at each point; and x, y=average of samples)

(55) Meanwhile, the use of only H-NS-mCherry generated nucleotide sequence (A/T)-specific color patterns at 4 nM or lower. These results indicate that H-NS-mCherry stains NT-rich regions and BRCA1-eGFP complementarily stains the parts of DNA, which were not stained by H-NS-mCherry.

Experimental Example 3: Confirmation of Various Combinations of Fluorescent Protein-DNA Binding Protein

(56) On the basis of mCherry and eGFP fluorescent proteins, various combinations of fluorescent protein-DNA binding protein (H-NS-mCherry, BRCA1-eGFP, 2HMG-mCherry, 2(KW).sub.2-mCherry, 2(KW).sub.2-eGFP, 2HMG-eGFP) were used to produce compositions for DNA sequence analysis. λ DNA was visualized using positively charged glass surface by the method in Experimental Example 1.

(57) As can be confirmed in FIGS. 3A and 3B, four (I, ii, iii, and iv) out of six combinations generated NT-specific λ genome patterns. On the contrary, random patterns (cc=0.61) were generated when both of the DNA binding proteins employ A/T-specific DNA binding proteins (H-NS and 2HMG) (v), and random patterns (cc=0.59) were also generated when both of the DNA binding proteins employ A/T-non-specific DNA binding proteins (BRCA1 and 2(KW).sub.2) (vi).

Experimental Example 4. Confirmation of DNA Staining at Short-DNA Fragment Level

(58) It was further investigated with reference to the above example results whether the composition of the present invention was applicable in short DNA fragments, such as M13, a bacteriophage that infects bacteria, and murine leukemia virus (MLV), a retrovirus that infects mice.

(59) More specifically, as for the M13 phage genome, the double-stranded M13mp18 was synthesized from single-stranded circular DNA by Top polymerase reaction with a primer (GGAAACCGA GGAAACGCAATAATAACGGAATACCC) (SEQ ID NO: 20). After the polymerase reaction, the double-stranded circular DNA was linearized by Pstl. The double-stranded retroviral genomic DNA was synthesized from murine leukemia virus genome. After the reaction, the circular dsDNA was linearized by Bmtl.

(60) Each viral DNA was visualized, by the method in Experimental example 1, using the composition for DNA sequence analysis in which 50 μL of 8 nM H-NS-mCherry and 50 μL of 40 nM BRCA1-eGFP were mixed at a ratio of 1:1.

(61) As can be confirmed in FIG. 4, the linearized DNA molecules were two-color stained to generate genome-specific patterns. It was confirmed that M13 DNA and MLV DNA could be differentiated from each other based on such image patterns.

CONCLUSION

(62) From the integration of the description, it could be confirmed that, by using a combination of two complementary color fusion proteins, the composition of the present invention specifically stains DNA AT-rich regions, and shows an AT-rich sequence-specific fluorescence intensity pattern on the DNA backbone when binding to DNA. The use of such sequence-specific patterns, when the full sequences are provided, can determine DNA sequences from microscopic images of stained DNA, and therefore, the composition of the present invention can be helpfully used in the high-rate and high-efficiency analysis of giant single DNA molecules.

(63) This application contains references to amino acid sequences and/or nucleic acid sequences which have been submitted herewith as the sequence listing text file. The aforementioned sequence listing is hereby incorporated by reference in its entirety pursuant to 37 C.F.R. § 1.52(e).

(64) This application contains references to amino acid sequences and/or nucleic acid sequences which have been submitted concurrently herewith as the sequence listing text file entitled “62406982_1.TXT”, file size 11 KiloBytes (KB), created on 7 Jun. 2022. The aforementioned sequence listing is hereby incorporated by reference in its entirety pursuant to 37 C.F.R. § 1.52(e)(5).

Flourescent protein composition for DNA sequence analysis and method for DNA sequence analysis using same

Assignee

Inventors

Cpc classification

Classification Explorer

C12Q2565/107

CHEMISTRY; METALLURGY

Classification Explorer

C12Q1/6869

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2565/531

CHEMISTRY; METALLURGY

Classification Explorer

C12Q1/6876

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2565/107

CHEMISTRY; METALLURGY

Classification Explorer

C12Q2565/531

CHEMISTRY; METALLURGY

Classification Explorer

C12Q1/6869

CHEMISTRY; METALLURGY

Classification Explorer

C12Q1/6886

CHEMISTRY; METALLURGY

International classification

Classification Explorer

C12Q1/6876

CHEMISTRY; METALLURGY

Abstract

Claims

Description