Nucleic acid probe and nucleic acid sequencing method

11993813 ยท 2024-05-28

Assignee

Inventors

Cpc classification

International classification

Abstract

A nucleic acid probe and a nucleic acid sequencing method for performing sequencing while ligating nucleic acids. The nucleic acid probe is a DNA sequencing probe, comprising a first moiety, a second moiety, a linker, and a detectable label. A base of the first moiety is A, T, U, C, or G, a base of the second moiety is a random base and/or a universal base, and 3 bases or more are present in the second moiety. The first moiety and the second moiety are ligated via the linker, the connection between the first moiety and the ligation can be cleaved, and the detectable label is ligated to the second moiety or the linker. The above probe, a combination formed therewith, or a sequencing method using the same can reduce the number or types of probes in nucleic acid sequencing, thereby reducing cost.

Claims

1. A nucleic acid probe combination, which comprises 4 groups of nucleic acid probes, wherein each group of nucleic acid probes contains a nucleic acid probe, wherein the nucleic acid probe comprises a first moiety, a second moiety, a linker and a detectable label, wherein: the first moiety has a base of A, T, U, C or G, the second moiety has random bases and/or universal bases, and the number of the bases is 3 or more, the first moiety is ligated to the second moiety via the linker, and the ligation between the first moiety and the linker can be cleaved, the detectable label is ligated to the second moiety or the linker; and the linker does not contain a sulfur atom; wherein: the first group of nucleic acid probes: comprising the nucleic acid probe of which the base of the first moiety is A; the second group of nucleic acid probes: comprising the nucleic acid probe of which the base of the first moiety is T or U; the third group of nucleic acid probes: comprising the nucleic acid probes of which the base of the first moiety is C; the fourth group of nucleic acid probes: comprising the nucleic acid probes of which the base of the first moiety is G; and the detectable labels in the 4 groups of nucleic acid probes are different from each other; the 4 groups of nucleic acid probes are mixed or not mixed; the mole number of the first group of nucleic acid probes and that of the fourth group of nucleic acid probes are equal; the mole number of the second group of nucleic acid probes and that of the third group of nucleic acid probes are equal; the sum of the mole number of the first group of nucleic acid probes and that of the fourth group of nucleic acid probes is less than or equal to the sum of the mole number of the second group of nucleic acid probes and that of the third group of nucleic acid probes; and the molar ratio of the first group of nucleic acid probes:the second group of nucleic acid probes:the third group of nucleic acid probes:the fourth group of nucleic acid probes is (0.5-2):(3-5):(3-5):(0.5-2); more preferably 1:4:4:1.

2. A kit, which comprises the nucleic acid probe combination according to claim 1; preferably, the kit further comprises one or more selected from the group consisting of a reagent capable of cleaving the ligation between the first moiety and the linker, a buffer for dissolving the nucleic acid probe, and a sequencing primer; preferably, the reagents in the kit are free of silver ion.

3. The kit according to claim 2, wherein the reagent capable of cleaving the ligation between the first moiety and the linker is an endonuclease, an organic phosphide, or a complex of PdCl.sub.2 and sulfonated triphenylphosphine.

4. A ligation solution, which comprises the nucleic acid probe combination according to claim 1, and a DNA ligase.

5. The ligation solution according to claim 4, characterized in any one or more of the following (1) to (5): (1) the DNA ligase is one or more selected from the group consisting of T4 DNA ligase, T7 DNA ligase, and T3 DNA ligase; (2) the concentration of the nucleic acid probe is 0.1 ?M to 5 ?M, preferably 1 ?M; (3) the concentration of the DNA ligase is 0.01 ?M to 2 ?M, preferably 0.5 ?M; (4) further comprising the following components: 50 mM CH.sub.3COOK, 20 mM Tris, 10 mM Mg(CH.sub.3COO).sub.2, 100 ?g/ml BSA, 1 mM ATP, 10% PEG6000; (5) the rest of the ligation liquid is water.

6. A method for sequencing nucleic acid, comprising the following steps: (1) hybridizing a sequencing primer to a nucleic acid molecule to be tested; (2) ligating the nucleic acid probe combination according to claim 1 to the sequencing primer; (3) eluting the nucleic acid probe that has not bound to the nucleic acid molecule to be tested; (4) detecting the detectable label of the nucleic acid probe binding to the nucleic acid molecule to be tested, and determining the base information of the first moiety; (5) cleaving the ligation between the first moiety of the nucleic acid probe and the linker, and eluting the rest of the nucleic acid probe except the first moiety; preferably, further comprising the following steps: (6) repeating the above steps (2) to (4) or (2) to (5).

7. The kit according to claim 3, wherein the endonuclease is endonuclease IV or endonuclease V.

8. The kit according to claim 3, wherein the organic phosphide is THPP or TCEP.

9. The nucleic acid probe combination according to claim 1, wherein the first moiety is located at the 5-terminal or the 3-terminal.

10. The nucleic acid probe combination according to claim 1, wherein the bases of the second moiety are 3 to 15 bases, preferably 5 to 12 bases, and more preferably 5 to 10 bases, particularly preferably 6 to 9 bases.

11. The nucleic acid probe combination according to claim 1, wherein the detectable label is a fluorophore; preferably one or more selected from the group consisting of cy3, cy5, Texas Red, 6-FAMTM, AF532, AF647 and AF688; preferably, the detectable label is ligated to the second moiety; preferably, the detectable label is ligated to 3-OH at the end of the second moiety; preferably, the detectable label is ligated to 3-OH at the end of the second moiety via a phosphoester bond.

12. The nucleic acid probe combination according to claim 1, wherein the linker is selected from the group represented by the following Formula IV to Formula IX: ##STR00035## wherein, in Formula IV, R.sup.1 is selected from a group consisting of H, OH, C.sub.1-C.sub.6 alkyl, C.sub.2-C.sub.6 alkenyl, and C.sub.2-C.sub.6 alkynyl; R.sup.2 is selected from a group consisting of H, OH, F, Cl, and Br.

13. The nucleic acid probe combination according to claim 1, wherein the molar ratio of the first group of nucleic acid probes:the second group of nucleic acid probes:the third group of nucleic acid probes:the fourth group of nucleic acid probes is 1:4:4:1.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 shows a schematic diagram of excising AP site-containing probes under the action of endonuclease IV.

SPECIFIC MODELS FOR CARRYING OUT THE INVENTION

(2) The embodiments of the present invention will be described in detail below with reference to examples, but those skilled in the art will understand that the following examples are only used to illustrate the present invention and should not be considered as limiting the scope of the present invention. If the specific conditions are not indicated in the examples, the conventional conditions or the conditions recommended by the manufacturer are used. If the reagents or instruments used are not specified by the manufacturer, they are all conventional products that are commercially available.

(3) The nucleic acid probes used in the following examples can be synthesized according to the methods known in the art, and unless otherwise specified, they were synthesized by a commissioned commercial company, such as Heya Medical Technology (Shanghai) Co., Ltd. or Biotech Biotechnology (Shanghai) Co., Ltd.

Example 1: Sequencing Application of AP Site Reversible Ligation Probe (6 Random Bases)

(4) 1. Instruments and Reagents

(5) The instrument was based on a BGISEQ-500 platform. Theoretically, other sequencing platforms (such as Illumina's Hiseq platform, etc.) could be appropriately adjusted to perform the experiments the same as or similar to those in this example.

(6) In addition, in order to enable the application on the BGISEQ-500 platform, the selected modified dye had absorption and emission wavelengths similar to those of the dye used by the BGISEQ-500 reagent, so that it could be well detected by the BGISEQ-500 optical system.

(7) Some of the reagents used in this experiment were completely the same as those of BGISEQ-500, including the photographic buffer reagent and elution buffer 2 used in this experiment.

(8) Some reagents used in this experiment were different from those of BGISEQ-500, including: a ligation solution containing the probes, enzymes and buffers of this example was used to replace the probe polymerization reaction solution in BGISEQ-500, and the excision buffer of this experiment was used to replace the excision buffer of BGISEQ-500.

(9) The experimental sample was the genomic DNA of E. coli, which was a standard sample carried by BGISEQ-500.

(10) According to the manufacturer's instructions, MGIEasy? DNA library preparation kit (Shenzhen Huada Zhizao Technology Co., Ltd.) was used to extract DNA from E. coli standard strains as raw materials for preparing a library for sequencing, and the library was loaded on a sequencing chip.

(11) 2. Design and Synthesis of Probes

(12) Four groups of AP site reversible ligation probes were as follows (x=6):

(13) The first group (A probes): the first moiety, i.e., the sequencing base was A, and the second moiety was 6 random bases.

(14) ##STR00011##

(15) The second group (T probes): the first moiety, i.e., the sequencing base was T, and the second moiety was 6 random bases.

(16) ##STR00012##

(17) The third group (C probes): the first moiety, i.e., the sequenced base was C, and the second moiety was 6 random bases.

(18) ##STR00013##

(19) The fourth group (G probes): the first moiety, i.e., the sequencing base was G, and the second moiety was 6 random bases.

(20) ##STR00014##

(21) Biosynthetic Engineering (Shanghai) Co., Ltd. was commissioned to synthesize the above probes.

(22) The above 4 groups of probes and T4 DNA ligase were dissolved in in the following buffer: 50 mM CH.sub.3COOK, 20 mM Tris, 10 mM Mg(CH.sub.3COO).sub.2, 100 ?g/ml BSA, 1 mM ATP, 10% PEG6000.

(23) A ligation solution was obtained.

(24) In the ligation solution, the concentration of the probes was 1 ?M, in which the molar ratio of A probes:T probes:C probes:G probes was approximately 1:4:4:1.

(25) In the ligation solution, the concentration of the DNA ligase was 0.5 ?M.

(26) 3. Sequencing Steps

(27) Referring to the instruction manual of BGISEQ-500, the following preliminary preparations were performed: library construction, a DNA single-stranded loop was amplified into DNA nanospheres, the DNA nanospheres were loaded on the chip carried by BGISEQ-500, and the sequencing primer was loaded on the DNA nanospheres.

(28) A ligation solution containing the above four kinds of probes, a T4 DNA ligase and a buffer was added by using an instrument, and the ligation reaction was performed at 25? C. for 30 minutes;

(29) The elution reagent 2 was used to elute the probes that were not ligated;

(30) Then a photographic buffer was added for image acquisition (photographing); and a software was used to analyze the base information of each DNA nanosphere site;

(31) After taking the picture, endonuclease IV (New England Biolabs, article number M0304L) and buffer thereof were added, and the reaction was performed at 37? C. for 5 minutes to excise the AP site, and then the elution reagent 2 was added to elute the excised moiety of the probe;

(32) The 4 groups of probes could be added repeatedly to perform the next cycle of sequencing.

(33) 4. Experimental Results

(34) The result was completely consistent with the sequence of the standard sample carried by BGISEQ-500, indicating that the sequencing method of the present invention was accurate.

Example 2: Sequencing Application of AP Site Reversible Ligation Probe (3 Random Bases+3 Universal Bases)

(35) This example was performed by substantially the same method as that in Example 1, except that the following 4 groups of probes were used:

(36) The first group (A probes): the first moiety, i.e., the sequencing base was A, and the second moiety was 3 random bases+3 universal bases.

(37) ##STR00015##

(38) The second group (T probes): the first moiety, i.e., the sequencing base was T, and the second moiety was 3 random bases+3 universal bases.

(39) ##STR00016##

(40) The third group (C probes): the first moiety, i.e., the sequenced base was C, and the second moiety was 3 random bases+3 universal bases.

(41) ##STR00017##

(42) The fourth group (G probes): the first moiety, i.e., the sequencing base was G, and the second moiety was 3 random bases+3 universal bases.

(43) ##STR00018##

(44) The result was completely consistent with the sequence of the standard sample carried by BGISEQ-500, indicating that the sequencing method of the present invention was accurate.

Example 3: Sequencing Application of AP Site Reversible Ligation Probe (7 Random Bases)

(45) This example was performed by substantially the same method as that in Example 1, except that the 4 groups of AP site reversible ligation probes were used, where x=7.

(46) The result was completely consistent with the sequence of the standard sample carried by BGISEQ-500, indicating that the sequencing method of the present invention was accurate.

Example 4: Sequencing Application of AP Site Reversible Ligation Probe (8 Random Bases)

(47) This example was performed by substantially the same method as that in Example 1, except that the 4 groups of AP site reversible ligation probes were used, where x=8.

(48) The result was completely consistent with the sequence of the standard sample carried by BGISEQ-500, indicating that the sequencing method of the present invention was accurate.

Example 5: Sequencing Application of AP Site Reversible Ligation Probe (9 Random Bases)

(49) This example was performed by substantially the same method as that in Example 1, except that the 4 groups of AP site reversible ligation probes were used, where x=9.

(50) The result was completely consistent with the sequence of the standard sample carried by BGISEQ-500, indicating that the sequencing method of the present invention was accurate.

Example 6: Sequencing Application of Chemical Group (Azide) Reversible Ligation Probe (6 Random Bases)

(51) This example was performed by substantially the same method as that in Example 1, except the following differences:

(52) 1. Instruments and Reagents

(53) The same as those in Example 1.

(54) 2. Design and Synthesis of Probes

(55) The following 4 groups of probes were used, where x=6:

(56) The first group (A probes): the first moiety, i.e., the sequencing base was A, and the second moiety was 6 random bases.

(57) ##STR00019##

(58) The second group (T probes): the first moiety, i.e., the sequencing base was T, and the second moiety was 6 random bases.

(59) ##STR00020##

(60) The third group (C probes): the first moiety, i.e., the sequenced base was C, and the second moiety was 6 random bases.

(61) ##STR00021##

(62) The fourth group (G probes): the first moiety, i.e., the sequencing base was G, and the second moiety was 6 random bases.

(63) ##STR00022##

(64) The above 4 groups of probes and T4 DNA ligase were dissolved in the following buffer: 50 mM CH.sub.3COOK, 20 mM Tris, 10 mM Mg(CH.sub.3COO).sub.2, 100 ?g/ml BSA, 1 mM ATP, 10% PEG6000.

(65) A ligation solution was obtained.

(66) In the ligation solution, the concentration of the probes was 1 ?M, in which the molar ratio of A probes:T probes:C probes:G probes was approximately 1:4:4:1.

(67) The concentration of T4 DNA ligase in the ligation solution was 0.5 ?M.

(68) 3. Sequencing Steps

(69) Referring to the instruction manual of BGISEQ-500, the following preliminary preparations were completed: library construction, a DNA single-stranded loop was amplified into DNA nanospheres, the DNA nanospheres were loaded on the chip carried by BGISEQ-500, and sequencing primers was loaded on the DNA nanospheres.

(70) A ligation solution containing the above 4 kinds of probes, a T4 DNA ligase and a buffer was added by using an instrument, and the ligation reaction was performed at 25? C. for 30 minutes;

(71) The elution reagent 2 was used to elute the probes that were not ligated;

(72) Then a photographic buffer was added for image acquisition (photographing); and a software was used to analyze the base information of each DNA nanosphere site;

(73) After taking the picture, an excision reagent (whose composition was: 10 Mm THPP, 200 Mm tris, pH=9 buffer, 0.5M sodium chloride) was used to perform the excision reaction at 60? C. for 3 minutes, and then the elution reagent 2 was added to elute the excised moiety of the probe;

(74) The 4 groups of probes could be added repeatedly to perform the next cycle of sequencing.

(75) 4. Experimental Results

(76) Due to the small genome of E. coli, the inventors herein performed 30 cycles of sequencing, and the results were analyzed with the sequencing analysis software of BGISEQ-500. The results were shown in Table 1 below.

(77) TABLE-US-00001 TABLE 1 Reference Genome (Reference) E. coli Number of cycles (CycleNumber) 30 Photographing area (number of areas) 1632 Total reads (TotalReads) 352.89M Mapped reads (MappedReads) 293.01M .sup.a Q30 77.5% Lagging phase (Lag) 0.78% Leading phase (Runon) 0.39% Effective reads ratio (ESR) 80.73% Mapping rate (MappingRate) 83.3% .sup.b Error rate 1.83% .sup.a Q30 indicates the probability that a base is mismeasured is 0.1%, that was, the accuracy is 99.9%; Q30 is 77.5%, which means that the accuracy of the base call (77.5%) reaches 99.9%. .sup.b it is an average error rate.

(78) The results showed that the method of the present invention had a Q30 of 77.5%, the error rate was only 1.83%, and the reads could reach at least 30 bases. In addition, the cost of the present invention was significantly lower than that of the existing sequencing-by-ligation methods because that the number of probes was reduced and the replacement with new primers was not necessary.

Example 7: Sequencing Application of Chemical Group (Allyl) Reversible Ligation Probe (6 Random Bases)

(79) This example was performed by substantially the same method as that in Example 1, except that the following 4 groups of chemical groups group (allyl) reversible ligation probes were used:

(80) The first group (A probes): the first moiety, i.e., the sequencing base was A, and the second moiety was 6 random bases.

(81) ##STR00023##

(82) The second group (T probes): the first moiety, i.e., the sequencing base was T, and the second moiety was 6 random bases.

(83) ##STR00024##

(84) The third group (C probes): the first moiety, i.e., the sequenced base was C, and the second moiety was 6 random bases.

(85) ##STR00025##

(86) The fourth group (G probes): the first moiety, i.e., the sequencing base was G, and the second moiety was 6 random bases.

(87) ##STR00026##

(88) The result was completely consistent with the sequence of the standard sample carried by BGISEQ500, indicating that the sequencing method of the present invention was accurate.

Example 8: Sequencing Application of Chemical Group (Cyanovinyl) Reversible Ligation Probe (6 Random Bases)

(89) This example was performed by substantially the same method as that in Example 1, except that the following 4 groups of chemical groups group (cyanovinyl) reversible ligation probes were used:

(90) The first group (A probes): the first moiety, i.e., the sequencing base was A, and the second moiety was 6 random bases.

(91) ##STR00027##

(92) The second group (T probes): the first moiety, i.e., the sequencing base was T, and the second moiety was 6 random bases.

(93) ##STR00028##

(94) The third group (C probes): the first moiety, i.e., the sequenced base was C, and the second moiety was 6 random bases.

(95) ##STR00029##

(96) The fourth group (G probes): the first moiety, i.e., the sequencing base was G, and the second moiety was 6 random bases.

(97) ##STR00030##

(98) The result was completely consistent with the sequence of the standard sample carried by BGISEQ-500, indicating that the sequencing method of the present invention was accurate.

Example 9: Sequencing Application of Chemical Group (Inosine) Reversible Ligation Probe (6 Random Bases)

(99) This example was performed by substantially the same method as that in Example 1, except that the following 4 groups of chemical groups group (Inosine) reversible ligation probes were used:

(100) The first group (A probes): the first moiety, i.e., the sequencing base was A, and the second moiety was 6 random bases.

(101) ##STR00031##

(102) The second group (T probes): the first moiety, i.e., the sequencing base was T, and the second moiety was 6 random bases.

(103) ##STR00032##

(104) The third group (C probes): the first moiety, i.e., the sequenced base was C, and the second moiety was 6 random bases.

(105) ##STR00033##

(106) The fourth group (G probes): the first moiety, i.e., the sequencing base was G, and the second moiety was 6 random bases.

(107) ##STR00034##

(108) The result was completely consistent with the sequence of the standard sample carried by BGISEQ-500, indicating that the sequencing method of the present invention was accurate.

(109) Although the specific embodiments of the present invention have been described in detail, those skilled in the art will understand that according to all the teachings that have been disclosed, various modifications and substitutions can be made to those details, and these changes are all within the protection scope of the present invention. The full scope of the invention is given by the appended claims and any equivalents thereof.