A METHOD FOR GENERATING RANDOM OLIGONUCLEOTIDES AND DETERMINING THEIR SEQUENCE
20220010306 · 2022-01-13
Inventors
Cpc classification
C12N15/111
CHEMISTRY; METALLURGY
C07H21/00
CHEMISTRY; METALLURGY
C12Q2525/179
CHEMISTRY; METALLURGY
C12Q1/6806
CHEMISTRY; METALLURGY
C12Q2525/179
CHEMISTRY; METALLURGY
C12Q1/6806
CHEMISTRY; METALLURGY
C12P19/34
CHEMISTRY; METALLURGY
International classification
C12N15/11
CHEMISTRY; METALLURGY
C07H21/00
CHEMISTRY; METALLURGY
Abstract
Random oligonucleotides are generated with incomplete information about the sequence of the nucleic acid bases present in the newly generated molecules. The sequences of the oligonucleotides are subsequently determined and then these oligonucleotides can be processed for various potential uses.
Claims
1. A method of generating an oligonucleotide, the method comprising: a. generating at least one molecule comprising nucleotides by adding at least one nucleotide at random to the molecule, wherein the molecule generated is a random oligonucleotide; b. determining the sequence of the random oligonucleotide; and c. selecting random oligonucleotides using certain characteristics of the random oligonucleotides.
2. The method of claim 1 wherein the random oligonucleotides are generated using phosphoramidite chemistry.
3. The method of claim 1, wherein the random oligonucleotides are generated using an enzymatic process.
4. The method of claim 1, wherein the random oligonucleotides are generated within a microwell.
5. The method of claim 1, wherein the random oligonucleotides are generated on a microarray.
6. The method of claim 1 wherein the characteristic used to select the random oligonucleotide is a specific sequence of nucleotides.
7. The method of claim 1 wherein the characteristic to select the random oligonucleotide is a size of the random oligonucleotide.
8. The method of claim 1 wherein the random oligonucleotides are generated on oligonucleotides having a sequence that is at least partially known.
9. The method of claim 3, wherein an indicator molecule becomes reactive after a nucleotide is added to the molecule.
10. The method of claim 1 wherein adding a nucleic acid base to the molecules is partially directed.
11. The method of claim 1, wherein microfluids are used to control reaction conditions.
12. The method of claim 1 wherein directed energy is used to control the reaction conditions.
13. The method of claim 1 where the properties of the random oligonucleotide are measured using a nanopore.
14. The method of claim 1 wherein the selected oligonucleotides are prepared for a specific use.
15. The method of claim 1 wherein the random oligonucleotide is measured so the random oligonucleotide can be identified using a similar or identical measuring technique.
16. A method of generating oligonucleotides, the method comprising: a. generating a random oligonucleotide by adding one nucleotide to another nucleotide at random; b. combining a first random oligonucleotide with another random oligonucleotide, wherein the first random oligonucleotide is combined to the other random oligonucleotide at random; c. screening the random oligonucleotides for certain characteristics; d. removing random oligonucleotides with certain characteristics; and e. combining the remaining random oligonucleotides at random.
17. The method of claim 16 further comprising repeating any of steps c-e to create random oligonucleotides with a desired characteristic.
18. The method of claim 16 wherein the characteristics used to screen the random oligonucleotides include the size of the random oligonucleotide.
19. The method of claim 16 wherein the characteristics used to screen the random oligonucleotides include the nucleotide sequence of the random oligonucleotide.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0024]
[0025]
[0026]
[0027]
DETAILED DESCRIPTION
[0028] Various embodiments are described herein, and those skilled in the art will recognize the embodiments described herein are provided only as examples. Those skilled in the art can change, substitute and/or vary certain aspects of the invention without departing from the present invention. The present invention described herein is not limited to specific materials, reagents nor a specific process. The terminology used herein is used to describe aspects of the invention required for its implementation, and is not intended to be limiting.
[0029] The technology described herein references specific instances, often using the singular form “a”, “an” or “the”, but reference to these instances in the singular form does not limit the invention to applications in which these instances occur in isolation or alone. Those skilled in the art can determine how often these instances must occur for the application of the invention, and whether these instances occur in parallel or in tandem.
[0030] As used herein, “random” or “randomly-generated” when referring to a molecule, refers to a molecule, such as an oligonucleotide and/or polynucleotide, which has a random nucleotide sequence since the molecule was generated by adding a nucleotide to the molecule at random.
[0031] As used herein, the term “oligonucleotide” refers to a molecule comprising a sequence of nucleotides, or nucleotide bases, which are linked together by some form of sugar phosphate backbone. The number of nucleotides linked together can be any number, as small as two and as large as one thousand or more. As used herein, the term “oligonucleotide” is interchangeable with the term “polynucleotide”. The exact length of the nucleic acid polymer to which the term “oligonucleotide” or “polynucleotide” refers can be determined by those skilled in the art.
[0032] As used herein, the term “signature” refers to any chemical, biological or physical measurement that can be made of a molecule, which can then be used to measure that molecule in the future, with or without perfect identification.
[0033] Current methods of nucleotide synthesis have at least two qualities. First, a known sequence of nucleic acids is used to generate the desired oligonucleotide. The newly generated oligonucleotide is typically desired to be created with a very high accuracy, e.g. above 99% accuracy. Second, the generation of the oligonucleotide is generally desired to occur en masse, that is, many identical oligonucleotides are generated at the same time.
[0034] According to the present invention described herein, the sequence of a random oligonucleotide does not necessarily need to be known prior to generation the random oligonucleotide, provided that its sequence can be determined prior to its use. Furthermore, for some applications, such as, but not limited, to molecular cryptography as described in PCT/US17/058076, PCT Application Publication No. WO 2018/081113 to Sawaya, filed 24 Oct. 2017, which is incorporated herein by reference, the exact sequence of the oligonucleotide need not be known perfectly, nor in its entirety. For some applications, only sufficient knowledge of the sequence is required to differentiate it from other random oligonucleotides. In fact, the nucleotide sequence of the oligonucleotide need not be known, as along as a “signature” of the oligonucleotide is determined prior to its use, and the signature is sufficiently distinct enough to differentiate the oligonucleotide from other random oligonucleotides.
[0035] Uniqueness of the random oligonucleotide is also required for some applications. In these applications, such as molecular cryptography (PCT/US17/058076, PCT application Publication No. WO 2018/081113), having more than one oligonucleotide with the exact same sequence is less than ideal. Although some of these applications can tolerate the presence of non-unique oligonucleotides, there is little use in having many (e.g. thousands, tens-of-thousands, or millions or more) oligonucleotides with the exact same sequence. Hence, the present invention takes a unique approach, and resolves unique challenges, in comparison to contemporary methods of nucleotide synthesis.
[0036] The present invention generates unique, random oligonucleotides by: a) generating at least one molecule comprising nucleic acids by adding at least one nucleotide to the molecule at random, wherein the molecule generated is a random oligonucleotide; b) determining the nucleotide sequence of the random oligonucleotide; and c) selecting random oligonucleotides using certain characteristics of the random oligonucleotides. In certain embodiments, selecting random oligonucleotides is part of processing the oligonucleotides to prepare them for various uses. In certain embodiments, the molecules are measured. In certain embodiments, the random oligonucleotide can be identified using a similar or identical measuring technique used to measure the oligonucleotide.
[0037] The synthesis of molecules and oligonucleotides to generate the molecules as described herein can be achieved by a range of methods known to those skilled in the art. These methods can include, but are not limited to, phosphoramidite chemistry and/or enzymatic-based synthesis.
[0038] A wide array of methods are available for oligonucleotide synthesis as described in Hughes and Ellington (2017) “Synthetic DNA synthesis and assembly: putting the synthetic in synthetic biology”. Perspect. Biol.; 9: a023812; and described in Kosuri and Church (2014) “Large-scale de novo DNA synthesis: technologies and applications.” Nature Methods; 11:499-507, each incorporated herein by reference. These methods can be utilized on their own, or combined, to generate random oligonucleotides. The methods for generating random oligonucleotides described herein are examples that can be utilized in this invention, but the present invention is not limited to the specific the oligonucleotide synthesis methods. As the state of the art of oligonucleotide synthesis develops, alternative methods of oligonucleotide synthesis can be utilized in the generation of random oligonucleotides in the invention.
[0039] Currently, large-scale, low-cost nucleotide synthesis occurs using phosphoramidite chemistry on microarray synthesizers. Synthesis with this method allows multiple oligonucleotides to be generated in parallel with high sequence accuracy. Details about these methods are described in Heller (2002) DNA microarray technology: devices, systems, and applications. Annu Rev Biomed Eng., 4:129-53; and LeProust et al., (2010) Synthesis of high-quality libraries of long (150mer) oligonucleotides by a novel depurination controlled process. Nucleic Acids Res. 38(8): 2522-40, each incorporated herein by reference.
[0040] In certain embodiments, oligonucleotides having a specific sequence are synthesized in parallel on a microarray or in microwells using phosphoramidite synthesis (with current techniques such as but not limited to LeProust above; and U.S. Pat. No. 6,458,583 to Bruhn et al.; U.S. Pat. No. 7,544,793 to Gao et al.; U.S. Pat. No. 9,555,388 to Banyai et al., each of which incorporated herein by reference), and these identical oligonucleotides serve as the substrate on which random oligonucleotides are then generated.
[0041] Referring to
[0042] The substrate oligonucleotides 102 can, in some embodiments, have properties that facilitate future use of the random oligonucleotides. For example, the substrate oligonucleotides 102 can comprise an appropriate sequence that can act as a primer for nucleotide amplification in a polymerase chain reaction (PCR) protocol. These substrate oligonucleotides 102 can also, in some embodiments, serve as indexes to help identify the oligonucleotides being generated. Furthermore, these substrate oligonucleotides 102 can also, in some embodiments, act as adapters, or serve other functions not limited to the functions discussed here. The generated random oligonucleotides 106 that are connected to the oligonucleotides of a known sequence can, in some embodiments, then act as substrates for the generation of oligonucleotides with a specific, known sequence 108. Following this, in some embodiments, more random oligonucleotides 106 can also be generated on this molecule, and in some embodiments, then more oligonucleotides of known sequence can be generated, and so forth in this manner, alternating between the generation of known and unknown, random sequence moieties until a desired molecule is generated.
[0043] After a group of identical oligonucleotides are synthesized in tandem on a microarray, to serve as substrates for synthesis of random oligonucleotides, in some embodiments, random oligonucleotide synthesis occurs upon these substrate oligonucleotides 102 using phosphoramidite synthesis. In contrast to traditional synthesis on microarrays, the random synthesis used to generate random oligonucleotides 106 is undirected, or in some embodiments, partially directed such that the exact sequence being generated is completely unknown, or when partially directed, partially unknown. As used herein, “partially directed”, refers to adding a selected nucleotide or oligonucleotide to an oligonucleotide in a manner that is not entirely random.
[0044] In some embodiments, random oligonucleotides are synthesized using a protein that enzymatically synthesizes DNA, such as a terminal deoxynucleotidyl transferase (TdT). In some embodiments, these protein enzymes are specifically designed to synthesize random oligonucleotides, so that they do not synthesize specific oligonucleotides dependent on any sequence moiety of the random oligonucleotide that has previously been generated by the enzyme.
[0045] In some embodiments, a random oligonucleotide is synthesized within a microwell in a system similar to, but not limited to, U.S. Pat. No. 9,845,501 to Williams; and U.S. Pat. No. 7,302,146 to Turner et al., each of which incorporated herein by reference. In such embodiments, the synthesis of the random oligonucleotide includes the use of labelling molecules, such as the labelling techniques discussed within U.S. Pat. No. 9,845,501 to Williams; and U.S. Pat. No. 8,580,539 to Korlach, each of which are incorporated by reference. The labelling of the molecules serve to determine the sequence of the random oligonucleotide being synthesized immediately after or shortly after a nucleotide is added to the random oligonucleotide. In these embodiments, the random oligonucleotide being generated by this process is only of an unknown sequence for a short period of time. The newly added nucleotide becomes known and the sequence of the oligonucleotide is thus determined. In these embodiments, the newly generated oligonucleotide becomes the substrate oligonucleotide on which another random oligonucleotide or nucleotide is chemically bonded. The sequence is then determined after a random oligonucleotide or a nucleotide is added. This continues in this manner until a random oligonucleotide of approximately-known length and approximately-known or fully-known sequence has been generated. In such embodiments, the methods alternate between (a) randomly generating an oligonucleotide, and (b) determining the sequence of that oligonucleotide. The method for determining the sequence of the oligonucleotide in such embodiments is further detailed below discussing methods for determining sequence composition of the randomly-generated oligonucleotides.
[0046] An entirely undirected synthesis of oligonucleotides can lead to some random oligonucleotides being generated that may be problematic for the future use of these oligonucleotides. For example, homopolymer runs, such as oligonucleotides with a “CCCC” sequence of nucleotides, can lead to non-specific annealing and/or problems with sequencing on some sequencing platforms (see Xu et al, (2009) Design of 240,000 orthogonal 25mer DNA barcode probes, PNAS, Vol. 107 No. 7 pp. 2289-2294). To avoid these and other less-than-ideal random oligonucleotides prior to their synthesis, reagents used in the reactions can be partially directed in certain embodiments, using ink-jet technology, for example as described in U.S. Pat. No. 6,221,653 to Caren et al.; U.S. Pat. No. 6,476,215 to Okamoto et al; and U.S. Pat. No. 6,077,674 to Schleifer et al.; and U.S. Pat. No. 7,572,907 to Dellinger et al., each incorporated herein by reference.
[0047] In certain embodiments, the reactions can also be partially directed using micromirrors to regulate light-controlled reactions, such as those described in U.S. Pat. No. 6,545,758 to Sandstrom and U.S. Pat. No. 7,157,229 to Cerrina et al. each incorporated herein by reference. An example of partial control over random synthesis of oligonucleotides would be to limit the quantity of a given nucleotide base. For example, if a run of homopolymer cytosines were less-than-ideal for a given application, then nucleotide bases other than cytosine could be favorably directed, using, for example, ink-jet technology, in the synthesis of the random oligonucleotide. Furthermore, light controlled reactions can be modified with micromirrors. Using light controlled reactions, specific nucleic acid bases can be favored or disfavored depending on the nucleotide bases that are estimated or known to be present in the solution in proximity to the reaction.
[0048] In certain embodiments, after a random oligonucleotide is generated, a non-random oligonucleotide or nucleotide is attached to the random oligonucleotide, either by directly generating on the random oligonucleotide, and/or attaching a non-random oligonucleotide to the random oligonucleotide by ligation.
[0049] In certain embodiments, random oligonucleotides are generated in a multi-step method. The method comprises generating at least one shorter random oligonucleotide. The shorter random oligonucleotides are then combined to generate longer random oligonucleotides. The longer random oligonucleotides can be then can be combined to generate even longer oligonucleotides. The random nucleotides are combined, for example, through ligation using an enzyme or molecular bonding directed by phosphoramidite chemistry, to generate longer oligonucleotides. In certain embodiments, a filtration step is used to remove unwanted random oligonucleotides at one or more stages of the random oligonucleotide generation. In certain embodiments, the multi-step method comprises: generating a random oligonucleotide by adding one nucleotide to another nucleotide at random; combining a first random oligonucleotide to another random oligonucleotide at random; screening the random oligonucleotides for certain characteristics; removing random oligonucleotides with certain characteristics; and combining the remaining random oligonucleotides at random. The screening, removing, and combining steps can be repeated to create random oligonucleotides with desired characteristics.
[0050] Referring to
[0051] A four-mer 203 is an oligonucleotide comprising four nucleotides. A four-mer oligonucleotide has 16.sup.2 combinations of nucleic acid bases. If four-mer oligonucleotides are generated by combining dimer oligonucleotides, the four-mer oligonucleotides can, for example, be screened for homopolymer oligonucleotides in certain embodiments. A homopolymer oligonucleotide comprises a single type of nucleotide base, e.g. AAAA, TTTT, CCCC, and GGGG. This screening can occur by a process such as, but not limited to, washing the four-mer oligonucleotides over a microarray of oligonucleotides with terminal sequences that are themselves homopolymer oligonucleotides, thus complimentary to the less-than-ideal, randomly-generated homopolymer oligonucleotides. In certain embodiments, the dimers are random dimers and the four-mers are random four-mers.
[0052] In certain embodiments, screening for the unwanted oligonucleotides occurs during the generation of the random oligonucleotides. Screening occurs by using oligonucleotides that have complimentary sequences to the unwanted oligonucleotides, as filters that bind to the unwanted oligonucleotides to remove them from a solution of random oligonucleotides.
[0053] An exemplary embodiment, as seen in
[0054] In certain embodiments, random oligonucleotides are chemically bonded with oligonucleotides of a known sequence. These known oligonucleotides can, in some embodiments, be linked to the 3′ and/or 5′ end of the random oligonucleotides. In some embodiments, this process occurs multiple times, resulting in oligonucleotides that contain a combination of random oligonucleotides and oligonucleotides of a known sequence.
[0055] In some embodiments, after a random oligonucleotide has been generated, they can be filtered for unwanted oligonucleotides. The filtering for unwanted oligonucleotides can occur before or after the random nucleotide has been linked physically with other oligonucleotides. The other oligonucleotides can have a known sequence, a partially random sequence, or entirely unknown sequence. This filtration can occur by removing unwanted oligonucleotides. Unwanted oligonucleotides can be removed, for example, by filtering the generated oligonucleotides based on size, using for example, column-based size separation techniques.
[0056] In certain embodiments, removing unwanted oligonucleotides can occur in combination with other filtration processes, such as filtering for unwanted sequences. Filtering for unwanted sequences can be achieved by any of the following methods or combination thereof: binding of unwanted oligonucleotides to complimentary oligonucleotides bound to a surface or bead, and subsequently washing the desired oligonucleotides away from the unwanted oligonucleotides; binding of unwanted oligonucleotides to complimentary oligonucleotides, which are bound to proteins which can facilitate degradation of unwanted oligonucleotides; by using restriction endonucleases to cleave unwanted oligonucleotides, thus reducing their size allowing for subsequent filtration by size to remove the unwanted oligonucleotides; and/or using any other method that is known by those skilled in the art to filter out oligonucleotides having unwanted sequences.
[0057] To determine the nucleotide sequence of the generated random oligonucleotides, a range of sequencing techniques can be used, individually or in combination. Those skilled in the art can determine the best, preferred sequencing technique(s), and as sequencing technology advances, those techniques may be utilized for determining sequences according to the present invention. Importantly, sequencing techniques that are used to determine the nucleotide sequences of the random oligonucleotides must either allow the random oligonucleotides to be collected for use after they have been sequenced, or alternatively, generate a copy or complimentary oligonucleotide that can be collected after the random oligonucleotide has been sequenced. This requirement allows the generated random oligonucleotides to be used after their sequence has been determined.
[0058] In certain embodiments, the random oligonucleotide must be processed prior to the determination of its sequence. This processing can include, but is not limited to, addition of nucleotides to the random oligonucleotide, the removal of specific oligonucleotides based on their size or sequence, and/or changing the solution in which the random oligonucleotides exist, as those skilled in the art can appreciate. Specific methods for processing random oligonucleotides to prepare them for sequencing depend on the sequencing method to be used. As methods for sequencing chains of nucleic acids change, processing the random oligonucleotides to prepare them for sequencing may also change, and those skilled in the art can determine the appropriate methods for processing the random oligonucleotides to prepare them for sequencing.
[0059] In certain embodiments, random oligonucleotides need to be collected after sequencing. A potential option for sequencers that allow for the collection of nucleotides after sequencing are the Pacific Biosciences single-molecule real-time sequencers, for example discussed in Eid et al. (2009) “Real-Time DNA Sequencing from Single Polymerase Molecules”, Science 323(5910):133-38. The sequencers described therein observe a polymerase replicating a polynucleotide in real-time using indicator molecules that can uniquely identify nucleotides as they are incorporated into a polynucleotide as it is being replicated.
[0060] The result is less-than-ideal for some situations for which the invention can be used, as extraction of these polynucleotides after they have been sequenced will result in more than one copy of the polynucleotides that were originally sequenced. For example, if the randomly-generated oligonucleotides are to be used in molecular cryptography, such as described in Application PCT/US17/058076 WO Publication WO 2018/081113, then duplicate oligonucleotides are less than ideal. Although the synthesized polynucleotide could, in some embodiments, be separated from the original polynucleotides using an attachment of a bead to the polynucleotide or oligonucleotide being sequenced, other approaches may provide a better yield.
[0061] In some embodiments, an approach to separate the synthesized polynucleotide from the original polynucleotide would be to have randomly-generated a single-stranded polynucleotide or oligonucleotide, or partially single-stranded polynucleotide. In such embodiments the copy of the molecule generated during sequencing would be the compliment to the randomly-generated oligonucleotide. Assuming that the molecule being sequenced does not contain a hair-pin loop (such as is currently used in Pacific Biosciences sequence preparation, U.S. Pat. No. 9,404,146 to Travers et al., each incorporated herein by reference), the newly synthesized polynucleotide and the original, complimentary molecule would be a randomly-generated double-stranded oligonucleotide with a known sequence that can be extracted from the sequencer.
[0062] In certain embodiments, sequencing technology can be utilized to directly generate a random oligonucleotide and then immediately determine its sequence. Sequencing technology observes the reaction between a polymerase and polynucleotide by observing indicator molecules, which indicate that a given nucleotide base has been incorporated into the polynucleotide to be sequenced. In certain embodiments, the reaction is ideally directed by an enzyme, such as a TdT, that extends the oligonucleotide that is being generated randomly.
[0063] Referring to
[0064] Referring to
[0065] Referring to
[0066] In some embodiments, the addition of a new nucleotide to the oligonucleotide is added at random because the solution in the proximity of the enzyme contains an approximately equal quantity of various nucleotide bases. In other embodiments, the solution surrounding the enzyme that is synthesizing the random oligonucleotide does not contain an equal proportion of nucleotide bases and thus the composition of the oligonucleotide, while still random, is not expected to have equal proportions of each nucleotide base. In some embodiments, the solution surrounding the enzyme is directed to contain specific ratios of nucleotide bases in order to direct the generation of the random oligonucleotide. This direction can, for example, favor the incorporation of certain nucleotide bases over others to control the composition of the random oligonucleotide.
[0067] Another sequencing technology that has the ability to recover an oligonucleotide after its sequence has been determined are nanopore sequencers, such as those described in Loman and Watson (2015) “Successful test launch for nanopore sequencing” Nature Methods volume 12, pages 303-304, incorporated herein by reference. In certain embodiments, the random oligonucleotide is generated in its entirety, and then the sequence is determined using a nanopore sequencer that directly determines the random oligonucleotide's sequence after which the oligonucleotide is recovered.
[0068] In other embodiments, nanopore technology is used to observe the incorporation of a random nucleotide to an oligonucleotide by using an indicator molecule, similar but not identical to Fuller et al. (2016), “Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array,” PNAS, May 10, 2016, Vol. 113 No. 19, pp. 5233-5238 (available at www.pnas.org/cgi/doi/10.1073/pnas.1601782113) incorporated by reference herein.
[0069]
[0070] Referring to
[0071] Referring to
[0072] In some embodiments, the nucleotide bases in the reaction solution need to be regulated to ensure that the sequence of oligonucleotide being generated is entirely random. With an entirely random sequence, the presence of one nucleotide base at any given position in the random oligonucleotide cannot be used to predict the presence or absence of a nucleotide base at another position in the random oligonucleotide.
[0073] In embodiments where the sequence of the oligonucleotide is determined immediately after or shortly after the oligonucleotide has been generated, or determined when the oligonucleotide is currently being generated, the solution around this generation may, in some embodiments, be controlled. In some embodiments, a microfluidic control mechanism is used, in which microchannels regulate the reactive solution, allowing the solution to flow across the reactive area or microwell. In some embodiments, ink-jet technology is utilized to regulate the reactive solution in real-time. In some embodiments, a flow of solution containing nucleotide bases in a specific concentration is washed across the reactive surface or microwell. In some embodiments, directed energy, such as heat or light-based reactions are controlled using micromirrors and/or other forms of control over the direction of energy that influences the reaction.
[0074] In some embodiments, the exact sequence of the random oligonucleotide is not determined with certainty. Knowledge of the exact sequence of the random oligonucleotide may not be necessary for some uses of the random oligonucleotide. For some uses of the random oligonucleotide, imperfect information about the sequence of a given random oligonucleotide is sufficient to differentiate that oligonucleotide from other randomly-generated oligonucleotides. Therefore, in some embodiments, partial sequence information, or partially inaccurate sequence information is obtained from the randomly-generated oligonucleotides. In some embodiments, a “signature” of the random oligonucleotide is obtained from a sequencer, which can then be used to identify a randomly-generated oligonucleotide and/or differentiate it from other random oligonucleotides. This signature can be, for example but not limited to, the electrical signal obtained from passage of the random oligonucleotide through a pore embedded in a membrane, or the kinetic signature of the molecule obtained from its interaction with another molecule or enzyme. The signature obtained from a given sequencing method can then be used to identify the random oligonucleotide. For example, the random oligonucleotide can be identified in a protocol where the random oligonucleotide is used as an identifier for another oligonucleotide that has been ligated to the random oligonucleotide, when sequencing the combined oligonucleotide on the same or similar sequencing platform.
[0075] Methods for Processing the Randomly-Generated Molecules.
[0076] The random oligonucleotides generated by the method of the present invention can be used in various technologies, such as molecular cryptography, (PCT Application No. PCT/US17/058076 or other methods that require molecular level identification such as methods described in U.S. Patent Pub. Nos. 2015/0211050; and 2015/0211061, each of which are incorporated herein by reference. The method of processing the randomly-generated molecules depends on the technology using the randomly-generated molecules.
[0077] In some embodiments, the random oligonucleotides are processed prior to the determination of their sequence by selecting certain oligonucleotides having certain characteristics. The processing allows efficient sequencing, and the nucleotide sequences generated by the sequencer are accurate and generated only for oligonucleotides that are of use. However, in some embodiments, it can also desired to also process the oligonucleotides after their sequence has been determined. This would be required when, for example, the sequence of the oligonucleotide being generated is determined immediately after or shortly after its generation. This may also be required when generating a large random oligonucleotide and its sequence is subsequently determined, and this large random oligonucleotide must be processed into smaller oligonucleotides prior to use.
[0078] In some embodiments, random oligonucleotides are processed by selecting for size, using a selection technique such as, but not limited to, column separation. Molecules of an unwanted size can be separated from molecules of a wanted size, choosing molecules that contain a specific number of nucleotide bases when desired.
[0079] In some embodiments, random oligonucleotides are screened for specific sequences by removing oligonucleotides that have less-than-ideal and/or unwanted sequences. In these embodiments, random oligonucleotides with complimentary nucleotide sequences to the unwanted oligonucleotides can be attached to a surface such as a microarray or through the use of magnetic beads attached to random oligonucleotides. The random oligonucleotide can then be introduced to this surface by washing them over the microarray, or introduced to a solution containing magnetic beads. The desired random oligonucleotides remain in solution after the solution has been washed over the microarray, or remain in solution when magnetic beads are removed. Those skilled in the art can determine the best techniques to screen for specific oligonucleotides sequences to avoid the presence of these unwanted molecules in a final mixture.
[0080] In some embodiments, restriction enzymes are used to digest random oligonucleotides that have a specific sequence. This digestion reduces the size of the oligonucleotides, breaking them into smaller oligonucleotides. These smaller molecules can be size selected and separated from the larger oligonucleotides.
[0081] In some embodiments, the random oligonucleotides are ligated and/or otherwise attached onto other oligonucleotides, using for example T-A ligation or any other method used by those skilled in the art. This ligation can be used to incorporate the random oligonucleotides into technology for other purposes, for example purposes disclosed in U.S. Patent Pub. Nos. 2015/0211050; and 2015/0211061.
[0082] In some embodiments, the generated random oligonucleotides are single-stranded. In some embodiments, the process by which the sequence of the random oligonucleotide is determined results in single-stranded random oligonucleotides. If the sequenced random oligonucleotides are entirely single-stranded and their desired use requires them to be double-stranded, then in some embodiments, a technique is used to generate a strand with a complimentary sequence. One such technique may be the use of a DNA polymerase along with random, degenerate primers to generate matching strands. Random primers are not always necessary. In some embodiments, the randomly-generated single-stranded oligonucleotides are attached to a double-stranded DNA molecule prior to or during the determination of their sequence. In certain embodiments, the randomly-generated single-stranded oligonucleotides are ligated to a double-stranded DNA molecule after their sequence has been determined. In some embodiments, the random oligonucleotides are attached to molecules with a known sequence moiety, for which specific primers can be designed to generate complimentary strands using a polymerase.
[0083] In some embodiments, after the random oligonucleotides are processed into their desired form, they can be packaged and this package can be labelled with the sequences present in the solution. In some embodiments, when the randomly-generated molecules are to be used in molecular cryptography (see PCT/US17/058076), the exact sequences and/or signatures of the molecules in the solution must be kept secure, and in these embodiments the information about the sequences present are not packaged with the solution of the molecules and can instead, be sent in a separate package and/or have the data file containing the sequence information be securely delivered to the appropriate parties.
[0084] Although the invention has been described in detail with reference to certain preferred embodiments, variations and modifications exist within the scope and spirit of one or more independent aspects of the invention as described.