In vitro selection with expanded genetic alphabets

09725713 · 2017-08-08

    Inventors

    Cpc classification

    International classification

    Abstract

    This invention provides for products and processes for binding to a preselected target, where the process involves contacting this target to an oligonucleotide molecule that contains one or more “non-standard” nucleotides, which are nucleotide analogs that, when incorporated into oligonucleotides (DNA or RNA, collectively xNA), present to a complementary strand in a Watson-Crick pairing geometry a pattern of hydrogen bonds that is different from the pattern presented by adenine, guanine, cytosine, and uracil. This disclosure provides an example where such an oligonucleotide molecule contains a single 2-amino-8-(1′-β-D-2′-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)one and a single 6-amino-5-nitro-3-(1′-β-D-2′-deoxyribofuranosyl)-2(1H)-pyridone, and where the target is a cell, and is obtained by a process of in vitro selection.

    Claims

    1. A process for identifying one or more oligonucleotides that bind to a target, wherein at least one nucleobase of each of said one or more oligonucleotides has a structure selected from the group consisting of ##STR00001## wherein R is the point of attachment of said nucleobase to said nucleotide, from a candidate mixture comprising single stranded oligonucleotides each having a region of randomized sequence, said process comprising: a) contacting the candidate mixture with the target in aqueous solution. b) separating the oligonucleotides having higher affinity for the target from oligonucleotides having lower affinity for the target; c) amplifying the oligonucleotides having higher affinity for the target to yield a mixture enriched in said oligonucleotides having higher affinity, and d) determining the sequence of one or more of said oligonucleotides, wherein said target is not a Watson-Crick complementary oligonucleotide.

    2. The process of claim 1, wherein steps a), b), and c) are repeated multiple times on successively enriched mixtures.

    3. The process of claim 1, wherein said nucleobase is ##STR00002## where R is the point of attachment of said nucleobase to the oligonucleotides at position 1′ of the ribose or 2′-deoxyribose of said oligonucleotide.

    4. The process of claim 1, wherein said nucleobase is ##STR00003## where R is the point of attachment of said nucleobase to the oligonucleotide at position 1′ of the ribose or 2′-deoxyribose of said oligonucleotide.

    5. The process of claim 1, wherein said nucleobase is ##STR00004## where R is the point of attachment of said nucleobase to the oligonucleotide at position 1′ of the ribose or 2′-deoxyribose of said oligonucleotide.

    6. The process of claim 1, wherein said nucleobase is ##STR00005## where R is the point of attachment of said nucleobase to the oligonucleotide at position 1′ of the ribose or 2′-deoxyribose of said oligonucleotide.

    7. The process of claim 1, wherein said nucleobase is ##STR00006## where R is the point of attachment of said nucleobase to the oligonucleotide at position 1′ of the ribose or 2′-deoxyribose of said oligonucleotide.

    8. The process of claim 1, wherein said nucleobase is ##STR00007## where R is the point of attachment of said nucleobase to the oligonucleotide at position 1′ of the ribose or 2′-deoxyribose of said oligonucleotide.

    9. The process of claim 1, wherein said nucleobase is ##STR00008## where R is the point of attachment of said nucleobase to the oligonucleotide at position 1′ of the ribose or 2′-deoxyribose of said oligonucleotide.

    10. The process of claim 1, wherein said target is on the surface of a cell.

    11. The process of claim 1, wherein said target is a protein.

    12. The process of claim 1, wherein said target is a small molecule.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    (1) FIG. 1 Watson-Crick pairing rules follow two rules of complementarity: (a) size complementarity (large purines pair with small pyrimidines) and (b) hydrogen bonding complementarity (hydrogen bond donors, D, pair with hydrogen bond acceptors A). Rearranging D and A groups on various heterocycles supports an artificially expanded genetic information system (AEGIS). AEGIS nucleobases can also be functionalized at the position indicated by the “R” in these structures. Thus, AEGIS offers a solution to the limitations of aptamers by increasing the number of building blocks, and functionalizing an expanded set of building blocks.

    DESCRIPTION OF INVENTION

    (2) Definition Oof Non-Standard Components of an Artificially Expanded Genetic Information System

    (3) This application teaches a distinction between the hydrogen-bonding pattern (in FIG. 1 nomenclature, pyDAD, for example) and the heterocycle that implements it. Thus, the pyADA hydrogen-bonding pattern is implemented by thymidine, uridine, and pseudouridine. The puDDA hydrogen bonding pattern is implemented by both the heterocycle isoguanosine and 7-deaz-isoguanosine. Heterocycles to implement any particular pre-selected hydrogen-bonding pattern are preferred depending on their chemical properties, for example, high chemical stability or low tautomeric ambiguity. The pyADA, pyDAA, puADD, and puDAD hydrogen bonding patterns are said to be “standard” hydrogen bonding patterns, and to form with their appropriate

    (4) U.S. Pat. No. 6,569,620: Method for the automated generation of nucleic acid ligands This invention covers a method and device for performing automated SELEX.

    (5) Each of these is incorporated in its entirety herein by reference. Each of these could also be applied to IVS based on AEGIS components, if only the steps not enabled in the art were to be enabled.

    (6) Processes that are Absent in the Prior Art

    (7) Missing from the art for standard IVS and all of its variants, and not obvious to those of ordinary skill in the art, are the remaining steps in the IVS process. Specifically:

    (8) (c) Absent from the art prior to the priority date of this application, PCR amplification using AEGIS components is not available. For the instant invention, AEGIS PCR amplification is made available in U.S. patent application Ser. No. 12/999138, having the title: Polymerase incorporation of non-standard nucleotides which is herein incorporated in its entirety by reference, with respect to pyDDA:puAAD pairs. This application provides for processes that PCR amplify DNA containing G, A, C, T, Z, and P nucleotides. AEGIS PCR amplification is made available in U.S. patent application Ser. No. 12/800826, which describes variants of isoguanine with lower amounts of minor tautomeric forms, which is herein incorporated in its entirety by reference, with respect to pyAAD:puDDA pairs.

    (9) (d) Also absent in the prior art are processes to do the repeated cycling needed to obtain useful aptamers and DNAzymes for selection survivors containing AEGIS components, as this requires PCR amplification of nucleic acid analogues containing AEGIS components.

    (10) (e) Also absent in the prior art are procedures to efficiently sequence DNA containing AEGIS components. While methods in the art, including dideoxy sequencing, might be applied to DNA containing AEGIS components, the challenges associated with this application have to date prevented any successful AEGIS IVS. A workable method of sequencing is disclosed here, and is based on U.S. patent application Ser. No. 12/999138, which is herein incorporated in its entirety by reference. This application provides for processes that convert Z:P pairs in DNA into A:T pairs and/or C:G pairs, or isoC:7-deazaisoG pairs into T:A pairs, enabling a process for efficiently sequencing aptamers/xNAzymes built from G, A, C, T, Z, and P nucleotides. This method for sequencing employs the following steps:

    (11) (a) Perform amplification under conditions that convert Z:P pairs sometimes to C:G pairs and sometimes to T:A bases;

    (12) (b) Shotgun clone the products of that amplification, now built entirely from standard nucleotides;

    (13) (c) Sequence the cloned material using high throughput DNA sequencing; and

    (14) (d) Align and compare the sequences recovered.

    (15) In this “converting nucleosides” strategy (U.S. Ser. No. 13/493172), two populations of standard DNA are generated from one precursor of GACTZP DNA. Sites that originally held Z in the precursor hold either C or T in the converted sequence. This generates a “C” call half of the time, and a “T” call the other half of the time. Similarly, sites that originally held P will generate either a “G” call or an “A” call. Sites that originally held G, A, C, and T will give uniform calls in all of the sequences returned. Thus, the sequence of the precursor and the positions of Z and P in that sequence can be inferred.

    (16) Description of the Preferred Embodiments

    (17) The presently preferred AEGIS components to support IVS are nucleosides that implement the pyDDA and puAAD hydrogen bonding patterns are as follows. For DNA, presently preferred implementation of the pyDDA hydrogen bonding pattern is the nucleoside analog 6-amino-5-nitro-3-(1′-β-D-2′-deoxyribofuranosyl)-2(1H)-pyridone. The presently preferred implementation of the puAAD hydrogen bonding pattern is the nucleoside analog (2-amino-8-(1′-β-D-2-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)one. These are trivially named dZ and dP; their ribonucleoside analogues are preferred to implement IVS based on an RNA-like scaffold.

    (18) For the pyAAD hydrogen-bonding pattern, the presently preferred nucleobase embodiments are isocytosine and pseudocytosine disclosed in U.S. Pat. No. 7,741,294, which is incorporated in its entirety herein by reference. For the puDDA hydrogen-bonding pattern, the presently preferred nucleobase embodiment is 7-deazaisoguanine.

    EXAMPLE

    Example 1

    (19) Selection of an Aptamer that Binds to a Line of Human Breast Cancer Cells

    (20) This in vitro selection (or AEGIS-SELEX) example exploited two additional nucleotides (2-amino-8-(1′-β-D-2-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)one, trivially called P, and 6- amino-5-nitro-3-(1′-β-D-2′-deoxyribofuranosyl)-2(1H)-pyridone, trivially called Z).

    (21) Synthesis and Purification of GACTZP Libraries Containing Four Natural Nucleotides (G, A, C, and T) and AEGIS Nucleotides (Z and P) to Support AEGIS-SELEX.

    (22) All dZ and dP containing oligonucleotides (Tables S1 and S4) were synthesized using standard phosphoramidite chemistry on controlled pore glass supports on an ABI 394 DNA Synthesizer. Protected dZ and dP phosphoramidites were purchased from Firebird Biomolecular Sciences LLC (Alachua, Fla., www.firebirdbio.com, Cat. # DZPhosphor-101, Cat. # DPPhosphor-102). ). Standard phosphoramidites (Bz-dA, Ac-dC, dmf-dG, and dT) were purchased from Glen Research (Sterling, Va.). The oligonucleotides were designed to have forward and reverse primer binding sites (each 16 nucleotides in length) with a random region (20 nts) containing GACTZP (six nucleotides) at each site in equimolar concentrations. Coupling times were 60 seconds. The CPG-bound DMT-off oligonucleotides were incubated with triethylamine-acetonitrile (1:1 v/v, 1.5 mL) for 1 hour at 25° C., followed by removal of supernatant, the CPG-bound oligonucleotides were treated with another 1.5 mL of triethylamine-acetonitrile (1:1 v/v) for overnight at 25° C. After removal of supernatant, the CPG-bound oligonucleotides were incubated with 1.0 mL of DBU in anhydrous CH.sub.3CN (1 M) at room temperature for ˜18 hours to remove the protecting groups on dZ. After removal of CH.sub.3CN, dZ and dP containing oligonucleotides were retreated with NH.sub.4OH (55° C., overnight). The product mixture was resolved by denaturing PAGE (7 M urea), and extracted with TEAA buffer (0.2 M, pH=7.0). The product was desalted by Sep-Pac® Plus C18 cartridges (Waters). All 5′-biotinylated dZ and dP containing potential aptamers were synthesized, deprotected, and purified in house based on the above methods. All standard 5′-biotinylated oligonucleotides were purchased from IDT and purified by HPLC.

    (23) Cell Lines.

    (24) Triple negative breast cancer cells (MDA-MB-231, ATCC® HTB-26™) were cultured using ATCC recommended media and reagents (Incubate cultures at 37° C. without CO.sub.2. http://atcc.org/Products/All/HTB-26.aspx#7301B7F956944F8382B6192957C08A3B).

    (25) Experimental Procedure of AEGIS-SELEX.

    (26) To begin the AEGIS-SELEX experiment, MBA-MD-231 cells were seeded in culture flasks (25 mL). These cells adhere to the walls of the flask and grown to about 97% coverage of culture flask. Cells were washed with washing buffer (4.5 g/liter glucose, 5 mM MgCl.sub.2 in Dulbecco's PBS). Five nanomoles of GACTZP DNA library was dissolved in 700 μl of binding buffer (4.5 g/liter glucose, 5 mM MgCl.sub.2, 0.1 mg/ml tRNA and 1 mg/ml BSA, all in Dulbecco's PBS).

    (27) The GACTZP DNA library was denatured by heating at 95° C. for 3 min, and then “snap cooled” on ice for 10 min. The library was then incubated with the cells, still adhering to the walls of the flask at 4° C. on rocker for 1 hour. Cells were thrice gently washed with washing buffer to remove unbound sequences. Binding buffer (0.5 mL) was added and the cells scraped off the plate using cell scraper to recover cell/ DNA complexes.

    (28) Once the cells were scraped from the walls of the flask into a suspension in PBS buffer, they were heated (95° C. for 15 min). The resulting mixture centrifuged at 14000 rpm to pellet the cell debris. The supernatant containing the ssDNA survivors were recovered.

    (29) The recovered survivors were then amplified by six-nucleotide PCR using FITC- and biotin-labeled primers (Table S1) with six nucleotide triphosphate mixture (dZTP, dPTP, dGTP, dATP, dCTP, and dTTP). Different PCR cycles (from 8 cycles to 25 cycles) were tested to determine the optimum number of cycles for preparative PCR to produce maximal amount of amplicon with less PCR artifacts. Reagents and conditions are listed in Table S2.

    (30) TABLE-US-00001 TABLE S1 GACTZP DNA library, 6-nucleotide PCR primers, and barcoded primers for deep sequencing Name Sequence GACTZP DNA 5′-TCCCGAGTGACGCAGC-custom character - Library GGACACGGTGGCTGAC-3′; custom character  = equimolar A, G,   SEQ ID NO. 1 C, T, Z, and P phosphoramidites FITC-Primer 5′-FITC-TCCCGAGTGACGCAGC-3′ SEQ ID NO. 2 Biotin-Primer 3′-CCTGTGCCACCGACTG-Biotin-5′ SEQ ID NO. 3 A_Code2_Forward_56mer Adaptor A SEQ ID NO. 4 Key Barcode2 Forward Primer 5′-CCATCTCATCCCTGCGTGTCTCCGACTCAG-GATGATTGCC- TCCCGAGTGACGCAGC-3′ A_Code6_Forward_56mer Adaptor A Key Barcode6 Forward Primer SEQ ID NO.5 5′-CCATCTCATCCCTGCGTGTCTCCGACTCAG-GACATTACTT- TCCCGAGTGACGCAGC-3′ trP_Reverse_39mer Adaptor trP1 Reverse Primer SEQ ID NO. 6 5′-CCTCTCTATGGGCAGTCGGTGAT-GTCAGCCACCGTGTCC-3′

    (31) TABLE-US-00002 TABLE S2 Typical six-nucleotide PCR amplification of GACTZP DNA library: Reagents Volume (μL) Final conc. H.sub.2O 30.5 FITC-Primer + Biotin Primer mixture 2.5 0.5 μM (each 10 μM) Six-Nucleotide Mix of 10× dA, T, G/TPs (1 mM of each) 5.0 0.1 mM of each dCTP (2 mM) 0.2 mM dZTP (1 mM) 0.1 mM dPTP (6 mM) 0.6 mM 10× ThermoPol Buffer (pH = 8.0) 5.0 1× GACTZP DNA library (survivors) 5.0 (10% of reaction volume) HS Takara Taq DNA polymerase (2.5 2 0.10 (U/μL) units/μL) Total volume (uL) 50.0 Note: 1 × ThermoPol Reaction Buffer (20 mM Tris-HCl, 10 mM (NH.sub.4).sub.2SO.sub.4, 10 mM KCl, 2 mM MgSO.sub.4, 0.1% Tritonx-100, pH 8.0 at 25° C.); PCR cycling conditions: one cycle of 94° C. for 1 min; 8 cycles~25 cycles of (94° C. for 20 s, 55° C. for 30 s, 72° C. for 5 min); 72° C. for 10 min; 4° C. forever.

    (32) Upon completion of six-nucleotide PCR, the FITC-labeled DNA stands were separated from the biotinylated strands by affinity purification with streptavidin-coated Sepharose beads (GE Healthcare Bio-Sciences Corp., Piscataway), followed by alkaline denaturation (with NaOH, 50 mM), and neutralized. The surviving ssDNA was desalted and resuspended in binding buffer to a final concentration of 0.5 μM.

    (33) The survivors were denatured at 95° C., snap cooled and used to perform the second round of selection using the same procedure as described for the first round of selection. As a proof of concept, no counter selection was designed in the course of the selection. The entire selection process was repeated until a sustained significant enrichment was obtained at 11.sup.th and 12.sup.th rounds. During the selection, the stringency of selection was increased by increasing the volume of washing buffer and the number of washes.

    (34) Deep Sequencing of GACTZP DNA Survivors Using Next Generation Sequencing Technology.

    (35) Sequencing was done following the “conversion” strategy. Solutions containing enriched GACTZP DNA survivors after 12 rounds of AEGIS-SELEX, were divided into two equal parts. These were separately converted into standard GACT DNA under two conversion conditions using primers that carried barcodes for the Ion Torrent instrument (Table S1):

    (36) TABLE-US-00003 TABLE S3 Converting Z:P to C:G (barcode 6) or converting Z:P to T:A and C:G (barcode2) Z:P to T:A Z:P to C:G and C:G Components conversion conversion Final Conc. ddH.sub.2O 33 μl  33 μl  50 μl A_Code6_For_56mer 2 μl 0.4 μM (10μM) trP_ Rev_39mer (10 μM) 2 μl 0.4 μM A_Code2_For_56mer 2 μl 0.4 μM (10 μM) trP_ Rev_39mer (10 μM) 2 μl 0.4 μM 12th-Round Survivors 1 μl 1 μl 10× Five-Nucleotide Mix dZTP (0.1 mM) 5 μl 0.01 mM dC, G/TPs (4 mM of each) 0.4 mM of each dT, A/TPs (0.4 mM) 0.04 mM of each 10× Five-Nucleotide Mix dPTP (2 mM) 5 μl 0.2 mM dC, G/TPs (1 mM of each) 0.1 mM of each dT, A/TPs (1 mM) 0.1 mM of each 10× ThermoPol 5 μl 5 μl 1× Buffer (pH 8.8) JumpStart Taq 2 μl 2 μl 0.1 (U/μl) (2.5 units/μl, Sigma) Notes: 1. 1 × ThermoPol Reaction Buffer (20 mM Tris-HCl, 10 mM (NH.sub.4).sub.2SO.sub.4, 10 mM KCl, 2 mM MgSO.sub.4, 0.1% Tritonx-100, pH 8.8 at at 25° C.); 2. PCR conditions: one cycle of 94° C. for 1 min; 11 cycles of (94° C. for 20 s, 57° C. for 30 s, 72° C. for 90 s); 72° C. for 10 min; 4° C. forever.

    (37) Following conversion, the samples were combined, purified by native agarose gel, and submitted to Ion Torrent “NextGen” sequence (University of Florida, ICBR sequencing core facility). The products were aligned to identify sequences derived from a single common aptamer “ancestor”, and the ancestral sequence was inferred (see below).

    (38) Inference of GACTZP Aptamer Sequences.

    (39) Ion Torrent sequencing produced 2,975,012 reads, delivered in FASTQ format. Reads that did not contain exact matches to the barcode, forward priming, and reverse priming sequences were discarded, leaving 1,586,297 reads. To minimize miscalling, any read present in less than 80 copies was removed from the analysis. Remaining reads were then clustered using custom software (Bradley, FfAME), which ignored differing barcodes while clustering, and accepting single-step changes within sequence reads. Clustered sequences were then separated by barcode, with variable sites being compared between each barcode (differentiating the two conversion conditions). Sites where variation resembled previously documented base percentages (the first condition with barcode6 generate predominately Z:P to C:G conversion, or the second condition with barcode2 produce Z:P to C:G and T:A, ˜50% of each conversion) in each conversion protocol were marked as likely locations of conversion and were assigned as dZ and dP in the common “ancestor”.

    (40) TABLE-US-00004 TABLE S4 Re-synthesis and purification of the dominant aptamer ZAP-2012 and variants. Name  Sequence  ZAP-2012  5′-Biotin-TCCCGAGTGACGCAGC-  (Z23-P30)  custom character SEQ ID NO. 7 GGACACGGTGGCTGAC-3′ Z23-G30 5′-Biotin-TCCCGAGTGACGCAGC- SEQ ID NO. 8 custom character   GGACACGGTGGCTGAC-3′ Z23-A30  5′-Biotin-TCCCGAGTGACGCAGC-  SEQ ID NO. 9 custom character GGACACGGTGGCTGAC-3′ C23-P30  5′-Biotin-TCCCGAGTGACGCAGC- SEQ ID NO. 10 custom character   GGACACGGTGGCTGAC-3′ T23-P30  5′-Biotin-TCCCGAGTGACGCAGC- SEQ ID NO. 11 custom character GGACACGGTGGCTGAC-3′ C23-G30  5′-Biotin-TCCCGAGTGACGCAGC-  SEQ ID NO. 12 custom character GGACACGGTGGCTGAC-3′ T23-A30  5′-Biotin-TCCCGAGTGACGCAGC-  SEQ ID NO. 13 custom character GGACACGGTGGCTGAC-3′ C23-A30  5′-Biotin-TCCCGAGTGACGCAGC-  SEQ ID NO. 14 custom character GGACACGGTGGCTGAC-3′ T23-G30  5′-Biotin-TCCCGAGTGACGCAGC-  SEQ ID NO. 15 custom character GGACACGGTGGCTGAC-3′
    Screening of Potential Aptamer Candidates.

    (41) Analysis of the Ion Torrent sequencing identified several other candidate aptamers with different arrangements of dZ and dP as well as those containing only normal bases. Each sequence was chemically synthesized and labeled with biotin at the 5′ end. They were purified by HPLC (for IDT-derived oligos) or PAGE (for oligos prepared in house). These were quantified (UV 260/280) and diluted to standard concentrations.

    (42) Flow cytometry binding assays were then done using the target MDA-MB-231 cells. To obtain suspended cells for flow cytometry, culture medium was removed from the cells and non-enzymatic dissociation buffer was added to cover the surface of the entire flask. This was placed in an incubator at 37° C.

    (43) After incubation (5 min), the cells were aspirated using a transfer pipette to remove them from the flask. This was washed twice by centrifugation and approximately 5.0×10.sup.5 cells were incubated separately with the aptamer candidates at a final concentration of 250 nM. After incubation, cells were washed. Streptavidin-PE-cy5.5 conjugate (100 μL of 1:400 dilution, optimized) was then added, and the mixture was incubated at 4° C. for 10 min. Excess dye conjugates were removed by washing twice and the cell-DNA complexes resuspended in 150 μL binding buffer. The aptamer binding signal was detected using flow cytometry (BD). Unselected library was used as a control to set the fluorescence background.

    (44) Determination of Binding Affinity.

    (45) The binding affinity of the aptamer ZAP-2012 was done by flow cytometry using biotin-labeled aptamer, and similarly the signal was detected with streptavidin-PE-cy5.5 conjugate. MDA-MB-231 cells were dissociated using non-enzymatic dissociation buffer. Cells were washed and incubated with varying concentrations (0.1 nM-500 nM final concentration) of biotin-labeled aptamer in a 200 mL volume of binding buffer containing 10% FBS. After 20 min of incubation, cells were washed twice with washing buffer and then incubated with 100 mL of the conjugate dye (1:400 dilution). This was incubated for 10 min and then washed twice, each with 1300 mL of washing buffer. The cell pellets were resuspended in 200 mL washing buffer and analyzed by flow cytometry. The biotin-labeled unselected library was used as a negative control to determine the background binding. All binding assays were done in duplicate. The mean fluorescence intensity of the unselected library was subtracted from that of the corresponding aptamer with the target cells to determine the specific binding of the labeled aptamer.

    (46) Results

    (47) GACTZP Aptamer Selection Scheme and AEGIS-SELEX Progression.

    (48) This AEGIS-SELEX began with the solid-phase synthesis of a GACTZP DNA library having two primer binding sequences (16 nts each) flanking a 20 nt random region (N.sub.1N.sub.2. . . N.sub.20). Each of the 20 randomized sites was synthesized to have all six (GACTZP) phosphoramidites in equal amounts. In subsequent analysis, the library was digested and the nucleotide fragments were quantitated by HPLC to show that Z and P were present in the random region. The ratio of all six nucleotides was T/G/A/C/Z/P≈1.5/1.2/1.0/1.0/1.0/0.5.

    (49) A sample (5 nmol) of the GACTZP library was then subjected to sequential binding and elution from the line of breast cancer cells, MDA-MB-231. The pool of DNA survivors was collected after each round of selection and amplified by six-letter GACTZP PCR with a mixture of nucleotide triphosphates (dGTP, dATP, dCTP, dTTP, dZTP, and dPTP, Table S2) using Hot Start Taq DNA polymerase (TaKaRa). The product was recovered by binding to solid-phase streptavidin, followed by elution with NaOH. The resulting single stranded DNA was subjected to the next round of selection. No negative selection was used.

    (50) To increase the selection pressure in later rounds of AEGIS-SELEX, the number of cells and incubation times were gradually reduced, and the volume of washing buffer and the number of washes was increased. Starting after nine rounds of selection, the progress of the AEGIS-SELEX experiment was monitored by flow cytometry to measure the binding of the total library to the target cells. The amount of surviving GACTZP total DNA bound to MDA-MB-231 increased from 9.sup.th round to 11.sup.th round, but not further after 11.sup.th round. Therefore, the AEGIS-SELEX was stopped at 12.sup.th round, and the survivors were prepared for deep sequencing.

    (51) Deep Sequencing GACTZP DNA Survivors Using Next Generation Sequencing Technology.

    (52) Deep sequencing was done following the “conversion” strategy previously reported. The enriched GACTZP DNA survivors recovered after 12 rounds of AEGIS-SELEX were divided into two equal portions. These were separately converted by barcoded copying into standard DNA using two conversion protocols (Table S3). In the first protocol, sites holding Z and P nucleotides in the GACTZP survivors were converted predominantly into sites holding C and G nucleotides, respectively; less than 15% were other nucleotides. Under the second conversion protocol, sites holding Z were converted to sites holding a mixture of C and T, with their ratio lying between 60:40 and 40:60, depending on the sequence surrounding that site. Sites holding P is converted to a mixture of G and A with roughly the same range of ratios, again depending on the sequence context surrounding that site.

    (53) Following conversion, two barcoded samples were combined and submitted for Ion Torrent “next generation” sequencing at the University of Florida DNA sequencing Core facility (ICBR/UF). Reads that did not contain exactly matched barcodes and/or forward and reverse priming sequences were discarded. To minimize miscalling, any sequence present as fewer than 80 copies in the whole library was removed from the analysis. The remaining reads (357,574 in total) were then clustered using software custom designed at the FfAME, which ignored differing barcodes during the clustering and accepted single-step changes within sequence reads. Clustered sequences were then separated by barcode, with variable sites being compared between each barcode. The clustered sequences obtained under the first conversion conditions (Z to C and P to G) serve as reference for the clustered sequences obtained under the second conversion conditions. Sites where C and T were found in approximately equal amounts after conversion under the second conditions were assigned as Z in their “parent”. Sites where G and A were found in approximately equal amounts after conversion under the second conditions were assigned as P in their “parent”.

    (54) The inferred ancestral sequences were aligned to identify dominant candidate aptamers. One dominant aptamer sequence was represented by 101,224 independent reads, and constituted approximately 30% of the survivors. This aptamer was named ZAP-2012 (Z And P, 20 nucleotide random region, 12 cycles of selection), and its sequence was inferred to be: 5′-TCCCGAGTGACGCAGC-CCCCGGZGGGATTPATCGGT-GGACACGGTGGCTGAC-3′. SEQ ID NO. 16

    (55) ZAP-2012 contains a single Z at the position of 23 and a single P at the position of 30 (Z23-P30). The second and third most abundant sequences were about 5% and 3% of the population.

    (56) Determine the Binding Affinity of the Dominant Aptamer ZAP-2012 and Variants.

    (57) The ZAP-2012 aptamer was then re-synthesized in a form carrying a 5′-biotin by solid phase synthesis (ABI 394 DNA) from standard phosphoramidites (Glen Research) and dZ and dP phosphoramidites (Firebird Biomolecular Sciences LLC). Analogous molecules lacking Z (C23-P30 and T23-P30), lacking P (Z23-G30, Z23-A30), or lacking both Z and P (C23-G30, T23-G30, T23-A30, and C23-A30) were also synthesized with a 5′-biotin (Table S4).

    (58) These were each used at 250 nM in a flow cytometry assay (labeling with streptavidin-PE-Cy5.5) to test their binding to the MDA-MB-231 target cells. The original non-binding library was used as a negative control. The ZAP-2012 sequence bound strongly; its mutant forms gave either reduced binding (C23-P30, Z23-G30, and Z23-A30) or no binding at all (T23-P30, and all sequences lacking both Z and P). These studies illustrated that the Z and P nucleotides in ZAP-2012 significantly contribute to the binding affinity.

    (59) We also re-synthesized and tested the binding affinity of secondary survivors in the population, contributing from 5.0% to 0.4% of the total population of survivors. Among these twelve sequences, three had no Z and P; the remainder had either a single Z (and no P), a single P (and no Z), or one of each. Compared with the binding signal of the ZAP-2012, all twelve secondary candidates gave negligible binding signals. To rule out the mis-assignment of the Z and P in these candidates, we replaced Z by either C or T, and replace P by either G or A, to produce all possible fully standard sequence analogs, to exclude the possibility that these might be the “true” aptamer arising from the selection. Again, all these candidate sequences gave no binding signals at 250 nM. This shows that both Z and P are required for ZAP-2012 to bind.

    (60) Determination of Dissociation Constant of Aptamer ZAP-2012.

    (61) The dissociation constant (K.sub.diss) of the ZAP-2012 aptamer against breast cancer cell (MDA-MB-231) was then estimated using serial dilutions (0.1 nM-500 nM final concentration). The biotin-labeled unselected library was used as a negative control to assess background binding. All binding assays were done in triplet. The mean fluorescence intensity of the unselected library was subtracted from that of the corresponding aptamer with the target cells to determine the specific binding of the labeled aptamer. The apparent K.sub.diss was obtained by fitting the intensity of binding versus the concentration of the aptamers to the equation Y=B.sub.max X/(K.sub.diss+X); Y is the mean fluorescence intensity, at the concentration of aptamer=X in nanomoles; B.sub.max is maximal binding), using Sigma Plot (Jandel, San Rafael, Calif.). From these data, ZAP-2012 (Z23-P30) bound to the cell (MDA-MB-231) with an apparent K.sub.diss=30±1 nM. If the Z in Z23-P30 is replaced by C to give C23-P30, the K.sub.diss increases to 160±31 nM. If the P in Z23-P30 is replaced by G to give Z23-G30, the K.sub.diss increases to 442±130 nM. All other mutant forms lacking Z (T23-P30), lacking P (Z23-A30), or lacking both Z and P (C23-G30, T23-A30, C23-A30, and T23-G30) gave almost no binding (the K.sub.diss increases to >1 μM).