PROFILING RNA-SMALL MOLECULE BINDING SITES WITH OLIGONUCLEOTIDES
20220119868 · 2022-04-21
Inventors
Cpc classification
C12Q1/6811
CHEMISTRY; METALLURGY
C12Q1/6811
CHEMISTRY; METALLURGY
International classification
Abstract
Many RNAs cause disease, however RNA is rarely exploited as a small molecule drug target. Disclosed herein are methods for identifying privileged RNA motif-small molecule interactions to enable the rational design of compounds that modulate RNA biology starting from only sequence. A massive, library-versus-library screen was completed that probed over 50 million binding events between RNA motifs and small molecules. The resulting data provide a rich encyclopedia of small molecule-RNA recognition patterns, defining chemotypes and RNA motifs that confer selective, avid binding. The resulting interaction maps were mined against the entire viral genome of hepatitis C virus (HCV). A small molecule was identified that avidly bound RNA motifs present in the HCV3′ untranslated region and inhibited viral replication while having no effect on host cells. Collectively, this investigation represents the first whole genome pattern recognition between small molecules and RNA folds.
Claims
1. A method comprising contacting a library of RNA sequences, a complementary antisense oligonucleotide, RNase H, and a small molecule candidate RNA-binding compound and determining cleavage of the RNA sequences in the presence of the compound (“presence cleavage”); and contacting the library of RNA sequences, the complementary antisense oligonucleotide, and RNase H in the absence of the small molecule candidate RNA-binding compound and determining cleavage of the RNA sequences in the absence of the compound (“absence cleavage”); wherein when cleavage is inhibited (e.g., presence cleavage is lower than absence cleavage), the small molecule candidate RNA-binding compound binds to the RNA sequence.
2. The method of claim 1 wherein the RNA sequence library comprises a transcriptome.
3. The method of claim 2 wherein the transcriptome is viral.
4. The method of claim 2 wherein the transcriptome is mammalian.
5. The method of claim 2 wherein the transcriptome is bacterial.
6. The method of claim 1 wherein the RNA sequence library comprises one or more of synthetic, semi-synthetic, or natural RNA.
7. The method of claim 1 wherein the RNA sequence library comprises the genome of an RNA virus.
8. The method of claim 1 carried out in vitro.
9. The method of claim 1 carried out in living cells.
10. The method of claim 9 wherein the cells are virally- or bacterially-infected cells.
11. The method of claim 1 wherein a set of complementary antisense oligonucleotides and a set of small molecule candidate RNA-binding compounds are assayed in a 2-dimensional parallel array.
Description
BRIEF DESCRIPTION OF THE FIGURES
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
DETAILED DESCRIPTION
[0019] Beyond identifying small molecule RNA binders, one major challenge in the discovery of small molecules directed at RNA is their “drug-likeness”. Aminoglycosides, the most commonly studied small molecules that target RNA, are highly charged, polar compounds and considered very non-drug-like; ironically, they are important drugs used clinically. Herein, a method, termed 2DCS, is used to probe a vast landscape of heterocyclic drug-like small molecule-RNA interactions to identify new chemotypes in small molecules that confer avid binding to RNA and elucidate their RNA motif binding preferences (
Small Molecule Libraries. Over 30,000 compounds from The Scripps Research Institute (TSRI) and the National Cancer Institute (NCI) small molecule libraries were inspected to identify members that contain an amine for site-specific conjugation onto aldehyde-functionalized microarrays. To reduce the number of compounds to a manageable number for screening, three small molecules known to bind toxic RNAs in cellulis and improve disease-associated defects (D6, 1a, and H1.sup.18-21;
[0020] The average Tanimoto scores of selected compounds were 0.28±0.05, 0.37±0.08, and 0.37±0.13 as compared to D6, 1a, and H1, respectively. The two refinements afforded 1,987 compounds that were commercially available. Notably, both the library of 1,987 compounds and the >30,000 from which they were selected are chemically diverse, as determined by using a Tanimoto analysis.sup.23. Further, the starting library contains both N- and O-containing heterocycles and functional groups (25% of the compounds contain an oxygen as part of a heterocycle or alcohol; 55% of the compounds have at least one oxygen as an aldehyde, ketone, ester, or amide).
[0021] Further, the 1,987 small molecules screened were verified to have drug-like properties by comparing them to the compounds in DrugBank, a publicly available repository containing the properties of FDA-approved therapeutics.sup.24-26. The compounds were scored for lipophilicity.sup.27-30 using Log P and Log D values.sup.31-33. Both values report partition coefficients for the ratio of unionized species in n-butanol to unionized species in water; Log D utilizes an additional algorithm to account for ionized species. The distribution pattern of the Log P and Log D values correlates well between the chemical library and DrugBank compounds with the most compounds having values between 1 and 4. Additionally, the small molecules studied herein and FDA-approved drugs followed similar distribution trends for diversity, molecular weight, lipophilicity, and rotatable bonds (
Identification of RNA-binding Small Molecules by 2DCS. To test the compounds for their ability to bind RNA, they were conjugated to a microarray surface that displays aldehydes and incubated with radioactively labeled RNA motif libraries. The secondary structures displayed by the RNA libraries (
[0022] To remove compounds that non-selectively bind RNA motifs, the 239 small molecules with affinity for RNA were probed for binding to the RNA libraries in the presence of 1,000-fold excess bulk tRNA, affording 91 unique compounds (4.6%; Table 3). A final screen was then completed in which the 91 array-immobilized compounds were incubated with all five RNA motif libraries separately in the presence of oligonucleotide competitors F-J, d(AT).sub.11 and d(GC).sub.11. Oligonucleotides F-J mimic regions common to all library members, restricting binding to the randomized regions. (Note: oligonucleotide J was not used for hairpin selections.) After rigorous washing to remove unbound RNAs, bound RNAs were harvested and identified by RNA-seq. By completing selections under conditions of high oligonucleotides stringency (by use of excess competitor oligonucleotides), these studies identified small molecules that bound RNA motifs avidly. A challenge in the small molecule RNA-targeting area has been the development and identification of selective interactions between small molecules and RNAs. In myriad studies, 2DCS has defined selective RNA motif-small molecule interactions with varied affinities.sup.11, 12, 35, 36.
Identification of Privileged RNA Space by RNA-Seq. Using RNA-seq, a large sequencing dataset was obtained for each pool of RNAs that were specifically bound to each small molecule. RNA libraries A, B, C, D, and E had at least 12.1-fold (average: 133.1±111.8), 6.7-fold (average: 80.8±78.7), 9.9-fold (average: 103.5±110.6), 7.2-fold (average: 75.7±65.6), and 6.0-fold (average: 36.3±31.6) coverage for each small molecule selection, respectively, as compared to the number of unique sequences within the corresponding RNA library. It was previously shown that at least 6-fold coverage is required to generate binding landscape maps.sup.12. RNA-seq data were then analyzed by High Throughput Structure-Activity Relationships Through Sequencing (HiT-StARTS).sup.12 to identify the privileged RNA motifs that bind each small molecule. Briefly, the frequency of occurrence of a selected RNA was compared to the frequency of occurrence of the same RNA from RNA-seq analysis of the starting RNA library to account for biases arising during transcription and RT-PCR. This pooled population comparison affords the parameter Z.sub.obs, a metric of statistical confidence. A large, positive Z.sub.obs indicates a strong preference for binding the motif while a negative Z.sub.obs indicates a strong preference against binding the motif. A Fitness Score is assigned by normalizing the Z.sub.obs values to the most statistically significant small molecule-RNA interaction to 100. The RNA binding landscape of a series of substituted benzimidazoles that do not bind DNA was previously studied. From these studies, it was determined that a Z.sub.obs>8 defined selective interactions; that is high affinity binding was observed to selected RNA motifs with Z.sub.obs>8.sup.12.
Structure-Activity Relationships (SAR). Next, the most privileged RNA motif binders and the most discriminated against RNA motifs (non-binders) for each compound selection was analyzed by generating LOGOS from the highest and lowest 0.5% of Z.sub.obs scores.sup.37. By comparing the LOGOS for related compounds, or DiffLogos, SAR can be defined. Indeed, various hit compounds differ by a single functional group, including compounds A and B (
##STR00001##
[0023] Four compounds that share an indolylpyrimidine-2,4-diamine core bound members of A (5-nucleotide hairpin library). LOGOS analysis shows both similarities and differences, driven by substitution of the diamine (
##STR00002##
[0024] Of the 26,624 possible sequences in the five RNA libraries, 1,215 sequences (4.6%) are unique for a single compound; that is, the sequence only appears in the highest 0.5% of Z.sub.obs values for one compound. The most selective RNA motifs were evaluated by searching for their sequences in the highest 0.5% and lowest 0.5% of Z.sub.obs values. A selective RNA motif will be enriched for only a single or a few compounds and discriminated against by many compounds. The most selective motifs for libraries B-E are (
[0025] The most promiscuous binding sequence for RNA library A is 5′GGUGU3′ (n=33 out of 38 total compounds) (
[0026] There are also 6-nucleotide hairpins (derived from RNA library B) that accommodate binding to many small molecules (
[0027] Interestingly, the most promiscuous sequences from RNA library C are all predicted to fold into single nucleotide U bulges (
[0028] As observed for the other RNA libraries, there are internal loops derived from D that bind a wide range of small molecules (
[0029] The loops derived from E that bind the most number of compounds are 5′GCUU(U)3′/3′CGC(A)5′ (n=12 out of 30 compounds; where invariant nucleotides from the cassette are in parentheses), 5′GGUC3′/3′CCU5′ (n=12), and 5′GUGG(U)3′/3′CGC_(A)5′ (n=12) (
[0030] Affinity of Selected RNA Motif-Small Molecule Interactions. The affinity of exemplar compounds that have inherent fluorescence was measured. As shown in Table 1, compounds 7 and 8 (
##STR00003## ##STR00004##
Scaffold and Chemoinformatics Analyses of Hit Compounds. To aid in the design of small molecules that bind RNA and to assess rigorously their drug-likeness, scaffold and chemoinformatics analyses of hit compounds were carried out. The 91 compounds that bind the RNA libraries vary in similarity to the starting lead small molecules: H1, 0.30±0.13; range: 0.09-0.75; 1a, 0.26±0.06; range: 0.12-0.48; and D6, 0.21±0.04; range: 0.11-0.30. Comparison of the hit compounds to each other affords Tanimoto scores ranging from 0.09 to 0.97, suggesting that chemical diversity and common scaffolds are found amongst the small molecules.
[0031] To gain general insight into the chemotypes that confer avidity for RNA, sub-features in the 91 hit compounds were identified via the method of Clark and Labute.sup.13. (Table 2). To quantify the significance of the sub-structures, a pooled population comparison was carried out on how frequently the sub-structure appears in the hit compounds and how frequently the same sub-structure appears in the 1,987 compounds in the small molecule library. This statistical analysis calculates Z.sub.obs, and hence p-values, for each sub-structure, akin to the analysis of statistically significant privileged motifs. This analysis afforded 11 privileged chemotypes, with S3 being the most statistically significant sub-structure for conferring avidity for RNA (Table 2).
[0032] Interestingly, the chemotypes identified here are present in previously discovered RNA small molecule binders (Table 2). For example, S6 is found in two molecules used to target trinucleotide repeat expansions that cause myotonic dystrophy type 1 and type 2.sup.38, 39. S3 is found in small molecules that were developed to target the trans-activation response element (TAR RNA) of human immunodeficiency virus 1 (HIV-1).sup.40 while S13 is found in a compound that binds the aminoacyl-tRNA acceptor site (A-site) of Escherichia coli ribosome.sup.41. S5 was previously identified as a privileged chemotype for binding RNA.sup.11. Drug-Like Properties Analysis of Hit Compounds. Compared to FDA-approved drugs, the hit compounds have similar distribution of Log P values, Log D values, molecular weights and number of rotatable bonds (
[0033] Lipinski parameters.sup.42 were computed for previously published small molecules that were studied by 2DCS including eight benzimidazoles and five 2-aminobenzimidazoles.sup.11, 35, 36. This small set of compounds also has similar c Log P values (range: 2-3), c Log D values (range: 0-3) range and number of H-bond donors (3) as that of the hit compounds (
Biological activity of exemplar compounds. Inforna has informed the design of compounds that target various disease-associated RNAs including, expanded repeating RNAs that cause microsatellite disorders and oncogenic microRNAs. Therefore, the Inforna approach was expanded to other types of RNAs, in particular viral RNAs. Interestingly, related compounds of the phenylimidazoline class were predicted to bind bulges present in the hepatitis C Virus (HCV) 3′ untranslated region (UTR) (
[0034] The importance of long range RNA-RNA interactions required for the replication of HCV suggests that stable RNA structures of defined composition exist, and the disclosed methods have identified molecules that bind to some of these structures. The compounds bind their cognate loops in the 3′ UTR with mid-nanomolar to low micromolar affinities (Table 1). It was next determined if the identified compounds could inhibit HCV RNA replication in HCV subgenomic replicon cells, in particular SGR-Neo-RLuc-JFH1-2A replicon cells established in Huh-7.5 (human hepatoma cells). In this system, an autonomously replicating viral RNA replicon is used to monitor efficiently RNA replication of the virus in the absence of spread of viral RNA to adjacent cells. This is achieved by replacing the structural proteins required to assembly progeny virions with a drug selectable marker (Neo) and a Renilla luciferase (RLuc) reporter gene. Once this replicon has been delivered to human hepatoma cells and selection with G418 (conferred by Neo selectable marker) has occurred, the expression and Renilla luciferase activity is directly correlated with the level of HCV replication.sup.57.
[0035] Inforna identified three lead compounds for HCV RNA that bind motifs in SL1 and SLII, compounds 7, 8, and 10 with varying affinities (Table 1;
[0036] Interestingly, 7, 8, and 10 have potencies similar to 2′-C-methyladenosine (IC.sub.50=4.5±0.1 μM), which inhibits HCV transcription via competitive inhibition of the HCV RNA polymerase nonstructural protein 5B (NS5B).sup.17. Here, an inhibitor was designed with similar potency that binds the viral RNA rather than a viral enzyme.
Investigating Compound Mode of Action. Many of the compounds identified from the disclosed screen that bind RNA selectively share chemotypes with kinase inhibitors. To further support RNA binding as a mode of action for inhibition of HCV replication, an antisense oligonucleotide-based approach was developed to study compound binding sites in cells (
[0037] Compound 8 binds the desired site in HCV RNA in vitro and in cells. To validate that the lead compounds bind to the desired site in the HCV RNA genome, an approach to profile binding site by using an antisense oligonucleotide was developed (
[0038] Application of an antisense agent (Table 4) and RNase H cleaves the RNA at the desired site. Addition of compound 8 to these reactions limited the ability of the antisense to mediate cleavage of the target (
[0039] Given these favorable results in vitro, the competitive cleaving approach was used to map cellular ligand binding sites. Indeed, compound 8 inhibited cleavage induced by an antisense oligonucleotide that overlaps with the 8-binding site, as determined by RT-qPCR and supported by readout of viral replication using luciferase (synergistic effect of compound and ASO treatment) (
[0104] All patents and publications referred to herein are incorporated by reference herein to the same extent as if each individual publication was specifically and individually indicated to be incorporated by reference in its entirety.
TABLE-US-00001 TABLE 1 Affinities of selected RNA motif-small molecule interactions derived from selections with RNA library C and small molecule affinities for HCV RNA Motifs. RNA motif-small molecule interactions derived from selections with RNA library C HCV RNA Motifs Small Z.sub.obs Fitness Affinity Small Affinity RNA Motif Molecule (Rank) Score (nM) RNA Motif Molecule (nM) 5′UUA3′/3′A_U5′ 7 25.2 (1) 100 380 ± 26 5′GUG3′/3′C_C5′ 7 470 ± 32 (SL I and SL III) 5′GAG3′/3′C_C5′ 7 18.3 (20) 73 410 ± 26 8 1400 ± 160 5′UGU3′/3′G_A5′ 7 11.8 (49) 47 630 ± 36 9 N.B. 5′AAC3′/3′U_G5′ 7 8.2 (94) 33 420 ± 41 10 3350 ± 770 5′UCCUU3′/ 7 1.1 (364) N/A 2080 ± 300 5′CUG3′/3′G_C5′ 7 820 ± 82 3′AUUA5′ (SL II) (non-binder) 5′UCCAU3′/3′AUCA 7 −2.8 (436) N/A 7200 ± 418 8 1400 ± 211 (non-binder) Fully paired RNA 7 N/A N/A >7000 9 N.B. 10 7100 ± 4200 5′UUA3′/3′A_U5′ 8 13.6 (1) 100 690 ± 47 5′UAG3′/3′A_C5′ 7 460 ± 70 (SL I) 5′GAU3′/3′C_A5′ 8 11.8 (7) 87 600 ± 61 8 849 ± 150 5′GCU3′/3′C_A5′ 8 10.8 (14) 79 450 ± 47 9 1300 ± 500 5′UUA3′/3′A_U5′ 8 9.5 (24) 70 530 ± 45 10 460 ± 140 5′UCUAU3′/ 8 6.1 (396) 45 5580 ± 440 3′ACCA5′ Fully paired RNA 8 N/A N/A >5000 N.B. = no binding observed
TABLE-US-00002 TABLE 2 Chemotypees that confer avidity for RNA. Number of hit compounds / % cpds in starting DrugBank library with with Scaffold chemotype Z.sub.obs p-value chemotype
TABLE-US-00003 TABLE 3
TABLE-US-00004 TABLE 4 List of primers used for RT-qPCR and associated experiments Primer Name Sequence 5′ - 3′ FGFR1 GCACATCCAGTGGCTAAAGCAC (SEQ ID NO: 10) FGFR1 AGCACCTCATCTCTTTGTCGG (SEQ ID NO: 11) JAK3 AGTGACCCTCACTTCCTGT (SEQ ID NO: 12) JAK3
GGCTGAACCAAGGATGATGTGG (SEQ ID NO: 13) PAK1 GTGAAGGCTGTGTCTGAGACTC (SEQ ID NO: 14) PAK1
GGAAGTGGTTCAATCACAGACCG (SEQ ID NO: 15) IRAK4 ATGCCACCTGACTCCTCAAGT (SEQ ID NO: 16) IRAK4
CCACCAACAGAAATGGGTCGTTC (SEQ ID NO: 17) P38α (MAPK) GAGCGTTACCAAACCTGTCTC (SEQ ID NO: 18) P38α
AGTAACCGCAGTTCTCTGTAGGT (SEQ ID NO: 19) AMPKA1 AGGAAGAATCCTGTGACAAGCAC (SEQ ID NO: 20) AMPKA1
CCGATCTCTGTGGAGTAGCAGT (SEQ ID NO: 21) AMPKB1 CTCCAGGTCATCCTGAACAAGG (SEQ ID NO: 22) AMPKB1
ACAGCGCGTATAGGTGGTTCAG (SEQ ID NO: 23) AMPKG1 AGAGGTTCTGCCCGTCTTCCTT (SEQ ID NO: 24) AMPKG1
TTCGTCCTCGAACTCCAGCTT (SEQ ID NO: 25) CAMK4 GTTCTTCTTCGCCTCTCACATCC (SEQ ID NO: 26) CAMK4
CTGTGACGAGTTCTAGGACCAG (SEQ ID NO: 27) CHK1 GTGTCAGAGTCTCCCAGTGGAT (SEQ ID NO: 28) CHK1
GTTCTGGCTGAGAACTGGAGTAC (SEQ ID NO: 29) AKT TGGACTACCTGCACTCGGAGAA (SEQ ID NO: 30) AKT
GTGCCGCAAAAGGTCTTCATGG (SEQ ID NO: 31) ROCK1 GAAACAGTGTTCCATGCTAGACG (SEQ ID NO: 32) ROCK1
GCCGCTTATTTGATTCCTGCTCC (SEQ ID NO: 33) β-Actin TGTGATGGTGGGAATGGGTCAGAA (SEQ ID NO: 34) β-Actin
TGTGGTGCCAGATCTTCTCCATGT (SEQ ID NO: 35) HCV-3′-UTR-SL1, II, TGGTGGCTCCATCTTAGCCC (SEQ ID NO: 36) III HCV-3′-UTR-SL1, II, CCTGCAGGTCGACTCTAG (SEQ ID NO: 37) III
HCV-3′UTR Ctrl2 CATCGTGGTGTCACGC (SEQ ID NO: 38) HCV-3′UTR Ctrl2
ATCATGTAACTCGCCTT (SEQ ID NO: 39) SL I ASO mA*mC*mG*G*C*A*C*T*C*T*C*T*G*mC*mA*mG (SEQ ID NO: 40) Control ASO mG*mG*mG*A*A*C*C*G*G*A*G*C*T*mG*mA* (SEQ ID NO: 41)
= Reverse primer, m= 2′-O-methyl modified deoxynucleotide *= Phosphorothioate backbone. All oligonucleotides were used without further purification.
Examples
Methods:
[0105] Oligonucleotides. All DNA oligonucleotides were purchased from Integrated DNA Technologies, Inc. (IDT) and used without further purification. RNA oligonucleotide competitors were purchased from Dharmacon and de-protected according to the manufacturer's standard procedure. All oligonucleotide solutions were prepared with NANOpure water. The RNA motif libraries were synthesized by in vitro transcription from the corresponding DNA template that was custom-mixed at the randomized positions to ensure equivalent representation of all four nucleotides.
Compounds. All compounds were either obtained from the National Cancer Institute (NCI), The Scripps Research Institute (TSRI), or the National Institutes of Health (NIH).
Calculation of Lipinski's Parameters and Chemoinformatic Analysis. JChem for Excel software was used to calculate c Log P, c Log D, molecular weight, H-bond donor, H-bond acceptors, rotatable bonds and shape Tanimoto values (JChem for Excel 5.11.4.886, 2012, ChemAxon (http://www.Chemaxon.com)). Chemoinformatic analysis of chemical substructures were generated by JChem (JChem 5.8.0, 2012, ChemAxon, http://www.chemaxon.com) and by NCGC Automatic R-group analysis program (Tripod Development; http://tripod.nih.gov/?p=46).sup.13.
Construction of Small Molecule Microarrays. Microarrays were constructed as previously described.sup.11.
RNA Selection. RNA libraries were radioactively labeled at the 5′-end as previously described using 1 mL of [γ-.sup.32P1 ATP (3000 Ci/mol; PerkinElmer).sup.10.
Reverse Transcription-Polymerase Chain Reaction (RT-PCR) Amplification. Bound RNAs were excised from agarose microarrays and incubated in 20 μL, of 1×RQ DNase I Buffer containing 2 units RQ1 RNase-free DNase (Promega) at 37° C. for 2 h. The sample was supplemented with 2 μL of 10×DNase Stop Solution (Promega) and incubated at 65° C. for 10 min to inactivate the DNase. This solution was used for reverse transcription-polymerase chain reaction (RT-PCR) amplification as previously describe.sup.14. Aliquots of the RT-PCR reactions were checked every five cycles starting at cycle 25 on a 15% polyacrylamide gel stained with ethidium bromide or SYBR Gold (Life Technologies).
Reverse Transcription and PCR Amplification to Install Barcodes for RNA-seq. Samples were prepared for RNA-seq analysis as previously described.sup.12. After purification using a native 8% or 12.5% polyacrylamide gel, the desired product was re-amplified by using manufacturer provided primer sets and the following cycling conditions: 94° C. for 30 s, 60° C. for 30 s and 72° C. for 30 s.
RNA-seq. RNA-seq was completed at Next Generation Sequencing Core at The Scripps Research Institute (La Jolla, Calif.) or the Scripps Florida Genomics Core.
Identification of Privileged RNA Motifs. Privileged RNA motifs were identified by High Throughput Structure-Activity Relationships Through Sequencing (HiT-StARTS).sup.12. HiT-StARTS uses a pooled population comparison (Z.sub.obs) to account for biases in transcription and RT-PCR. Z.sub.obs is calculated for each sequence per equations 1 and 2).sup.15:
where n.sub.1 is the size of Population 1 (number of sequencing reads for a given RNA from RNA-seq analysis of the selected library), n.sub.2 is the size of Population 2 (number of sequencing reads for the same RNA from RNA-seq analysis of the starting library), p.sub.1 is the observed proportion of Population 1 (the number of reads for a given RNA from selected library/total number of reads), and p.sub.2 is the observed proportion for Population 2 (the number of reads for the same RNA in the starting library/total number of reads).
Generation of Logos and DiffLogos: Z.sub.obs corresponding to the highest (enriched) 0.5% and lowest (discriminated against) 0.5% scores were analyzed to generate consensus sequences, or Logos. The resulting list of sequences were converted to position weight matrix (PWM) lists for each compound. “Enriched” and “discriminated” sequences were kept separate using JMP®, Version <13.2.1>. SAS Institute Inc., Cary, N.C., 1989-2007. The R package Difflogo,.sup.16 which is part of Bioconductor, was utilized to create the sequence logos from PWM lists for each compound and visually compare the differences between them.
Binding Affinity Measurements. Dissociation constants were determined using an in solution, fluorescence-based assay. Briefly, 100 nM of RNA labeled with 5′ fluorescein was folded in 1× Assay Buffer (20 mM HEPES, pH 7.5, 150 mM NaCl, 5 mM KCl, and 40 μg/mL BSA) by heating at 60° C. for 5 min and slowly cooling to room temperature, after which MgCl.sub.2 was added to a final concentration of 1 mM. Serial dilutions (1:2) of the small molecule were then completed in 1× Assay Buffer supplemented with 1 mM MgCl.sub.2. The solutions were incubated for 30 min at room temperature, transferred to a 96-well plate, and fluorescence intensity measured on a BioTek-FLx800 plate reader. The change in fluorescence intensity as a function of small molecule concentration was fit to a one site binding model per equation 3:
where Fl is the fluorescence intensity, B.sub.max is the maximum specific binding, X is the concentration of the small molecule, H is the Hill slope, and K.sub.d is the dissociation constant.
Effect of compounds on HCV replication. Hepatitis C virus stable sub-genomic replicon cells [SGR-Rluc-Neo-(NS3-5B)-JFH1-2a with luciferase reporter] were plated in 24-well plates (5×10.sup.4 cells/well). Approximately 24 h post cell plating, the compound of interest was added. DMSO (vehicle) or 2′-C-methyladenosine triphosphate.sup.12 (10 mM concentration) were included as controls. After 72 h, cells were washed twice with 1×PBS followed by addition of 400 mL of 1× luciferase lysis buffer. A 20 mL aliquot of cell lysate was used to measure luciferase activity per the manufacturer's instructions (Promega).
Cytotoxicity assays. Huh 7.5 cells (same cells used for the development of HCV stable replicon cells) were plated in 24-well plate (5×10.sup.4 cells/well). After 24 h, the compound of interest was added and incubated with the cells for 72 h. Cell viability was measured using a CellTiter-Glo (Promega) or WST-1 per manufacturer's instructions.
Oligonucleotide profiling of target engagement in vitro. WT-SLI and BP-SLI RNAs were 5′-end labeled as previously described. The RNA of interest (100 nM) was in 1× RNase H Reaction Buffer (New England BioLabs) by heating at 60° C. for 5 min and slowly cooling to room temperature. The RNA was incubated with compound (0.1, 1, 5 and 10 μM) for 15 min at room temperature. Then, the corresponding antisense oligonucleotide (Table S4) was added to a final concentration of 1 μM and incubated for another 15 min before addition of RNase H to a final concentration of 0.05 U/μL. The samples were incubated at 37° C. or 30 min and the resulting fragments were separated on a denaturing 15% polyacrylamide gel.
Oligonucleotide profiling of target engagement by RT-qPCR. Hepatitis C virus stable sub-genomic replicon cells were plated in 100 mm dishes and grown to ˜80% confluency. The cells were batch transfected with antisense oligonucleotides (50 nM) targeting the SLI site or a control site using Lipofectamine 2000 (Life Technologies) per the manufacturer's protocol. After removing the transfection cocktail, the cells were allowed to recover in growth medium for 4 h before seeding into 24-well plates (300,000 cells/well). Cells were allowed to adhere for 2 h before treatment with 8 for 24 h. Total RNA was then extracted using Zymo Quick-RNA mini-prep kit per the manufacturer's protocol. RT was carried out with qScript reverse transcriptase (Quantabio). qPCR was completed using Power SYBR PCR Master Mix and an Applied Biosystems 7900 HT cycler with the following cycling conditions: 95° C., 30 s; 55° C., 30 s; 72° C., 30 s. Primers can be found in Table 4. Many RNAs cause disease, however RNA is rarely exploited as a small molecule drug target. A programmatic focus is to define privileged RNA motif-small molecule interactions to enable the rational design of compounds that modulate RNA biology starting from only sequence. A massive, library-versus-library screen was completed that probed over 50 million binding events between RNA motifs and small molecules. The resulting data provide a rich encyclopedia of small molecule-RNA recognition patterns, defining chemotypes and RNA motifs that confer selective, avid binding. The resulting interaction maps were mined against the entire viral genome of hepatitis C virus (HCV). A small molecule was identified that avidly bound RNA motifs present in the HCV 3′ untranslated region and inhibited viral replication while having no effect on host cells. Collectively, this study represents the first whole genome pattern recognition between small molecules and RNA folds.
Inhibition of kinases. Compounds 7-10 and Staurosporine, a pan kinase inhibitor, were tested for in vitro inhibition of a panel of 22 kinases (20 μM, 24 and 72 h). As expected, Staurosporine is a global kinase inhibitor, on average inhibiting 84±3% of kinase activity. Compounds 7 and 8 each affected seven kinases modestly (average percent inhibition of 16±3% and 14±3%, respectively), with CAMK4 the most significantly inhibited. No kinases were significantly inhibited by 9 or 10. To determine if anti-HCV activity might be traced to inhibition of kinases, we measured the mRNA expression levels of the seven affected kinases in the Huh-7.5 replicon cell line by RT-qPCR. Interestingly, CAMK4, which is the most significantly inhibited by 7 and 8 in vitro is not expressed in Huh-7.5 cells, therefore making it an unlikely cellular target of our lead compounds.
To gain insight into if the kinases inhibited by 7 and 8 in vitro might give rise to anti-HCV activity, the activity of known kinase inhibitors was studied in the replicon assay (20 μM; Figure S7C). Inhibitors of JAK3, CAMK, p38a (MAPK), and ROCK1 have no activity in the assay while inhibitors of AKT and CHK1 inhibit ˜90% and ˜50% of HCV replication. Notably, CHK1 is inhibited by 8 but not 7 in vitro. AKT is a serine/threonine kinase that controls cell proliferation. Thus, an argument could be made that 7 and 8's anti-HCV activity in the replicon assay could be traced to inhibition of AKT and hence proliferation and decreased replication. However, neither 7 or 8 significantly affected cell viability. In addition, 96 known kinase inhibitors (200 nM) were profiled for anti-HCV activity. After 24 h (incubation time for our lead compounds), 12 compounds showed >20% inhibition of HCV replication. Collectively, these results suggest that the modes of action for 7 and 8 are not due to general inhibition of cellular kinases.
[0106] Thus, several approaches including kinase profiling, structure-activity relationship for RNA binding, and competitive oligonucleotide profiling collectively support that the compounds modulate HCV by directly targeting the viral RNA.
Summary & Outlook. Herein, diverse, drug-like small molecules were identified with preferences for particular RNA motifs. It was previously shown that a fundamental understanding of the motifs preferred by small molecules can inform design of selective modulators of RNA (dys)function. Indeed, this “bottom-up” approach has afforded chemical probes and preclinical modalities against expanded repeating RNAs that cause neurological and neuromuscular disease.sup.59, 61-64 and oncogenic microRNAs.sup.8, 65. These studies have further established an encyclopedia of binding landscapes, in particular for drug-like small molecules, that will aid the emerging field of RNA drug design and discovery.
[0107] Further, an inhibitor was designed with similar potency that binds a viral RNA, rather than an enzyme. Indeed, these studies are the first example of using whole genome-based rational design to deliver small molecules that target a virus. As viral populations and their threats to human health ebb and flow with each season, the ability to rapidly identify antivirals by using sequence-based design could enable a paradigm shift in how viruses are targeted including drug-resistant populations. Small molecules that target RNA motifs conserved across related viruses might have broad spectrum anti-viral activity, i.e., activity against multiple viruses. Such approaches should also be applicable to DNA viruses by targeting viral RNA intermediates/transcripts.