Enhanced expression of RNA vectors

Abstract

The present invention relates to methods and compositions for enhancing expression from RNA expression vectores. The invention is based upon the observation that reducing the frequency of the dinucleotide CpG and UpA has a significant effect on expression from such vectores. Aspects of the invention include, amongst others, synthetic RNA vectores, virions, cells, methods of producing vaccines and methods of treatment or immunisation.

Claims

1. A method of producing a synthetic RNA expression vector, the method comprising: modifying at least one region of a primary nucleotide sequence which reduces the frequency of at least one of CpG and UpA dinucleotides in said at least one region, thereby producing a modified primary nucleotide sequence; and producing a synthetic RNA expression vector comprising said modified primary nucleotide sequence which has a reduced frequency of at least one of CpG and UpA dinucleotides compared to a synthetic RNA expression vector which comprises an unmodified primary nucleotide sequence.

2. The method of claim 1 which comprises a step of preparing a DNA polynucleotide which encodes a synthetic polynucleotide having a reduced CpG and/or UpA frequency.

3. The method of claim 2 which comprises a step of transcribing said DNA polynucleotide to form a synthetic RNA polynucleotide having a reduced CpG and/or UpA frequency.

4. The method of claim 1, wherein the synthetic RNA expression vector is a recombinant RNA viral vector.

5. The method of claim 4, wherein the synthetic RNA expression vector is a recombinant virus genome, optionally wherein the synthetic RNA expression vector is a recombinant single stranded RNA (ssRNA) virus genome.

6. The method of claim 5, wherein the at least one region of said primary nucleotide sequence which is modified to reduce the frequency of at least one of CpG and UpA dinucleotides is the entire recombinant virus genome.

7. The method of claim 1, wherein the frequency of both CpG and UpA dinucleotides is reduced in the synthetic RNA expression vector comprising said modified sequence, as compared to the synthetic RNA expression vector which comprises an unmodified primary nucleotide sequence.

8. The method of claim 1, wherein the at least one region of said primary nucleotide sequence which is modified to reduce the frequency of at least one of CpG and UpA dinucleotides is of at least 30 nucleotides in length, optionally the at least one region is of at least 100 nucleotides in length, optionally the at least one region is of at least 200 nucleotides in length, optionally the at least one region is of at least 500 nucleotides in length, optionally the at least one region is of at least 1000 nucleotides in length.

9. The method of claim 1, wherein the synthetic RNA expression vector comprising said modified primary nucleotide sequence has a reduced frequency of at least one of CpG and UpA dinucleotides compared to the synthetic RNA expression vector which comprises an unmodified primary nucleotide sequence, and exhibits increased open reading frame (ORF) expression as compared to the synthetic RNA expression vector which comprises an unmodified primary nucleotide sequence.

10. The method of claim 1, wherein the frequency of at least one of CpG and UpA dinucleotides in the at least one region of said primary nucleotide sequence which is modified is reduced by at least 50%, optionally by at least 60%, optionally by at least 70%, optionally by at least 80%, optionally by at least 90%, optionally by at least 95%, optionally by 100%, as compared to the unmodified primary nucleotide sequence.

11. The method of claim 1, wherein the frequency of CpG dinucleotides and the frequency of UpA dinucleotides in the at least one region of said primary nucleotide sequence which is modified is reduced by at least 50%, optionally by at least 60%, optionally by at least 70%, optionally by at least 80%, optionally by at least 90%, optionally by at least 95%, optionally by 100%, as compared to the unmodified primary nucleotide sequence.

12. The method of claim 1, wherein the frequency of CpG dinucleotides and/or the frequency of UpA dinucleotides in the at least one region of said primary nucleotide sequence which is modified is reduced via introduction of synonymous substitutions into coding regions of said primary nucleotide sequence, as compared to the unmodified primary nucleotide sequence.

13. The method of claim 1, wherein the frequency ratio of the at least one of CpG and UpA dinucleotides is 0.4 or lower, optionally 0.3 or lower, optionally 0.2 or lower, optionally 0.1 or lower in the synthetic RNA expression vector as a whole.

14. The method of claim 1, wherein the frequency of CpG dinucleotides and/or the frequency of UpA dinucleotides is reduced in open reading frames (ORFs) and/or coding regions of the synthetic RNA expression vector, as compared to ORFs and/or coding regions of the synthetic RNA expression vector which comprises an unmodified primary nucleotide sequence.

15. The method of claim 1, wherein regions totaling at least 50% of the synthetic RNA expression vector are modified, optionally wherein regions totaling at least 60% of the synthetic RNA expression vector are modified, optionally wherein regions totaling at least 70% of the synthetic RNA expression vector are modified.

16. The method of claim 1, wherein the at least one region of said primary nucleotide sequence which is modified to reduce the frequency of at least one of CpG and UpA dinucleotides is a viral open reading frame (ORF), optionally a viral ORF derived from a viral genome.

17. The method of claim 1, wherein the synthetic RNA expression vector comprises a RNA virus adapted for expression in an expression system for the production of a virus vaccine, optionally wherein the virus vaccine expresses heterologous pathogen antigens.

18. The method of claim 1, wherein the synthetic RNA expression vector is present in a viral replicon.

19. The method of claim 1, wherein the synthetic RNA expression vector is present in a viral replicon, optionally wherein replication of the viral replicon in a mammalian cell is enhanced relative to a viral replicon containing the synthetic RNA expression vector which comprises an unmodified primary nucleotide sequence.

20. A method of producing a synthetic recombinant ssRNA virus genome, the method comprising: modifying at least one region of a primary ssRNA virus genome to reduce the frequency of CpG and UpA dinucleotides in said at least one region of the primary ssRNA virus genome, thereby producing one or more modified regions of the primary ssRNA virus genome; and producing a synthetic recombinant ssRNA virus genome comprising said one or more modified regions which have a reduced frequency of CpG and UpA dinucleotides compared to synthetic ssRNA virus genome which comprises an unmodified primary ssRNA virus genome.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1. RNA to infectivity ratios of WT and viruses with modified CpG/UpA frequencies. WT and mutant viruses were recovered from RD cells and titred by TCID50. The number of viral genome copies was determined through qRT-PCR and compared with the infectivity titre. Results are the mean and standard error from three separate extractions.

(2) FIG. 2A-C. Replication kinetics of WT and modified viruses infected at a low MOI. RD cells were infected with E7 WT, permuted, CpG/UpA-high mutant (A) and CpG/UpA-low mutant (B) virus at an MOI of 0.01. The inoculum was removed and cells washed after 1 hour. The infectious titre of cell supernatants was then analysed at a range of time points by TCI D50. Results are the mean of three biological replicates. The mean titre and standard error of all the viruses is shown at 24 hours post infection (C).

(3) FIG. 3A-B. Plaque morphology of E7 WT and modified viruses. RD cell monolayers in 10 cm plates were infected with a similar infectious titre of virus and incubated for 96 hours at 37° C. (A). Plaque area was determined using ImageJ software (B). Results are the mean of an equal number of plaques selected randomly from one plate per virus. Asterisks show a significant difference from the WT value as determined by t test (*p<0.05, **p<0.01).

(4) FIG. 4. Synchronised infection with equal viral genome copies. Cells were synchronously infected with 1000 genome copies of WT, R1/R2 CpG-high or R1/R2 UpA-high virus, as calculated using qRT-PCR. Cells were trypsinised and washed 1 or 4 hours post infection and the intracellular viral load determined by qRT-PCR. Results are the mean and standard error of three biological replicates.

(5) FIG. 5. Analysis of luciferase expression driven by E7 replicons with reduced CpG/UpA frequencies. Replicons were generated with reduced CpG/UpA frequencies, based on the backbone pRiboE7luc replicon, in which the structural genes of E7 are replaced by an insect luciferase gene. In the pRiboE7(CpG/UpA low)luc replicon the luciferase gene itself was modified to minimise both CpG and UpA frequency; in the pRiboE7(CpG/UpA low)luc R2 CpG-low and pRiboE7(CpG/UpA low)luc R2 UpA-low replicons Region 2 was additionally modified to further reduce either CpG or UpA frequency. RNA was generated from replicons and 50 ng transfected into RD cells.

(6) Luminescence was measured relative to the mock-transfected control. Results are the mean and standard error of three biological replicates.

(7) FIG. 6. Fitness determination by competition assays between WT and modified viruses. Cells were infected with an equal MOI of WT and modified virus, and the supernatant serially passaged through cells. RNA was isolated and the composition of each virus determined through selective restriction digest (enzymes used are given in Table 2). Images show the virus composition in the starting inoculum and in three biological replicates following passage.

(8) FIG. 7A-B. Pairwise fitness comparison between CpG-low and UpA-low viruses. Cells were infected with an equal MOI of two viruses and the supernatant serially passaged. The composition of each virus was determined through selective digest, and is displayed by differential shading (A). The more rapidly the virus on the left out-competed the virus shown above, the darker the shading. A fitness ranking was then determined (B).

(9) FIG. 8. Genome organisation of E7 and positions of mutated insert regions. Insert positions are compared to genome diagram and a plot of sequence variability within species B at synonymous sites (dotted line) and folding energies indicative of RNA secondary structure (solid line). Variability at synonymous sites (left y-axis) was computed at each codon position in alignments, plotted with a window size of 41 codons. MFED values (right y-axis) for sense and antisense RNA sequences were calculated for 200 base fragments, incrementing by 48 bases; values plotted represent mean values of 5 consecutive fragments. Nucleotide positions were calculated relative to the pT7:e7 clone sequence.

(10) FIG. 9. Effect of CpG and UpA frequency changes on the replication of TMEV in mouse RAW cells. Removal of CpG and UpA dinucleotides in non-structural gene region of the genome led to enhancement of replication as determined by quantitative PCR of TMEV RNA sequences at 24 hours post-infection (y-axis scale). Conversely, addition of CpG and UpA dinucleotides in this genome region suppressed replication. The degree of replication enhancement and attenuation was comparable to that observed in mutants of echovirus 7 with similar extents of sequence replacement (single genome regions; Atkinson et al. 2014, Nucleic acids research, gku075).

(11) FIG. 10. Effect of CpG and UpA reduction in the luciferase gene on gene expression and replication of the HCV replicon. Removal of CpGs and UpAs enhanced luciferase expression immediately post-transfection and accelerated replication relative to the unmodified Con-1 replicon for at least 96 hours. Pol- is the non-replicating control RNA (mutated GDD.fwdarw.GND motif in RNA polymerase).

(12) FIG. 11. Quantitation of capsid protein synthesis by Western blot of protein extracted from cells and cell free supernatant at 18 hours after infection with wild type (WT) or CpG/UpA-low mutants of E7 (moderate cytopathic effect). Viral proteins were detected using a VP1-specific monoclonal antibody (DAKO, Clone 5-D8/1) and levels compared by densitometry (values shown above panels, standardised to wild-type levels).

DETAILED DESCRIPTION

(13) Materials and Methods Cell culture and cell lines E7 was propagated in rhabdomyosarcoma (RD) cells using Dulbecco modified Eagle medium (DMEM) with 10% foetal calf serum (FCS), penicillin (100 U/ml) and streptomycin (100 μg/ml). All cells were maintained at 37° C. with 5% CO2.

(14) In Silico Design of CpG and UpA Modified Viruses.

(15) Two regions of the full length E7 cDNA pT7: E7 clone were selected for mutagenesis that lay in regions of the genome bounded by unique restriction sites SalI (genome position 1878) and Hpa\ (genome position 31 19) for Region 1 and EcoRI (genome position 5403) and BglII (genome position 6462) for Region 2. To generate CpG-low mutants, all CpG dinucleotides were eliminated by replacement of either the C or the G base with a randomly alternative selected base selected to preserve coding of the underlying sequence. A similar strategy was used to generate UpA-low mutants, with the restriction that UpAp(C or U) codons encoding tyrosine precluded elimination of all UpA dinucleotides. Introduction of as many possible CpG or UpA dinucleotides while preserving coding was employed to generate CpG-high and UpA-HIGH sequences. The sequence changes and base compositions of the resulting insert sequences are shown in Table 1.

(16) TABLE-US-00001 TABLE 1 Composition of region 1 and 2 insert sequences G + C CpG UpA Region Sequence Content freq.sup.a change.sup.b ratio.sup.c freq Change Ratio 1 Native 47.6% 0.041 — 0.730 0.050 — 0.742 Permuted 47.6% 0.041 0 0.730 0.050 0 0.742 CpG-low 44.3% 0 −51 0 0.057 +8 0.741 UpA-low 50.6% 0.045 +5 0.703 0.015 −43 0.256 UpA/CpG-low 47.5% 0 −51 0 0.015 −43 0.227 CpG-high 56.5% 0.146 +129 1.828 0.042 −10 0.900 UpA-high 40.9% 0.032 −12 0.756 0.139 +109 1.593 2 Native 47.6% 0.018 — 0.320 0.047 — 0.695 Permuted 47.6% 0.018 0 0.320 0.047 0 0.695 CpG-low 44.3% 0 −18 0 0.047 — 0.695 UpA-low 50.6% 0.021 +3 0.331 0.014 −34 0.229 UpA/CpG-low 47.5% 0 −18 0 0.014 −34 0.215 CpG-high 56.5% 0.133 +116 1.667 0.037 −10 0.824 UpA-high 40.9% 0.015 −3 0.390 0.149 +103 1.633 .sup.aFrequency of dinucleotide in insert region .sup.bChange in the number of dinucleotides (CpG or UpA) between mutated and original WT sequence .sup.cRatio of observed dinucleotide frequence to that expected based on mononucleotide composition i.e. f(CpG/f(C) * f(G)

(17) The specific sequences of the wild type (WT), CpG-low, UpA-low and CpG and UpA-low for each of regions 1 and 2 were as follows:

(18) TABLE-US-00002 Echovirus 7 WT Region 1 (SEQ ID NO 1) GUCGACUCCGUGGUGCCCGUCAACAAUAUCAAAGUCAACCUGCAAAGCAU GGAUGCGUAUCAUAUUGAGGUCAAUACCGGGAACCACCAGGGGGAAAAGA UUUUUGCGUUCCAAAUGCAGCCGGGGUUAGAGUCUGUUUUCAAGAGAACC CUUAUGGGGGAGAUUCUUAAUUAUUAUGCACACUGGUCAGGGAGCAUUAA GCUGACAUUCACAUUUUGUGGAUCGGCGAUGGCAACUGGAAAACUCUUGU UAGCGUAUUCACCACCAGGUGCUGAUGUGCCCGCGACCAGGAAACAGGCG AUGUUAGGCACACACAUGAUUUGGGAUAUCGGGCUUCAGUCGAGCUGUGU UUUGUGCAUCCCAUGGAUAAGUCAGACACACUACCGGUUAGUGCAACAAG AUGAAUACACGAGUGCAGGCAAUGUGACGUGUUGGUACCAAACAGGAAUA GUGGUGCCCCCUGGCACUCCAAAUAAGUGUGUAGUGCUUUGUUUUGCAUC AGCUUGUAAUGAUUUCUCAGUUCGAAUGCUUAGGGACACCCCUUUCAUCG GACAAACAGCACUGCUGCAAGGCGACACCGAAACGGCUAUUGACAAUGCA AUCGCCAGGGUAGCAGAUACGGUGGCGAGCGGUCCUAGUAAUUCGACCAG UAUCCCAGCACUCACAGCAGUUGAGACAGGUCACACGUCACAAGUCGAGC CCAGCGAUACAAUGCAGACUAGACAUGUCAAAAACUACCACUCGCGUUCU GAGUCAACCGUGGAAAACUUUCUAAGUCGCUCCGCUUGUGUGUACAUCGA AGAGUACUACACCAAGGACCAAGACAAUGUUAAUAGGUACAUGUCGUGGA CAAUAAAUGCCAGAAGAAUGGUGCAAUUGAGGAGAAAGUUUGAGCUGUUU ACAUACAUGAGAUUUGAUAUGGAAAUCACGUUUGUAAUCACAAGUAGACA ACUACCUGGGACUAGCAUAGCACAAGAUAUGCCGCCACUCACCCACCAGA UCAUGUACAUACCACCAGGUGGCCCGGUACCAAACAGCGUAACAGAUUUU GCGUGGCAGACAUCAACAAACCCCAGCAUUUUCUGGACAGAAGGAAACGC GCCACCUCGCAUGUCUAUUCCAUUCAUCAGUAUUGGCAAUGCAUAUAGCA ACUUCUAUGACGGGUGGUCACACUUUUCCCAAAACGGUGUGUACGGAUAC AACGCCCUGAACAACAUGGGCAAGCUGUACGCACGUCAUGUUAAC Echovirus 7 WT Region 2 (SEQ ID NO 2) GAAUUCGCCGUUGCUAUGAUGAAGAGAAACUCAAGUACAGUGAAGACUGA GUAUGGUGAGUUUACUAUGCUGGGCAUCUAUGACAAGUGGGCCGUUUUGC CACGCCAUGCUAAACCUGGACCAACCAUCCUGAUGAAUGACCAAGAGGUC GGCGUGUUAGACGCCAAGGAACUAGUGGACAAGGAUGGCACUAACCUGGA GCUGACACUACUCAAGUUAAACCGGAAUGAGAAGUUCAGAGACAUCAGAG GCUUCUUGGCUAAGGAGGAAGUGGAAGUCAACGAGGCUGUGCUGGCAAUA AACACUAGCAAGUUUCCUAACAUGUACAUUCCAGUAGGGCAGGUUACAGA UUACGGCUUCCUAAACCUGGGUGGUACACCCACCAAAAGAAUGCUUAUGU AUAACUUCCCCACAAGAGCAGGCCAGUGUGGCGGGGUACUCAUGUCCACU GGCAAAGUUUUGGGAAUCCAUGUUGGUGGAAAUGGCCAUCAAGGCUUCUC AGCAGCACUUCUCAAACACUACUUUAAUGAUGAACAAGGAGAGAUUGAGU UCAUUGAGAGUUCAAAGGAAGCAGGGUUCCCAAUCAUUAACGCACCCAGU AAAACCAAGCUGGAGCCAAGUGUCUUCCACCAAGUAUUUGAAGGCAACAA AGAGCCAGCAGUCCUCAGGAACAGUGACCCACGUCUCAAAGCUAAUUUCG AGGAGGCCAUCUUUUCCAAAUACAUUGGGAAUGUCAACACACACAUAGAU GAAUACAUGUUGGAGGCUGUUGACCAUUAUGCCGGACAAUUGGCCACCCU AGAUAUCAGCACUGAACCAAUGAAGUUGGAGGAUGCUGUGUACGGUACUG AAGGCCUUGAAGCUCUUGACUUAACAACAAGUGCAGGCUACCCCUAUGUC GCACUGGGUAUCAAGAAGAGAGACAUCCUCUCGAAGAAGACCAAGGACCU GACCAAGCUGAAAGAGUGCAUGGAUAAGUAUGGCCUGAAUCUACCAAUGG UGACAUACGUGAAAGAUGAACUCAGAUCU CpG-low Region 1 (SEQ ID NO 3) GUCGACUCAGUGGUGCCAGUCAACAAUAUCAAAGUCAACCUGCAAAGCAU GGAUGCUUAUCAUAUUGAGGUCAAUACAGGGAACCACCAGGGGGAAAAGA UUUCUGCUUUCCAAAUGCAGCCUGGGUUAGAGUCUGUUUUCAAGAGAACC CUUAUGGGGGAGAUUCUUAAUUAUUAUGCACACUGGUCAGGGAGCAUUAA GCUGACAUUCACAUUUUGUGGAUCUGCCAUGGCAACUGGAAAACUCUUGU UAGCUUAUUCACCACCAGGUGCUGAUGUGCCUGCAACCAGGAAACAGGCU AUGUUAGGCACACACAUGAUUUGGGAUAUAGGGCUUCAGUCCAGCUGUGU UUUGUGCAUCCCAUGGAUAAGUCAGACACACUACAGGUUAGUGCAACAAG AUGAAUACACAAGUGCAGGCAAUGUGACAUGUUGGUACCAAACAGGAAUA GUGGUGCCCCCUGGCACUCCAAAUAAGUGUGUAGUGCUUUGUUUUGCAUC AGCUUGUAAUGAUUUCUCAGUUAGGAUGCUUAGGGACACCCCUUUCAUAG GACAAACAGCACUGCUGCAAGGAGACACAGAAACAGCUAUUGACAAUGCA AUUGCCAGGGUAGCAGAUACUGUGGCAAGUGGUCCUAGUAAUUCAACCAG UAUCCCAGCACUCACAGCAGUUGAGACAGGUCACACCUCACAAGUGGAGC CCAGUGAUACAAUGCAGACUAGACAUGUCAAAAACUACCACUCUAGGUCU GAGUCAACUGUGGAAAACUUUCUAAGUAGGUCAGCUUGUGUGUACAUAGA AGAGUACUACACCAAGGACCAAGAC AAUGUUAAUAGGUACAUGUCCUGGACAAUAAAUGCCAGAAGAAUGGUGCA AUUGAGGAGAAAGUUUGAGCUGUUUACAUACAUGAGAUUUGAUAUGGAAA UCACCUUUGUAAUCACAAGUAGACAACUACCUGGGACUAGCAUAGCACAA GAUAUGCCACCACUCACCCACCAGAUCAUGUACAUACCACCAGGUGGCCC AGUACCAAACAGUGUAACAGAUUUUGCCUGGCAGACAUCAACAAACCCCA GCAUUUUCUGGACAGAAGGAAAUGCCCCACCUAGGAUGUCUAUUCCAUUC AUCAGUAUUGGCAAUGCAUAUAGCAACUUCUAUGAUGGGUGGUCACACUU UUCCCAAAAUGGUGUGUAUGGAUACAAUGCCCUGAACAACAUGGGCAAGC UGUAUGCAAGACAUGUUAAC CpG-low Region 2 (SEQ ID NO 4) GAAUUCGCUGUUGCUAUGAUGAAGAGAAACUCAAGUACAGUGAAGACUGA GUAUGGUGAGUUUACUAUGCUGGGCAUCUAUGACAAGUGGGCAGUUUUGC CAAGGCAUGCUAAACCUGGACCAACCAUCCUGAUGAAUGACCAAGAGGUU GGGGUGUUAGAUGCCAAGGAACUAGUGGACAAGGAUGGCACUAACCUGGA GCUGACACUACUCAAGUUAAACAGAAAUGAGAAGUUCAGAGACAUCAGAG GCUUCUUGGCUAAGGAGGAAGUGGAAGUCAAUGAGGCUGUGCUGGCAAUA AACACUAGCAAGUUUCCUAACAUGUACAUUCCAGUAGGGCAGGUUACAGA UUAUGGCUUCCUAAACCUGGGUGGUACACCCACCAAAAGAAUGCUUAUGU AUAACUUCCCCACAAGAGCAGGCCAGUGUGGAGGGGUACUCAUGUCCACU GGCAAAGUUUUGGGAAUCCAUGUUGGUGGAAAUGGCCAUCAAGGCUUCUC AGCAGCACUUCUCAAACACUACUUUAAUGAUGAACAAGGAGAGAUUGAGU UCAUUGAGAGUUCAAAGGAAGCAGGGUUCCCAAUCAUUAAUGCACCCAGU AAAACCAAGCUGGAGCCAAGUGUCUUCCACCAAGUAUUUGAAGGCAACAA AGAGCCAGCAGUCCUCAGGAACAGUGACCCAAGGCUCAAAGCUAAUUUUG AGGAGGCCAUCUUUUCCAAAUACAUUGGGAAUGUCAACACACACAUAGAU GAAUACAUGUUGGAGGCUGUUGACCAUUAUGCAGGACAAUUGGCCACCCU AGAUAUCAGCACUGAACCAAUGAAGUUGGAGGAUGCUGUGUAUGGUACUG AAGGCCUUGAAGCUCUUGACUUAACAACAAGUGCAGGCUACCCCUAUGUG GCACUGGGUAUCAAGAAGAGAGACAUCCUCUCAAAGAAGACCAAGGACCU GACCAAGCUGAAAGAGUGCAUGGAUAAGUAUGGCCUGAAUCUACCAAUGG UGACAUAUGUGAAAGAUGAACUCAGAUCU UpA-low Region 1 (SEQ ID NO 5) GUCGACUCCGUGGUGCCCGUCAACAACAUCAAAGUCAACCUGCAAAGCAU GGAUGCGUAUCACAUUGAGGUCAACACCGGGAACCACCAGGGGGAAAAGA UUUUUGCGUUCCAAAUGCAGCCGGGGUUGGAGUCUGUUUUCAAGAGAACC CUCAUGGGGGAGAUUCUCAAUUAUUAUGCACACUGGUCAGGGAGCAUCAA GCUGACAUUCACAUUUUGUGGAUCGGCGAUGGCAACUGGAAAACUCUUGU UGGCGUAUUCACCACCAGGUGCUGAUGUGCCCGCGACCAGGAAACAGGCG AUGUUGGGCACACACAUGAUUUGGGACAUCGGGCUUCAGUCGAGCUGUGU UUUGUGCAUCCCAUGGAUCAGUCAGACACACUACCGGUUGGUGCAACAAG AUGAAUACACGAGUGCAGGCAAUGUGACGUGUUGGUACCAAACAGGAAUU GUGGUGCCCCCUGGCACUCCAAACAAGUGUGUCGUGCUUUGUUUUGCAUC AGCUUGCAAUGAUUUCUCAGUUCGAAUGCUGAGGGACACCCCUUUCAUCG GACAAACAGCACUGCUGCAAGGCGACACCGAAACGGCGAUUGACAAUGCA AUCGCCAGGGUUGCAGACACGGUGGCGAGCGGUCCGAGCAAUUCGACCAG CAUCCCAGCACUCACAGCAGUUGAGACAGGUCACACGUCACAAGUCGAGC CCAGCGACACAAUGCAGACCAGACAUGUCAAAAACUACCACUCGCGUUCU GAGUCAACCGUGGAAAACUUUCUCAGUCGCUCCGCUUGUGUGUACAUCGA AGAGUACUACACCAAGGACCAAGACAAUGUCAACAGGUACAUGUCGUGGA CAAUCAAUGCCAGAAGAAUGGUGCAAUUGAGGAGAAAGUUUGAGCUGUUC ACAUACAUGAGAUUUGACAUGGAAAUCACGUUUGUCAUCACAAGCAGACA ACUUCCUGGGACGAGCAUCGCACAAGACAUGCCGCCACUCACCCACCAGA UCAUGUACAUCCCACCAGGUGGCCCGGUCCCAAACAGCGUCACAGAUUUU GCGUGGCAGACAUCAACAAACCCCAGCAUUUUCUGGACAGAAGGAAACGC GCCACCUCGCAUGUCCAUUCCAUUC AUCAGCAUUGGCAAUGCAUACAGCAACUUCUAUGACGGGUGGUCACACUU UUCCCAAAACGGUGUGUACGGAUACAACGCCCUGAACAACAUGGGCAAGC UGUACGCACGUCAUGUUAAC UpA-low Region 2 (SEQ ID NO 6) GAAUUCGCCGUUGCCAUGAUGAAGAGAAACUCAAGCACAGUGAAGACUGA GUAUGGUGAGUUCACGAUCCUGGGCAUCUAUGACAAGUGGGCCGUUUUGC CACGCCAUGCCAAACCUGGACCAACCAUCCUGAUGAAUGACCAAGAGGUC GGCGUGUUGGACGCCAAGGAACUGGUGGACAAGGAUGGCACAAACCUGGA GCUGACACUCCUCAAGUUGAACCGGAAUGAGAAGUUCAGAGACAUCAGAG GCUUCUUGGCGAAGGAGGAAGUGGAAGUCAACGAGGCUGUGCUGGCAAUC AACACCAGCAAGUUUCCAAACAUGUACAUUCCAGUUGGGCAGGUCACAGA UUACGGCUUCCUGAACCUGGGUGGGACACCCACCAAAAGAAUGCUCAUGU ACAACUUCCCCACAAGAGCAGGCCAGUGUGGCGGGGUGCUCAUGUCCACU GGCAAAGUUUUGGGAAUCCAUGUUGGUGGAAAUGGCCAUCAAGGCUUCUC AGCAGCACUUCUCAAACACUACUUCAAUGAUGAACAAGGAGAGAUUGAGU UCAUUGAGAGUUCAAAGGAAGCAGGGUUCCCAAUCAUCAACGCACCCAGC AAAACCAAGCUGGAGCCAAGUGUCUUCCACCAAGUGUUUGAAGGCAACAA AGAGCCAGCAGUCCUCAGGAACAGUGACCCACGUCUCAAAGCCAAUUUCG AGGAGGCCAUCUUUUCCAAAUACAUUGGGAAUGUCAACACACACAUCGAU GAAUACAUGUUGGAGGCUGUUGACCAUUAUGCCGGACAAUUGGCCACCCU UGACAUCAGCACUGAACCAAUGAAGUUGGAGGAUGCUGUGUACGGCACUG AAGGCCUUGAAGCUCUUGACUUGACAACAAGUGCAGGCUACCCCUAUGUC GCACUGGGGAUCAAGAAGAGAGACAUCUUCUCGAAGAAGACCAAGGACCU GACCAAGCUGAAAGAGUGCAUGGACAAGUAUGGCCUGAAUCUUCCAAUGG UGACAUACGUGAAAGAUGAACUCAGAUCU CpG & UpA-low Region 1 (SEQ ID NO 7) GUCGACUCAGUGGUGCCAGUCAACAACAUCAAAGUCAACCUGCAAAGCAU GGAUGCUUAUCACAUUGAGGUCAACACAGGGAACCACCAGGGGGAAAAGA UUUUUGCUUUCCAAAUGCAGCCUGGGUUGGAGUCUGUUUUCAAGAGAACC CUGAUGGGGGAGAUUCUGAAUUAUUAUGCACACUGGUCAGGGAGCAUCAA GCUGACAUUCACAUUUUGUGGAUCUGCCAUGGCAACUGGAAAACUCUUGU UGGCUUAUUCACCACCAGGUGCUGAUGUGCCUGCAACCAGGAAACAGGCC AUGUUGGGCACACACAUGAUUUGGGACAUUGGGCUUCAGUCCAGCUGUGU UUUGUGCAUCCCAUGGAUCAGUCAGACACACUACAGGUUGGUGCAACAAG AUGAAUACACAAGUGCAGGCAAUGUGACAUGUUGGUACCAAACAGGAAUU GUGGUGCCCCCUGGCACUCCAAACAAGUGUGUUGUGCUUUGUUUUGCAUC AGCUUGCAAUGAUUUCUCAGUCAGGAUGCUCAGGGACACCCCUUUCAUUG GACAAACAGCACUGCUGCAAGGAGACACAGAAACAGCCAUUGACAAUGCA AUUGCCAGGGUUGCAGACACUGUGGCAAGUGGUCCAAGCAAUUCAACCAG CAUCCCAGCACUCACAGCAGUUGAGACAGGUCACACCUCACAAGUGGAGC CCAGUGACACAAUGCAGACAAGACAUGUCAAAAACUACCACUCCAGGUCU GAGUCAACUGUGGAAAACUUUCUCAGCAGGUCAGCUUGUGUGUACAUUGA AGAGUACUACACCAAGGACCAAGACAAUGUCAACAGGUACAUGUCCUGGA CAAUCAAUGCCAGAAGAAUGGUGCAAUUGAGGAGAAAGUUUGAGCUGUUC ACAUACAUGAGAUUUGACAUGGAAAUCACCUUCGUGAUCACAAGCAGACA ACUCCCUGGGACAAGCAUUGCACAAGACAUGCCACCACUCACCCACCAGA UCAUGUACAUUCCACCAGGUGGCCCAGUGCCAAACAGUGUCACAGAUUUU GCCUGGCAGACAUCAACAAACCCCAGCAUUUUCUGGACAGAAGGAAAUGC CCCACCAAGGAUGUCCAUUCCAUUCAUCAGCAUUGGCAAUGCAUACAGCA ACUUCUAUGAUGGGUGGUCACACUUUUCCCAAAAUGGUGUGUAUGGAUAC AAUGCCCUGAACAACAUGGGCAAGCUGUAUGCAAGACAUGUUAAC CpG & UpA-low Region 2 (SEQ ID NO 8) GAAUUCGCUGUUGCCAUGAUGAAGAGAAACUCAAGCACAGUGAAGACUGA GUAUGGUGAGUUCACCAUGCUGGGCAUCUAUGACAAGUGGGCAGUUUUGC CAAGGCAUGCCAAACCUGGACCAACCAUCCUGAUGAAUGACCAAGAGGUU GGGGUGUUGGAUGCCAAGGAACUGGUGGACAAGGAUGGCACCAACCUGGA GCUGACACUUCUCAAGUUGAACAGAAAUGAGAAGUUCAGAGACAUCAGAG GCUUCUUGGCCAAGGAGGAAGUGGAAGUCAAUGAGGCUGUGCUGGCAAUC AACACCAGCAAGUUUCCCAACAUGUACAUUCCAGUGGGGCAGGUGACAGA UUAUGGCUUCCUGAACCUGGGUGGAACACCCACCAAAAGAAUGCUCAUGU ACAACUUCCCCACAAGAGCAGGCCAGUGUGGAGGGGUUCUCAUGUCCACU GGCAAAGUUUUGGGAAUCCAUGUUGGUGGAAAUGGCCAUCAAGGCUUCUC AGCAGCACUUCUCAAACACUACUUCAAUGAUGAACAAGGAGAGAUUGAGU UCAUUGAGAGUUCAAAGGAAGCAGGGUUCCCAAUCAUCAAUGCACCCAGC AAAACCAAGCUGGAGCCAAGUGUCUUCCACCAAGUGUUUGAAGGCAACAA AGAGCCAGCAGUCCUCAGGAACAGUGACCCAAGGCUCAAAGCCAAUUUUG AGGAGGCCAUCUUUUCCAAAUACAUUGGGAAUGUCAACACACACAUUGAU GAAUACAUGUUGGAGGCUGUUGACCAUUAUGCAGGACAAUUGGCCACCCU GGACAUCAGCACUGAACCAAUGAAGUUGGAGGAUGCUGUGUAUGGCACUG AAGGCCUUGAAGCUCUUGACUUGACAACAAGUGCAGGCUACCCCUAUGUG GCACUGGGGAUCAAGAAGAGAGACAUCCUCUCAAAGAAGACCAAGGACCU GACCAAGCUGAAAGAGUGCAUGGACAAGUAUGGCCUGAAUCUCCCAAUGG UGACAUAUGUGAAAGAUGAACUCAGAUCU RNA structure prediction and sequence variability.
RNA Structure Prediction and Sequence Variability.

(19) Prototype sequences of each species B serotype (http://www.picornaviridae.com/) were scanned for RNA secondary structure using the program Folding Energy Scan in the SSE package (Simmonds. 2012, BMC research notes 5: 50-50) using 200 base fragments incrementing by 152 bases and 50 sequence order randomised control using the algorithm NDR that preserves dinucleotide frequencies of the native sequence (Simmonds et al. 2004, RNA-Publ. RNA Soc. 10: 1337-1351). Mean MFED values for each fragment were plotted against the mid-point of each fragment to localise areas of sequence-order dependent RNA secondary structure. MFEDs were also similarly calculated for the reverse complement of each genome sequence. Synonymous sequence variability was determined by measurement of mean pairwise distances using the program Sequence Scan in the SSE package.

(20) Clone Construction and Recovery of Mutant Viruses

(21) The full length E7 cDNA pT7:E7 clone under the control of a T7 promoter was used for this study. Mutant E7 constructs with altered CpG/UpA content were generated by ordering custom DNA sequences (GeneArt, Life Technologies, Paisley, UK). Sequences were provided in standard antibiotic resistant cloning vectors and were cloned into pT7:E7 All clones were sequenced over the insert regions prior to further applications. To recover the mutant viruses with altered CpG/UpA content, assembled plasmids were linearised using NotI and a T7 transcription reaction carried out to create RNA using a Mega Script T7 in vitro transcription kit (Ambion). 100 ng of RNA was transfected into RD cells using Lipofectamine 2000 (Invitrogen) according to the manufacturer's instructions. The resulting cell lysates were used to generate passage 1 stocks by re-infecting RD cells. Viral titres were determined by TCID.sub.50 titration in RD cells.

(22) Replication Phenotype

(23) RD cells were seeded at 5×10.sup.5 cells per well in 6-well plates and subsequently infected with the WT or CpG/UpA mutants at an MOI of 0.01 per cell for 1 hour, before removing the inoculum and washing the cells. Samples were then withdrawn at given time points (12, 18, 24, 30, 42 hours post-infection) and the viral titre determined by TCID.sub.50. The assay was performed in triplicate per virus. For plaque assays, confluent RD cells in 100 mm dishes or 6-well plates were inoculated with virus in DMEM and incubated for 1 hour at 37° C. with occasional rocking. The inoculum was removed and replaced with overlay consisting of 2% Methocel MC (Sigma) in DMEM. Plates were incubated for 96 hours at 37° C., fixed with 3.5% formaldehyde and stained with 0.1% crystal violet. Plaque sizes were quantified using ImageJ software.

(24) Quantification of Viral RNA in Infected Cells

(25) Load of viral RNA in infected RD cells was analysed using qRT-PCR. RNA was isolated from cells using the RNAspin Mini Kit (GE Healthcare) or from viral supernatant using the QIAamp Viral RNA Mini Kit (Qiagen). Reverse transcription was performed using M-MLV reverse transcriptase (Promega) and random primers. E7 cDNA was then quantified by qRT-PCR using primers annealing to the 5′ UTR region (Sense: TCCGGCCCCTGAATGCGGCTAA (SEQ ID NO 9), Antisense: CACCCAAAGTAGTCGGTTCCGC (SEQ ID NO 10)). Reactions were carried out using a Sensifast SYBR Mi-Rox Kit (Bioline) and a Rotorgene-Q cycler (Qiagen), and cycling conditions were as follows: 95° C. for 2 minutes, then 40 cycles of 95° C. for 5 seconds, 60° C. for 10 seconds and 72° C. for 20 seconds. A standard curve for E7 RNA using a quantified PCR product was carried out in parallel, allowing quantification of viral copy number. RNA to infectivity ratio was determined by extracting RNA from 5000 TCID.sub.50 units per virus and by performing quantitative RT-PCR against a standard curve.

(26) Replicon Construction and Replication Kinetics

(27) To accurately quantify intracellular viral replication, the pRiboE7luc replicon plasmid was used. This contains a version of the E7 genome in which the structural genes (nucleotides 753 to 3118) are replaced with the 1704 bp-long firefly luciferase gene. In order to minimise frequencies of CpG and UpA dinucleotides within the luciferase gene, an alternative luciferase gene was designed using the same method as that described for Regions 1 and 2, and ordered as a custom DNA sequence. As before, the amino acid sequence remained unchanged. The custom luciferase gene also contained a CpG- and UpA-low 72 bp linker sequence at the 3′ end to allow cloning into the SanDI restriction site at nucleotide 3191 of the E7 genome. The sequence was cloned into pT7:E7 using the unique restriction sites KasI (genome position 781) and SanDI. To create replicons containing the additional Region 2 CpG or UpA low inserts, a 3235 bp section of the replicon directly 3′ of the luciferase gene was excised using SanDI and BglII restriction enzymes. This was then replaced with the equivalent sections of the previously described R1/R2 CpG low or R1/R2 UpA low constructs, containing the modified Region 2 inserts. Replicon plasmids were linearised using NotI and RNA was created in a T7 reverse transcription reaction.

(28) Assays were performed by transfecting 50 ng of replicon RNA into RD cells seeded at 3×10.sup.4 cells per well in 96-well plates. RNA was transfected at given time points (1, 4, 6, 8, 12 hours) before luciferase assays were carried out using the Luciferase Assay System (Promega), according to the manufacturer's instructions. Cells were lysed using the Passive Lysis Buffer and the cell lysate transferred to opaque 96-well plates for luminescence analysis using the Glomax Multi Detection System (Promega).

(29) Sequencing of Individual Virus Genomes

(30) Viral RNA was isolated from E7 WT, R1/R2 CpG-high, or R1/R2 UpA-high virus stocks generated in RD cells, and cDNA created. Nested primers were designed to amplify a ˜500 bp section of the modified Region 1 (nucleotides 1835-2363) and an unmodified region of E7 (nucleotides 3241-3723). Primer sequences are given in Table 2. The proofreading enzyme PfuTurbo DNA Polymerase (Agilent)) was used to amplify the two sections from each cDNA. The products were purified, cloned into a TA vector (pGEM-T easy, Promega), and transformed into competent E. coli, generating a separate colony for each copy of the original viral cDNA. The 500 bp inserts were sequenced using M13 primers.

(31) TABLE-US-00003 TABLE 2 Nested primers used in sequencing individual viral genomes Primer Nucleotide Region Virus type position Sequence 1 All Outer, 1809 CCCAATTTGATGTAACACCACACATGG sense SEQ ID NO 11 1 All Inner, 1835 GATATTCCAGGCGAAGTACACAACC sense SEQ ID NO 12 1 EV7 WT Outer, 2343 CAAAGCACTACACACTTATTTGGAG antisense SEQ ID NO 13 1 R1/R2 Outer, 2382 ATTCGAACGGAGAAATCGTTAC CpG-high antisense SEQ ID NO 14 1 R1/R2 Outer, 2388 TCCCTTAGCATACGTACTGAGAAAT UpA-high antisense SEQ ID NO 15 1 EV7 WT Inner, 2313 GCACCACTATTCCTGTTTGGT antisense SEQ ID NO 16 1 R1/R2 Inner, 2348 AACAAAGCACGACGCACTTATT CpG-high antisense SEQ ID NO 17 1 R1/R2 Inner, 2363 CATTACAAGCTGATCCAAAACATAG UpA-high antisense SEQ ID NO 18 Un- All Outer, 3210 TGAGCCCGTACATCAAATCA modified sense SEQ ID NO 19 Un- All Inner, 3241 TTTTAACCCCACGAACCTGA modified sense SEQ ID NO 20 Un- All Outer, 3785 TTGCCGAGTTGTTCGACATA modified antisense SEQ ID NO 21 Un- All Inner, 3723 CAAGTCACGGATGTCTGCAA modified antisense SEQ ID NO 22
Competition Assays

(32) Equal titres of wild type (WT) and mutant virus (MOI=0.01) were applied simultaneously to RD cells in 24-well plates. Following CPE, the supernatant was frozen, thawed, and applied to fresh RD cells. This was continued for 10 passages, and was carried out in triplicate for each assay. For the pairwise competition assay, RD cells were inoculated with paired combinations of 7 viruses, giving 21 combinations in total. Each pairwise assay was carried out in a single well and passaged through RD cells 10 times. RNA was isolated from the final supernatants, cDNA was generated and nested PCR carried out to amplify either Region 1 or Region 2 (Primers used are as follows:

(33) TABLE-US-00004 Region 1 sense (outer): (SEQ ID NO 23) CCCAATTTGATGTAA CACCACACATGG, Region 1 sense (inner): (SEQ ID NO 24) GATATTCCAGGCGAAGTACACAACC, Region 1 antisense (outer): (SEQ ID NO 25) CCCATACTCGGATGTGCTTGGG, Region 1 antisense (inner): (SEQ ID NO 26) CACTCGGATTGTGCTTGACATCTG, Region 2 sense (outer): (SEQ ID NO 27) CAAGGAGCATACACAGGA ATA CC, Region 2 sense (inner): (SEQ ID NO 28) GGTACCTACTCTTAGGCAAGCA, Region 2 antisense (outer): (SEQ ID NO 29) GAATGTCTGCCTCATCGCCAACT, Region 2 antisense (inner): (SEQ ID NO 30)) AAGCTGGACGCTTCAATGAGCCT.

(34) The amplified fragment was then subjected to selective digest to determine the composition of each virus in the final supernatant. The restriction enzymes used for each competition assay are given in Table 3. Relative band intensity was measured using ImageJ software.

(35) TABLE-US-00005 TABLE 3 Enzymes used in selective digests for competition assays Region Virus 1 Virus 2 amplified Enzyme Restriction site Individual competetion experiments WT R1/R2 Permuted 2 HindIII In R1/R2 Permuted WT R1/R2 CpG-high 1 BamHI In R1/R2 CpG-high WT R1/R2 UpA-high 2 ScaI In R1/R2 UpA-high WT R1/R2 CpG-low 2 SphI In R1/R2 CpG-low WT R1/R2 UpA-low 2 EcoRV In WT Pairwise competition experiments WT R1/R2 Permuted 2 HindIII In R1/R2 Permuted WT R1 CpG/UpA-low 1 EcoRV In WT WT R2 CpG/UpA-low 2 EcoRV In WT WT R1/R2 CpG-low 2 EcoRV In R1/R2 CpG-low WT R1/R2 UpA-low 2 EcoRV In WT WT R1/R2 CpG/UpA-low 2 SphI In WT R1/R2 Permuted R1 CpG/UpA-low 1 SphI In R1/R2 Permuted R1/R2 Permuted R2 CpG/UpA-low 2 HindIII In R1/R2 Permuted R1/R2 Permuted R1/R2 CpG-low 2 HindIII In R1/R2 Permuted R1/R2 Permuted R1/R2 UpA-low 2 HindIII In R1/R2 Permuted R1/R2 Permuted R1/R2 CpG/UpA-low 2 HindIII In R1/R2 Permuted R1 CpG/UpA-low R2 CpG/UpA-low 1 EcoRV R2 CpG/UpA-low R1 CpG/UpA-low R1/R2 CpG-low 2 SphI In R1/R2 CpG-low R1 CpG/UpA-low R1/R2 UpA-low 2 EcoRV In R1 CpG/UpA-low R1 CpG/UpA-low R1/R2 CpG/UpA-low 2 EcoRV In R1 CpG/UpA-low R2 CpG/UpA-low R1/R2 CpG-low 2 EcoRV In R1/R2 CpG-low R2 CpG/UpA-low R1/R2 UpA-low 2 SphI In R2 CpG/UpA-low R2 CpG/UpA-low R1/R2 CpG/UpA-low 1 EcoRV In R2 CpG/UpA-low R1/R2 CpG-low R1/R2 UpA-low 2 EcoRV In R1/R2 CpG-low R1/R2 CpG-low R1/R2 CpG/UpA-low 2 EcoRV In R1/R2 CpG-low R1/R2 UpA-low R1/R2 CpG/UpA-low 2 SphI In R1/R2 CpG/UpA-low
Early Intra-Cellular Replication Kinetics

(36) To induce synchronous infection, RD cells in 24-well plates were cold-treated at 4° C. for 5 minutes before inoculation with wild type or mutant virus normalised for genome copy number. A total of 2×10.sup.8 genome copies (1000 per cell) were applied to each well, and the cells were maintained at 4° C. for a further 30 minutes before being moved to 37° C. Cells were washed twice with PBS and then trypsinised 1 hour or 4 hours post infection. The cells were then pelleted and washed again in PBS before RNA was isolated and viral copy number determined by qRT-PCR. Copy number was normalised against the housekeeping gene GAPDH (qRT-PCR primers: Sense GAAATCCCATCACCATCTTCCAGG (SEQ ID NO 31); Antisense GAGCCCCAGCCTTCTCCATG (SEQ ID NO 32)).

(37) R1 Transfection—Creating the Transcripts

(38) RNA transcripts were made from Region 1 of the E7 WT and mutant viruses by linearising the original cloning plasmid containing the synthetic insert with HpaI, and carrying out a T7 transcription reaction. The integrity of the 1.3 kb RNA transcripts was confirmed using an Agilent Bioanalyser. A549 cells in 24-plates were transfected with 250 μI RNA using 1.5 μI Lipofectamine 2000 (Invitrogen) per well, and cellular RNA was harvested 6 hours later. Poly 1:C (5 pg/well) was transfected as a positive control. Induction of I FNp was analysed by qRT-PCR (Primers: Sense GACCAACAAGTGTCTCCTCCAAA (SEQ I D NO 33); antisense G AACTG CTGCAGCTG CTTAATC (SEQ ID NO 34)) using cycling conditions of 95° C. for 10 mins, followed by 40 cycles of 95° C. for 15 s and 60° C. for 60 s. Copy number was normalised against GAPDH

Results

(39) Strategy for Maximising or Minimising CpG/UpA Content in Mutant Viruses

(40) Like other small RNA viruses, the frequency of CpG dinucleotides in the E7 genome was suppressed relative to the expected frequency based on its G+C content, with an observed to expected ratio of CpG dinucleotides in the coding sequence of E7 of 0.58. Frequencies of UpA dinucleotides were also suppressed in the E7 genome (observed to expected ratio of 0.78).

(41) To investigate whether CpG and UpA dinucleotide frequencies influenced the ability of E7 to replicate in vitro, we created a series of mutated viruses in which frequencies of both nucleotides were changed from their native levels. This was achieved using the reverse genetics system developed for enteroviruses, in the current study with the pT7:E7 infectious clone. RNA transcripts generated from a linearised plasmid containing the E7 complete genome sequence generate infectious virus for phenotypic characterisation after transfection into a wide range of mammalian cells.

(42) To select sequences for mutagenesis, we sought to avoid regions of the genome that contained RNA elements required for replication or translation functions of the virus, such as the cis-replicating element embedded in the 2C coding sequence (Goodfellow et al. 2000, Journal of Virology 74: 4590-4600).

(43) Although incompletely located and functionally characterised to date, the presence of required non-coding elements can be revealed through analysis of RNA secondary structure formation in these regions and through suppression of synonymous sequence variability that reflects non-coding functional constraints on sequence change in these regions (FIG. 8). By scanning an alignment of complete genome sequences of each of the current described species B serotypes (including the pT7:E7 sequence of the infectious clone), an area of marked suppression of sequence variability co-localised in the 2C region with the CRE. Calculation of folding energies to detected RNA secondary structure in the genome showed prominent regions of structure in the 5′UTR, 3′UTR and the CRE. The remainder of the genome showed no evidence for consistent RNA structure formation (MFED values around zero).

(44) The combination of unrestricted synonymous variability and an absence of RNA secondary structure over long stretches of the E7 genome provided opportunities for altering dinucleotide frequencies without impairing virus replication for other reasons. Two genome regions (at positions 1878-31 19 and 5403-6462) were selected for mutagenesis based on these criteria. Sequences were modified by replacing nucleotides within CpG or UpA dinucleotides with alternative bases that preserved coding. It was possible to remove all CpG dinucleotides from both regions and reduce UpA to frequencies approximately one third of wild type levels (Table 1; CpG-low and UpA-low insert sequences). As an alternative strategy to maximise frequencies of these dinucleotides, every site that could tolerate the creation of these dinucleotides without changing coding was identified and mutated to create sequences with 2.5×3× the their naturally occurring frequencies (Table 1 ; CpG-high, UpA-high). To ensure that sequence disruption did not damage or destroy undetected replication element within Region 1 and 2, sequences from these regions were permuted using the algorithm CDLR in the SSE sequence package (E7-permuted in Table 1). This randomises the order of codons within the sequence while maintaining coding and dinucleotide frequencies through swaps between equivalently coding triplets in the same upstream and downstream dinucleotide contexts. All insert sequences were then synthesised and cloned into the pT7: E7 infectious clone using naturally occurring restriction sites. Clones were creating with one or both regions replaced by modified insert sequences.

(45) Replicative Fitness of Mutants with Modified CpG/UpA Frequencies

(46) Wild type E7 and mutant viruses were recovered in tissue culture by transfecting whole-genome RNA sequences obtained through T7 transcription of pT7: E7. Recovered virus was then titred by TCID50 and used in subsequent experiments.

(47) Particle to infectivity ratio. RNA copy to infectivity ratios were determined by extracting viral RNA from a known infectious titre of each virus, and carrying out qRT-PCR. The ratios are shown in FIG. 1. The RNA to infectivity ratio of the permuted double region mutant (247±9.2) was similar to that of the WT E7 virus (354±8.0), indicating that the process of synonymous nucleotide replacement itself does not affect RNA to infectivity ratio where dinucleotide frequencies are kept constant. In contrast, increasing the either the CpG or UpA dinucleotide frequency drastically affected RNA to infectivity ratio, with the value for the double region CpG-high mutant being approximately 350 times the WT value (128,840±31698.6) and the UpA-high mutant approximately 20 times higher (6233±883.6). The RNA to infectivity ratio for the double region CpG-low and UpA-low mutants was comparable to the WT.

(48) ii) Replication kinetics with low MOI infection. In a low-MOI multi-step infection the growth kinetics of the E7 mutants was compared to that of the WT. Increasing the CpG or UpA dinucleotide frequency caused a severe attenuation of viral replication, resulting in a viral output 6854-fold lower in the R1/R2 CpG-high than the WT after 24 hours, and a 30-fold lower output in the R1/R2 UpA-high mutant (FIG. 2a). Mutant viruses replicated more slowly as well as producing a lower final output of infectious particles. Dose-dependency was demonstrated, as viruses with only one region mutated tended to replicate better than the double region mutants. Increasing dinucleotide frequencies in Region 2 was more detrimental to viral replication than Region 1, despite its shorter length (1 kb compared to 1.3 kb). R1 CpG-high mutants replicated only 144-fold less than wild type at 24 hours, whilst R2 CpG-high mutants replicated 1487-fold less (FIG. 2c). Amongst the UpA-high mutants, replication was actually improved by modifying R1, giving a 10-fold higher output than wild type, whilst the R2 mutant fared slightly worse than the double mutant. The replication rate of the R1/R2 permuted control was indistinguishable from wild type. Lowering the CpG and UpA dinucleotide frequency compared to the WT level actually had a positive effect on viral replication, albeit more subtle than for the high mutants (FIG. 2b, c). The replication rates and final viral outputs of the CpG-low and UpA-low double mutants were similar to wild type, however the replication of the R1/R2 CpG/UpA-low double mutant was 10-fold higher than the wild type at both 18 and 24 hours post infection.

(49) iii) Plaque morphology. Increasing CpG and UpA frequency also negatively affected plaque area (FIGS. 3a-b). The size reduction in CpG-high mutants was dose-dependent, with the R1 mutant plaque area 3.5-fold lower than the E7 WT and the R1/R2 mutant 8.8-fold lower. R1/R2 UpA-high plaques were on average 3.2-fold smaller than the wild type, demonstrating again a less severe phenotype than the equivalent double region CpG-high mutant. The area of R1/R2 UpA-low plaques was comparable to WT, whilst the R1/R2 CpG-low mutant produced significantly larger plaques, 1.4-fold greater than the WT.

(50) iv) Replication kinetics of a sub-genomic replicon. The replication kinetics of CpG- and UpA-low mutants were further characterised using a sub-genomic replicon system expressing a luciferase gene, in order to provide a more sensitive measure of viral genome replication. Bioinformatic analysis of the original pRiboE7luc 1.7 kb firefly luciferase gene revealed a strikingly high observed to expected CpG ratio, of 1.242. This is characteristic of insect genomes, in which CpG frequency is not suppressed (Burge et al. 1992, Proceedings of the National Academy of Sciences of the United States of America 89: 1358-1362). Despite the widespread use of such reporter systems, the results obtained in the current study and those of Burns et al. (2009) suggested that the high CpG ratio could drastically impede the replication rate of this viral replicon in mammalian cell lines. A replacement luciferase gene was therefore designed in which the CpG ratio was reduced to 0.013 and the UpA ratio to 0.145 (from 0.699) through synonymous substitution, as described previously. Following this, Region 2 of the resulting modified replicon was replaced with the CpG-low or UpA-low inserts used in generating the original double region mutants. Fluorescence was then analysed over a 12-hour time-course following transfection of each replicon (FIG. 5). A dramatic increase in replicative ability was conferred by the replacement of the original insect luciferase gene with the synthetic CpG/UpA low gene, giving a 100-fold difference in relative luminescence at 4 hours. Replication rate was heightened further by the addition of the Region 2 CpG- or UpA-low inserts, to a maximum of 6-fold after 6 (CpG-low) or 4 (UpA-low) hours relative to the pRiboE7luc CpG/UpA low replicon. The results demonstrate that by reducing CpG or UpA frequencies to below wild type levels, replicative fitness of E7 can actually be improved in a cell culture environment. Furthermore, the efficiency of transgenic reporter genes may be improved by at least 100-fold by optimising CpG and UpA frequencies according to the genetic system under study.

(51) Investigation of Virus Particle Integrity

(52) In order to determine whether the impaired replication rate observed in CpG and UpA-high mutants was due to a reduction in the ability of virus particles to enter cells, a comparison was made between the number of virus particles used to infect cells and the number of intracellular vial genome copies present immediately post infection. One hour after a synchronous infection with 1000 virus particles per cell (as determined by qRT-PCR), the number of intracellular viral genome copies was found to be similar between viruses, with 42 per cell in wild type E7, 19 per cell in R1/R2 CpG-high, and 36 per cell in R1/R2 UpA-high, see FIG. 4. Four hours post infection, after initiation of viral genome replication, a clear differentiation was observed between viruses. The number of wild type genome copies had increased to 2362 per cell, whilst CpG-high copies remained at 58 per cell and UpA-high at 207 per cell. Increasing CpG or UpA dinucleotide frequencies therefore affects viral genome replication at an early stage post infection.

(53) Fitness Comparison of Modified Viruses using Competition Assays

(54) The relative fitness of high and low mutant viruses compared to E7 WT was confirmed using competition assays. Following infection with an equal MOI of each virus and serial passage in tissue culture, R1/R2 CpG-high and R1/R2 UpA-high each became rapidly out-competed by the WT, being un-detectable by PCR after 5 passages (FIG. 6). Further analysis of CpG-high mutants showed that the R1/R2 mutant was already being out-competed after 1 passage, whereas the R1 and R2 mutants were out-competed more slowly due to their higher relative fitness. Similarly, the individual R1 and R2 UpA-high mutants were still abundant after 5 passages.

(55) Confirming the replicative advantage revealed by the CpG-low and UpA-low replicons, the R1/R2 CpG- and UpA-low mutants demonstrated a higher relative fitness than WT, out-competing it completely after 15 passages, and showing at least 90% prevalence after only 10 passages (FIG. 6). To investigate this phenomenon further, a pairwise competition experiment was carried out whereby combinations of single or double region CpG- and/or UpA-low mutants were competed against one another, allowing a fitness ranking to be determined (FIG. 7a-b). The R1/R2 CpG/UpA-low mutant had the highest fitness, completely out-competing almost all of the other viruses by passage 6. The double region CpG-low ranked second, followed by the single region R1 CpG/UpA-low mutant. Lowering CpG/UpA frequency in Region 1 was demonstrated to have more effect than in R2, as the R2 mutant was rapidly out-competed by R1 CpG/UpA-low as well as the double region UpA-low mutant, an effect that might be expected due to the relative sizes of the modified fragments (Region 1 is 1.3 kb whereas R2 is 1 kb). The reduction of CpG frequency was shown to have a greater effect than that of UpA, whereas increasing the CpG level was more detrimental to viral replication than UpA.

(56) Effect of Dinucleotide Frequency Changes in other RNA Viruses

(57) To investigate the generality of the replication enhancement observed in E7 in other virus systems, the inventor constructed mutants of the murine Theiler's virus (TMEV), a picornavirus in the genus Cardiovirus and of influenza A virus (IAV) with regions of the genome replaced with modified coding sequences. These were similarly designed to contain elevated or lowered CpG and UpA dinucleotide frequencies while retaining protein and avoiding areas of the genome containing known or suspected RNA secondary structures or packaging elements (IAV).

(58) Replication competent mutant of TMEV was constructed with a region of the genome between positions 5445-6702 replaced with modified sequences (numbering based on the TMEV GD7 clone [accession number X56019]). Mutants with elevated frequencies of CpG and UpA showed substantial impairment of virus replication (FIG. 9) while the CpG/UpA-low mutant showed enhanced replication compared to wild type (WT) virus. CpG- and UpA-high mutants showed elevated RNA/infectivity ratios compared to WT. The degree of replication enhancement/attenuation observed in TMEV was similar in extent to those of E7 mutants with comparable degrees of genome replacement (single region mutants).

(59) Several mutants of IAV have been constructed in which one or more genome segments were replaced with modified insert sequences. As an example of the results obtained, mutants with a segment with increased CpG or UpA showed attenuated replication and an increased RNA/infectivity ratio. These changes in phenotype were comparable in magnitude to those observed in E7 (and TM EV). The replication cycle of IAV is substantially different from those of E7 and TMEV and indicates that the restrictions imposed by possession of CpG and UpA dinucleotides on replication/gene expression likely represent fundamental aspects of RNA virus replication. Dinucleotide frequencies therefore may influence replication rates of a much wider range of mammalian, avian and plant viruses that show similar suppression of CpG and UpA dinucleotide frequencies.

(60) Influence of Dinucleotide Frequencies on Reporter Gene Expression.

(61) A variety of genes are used as reporters or selectable markers in biotechnology, as components of expression vectors, transgenes and replicons. Reporter genes or selectable markers are frequently derived from prokaryotes (e.g. antibiotic resistance genes) or lower eukaryotes (e.g. luciferase, green fluorescent protein). Most derive from organisms without reduced or absent host genome DNA methylation and consequently lack the suppression of CpG dinucleotides observed in vertebrate sequences and in RNA viruses infecting them. The inventor hypothesised that high CpG frequencies in commonly used reporter genes such as firefly luciferase (derived from the insect Photinus pyralis) may have a generic, harmful effect on gene expression and replicative ability of replicons containing them. The inventor has previously observed substantial enhancement in luciferase expression and replication of the E7 replicon though insertion of a zero-CpG, low UpA replacement luciferase sequence. The inventor has now observed the same phenomenon in the HCV replicon.

(62) The Con1 replicon is widely used to study the replication of hepatitis C virus (Lohmann et al. 1999, Science 285: 110-1 13). A currently widely used Con1-derived construct (Krieger et al. 2001 , J. Virol. 75:4614-4624) contains a luciferase reporter gene similar to that used in the E7 replicon and which shows similarly elevated CpG frequencies. The inventor replaced this with a CpG-zero, UpA-low synthetic sequence and compared luciferase expression with the parental sequence.

(63) This degree of replication enhancement of the HCV replicon exceeded that even of E7. Remarkably, in its unmodified form, the Con1 HCV replicon has been used in replication assays in academic research and by the pharmaceutical industry for antiviral development for over 12-13 years without any idea that its replication is fundamentally compromised by inserted reporter genes (see FIG. 10). This underlines the novelty of the discovery described in the instant disclosure.

(64) Similar modifications can be made to a red fluorescent protein (RFP) expressing HCV replicon construct. In this specific case, commonly used RFP sequences as transgenes and other vectors show CpG frequencies of over 0.6 (observed to expected ratio) which potentially also influence their expression and mediate unintended cellular activation processes.

(65) Not only does luciferase (and likely other high CpG reporter genes) reduce the replication of replicons (e.g. E7 and HCV) but their intracellular expression has a likely substantial effect on the non-physiological activation of cellular defence pathways (Atkinson et al. 2014, Nucleic acids research, gku075). These have potentially compromised studies of effects of innate immune responses to viral replication in cells. Similar concerns about potential toxicity and cellular activation effects naturally arise when considering the use of these and other sequences with high CpG frequencies as selection or reporter genes in wider areas of biotechnology. The instability of many sequences used as transgenes may originate through recruitment of innate and inflammatory responses against cells expressing such reporter genes or selection markers.

(66) CpG and UpA Removal to Enhance Virus Replication in the Manufacture of Inactivated Virus Vaccines.

(67) By quantitative PCR and infectivity assays, accelerated replication of CpG/UpA-low mutants in multistep replication assays has been demonstrated, but to reinforce this it is useful to show further that enhanced replication produces greater yields of viral proteins that represent the protective component of a vaccine.

(68) The inventor infected RD cells with wild type echovirus 7 and the CpG/UpA-low mutant. Cells were harvested at several time points after infection and expression of viral capsid protein extracted from cells and supernatant quantified by Western blot using a specific anti-capsid monoclonal antibody (FIG. 11).

(69) The CpG/UpA-low echovirus 7 mutant showed enhanced capsid protein expression throughout the time course of the experiment, quantified at levels of 2-fold higher than the WT control at 12 hours and increasing to 14.5-fold at 18 hours Translated to a poliovirus system, this provides the evidence required for the ability of this mutational process to substantially improve inactivated virus vaccine production yields.

(70) The experimental results depicted in FIG. 11 were obtained from a mutant E7 with approximately 30% of the genome replaced by CpG/UpA-low mutated sequences. Further enhancement of virus replication and viral protein production can almost certainly be achieved through further replacement of sequences in other parts of the coding region of the genome. In E7 and likely in poliovirus, one is typically able to replace up to 80% of the genome with CpG/UpA-low sequences and achieve further enhancement of virus replication. For influenza A virus it is expected that segments 1, 4, 5 and 6 (collectively approximately 43% of the genome) can be replaced with CpG/UpA-low sequences. Segments 4 and 6 encode the haemagglutinin and neuraminidase proteins that represent the principal protective components in the inactivated IAV vaccine.

DISCUSSION

(71) High Mutants

(72) The first part of this study demonstrated that specifically increasing the frequency of CpG or UpA dinucleotides in E7 results in severe viral attenuation. Attenuation was characterised by a dramatic reduction in replication rate, smaller plaque area, low particle to infectivity ratio and a low competitive fitness relative to WT E7. The results agree with the outcome of previous studies in poliovirus, in which codon replacement or de-optimisation leading to an increase in CpG/UpA frequency correlated negatively with replicative fitness (Burns et al. 2009, J. Virol. 83:9957-9969, Coleman et al. 2008, Science. 320:1784-1787). A reduced RNA to infectivity ratio due to higher CpG and UpA frequencies was also observed in poliovirus (Burns et al. 2009, J. Virol. 83:9957-9969). Increasing CpG and UpA in E7 had a greater effect than in poliovirus, where introducing 105 new CpG dinucleotides in the capsid region led to approximately a 3-fold reduction in infectivity output (Burns et al. 2009, J. Virol. 83:9957-9969). In E7, introducing 129 new CpGs in the capsid region led to a 74-fold reduction in infectivity titre, whilst introducing 116 CpGs into the region of non-structural genes caused a 7500-fold reduction. Similar experiments are currently underway using Theiler's murine encephalomyelitis virus (TMEV) and influenza A virus, in which increased CpG or UpA frequency also results in a decrease in viral replication (data not shown). Our results show definitively that experimental attenuation of viral fitness is specifically related to CpG and UpA frequencies and is irrespective of %G+C content, also dispelling theories that fitness is determined by non-preferred codon replacement itself or by codon pair bias (Coleman et al. 2008, Science. 320: 1784-1787, Burns et al. 2009, J. Virol. 83:9957-9969). The permuted control used in this study negates the possibility that attenuation is due to disruption in RNA secondary structure. Furthermore, replication defects are unlikely to result from a decrease in translational efficiency, as previous studies have shown that protein synthesis levels are unaltered even for highly attenuated viruses (Burns et al. 2006, J. Virol. 80:3259-3272, Burns et al. 2009, J. Virol. 83:9957-9969).

(73) Changes in CpG frequency had a greater effect on viral replication than changes in UpA levels, being both more beneficial to replication when lowered, and more detrimental when raised. When competed directly, the double region CpG-low mutant showed clear selective advantage over its counterpart UpA-low mutant. This could be attributed to the differences between final CpG and UpA frequency in the modified regions; CpGs were eliminated to a greater extent than UpAs in the low mutant, whilst more were introduced in the high mutant. However, this seems unlikely to account for the difference in fitness. In poliovirus, CpG-high mutants also exhibited a more severe attenuation than UpA-high mutants (Burns et al. 2009, J. Virol. 83:9957-9969), and selection against CpG dinucleotides has been shown to be greater than against UpA during serial passage of codon-deoptimised virus (Burns et al. 2006, J. Virol. 80:3259-3272). The dissimilar patterns of CpG and UpA suppression amongst organisms points to different selective pressures acting upon each dinucleotide (Burns et al. 2009, J. Virol. 83:9957-9969). CpG frequency is widely suppressed in higher eukaryotes and the small viruses that infect them (Karlin et al. 1994, J Virol 68, 2889-2897, Burge et al. 1992, Proceedings of the National Academy of Sciences of the United States of America 89: 1358-1362), whilst UpA suppression is almost universal. UpA-rich RNA is degraded in mammalian host cells by the antiviral endonuclease RNase L, which cleaves UpU or UpA dinucleotides in ssRNA (Washenberger et al. 2007, Virus Res 130, 85-95., Duan and Antezana. 2003, J Mol Evol 57, 694-701). Not being subject to methylation, small RNA viruses may have evolved to mimic both the CpG and UpA dinucleotide composition of their hosts, but for different evolutionary reasons (Burns et al. 2009, J. Virol. 83:9957-9969). The difference between CpG-suppressed mammalian genomes and non-suppressed lower eukaryote genomes may account for the results observed by Nougairede and colleagues (Nougairede et al. 2013, PLoS Pathog 9, e1003172), who found that viruses with de-optimised codons had a higher relative fitness in insect cells compared to mammalian cells. These data support the hypothesis that higher eukaryotes can identify non-self RNA by detecting higher CpG and UpA frequencies than are present in their own RNA.

(74) Low Mutants

(75) Surprisingly, viral replication was enhanced by designing mutants with lower CpG and UpA frequencies than WT. Mutants in which CpGs were eliminated entirely from two modified regions (representing 30% of the genome) out-competed WT in serial passage, whilst a replicon with CpGs removed from only 14% of the genome showed a 6-fold higher replication rate than the WT. Similar results were obtained for UpA-low mutants, despite the fact that UpAs could not be completely eliminated from the modified regions. Mutants in which both CpG and UpA frequency was minimised in both regions showed an even higher level of replicative fitness. These unprecedented findings, confirmed by several different assays, reveal an entirely novel phenomenon that would not have been predicted based on the results obtained from the CpG- and UpA-high mutants. If the host mechanism for detecting CpG and UpA in foreign RNA is based on sensing dinucleotide frequencies higher than in its own RNA, there is no immediate reason why viruses with non-physiologically lowered frequencies should do better than those with frequencies identical to the host. One explanation is that the system for recognising and limiting replication of RNA with high CpG/UpA is optimised at a sensitivity level that prioritises avoiding false negatives. Due to the importance not letting viral RNA go un-detected, occasionally RNA with a WT level of CpG/UpA could be targeted. In this situation, RNA with low CpG/UpA would have an advantage. Whether the CpG/UpA-low mutants could maintain their replicative advantage in a whole organism system is unclear. The heightened replication rates observed in viruses with reduced CpG/UpA ratios could provide opportunities for vaccine production. Where the vaccine involved a killed virus, an improved replication rate in cell culture would allow a higher production rate of a virus with identical antigenicity to the original.

(76) The various molecular biological and other associated techniques to perform the present invention are well known to the skilled person, and there is a plethora of reference material available on the subject which would form part of their common general knowledge. While specific techniques have been described in detail above, it is perfectly within the ability of the skilled person to modify or adapt the techniques described above to work within the scope of the present invention. A suitable reference text in respect of the various techniques discussed in the present application is Green & Sambrook, Molecular Cloning: A Laboratory Manual (Fourth Edition), 2012, Cold Spring Harbor Laboratory Press.

Enhanced expression of RNA vectors

Assignee

Inventors

Cpc classification

Classification Explorer

C12N2760/00051

CHEMISTRY; METALLURGY

Classification Explorer

C12N2770/00051

CHEMISTRY; METALLURGY

Classification Explorer

A61K31/519

HUMAN NECESSITIES

Classification Explorer

A61K31/19

HUMAN NECESSITIES

Classification Explorer

A23L33/16

HUMAN NECESSITIES

Classification Explorer

A61K45/06

HUMAN NECESSITIES

Classification Explorer

A23L7/117

HUMAN NECESSITIES

Classification Explorer

A61K9/00

HUMAN NECESSITIES

Classification Explorer

A61K33/26

HUMAN NECESSITIES

Classification Explorer

C12N2760/16143

CHEMISTRY; METALLURGY

Classification Explorer

A23K20/24

HUMAN NECESSITIES

Classification Explorer

Y02A50/30

GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS

Classification Explorer

A23V2002/00

HUMAN NECESSITIES

Classification Explorer

A61K31/14

HUMAN NECESSITIES

Classification Explorer

C12N2770/00043

CHEMISTRY; METALLURGY

Classification Explorer

C12N2760/00043

CHEMISTRY; METALLURGY

Classification Explorer

A23K20/174

HUMAN NECESSITIES

Classification Explorer

A61K31/714

HUMAN NECESSITIES

Classification Explorer

A23L33/15

HUMAN NECESSITIES

Classification Explorer

C12N2770/32043

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/86

CHEMISTRY; METALLURGY

International classification

Classification Explorer

A61K31/19

HUMAN NECESSITIES

Classification Explorer

A61K45/06

HUMAN NECESSITIES

Classification Explorer

A23L7/117

HUMAN NECESSITIES

Classification Explorer

A23K20/24

HUMAN NECESSITIES