RNA sequence adaptation
11692002 · 2023-07-04
Assignee
Inventors
- Stefan Heinz (Tübingen, DE)
- Tilmann Roos (Tübingen, DE)
- Dominik Vahrenhorst (Tübingen, DE)
- Markus CONZELMANN (Tübingen, DE)
Cpc classification
A61K31/713
HUMAN NECESSITIES
A61K31/7088
HUMAN NECESSITIES
C12N15/101
CHEMISTRY; METALLURGY
Y02A50/30
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
International classification
Abstract
The present invention is directed to a method for modifying the retention time of RNA on a chromatographic column. The present invention also concerns a method for purifying RNA from a mixture of at least two RNA species. Furthermore, the present invention relates to a method for co-purifying at least two RNA species from a mixture of at least two RNA species. In particular, the present invention provides a method for harmonizing the numbers of A and/or U nucleotides in at least two RNA species. The present invention is also directed to RNA obtainable by said methods, a composition comprising said RNA or a vaccine comprising said RNA and methods for producing such RNA and compositions. Further, the invention concerns a kit, particularly a kit of parts, comprising the RNA, composition or vaccine. The invention is further directed to a method of treating or preventing a disorder or a disease, first and second medical uses of the RNA, composition and vaccine. Moreover, the present invention concerns a method for providing an adapted RNA sequence or an adapted RNA mixture.
Claims
1. A method for analysis or purification of a mixture comprising at least two harmonized RNA species, the method comprising: a) obtaining the coding sequences for at least two RNA species, said at least two RNA species each having a length of 800 to 20,000 nucleotides wherein the sequence of at least one RNA species is adapted by altering the number of A and/or U nucleotides in the RNA sequence with respect to the number of A and/or U nucleotides in the original RNA sequence, said coding sequences of the at least two RNA species having a harmonized number of encoded A and U nucleotides that is no more than 50 different from each other; b) synthesizing the at least two RNA species to produce at least two harmonized RNA species; and c) analysing and/or purifying a mixture of said at least two harmonized RNA species by chromatography.
2. The method according to claim 1, wherein step b) comprises the separate synthesis of the at least two harmonized RNA species.
3. The method according to claim 2, wherein step b) comprises mixing the at least two harmonized RNA species.
4. The method according to claim 1, wherein step b) comprises the synthesis of the at least two harmonized RNA species in one batch.
5. The method according to claim 1, wherein step b) comprises an in vitro transcription step.
6. The method according to claim 1, wherein at least one RNA species comprises at least 500 nucleotides.
7. The method according to claim 6, wherein the at least two RNA species are mRNAs.
8. The method according to claim 7, wherein the at least two RNA species each comprise a 5′-cap structure.
9. The method according to claim 8, wherein the at least two RNA species each comprise, in 5′ to 3′ direction, the following elements: a) a 5′-cap structure b) optionally, a 5′-UTR element, c) at least one coding region; d) a 3′-UTR element, and e) a poly(A) sequence comprising 10 to 200.
10. The method according to claim 8, wherein the at least two RNA species each encode different Influenza virus hemagglutinin (HA) antigens.
11. The method according to claim 8, wherein the at least two RNA species each comprise a coding region, wherein the coding region has an increased G/C content compared to the G/C content of an original coding sequence, wherein the encoded amino acid sequence is not modified compared to the amino acid sequence encoded by the corresponding original mRNA.
12. The method according to claim 8, wherein the method is applied to at least three RNA species.
13. The method according to claim 12, wherein the at least three RNA species encode different influenza HA antigens.
14. The method according to claim 1, comprising: c) analysing the mixture of said at least two harmonized RNA species by chromatography.
15. The method according to claim 1, wherein the chromatography comprises HPLC.
16. The method according to claim 15, wherein the chromatography comprises reversed phase HPLC.
17. The method according to claim 16, wherein the reversed phase HPLC is with a column that comprises a porous material, selected from the group consisting of polystyrene, a non-alkylated polystyrene, an alkylated polystyrene, a polystyrenedivinylbenzene, a non-alkylated polystyrenedivinylbenzene, an alkylated polystyrenedivinylbenzene, a silica gel, a silica gel modified with non-polar residues, a silica gel modified with alkyl containing residues, selected from butyl-, octyl and/or octadecyl containing residues, a silica gel modified with phenylic residues, and a polymethacrylate.
18. The method according to claim 8, wherein the at least two RNA species each encode different Influenza virus neuraminidase (NA) antigens.
19. The method according to claim 1, wherein the numbers of A and U nucleotides in the sequences of the at least two harmonized RNA species differ from each other by not more than 20.
20. The method according to claim 19, wherein the numbers of A and U nucleotides in the sequences of the at least two harmonized RNA species differ from each other by not more than 10.
21. The method according to claim 13, wherein the numbers of A and U nucleotides in the sequences of the at least two harmonized RNA species differ from each other by not more than 20.
22. The method according to claim 8, wherein the method is applied to at least four RNA species, wherein the at least four RNA species encode different influenza HA antigens.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
EXAMPLES
(21) The Examples shown in the following are merely illustrative and shall describe the present invention in a further way. These Examples shall not be construed to limit the present invention thereto.
(22) TABLE-US-00003 TABLE 3 Materials used U3000 UH PLC-System Thermo Scientific HPLC column poly(styrene- Thermo Scientific divinylbenzen) matrix) WFI Fresenius Kabi, Ampuwa Acetonitril (MS-grade) Fisher Scientific 0.1M TEAA in WFI (Eluent A) 25% ACN in 0.1M TEAA (Eluent B)
Example 1: Examination of the Correlation Between Homopolymer Stretches of Nucleotides on HPLC Retention Times
(23) The inventors surprisingly found that not the size of an RNA, but the total number of adenine nucleotides (A nucleotides) and/or uracil nucleotides (U nucleotides) of an RNA is influencing HPLC retention times. Further details are provided in the following.
(24) 1.1. Preparation of DNAs Encoding Firefly Luciferase Including Varying Stretches of Adenines:
(25) The DNA sequence encoding firefly luciferase protein was introduced into a modified pUC19 derived vector backbone to comprise a 5′-UTR derived from the 32L4 ribosomal protein (32L4 TOP 5′-UTR) and a 3′-UTR derived from albumin, a histone-stem-loop structure, and stretches of varying numbers of adenine nucleotides (also referred to in the following as ‘polyA stretch’ or ‘A homopolymer’) at the 3′-terminal end and. The complete RNA sequences are provided in the sequence listing (see Table 4 below).
(26) TABLE-US-00004 TABLE 4 Constructs used in the experiment Encoded Length of protein polyA stretch SEQ ID NO: luciferase A25 1 A35 2 A50 3 A64 4
(27) DNA plasmids were linearized using EcoRI and transcribed in vitro using DNA dependent T7 RNA polymerase in the presence of a nucleotide mixture and cap analog under suitable buffer conditions. The obtained individual RNA products were purified using PureMessenger® as described in WO 2008/077592 A1 and subsequently analyzed using HPLC.
(28) 1.2 Determination of HPLC Retention Times:
(29) Individual RNA samples were diluted to 0.1 g/L using water for injection (WFI). 10p1 of the diluted RNA sample were injected into the HPLC column (monolithic poly(styrene-divinylbenzen) matrix). The RP HPLC analysis was performed using the following conditions:
(30) Gradient 1: Buffer A (0.1 M TEAA (pH 7.0)); Buffer B (0.1 M TEAA (pH 7.0) containing 25% acetonitrile). Starting at 30% buffer B the gradient extended to 32% buffer B in 2 minutes, followed by an extension to 55% buffer B over 15 minutes at a flow rate of 1 ml/min (adapted from WO 2008/077592). Chromatograms were recorded at a wavelength of 260 nm.
(31) In order to examine an eventual correlation between the presence and the extent of nucleotide homopolymer stretches on the one hand and the HPLC retention time on the other hand, chromatograms of each HPLC analysis run were superimposed.
(32) Results
(33) As shown in
(34) Notably, changes in the total number of cytosine nucleotides (C nucleotides) did not have an influence on HPLC retention time (not shown). As only the number of A nucleotides and not the number of C nucleotides influences HPLC retention times, an effect merely caused by elongation of the RNA molecule species can be ruled out.
Example 2: Harmonization of HPLC Retention Times of Different HA RNA Sequences for Co-Purification and/or Co-Analysis by Adaptation of the Adenine Count
(35) The inventors surprisingly found that the adaptation the total number of A nucleotides in two or more different RNA species (e.g. RNA molecules comprising different sequences encoding influenza HA-B) is suitable to harmonize the HPLC retention times, so that co-purification and/or co-analysis becomes feasible. Further details are provided in the following.
(36) 2.1. Adaptation of the Total Number of a Nucleotides:
(37) As the previous examples show a correlation between the number of A nucleotides in an RNA sequence and the respective HPLC retention time, RNA sequences encoding HA antigens (four different RNA sequences encoding influenza HA) were adapted so that they comprised (essentially) the same number of A nucleotides. The sequence adaptation was performed in such a way that the encoded amino acid sequence was unchanged, either by exploiting the degeneracy of the genetic code (compare with Table 1 and Table 2) or by introducing an adenine stretch into the polyA tail or the UTR of the RNA molecule species.
(38) The goal was to adapt the sequences in a way to facilitate co-purification and/or co-analysis of an RNA mixture comprising different HA RNA molecule species by obtaining a complete overlay of the four chromatograms (harmonization) in HPLC, which is a prerequisite for a cost-effective and fast production of an influenza vaccine based on an mRNA mixture (e.g., for the development of a multivalent/polyvalent influenza RNA vaccine platform, cf.
(39) In order to harmonize the retention times of all RNA molecule species encoding different HA antigens (HA-A and HA-B), GC-optimized DNA sequences encoding different HA proteins of Influenza B were adapted by increasing the number of A nucleotides by adapting the coding sequence (via codon exchange), by elongating the poly A sequence, or by introducing additional A nucleotides into the UTR region (see Table 5 below). The adaptation was performed by increasing the total number of A nucleotides in the HA-B sequences by 9 in order to shift the total number of A nucleotides in the HA-B sequences closer to the number of A nucleotides in the HA-A sequences. DNA constructs and RNA prepared as explained in Example 1.
(40) TABLE-US-00005 TABLE 5 HA-constructs used in the experiment Encoded A count AU count SEQ ID Antigen Mode of adaptation of RNA* of RNA** NO: HA-B Not adapted 467 723 5 Brisbane 9 A nucleotides introduced 476 732 6 into cds by codon exchange 9 A stretch introduced into poly A tail 476 732 7 9 A stretch introduced into the UTR 476 734 8 HA-B Not adapted 458 717 9 Phuket 9 A nucleotides introduced into 467 726 10 cds by codon exchange 9 A stretch in poly A tail 467 726 11 9 A stretch introduced into the UTR 467 728 12 HA-A Not adapted 476 737 13 California HA-A Not adapted 481 729 14 Hongkong *A-count of RNA: total number of A nucleotides in the respective RNA **AU-count of RNA: total number of A and U nucleotides in the respective RNA
(41) 2.2. Effect of the Total Number of a Nucleotides on HPLC Retention Time:
(42) HPLC sample preparation and HPLC analysis were performed as described Example 1. In order to examine the effect of the number of A nucleotides on HPLC retention time, the chromatograms of each RNA species were superimposed and analyzed.
(43)
(44) Results:
(45) The results show that the adaptation of the number of A nucleotides in the RNA sequences (see
(46)
(47) Of note, as analyzed and explained in further detail in Example 5, the surprisingly precise overlap of the HA-A sequences and adapted HA-B sequences as observed in
Example 3: Evaluation of Suitability of HPLC for Co-Analysis of RNA Mixture
(48) The inventors showed that HPLC is a particularly suitable method for co-analysis of an RNA mixture. Further details are provided in the following.
(49) 3.1. Preparation of Test RNA:
(50) RNA for testing the HPLC system was generated according to Example 1.
(51) 3.2. Directed Degradation of RNA and Preparation of RNA Mixtures of Different Integrities:
(52) RNA samples were degraded at 90° C. for 140 minutes. Subsequently, intact RNA and degraded RNA were mixed in different ratios of intact RNA: degraded RNA (90:10, 80:20, 70:30, 60:40, 50:50, 40:60, 30:70, 20:80, and 10:90) and respective RNA mixtures of varying integrities were applied to analytic HPLC. Analytic HPLC was performed as described in Example 1. For analysis, HPLC runs of the different RNA mixtures were superimposed. The results are shown in
(53) Results:
(54) As
Example 4: Harmonization of HPLC Retention Times of NA RNA Sequences for Co-Purification and/or Co-Analysis by Adaptation of the Number of a Nucleotides
(55) The inventors surprisingly found that an RNA mixture (encoding three different NA antigens) comprising RNA species with an adapted number of A nucleotides generates one harmonized HPLC peak, suitable for co-analysis and co-purification. Further details are provided in the following.
(56) 4.1. Adaptation of the Number of a Nucleotides in DNA Encoding NA Proteins of Several Influenza Strains:
(57) The goal was to adapt NA RNA sequences in a way to facilitate co-purification and/or co-analysis of an RNA mixture of different NA RNA molecule species by obtaining a complete overlay of the three chromatograms, which is a prerequisite for a cost-effective and fast production of an RNA-mixture based influenza vaccine (e.g. for the development of a multivalent influenza RNA vaccine, cf.
(58) In order to harmonize the retention time of all RNA molecule species encoding different NA antigens, GC-optimized DNA sequences encoding different NA proteins of Influenza were adapted by decreasing the number of A nucleotides by altering the coding sequence (codon exchange; see Table 6 below). The adaptation was performed in order to decrease the number of A nucleotides in RNA encoding NA H3N2 and mRNA encoding NA H1N1 to essentially match the number of A nucleotides in RNA encoding NA Influenza B.
(59) DNA constructs and RNA were prepared as explained in Example 1.
(60) TABLE-US-00006 TABLE 6 NA-constructs used in the experiment Encoded Antigen Mode of adaptation SEQ ID NO: NA Influenza B Not adapted 15 (Brisbane) NA H3N2 Not adapted 16 (Hongkong) 17 A removed from cds 17 by codon exchange NA H1N1 Not adapted 18 (California) 16 A removed from cds 19 by codon exchange
(61) 4.2. Effect of the Number of a Nucleotides on HPLC Retention Time:
(62) HPLC sample preparation and HPLC analysis were performed as described in Example 1.
(63) In order to examine the effect of the number of A nucleotides on HPLC retention time, the chromatograms of non-adapted RNA species were superimposed and analyzed. In addition, non-adapted RNA molecule species were mixed (100 ng each), applied as a mixture, and analyzed by HPLC (see
(64) Results
(65) The results show that the adaptation of the number of A nucleotides in the individual RNA sequences of an RNA mixture leads to adaptation of the retention time of the RNA mixture and a discrete HPLC peak (see
(66) Of note, the adaptation (reduction) of the number of A nucleotides in SEQ ID NOs: 17 and 19 was performed by changing serine codon AGC to codon UCC, which led to a decrease in A count and to an increase in U count (AU count was therefore stable; ratio of A:U was decreased), suggesting that the observed slight variation in the HPLC chromatograms of the individually analyzed adapted sequences was caused by a shift in the A:U ratio. Accordingly, adaptation of the A:U ratio can also be used for sequence adaptations according to the invention.
Example 5: Examination of the Influence of Nucleotides on HPLC Retention Time
(67) As shown in the previous examples, the adaptation of the number of A nucleotides in RNA sequences allows for harmonization of HPLC chromatograms, which is a requirement for co-analysis and/or co-purification. The inventors further found that the number of A and/or U nucleotides correlates with HPLC retention time. That finding provides even more options for adapting an RNA sequence and to harmonize HPLC chromatograms of RNA mixtures. Further details are provided in the following.
(68) 5.1. Preparation of DNA Encoding HA Proteins of Several Influenza Strains:
(69) DNA sequences encoding different haemagglutinin (HA) and neuraminidase (NA) proteins, two glycoproteins found on the surface of influenza viruses (Influenza A and Influenza B), were generated, and RNA was produced as described in Example 1.
(70) TABLE-US-00007 TABLE 7 HA-constructs used in the experiment: Encoded antigen SEQ ID NO: HA-A California 13 HA-A Hongkong 14 HA-B Brisbane 5 HA-B Phuket 9 NA H1N1 (California) 18 NA H3N2 (Hongkong) 16 NA Influenza B (Brisbane) 15
(71) 5.2. Correlation Between the Total Number of a Nucleotide and/or the Relative Content of a Nucleotide and the HPLC Retention Time:
(72) HPLC sample preparation and HPLC analysis were performed as described Example 1.
(73) In a first step, the individually produced RNA constructs (RNA species) encoding HA and NA antigens were separately analyzed on HPLC. The superimposed HPLC chromatograms are shown in
(74) For a better understanding of the impact of the nucleotide sequence on HPLC retention time, the correlation between the nucleotide count (A, U, G, and C) and nucleotide content for each RNA molecule species and HPLC retention time was examined.
(75)
(76) Results:
(77)
(78) Notably, the correlation between the number of A nucleotides and the retention time is stronger than the correlation between the number of U nucleotides and the retention time; In line with that, the results of Example 4 also suggested that the effect of A nucleotides on retention time is stronger than the effect of U nucleotides on retention time.
(79) Overall, the number of A and U nucleotides shows the best correlation and will allow for the most precise way for adapting RNA sequences to harmonize RNA mixtures for co-analysis and co-purification.
Example 6: Development of an Automated Nucleotide Adaptation Method (Algorithm)
(80) The inventors developed an automated in silico method (algorithm) to set the number of any nucleotide in an RNA sequence to a certain defined value, without altering the amino acid sequence. In the context of the invention, the automated in silico method was used for sequence adaptation (adaptation of the number of A and/or U nucleotides (AU count)) of RNA sequences to allow harmonization of RNA mixtures for HPLC co-analysis and/or HPLC co-purification. Further details are provided in the following.
(81) 6.1 Sequence Analysis and Definition of Target AU Count:
(82) The objective of the experiment was to generate RNA sequences for an adapted RNA mixture (comprising three different RNA molecule species encoding antibodies) suitable for co-analysis using HPLC. The AU count has to be adapted in all RNA molecule such that their respective HPLC chromatograms are completely separated (difference in the AU count of at least 70), allowing for co-analysis of their integrity.
(83) Three antibody sequences (SEQ ID NOs: 20-22) were selected and GC optimized DNA (SEQ ID NOs: 23-25) sequences were generated (essentially according to Example 1). Nucleotide numbers were determined for the respective GC optimized sequences (product 1, product 2, product 3; see Table 8 below) to be able to define optimal numbers of A and U (T) nucleotides for HPLC co-analysis.
(84) TABLE-US-00008 TABLE 8 Nucleotide numbers for GC optimized constructs: product Length A count T (U) count AT (AU) count SEQ ID NO: 1 81 19 13 32 23 2 258 59 26 85 24 3 429 77 55 132 25
(85) To adapt the RNA molecule species comprised in the RNA mixture for HPLC co-analysis, the target AU counts for product 2 and product 3 were set to the following values, allowing integrity analysis on HPLC when analyzed as an RNA mixture:
(86) TABLE-US-00009 TABLE 9 Adaptation strategy for co-analysis of the RNA mixture: AT (AU) count Change in Target AT product (non-adapted) AU count (AU) count 1 32 0 32 2 85 +17 102 3 132 +40 172
(87) As indicated in Table 9, the target AU count for each product RNA was set in such a manner that the AU counts of the three RNA sequences differ by at least 70 nucleotides (strategy illustrated in
(88) 6.2 AU Sequence Adaptation Method:
(89) In the following, the sequence adaptation method is exemplarily described for product 2 (+17 AU) (SEQ ID NO: 24). As the number of A nucleotides in the sequence was larger than the number of T (U) nucleotides, the adaptation values were set to +8A and +9T(U) in order to maintain the distribution of A and U nucleotides in the resulting AU adapted sequence.
(90) In the initial phase of the method (algorithm), a matrix for each codon comprised in the sequence was created, identifying possible changes (herein referred to as “exchange matrix”). An exemplary “exchange matrix” is shown in Formula (I).
(91)
(92) Formula (I) shows that for codon “CGA” a change to an alternative codon offers the option of increasing the number of A nucleotides by 1 (e.g.: CGA.fwdarw.AGA), offers the option of increasing the number of C nucleotides by 1 (e.g. CGA.fwdarw.CGC), offers the option of increasing the number of G nucleotides by 1 (e.g. CGA.fwdarw.CGG), and offers the option of increasing the number of T nucleotides by 1 (e.g. CGA.fwdarw.CGT).
(93) Exchange matrices were generated for each individual codon in the sequence. Using said exchange matrices, the potential maximum number of the respective nucleotides (A and T(U) count, respectively) in each codon was determined (without changing the amino acid sequence). Accordingly, all 63 codons of the sequence were analyzed, and the potential alternative codons were assembled in a table structure as shown in Table 10 by way of example.
(94) TABLE-US-00010 TABLE 10 Exemplary table of alternative codons allowing for a change in the number of a nucleotide Codons Alternative codons CGA CGT, CGC, CGG, AGA, AGG GAT GAC GAC GAT ATG no alternative codon . . . . . .
(95) Next, the sequence according to SEQ ID NO: 24 was iteratively divided into separate codons and stored in table format, which resulted in a list as exemplarily shown in Table 11 (positions 1, 2, 3, 4 . . . 86 and 87 of the sequence are indicated).
(96) TABLE-US-00011 TABLE 11 Codon list of SEQ ID NO 24: Codon position Codon 1 ATG 2 AGC 3 ATC 4 ATC . . . . . . 86 GAG 87 AGC
(97) Next, the list of codons (see Table 11) was analysed for possible codon changes by step-wise iteration, wherein in each iteration step the corresponding codon was analysed using the respective exchange matrix (as outlined above) for potential nucleotide changes. For example, if no changes were theoretically possible in the respective codon, e.g. as in the case of “ATG” or “TGG”, the corresponding exchange matrix as exemplarily shown in Formula (II) was used (* of exchange matrix=0).
(98)
(99) Formula (II) shows that for codon “ATG” a change to an alternative codon offers no option of increasing the number of A nucleotides, C nucleotides, G nucleotides or T nucleotides (as there are no alternative codons for ATG (Met)).
(100) In cases where changes according to the respective exchange matrix (*>0) were theoretically possible, the codon was further analysed if these changes can be implemented under the premise that e.g. only codons that offer the option of increasing the number of A and/or T(U) nucleotides were adapted. Therefore the intersection between the target nucleotides (e.g. A and/or T(U)) and the nucleotides that potentially generate a positive result (that is, A and/or T(U) change; see e.g. Formula (I)) in the current exchange matrix was constructed. As a result, each codon was categorized and grouped in three categories:
(101) Category 1 (Category “Favourable”):
(102) Potential codon exchanges allowing an increase in only one target nucleotide (in the present example A or T(U)). For example, the codon “GAC” (Asp) can be changed to “GAU” (Asp) in order to increase the number of A and T(U) nucleotides. No further analysis is required since that modification does not have any further impact (besides the one mentioned above) on the number of A and T(U) nucleotides.
(103) Category 2 (Category “Possible”):
(104) Potential codon exchanges allowing the increase in both target nucleotides (in the present example A and T(U)). For example, codon “GCA” can be changed to “GCU”, which would increase the T(U) count but at the same time decrease the A count. Accordingly, further analysis would be required with respect to codons belonging to this the category in order to decide, whether the number of one of the two target nucleotides (T(U)) in this example should be increased at the expense of a reduction of the number of the other target nucleotide (A).
(105) Category 3 (Category “Impossible”):
(106) Codons in the RNA sequence, for which no alternative codons exist (*=0). Examples for this category 3 are ATG (Met; start codon) or “UGG” (Trp).
(107) All codons of the original sequence were categorized in that manner. After this step, there were three categories with a total of 87 entries (for 87 codons present in SEQ ID NO: 24). For the next step, category 3 was no longer considered, as a codon change will not influence the target nucleotide count (A, T(U)).
(108) Next, it was calculated how many potential nucleotide changes have been identified for all target nucleotides (A, T(U)). For the SEQ ID NO: 24 the possible nucleotide changes are listed (category 1, category 2; see Table 12).
(109) TABLE-US-00012 TABLE 12 Nucleotide changes that can potentially be applied to SEQ ID NO 24 Potential Nucleotide changes Category A 22 1 T 15 1 A, T 47 2
(110) Accordingly, 22 favourable changes (see category 1/“favourable” as explained above) were identified for A and 15 favourable changes were identified for T(U). As the adaptation values were set to +8A and +9T(U) the adaptation at codon positions were all taken from category 1. Table 13 summarizes the introduced codon exchanges that were equally distributed across the sequence.
(111) TABLE-US-00013 TABLE 13 Codon exchanges introduced into SEQ ID NO 24 Position A Position T(U) A Exchange increase T Exchange increase 30 AAG -> AAA +1 3 AGC -> AGT +1 48 AAG -> AAA +1 12 AGC -> AGT +1 78 AAG -> AAA +1 63 GAC -> GAT +1 141 CAG -> CAA +1 90 AGC -> AGT +1 165 GAG -> GAA +1 102 AGC -> AGT +1 213 GAG -> GAA +1 117 AAC -> AAT +1 234 GAG -> GAA +1 138 AGC -> AGT +1 252 GAG -> GAA +1 150 AAC -> AAT +1 180 AGC -> AGT +1
(112) These exchanges resulted in the following adapted sequence according to SEQ ID NO: 25.
(113) In the above described example, potential nucleotide exchanges from category 2 were not implemented. However, in scenarios where e.g. T(U) counts larger than 15 are needed, codons from category 2 are used as soon as all codons from category 1 have been used, in order to obtain additional adaptation possibilities for A and T(U) counts. If category 2 is required in order to achieve the desired nucleotide counts, calculation of the following ratio will identify the exchange nucleotide (nucleotide A or T(U)):
(114)
wherein i represents the corresponding target, c.sub.i is the count of possible adaptation positions of i in category 2, x.sub.i is the desired threshold for i, p.sub.i the count for the already changed identified adaptation positions. All calculated ratios are ranked and starting from lowest to highest ratio, the changes from category 2 are applied, until the desired threshold has been reached or until all the possible exchanges from category 2 have been performed. This procedure is carried out iteratively for all targets, where the desired numbers cannot be achieved by only using exchanges according to category 1.
(115) For example, category 2 contains 47 codons (see Table 12), which could potentially be exchanged in order to increase the A or T count. Accordingly, further changes are implemented from category 2 until the desired threshold has been reached or until all the codons from category 2 have been used as well. For SEQ ID NO: 24, a change of the T(U) count to e.g. 20 would result in additional adaptations by using the following alternative codons from category 2 (additional codon exchanges from category 1 are not shown): 6=ATT, 9=ATT, 21=CTT, 144=CCT, 183=TCT.
(116) In cases where the desired target nucleotide count cannot be achieved (as all alternative codons from category 1 and 2 have been used, which means that no further changes are possible), an adapted sequence is generated that is matching the target nucleotide count as close as possible.
(117) In order to further optimize the above described method (algorithm), the following improvements are implemented: 1. The basic equal distribution, which was used in the experiments described above, is based on the exchange possibilities. Other distribution models may also be envisaged, such as normal distribution, first occurrences distribution, last occurrences distribution or random-based distribution. Alternatively, the mean of the possible changes or median of the possible changes may be determined and all exchanges may be arranged around these values. 2. The exchange matrix contains additional information about the codon for the target sequence (e.g. codon usage etc.). This creates a further criteria for the question of whether a codon exchange is desirable or not, facilitating adaptation to a specific codon usage or a different nucleotide ratio in the target sequence. 3. Implementation of a third category by sequences or motifs which should be avoided by an exchange (e.g. a recognition motif of a restriction enzyme, promotor sequences or sequences building not desired secondary structures, etc.). 4. Automated binning of input sequences, based on their length and the occurrence of the desired target nucleotides in order to identify optimal nucleotide counts for A and/or U adaptation.
Example 7: Generation and Use of a Polyvalent Influenza Virus RNA Platform for Fast-Adjustable Influenza Vaccine Production
(118) A pool of GC optimized coding sequences encoding HA antigens were AU adapted to a count of 612 AU (360 A and 252 T) resulting in an AU adapted HA sequence pool (SEQ ID NO: 26 to 16263). A pool of GC optimized coding sequences encoding NA antigens were AU adapted to an AU count of 488 AU (271A and 217U) resulting in an AU adapted NA sequence pool (SEQ ID NOs: 16264-30567). AU adaptations were performed according to the invention, essentially as described in Example 6. The adaptation allows co-purification of RNA mixtures comprising adapted HA RNA sequences and co-purification of RNA mixtures comprising adapted NA sequences. Moreover the adaptation allows co-analysis of an RNA mixture comprising adapted HA and NA RNA species as the RNA sequences encoding HA (AU count 612) and the RNA sequences encoding NA (AU count 488) generate separated peaks (AU count difference: 124), suitable for analysis of integrity using HPLC.
(119) HA RNA mixtures are produced according to procedures as disclosed in the PCT application WO2017/109134 using GC optimized AT adapted DNA templates (generated as described in Example 1). In short, a DNA construct mixture (each of which comprising different HA coding sequences and a T7 promotor) is used as a template for simultaneous RNA in vitro transcription to generate a mixture of HA mRNA constructs. Subsequently, the obtained harmonized RNA mixture is used for co-purification using RP-HPLC.
(120) In a parallel reaction, NA RNA mixtures are produced according to procedures as disclosed in the PCT application WO 2017/109134 using GC optimized AT adapted DNA templates. In short, a DNA construct mixture (each of which comprising different NA coding sequences and a T7 promotor) is used as a template for simultaneous RNA in vitro transcription to generate a mixture of HA mRNA constructs. Subsequently, the obtained harmonized RNA mixture is used for co-purification using RP-HPLC.
(121) The purified mRNA mixture encoding HA antigens and the purified mRNA mixture encoding NA antigens are mixed to generate a HA/NA RNA mixture. The integrity of the mixture (that is of the NA peak and the HA peak) is co-analyzed via HPLC as described herein.
(122) Advantageously, the AU adaptation of HA sequences and NA sequences in order to harmonize chromatographic peaks for HPLC based co-purification and co-analysis according to the invention facilitates the production of mRNA-based multivalent influenza vaccines, which may be quickly adapted to demand, e.g. in seasonal influenza vaccine design or in a pandemic scenario (compare with
Example 8: Suitability of the Method on Various Reverse Phase HPLC Matrices
(123) The inventors found that modification of the retention time of an RNA via adaptation of A and/or U count is not restricted to a certain reverse phase column chemistry.
(124) To test whether a modification in retention time via an adaptation of A and/or U count can also be observed on other reverse phase columns (in Example 1-7, a monolithic poly(styrene-divinylbenzene)matrix has been used), the following columns were tested: monolithic ethylvinylbenzene-divinylbenzene copolymer (ThermoFisher Scientific) (see
(125) RNA encoding yellow fever virus antigens was generated. The constructs encode the same antigen (YFV(17D)-prME), comprise the same UTR elements, and have the same size. The AU count for each constructs was changed via coding-sequence adaptation. The constructs are listed in Table 14.
(126) TABLE-US-00014 TABLE 14 AU adapted YFV constructs SEQ ID NO Antigen RNA size A U G C AU GC 30588 prME 2311 474 350 544 943 824 1487 30589 prME 2311 587 307 704 713 894 1417 30590 prME 2311 633 504 627 547 1137 1174 30591 prME 2311 788 351 555 617 1139 1172 30592 prME 2311 886 538 516 371 1424 887
(127) To evaluate the effect of AU adaptation on retention time, 500 ug of each construct was subjected individually to the respective column. In addition, for each column two different flow-rates were tested. Results are shown in
(128) In addition, the separation factor alpha was determined for each construct on each column tested. In chromatography, the separation factor alpha expresses the ratio of retention times of two compounds. Accordingly, a separation factor of value of larger 1.0 means that separation of two compounds occurred. In the present analysis, SEQ ID NO: 30592 (with AU 1424) was taken as a reference for separation factor calculation. The calculated separation factors are shown in Table 15. The obtained separation factor values were plotted against the AU count difference of the constructs (SEQ ID NO: 30592, with AU 1424 was taken as a reference). The diagram is shown in
(129) TABLE-US-00015 TABLE 15 AU adapted YFV constructs SEQ RNA AU AU Alpha Alpha Alpha Alpha ID NO size count difference monolithic PVD C4 PLPR-S 30592 2311 1424 0 1,00 1,00 1,00 1,00 30591 2311 1139 285 1,07 1,05 1,05 1,08 30590 2311 1137 287 1,09 1,07 1,06 1,11 30589 2311 894 530 1,15 1,12 1,10 1,18 30588 2311 824 600 1,19 1,15 1,12 1,24
CONCLUSION
(130) As shown in
(131) As shown in
(132) As shown in
(133)
(134) Of course, a harmonization of AU counts (in other words: decreasing the difference in AU count between the constructs) would also lead to a modification in retention time, thereby allowing co-purification on HPLC.