Stable recombinant MVA vectors comprising modified RSV genes with reduced intramolecular recombinatorial activity

11225673 · 2022-01-18

Assignee

Inventors

Cpc classification

International classification

Abstract

The invention relates to vectors comprising two or more homologous nucleotide sequences and methods for generating them. The invention concerns substituting bases in the homologous nucleotide sequences with different bases that do not alter the encoded amino acid sequence. The invention allows for the reduction of intramolecular recombination between homologous nucleotide sequences, in particular in mammalian cells. The invention further relates to nucleotide sequences containing substituted bases.

Claims

1. A recombinant modified vaccinia Ankara (MVA) virus vector that stably encodes homologous sequences, the vector comprising: first and second nucleotide sequences of at least 1500 nucleotides each, each coding for at least 500 amino acids, wherein at least 150 continuous amino acids encoded by each of the two nucleotide sequences have at least 75% amino acid identity; wherein at least one of the first and second nucleotides has at least 400 substituted nucleotides and wherein the substituted nucleotides do not alter the identical amino acids encoded by said two nucleotide sequences; and wherein the first and second nucleotide sequences differ by at least 400 nucleotides; and wherein the first and second nucleotides share stretches of identity of no more than 9 contiguous nucleotides; and wherein the first and second nucleotide sequences each encode a RSV protein.

2. The recombinant MVA virus vector of claim 1, wherein first and second nucleotide sequences encode a full-length RSV-F protein and a truncated RSV-F protein.

3. The recombinant MVA virus vector of claim 1, wherein the first and second nucleotide sequences encode the amino acid sequences of SEQ ID NO:3 and SEQ ID NO:4, respectively.

4. The recombinant MVA virus vector of claim 3, wherein the first and second nucleotide sequences comprise the nucleotide sequences of SEQ ID NO: 1 and SEQ ID NO:2, respectively.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) The invention is more fully understood with reference to the drawings, in which:

(2) FIG. 1A-C depict an alignment of the nucleotide sequence encoding the full-length RSV-F (F) protein (SEQ ID NO:1) with the nucleotide sequence encoding the substituted, truncated RSV-F_trunc (F_trunc) protein (SEQ ID NO:2). The identical sequences are highlighted in black, and the substituted nucleotides remain unhighlighted. The locations of primers A1 and B2 are indicated.

(3) FIG. 2 depicts an alignment of the full-length RSV-F (F) protein (SEQ ID NO:§) with the truncated RSV-F_trunc (F_trunc) protein (SEQ ID NO:4). The full length sequence of RSV-F is truncated by 50 aa to result in the truncated RSV-F_trunc protein. The RSV-F_trunc protein covers approximately 91% of the full length protein.

(4) FIG. 3 depicts expression of RSV-F and RSV-F_trunc from recombinant MVA-BN® viruses in a human cell line. Western blot with extracts from infected human cells upon infection with different MVA-BN® based viruses with an MOI of 10 and lysis at 24 h post infection. MVA-BN® (empty vector control; lane 1), MVA-mBN172B (recombinant MVA-BN® with full length RSV-F; lane 2), MVA-mBN173B (recombinant MVA-BNO with truncated RSV-F_trunc; lane 3) and lane 4: MVA-mBN175B (recombinant MVA-BN® with RSV-F and RSV-F_trunc). The calculated molecular weight of the proteins is: RSV-F (61.6 kDa) and RSV-F_trunc (56.1 kDa).

(5) FIG. 4A-C depict PCR analysis of MVA-mBN175B. RSV-F (F) and RSV-F_trunc (F_trunc) are shown. A. PCR results with various primer pairs. M=markers (1 kb-ladder, New England Biolabs). Lane 1 is MVA-mBN175B. Lane 2 is a positive control plasmid (pBN345). Lane 3 is MVA-mBN®. Lane 4 is a water control. Lane 5 is a positive control plasmid (pBN343). B. Schematic of MVA-mBN175B showing locations of primers used for the PCRs shown in FIG. 4A. C. Schematic of wild type MVA-mBN® showing locations of primers.

(6) FIG. 5A-C depict the hypothetical recombination F/F.sub.trunc between the full lengthRSV-F gene (F) and the truncated F gene (F.sub.trunc) in the double recombinant MVA and the locations of the PCR trunc, primers in the recombinant and non-recombinant viruses and control plasmids. A. MVA-mBN175B. B. pMISC173. C. pMISC172.

(7) FIG. 6 depicts PCR analysis of DNA isolated from cells infected with MVA-mBN175B. Lanes 1 and 7 are marker lanes. Lane 2 is MVA-mBN175B. Lane 3 is a plasmid control for the F gene (pBN343). Lane 4 is a plasmid control for the truncated F gene (pBN345). Lane 5 is MVA-BN®. Lane 6 is a water control. The expected PCR product from a hypothetical recombination between the RSV-F gene and truncated F gene RSV-F_trunc in MVA-mBN175B is 613 base pairs.

(8) FIG. 7 depicts an alignment of three EBOV (ebolavirus) GP (glycoprotein) protein sequences. The amino acid sequences of three GP proteins of the ebola virus strains EBOV-B (SEQ D NO:5), EBOV-S(SEQ ID NO:6) and EBOV-Z (SEQ ID NO:7) are aligned. No gaps were allowed in the alignment. The overall identity in all three protein sequences is 48.5%. Gray background: identical in all three protein sequences. Black background: identical in two proteins.

(9) FIG. 8A-C and 8D-F depict an alignment of three EBOV GP coding sequences used in the recombinant MVA-BN® based construct. The coding sequences for the GP genes originating from three EBOV strains EBOV-B (SEQ ID NO:8), -S(SEQ ID NO:9) and -Z (SEQ ID NO:10) were aligned before (non-opt; see FIG. 8A, (SEQ ID NOs:8-10)) and after (opt; see FIG. 8B, (SEQ ID NOs:11-13) optimization. No gaps were allowed in the alignment. Gray background: identical nucleotide positions in three coding sequences. Black background: identical nucleotide positions in two coding sequences. The identity in nucleotide positions of three genes prior optimization (non-opt) is 45.3%, while after optimization (opt) it is 44.6%.

(10) FIG. 9 depicts pairwise alignments of three EBOV GP coding sequences used in the recombinant MVA-BN® based construct. The coding sequences for the GP genes originating from three EBOV strains EBOV-B, -S and -Z were aligned pairwise before (non-opt; see FIG. 9A-G, SEQ ID NOs:8-10)) and after (opt; see FIG. 9H-N, SEQ ID NOs:11-13)) optimization. FIG. 9A-G-: EBOV-B non-opt SEQ ID NO: 8, EBOV-S non-opt SEQ ID NO:9, EBOV-Z non-opt SEQ ID NO:10; FIG. 9H-N: EBOV-B opt SEQ ID NO: 11, EBOV-S opt SEQ ID NO:12, EBOV-Z opt SEQ ID NO:13. No Gaps were allowed in the alignments. Gray background: identical nucleotide positions in the coding sequence. The identity in nucleotide positions of three genes prior (non-opt) and after (opt) optimization is tabulated in Table C.

(11) FIG. 10 depicts a restriction enzyme digest and plasmid map of plasmid pMISC210 comprising the full-length (RSV-F) and truncated (RSV-F_trunc) protein. Lane 1: plasmid pMISC210 comprising RSV—F and RSV-F_trunc; Lane 2: control plasmid pMISC209 comprising RSV-F_trunc only; Lane 3: Molecular weight marker. The size of the marker-bands in base pairs (bp) is shown.

EXAMPLES

Example 1

(12) Preparation of Substituted, Truncated F Gene

(13) Creation of a recombinant MVA expressing both a full-length RSV-F protein and a truncated Version RSV-F_trunc was desired. However, based on results with MVA and other vaccinia viruses containing repeat sequences, it was expected that intramolecular recombination would lead to recombination between the two copies of the F gene, resulting in deletion of one of the copies of the F gene.

(14) To minimize the presence of long stretches of identical nucleotides between the two F genes, the codons in the nucleotide sequence encoding the RSV-F_trunc gene were substituted, while maintaining the amino acid sequence of the F genes. The use of rare codons for mammals and chickens was avoided. Also, substitutions that might introduce nucleic acid signals were avoided. Such signals included internal TATA-boxes, chi-sites, and ribosomal entry sites; AT-rich and GC-rich sequence stretches; ARE, INS, and CRS sequence elements; repeat sequences and RNA secondary structures; (cryptic) splice donor and acceptor sites, and branch points; and vaccinia termination signals (TTTTTNT). The substituted nucleotide sequence is shown in FIG. 1, compared to a coding sequence for a full-length RSV-F protein. Although significant identity remains throughout the two coding sequences, there are no remaining large stretches of identity greater than nine contiguous nucleotides within the two coding sequences. The proteins encoded by the two coding sequences are aligned in FIG. 2. The two proteins have 100% identity over the first 524 amino acids (the substituted F protein is truncated at the carboxy terminus). Thus, although these two coding nucleotide sequences encode a stretch of identical amino acids, one of the sequences has been substituted relative to the other.

Example 2

(15) Preparation of Recombinant Viruses Comprising RSV-F Genes

(16) The DNA encoding the full-length RSV-F gene was inserted into MVA at two different integration sites to generate MVA-mBN170B and MVA-mBN172B (in the IGR88/89 site). The substituted, RSV-F_trunc gene was inserted into MVA at the IGR148/149 site to generate MVA-mBN173B.

(17) A double recombinant MVA was then created containing the full-length RSV-F gene inserted into MVA at the IGR88/89 site and the substituted, RSV-F_trunc gene inserted into the same MVA at the IGR148/149 site. The double recombinant virus was called MVA-mBN175B. A schematic of this virus is shown in FIG. 4B.

Example 3

(18) Expression of F Proteins from Recombinant Viruses

(19) To determine whether protein was expressed from the substituted nucleotide sequence, western blot analysis was performed on protein extracts from a human cell line infected with a recombinant MVA-BN®-based virus encoding the full-length RSV-F gene (MVA-mBN172B), the virus encoding the substituted, RSV-F_trunc gene (MVA-mBN173B) and a double recombinant virus encoding both, the full length and the RSV-F_trunc gene (MVA-mBN175B). All three viruses showed the production of the appropriately sized RSV-F proteins by Western blot analysis (FIG. 3), while the MVA-BN® control (empty vector) did not show any bands, as expected. Thus, the full length and the truncated F protein expressed from the substituted coding nucleotide sequence were expressed individually from single recombinant MVA-BN® but both were also co-expressed from one double recombinant MVA-BN® virus (MVA-mBN175B) in a human cell line.

Example 4

(20) Growth of Recombinant Viruses

(21) Chicken embryo fibroblast cells were infected with MVA-mBN175B, a construct containing both the full-length F gene and the substituted, RSV-F_trunc gene, or a construct containing only the full-length F gene to receive a first virus crude stock. Similar titers of the double recombinant virus containing both full length F and truncated F genes (1.34×10.sup.7 TCID50) were seen in comparison with titers of the virus containing only the full length F gene (1.46×10.sup.7 TCID50). These results indicated that a stable double recombinant MVA was being produced, and that recombination between the two copies of the F gene had been limited by substituting nucleotide bases in the sequences.

Example 5

(22) PCR Analysis of Recombinant Viruses

(23) PCR analysis was performed on DNA from cells infected with MVA-mBN175B or MVA-BN® using the insert-specific and flank-specific primer pairs depicted in FIGS. 4B and C. PCR A with primers A1/A2, which are specific for the full-length F gene, detected a band with the size of 663 base pairs (bp) in cells infected with MVA-mBN175B and in a specific plasmid positive control as expected. This band as expected is absent in cells infected with MVA-BN® or in the water control (FIG. 4A). PCR B with primers B1/B2, which are specific for the substituted, truncated F gene, detected a band with the size of 625 bp in cells infected with MVA-mBN175B and in a specific plasmid positive control as expected. This band, as expected, is absent in cells infected with MVA-BN® or in the water control (FIG. 4A). PCR C with primers C1/C2, which detect insertions into the IGR88/89 site, detected a band with the size of 2047 bp in cells infected with MVA-mBN175B and in a specific plasmid positive control as expected. This band, as expected, is absent in cells infected with the empty vector control MVA-BN®, instead a band of 161 bp indicates the wildtype situation at IGR88/89 in MVA-BN® (FIG. 4A). PCR D with primers D1/D2, which detect insertions into the IGR148/149 site, detected a band with the size of 2062 bp in cells infected with MVA-mBN175B and in a specific plasmid positive control as expected. This band as expected is absent in cells infected with the empty vector control MVA-BN®, instead a band of 360 bp indicates the wildtype situation at IGR88/89 in MVA-BN®. (FIG. 4A).

(24) Recombination between the F genes would yield a hybrid F gene having parts of the wild-type F gene and parts of the truncated F gene. (FIG. 5A.) To detect the presence of any such recombinants, PCR analysis was performed on DNA from cells infected with MVA-mBN175B or MVA-BN® using the primer pairs A1/B2 (FIG. 5B.), which should generate a 613 base pair product, specific for the recombinant F gene. The results of this PCR showed no detectable recombinants. (FIG. 6.) These results indicated that a stable double recombinant MVA was being produced, and that recombination between the two copies of the F gene had been limited.

Example 6

(25) Preparation of Recombinant Glycoprotein (GP) Genes of Three Different Ebolavirus (EBOV) Strains

(26) Generation of a recombinant MVA expressing three ebolavirus (EBOV) glycoproteins (GP) was desired. The EBOV strains used herein are EBOV-B (Bundibugyo), EBOV-S(Sudan) and EBOV-Z (Zaire), all belonging to virus strains with high lethality in infected humans. Said three GP share an overall identity of 48.5%, indicating that nearly every second amino acid in the GP proteins is identical in all three strains, while the percent identities over the full-length protein sequences in comparison of combinations of two strains are between 57.0% and 64.2% (FIG. 7).

(27) To minimize the presence of long stretches of identical nucleotides within the three EBOV GP genes, the codons in the three nucleotide sequences were substituted, while maintaining the encoded amino acid sequences of the three GP genes. The use of rare codons for mammals and chickens, as well as substitutions that might introduce nucleic acid signals were avoided. Such signals included internal TATA-boxes, chi-sites, and ribosomal entry sites; AT-rich and GC-rich sequence stretches; ARE, INS, and CRS sequence elements; repeat sequences and RNA secondary structures; (cryptic) splice donor and acceptor sites, and branch points; and vaccinia termination signals (TTTTTNT). The G after the ATG start codon allows for high expression and is present in the original coding sequence of all three EBOV GP genes and was maintained.

(28) Although 23.3 to 24.9% of the nucleotides in each of the 3 optimized EBOV GP coding sequences were exchanged (see Table A), the overall identities did not dramatically change between the three GP coding sequences (Table B). In two cases, the pair wise comparisons even showed marginally higher identities after optimization of the coding sequences, as shown below in Table B.

(29) TABLE-US-00001 TABLE A Nucleotide exchanges in three optimized EBOV GP genes. The table shows the number of changed nucleotides at the corresponding positions in the optimized GP coding sequences (opt) compared to the non-optimized (non-opt) sequence of different EBOV strains based on the total number of nucleotides in [%]. The total number of nt is 1147. exchanged nt positions in optimized GP coding sequences compared to non-optimized sequences [%] EBOV-B non-opt:EBOV-B opt 23.3 EBOV-S non-opt:EBOV-S opt 24.9 EBOV-Z non-opt:EBOV-Z opt 23.9

(30) TABLE-US-00002 TABLE B Identical nucleotide positions of three EBOV GP coding sequences. The table shows the number of identical nucleotides at the corresponding positions in two GP coding sequences of different EBOV strains based on the total number of nucleotides in [%]. pairwise comparison identity of nucleotides in identity of nucleotides in of GP genes non-optimized genes [%] optimized genes [%] EBOV-B:EBOV-S 57.0 57.3 EBOV-B:EBOV-Z 64.2 61.1 EBOV-S:EBOV-Z 57.6 60.4

(31) Pairwise alignments of the GP coding sequences of three EBOV strains EBOV-B, -S and -Z showed the identities in nucleotide positions and the distribution of identities (FIG. 9). Consequently, the method of the present invention led to shorter stretches of nucleotide identitity in the EBOV GP-sequences. When considering long stretches of identical consecutive nucleotides, it is evident that the interruption or shortening of such stretches of identities is an important part of the strategy to avoid recombination between sequences sharing a certain degree of nucleotide identities. In Table C (see below) the number of stretches of consecutive identical nucleotides from pair wise comparison of the GP coding sequences are shown. Prior to optimization, there are stretches of up to 23 bp length and in summary there are 41 stretches of 10 or more identical nucleotides. In the optimized version of the GP genes, only one 13 bp stretch is found and 7 stretches of 10 or more identical nucleotides can be found.

(32) TABLE-US-00003 TABLE C Long stretches of consecutive identical nucleotides. The table shows the number of stretches of consecutive identical nucleotides of a certain length in pair wise comparison of EBOV GP coding sequences before (non-opt) and after (opt) optimization. The numbers of the pairwise comparisons are summarized in the column ‘combined numbers’. The longest stretch in the non-optimized comparisons are 23 consecutive identical nucleotides, while in the optimized genes, it is reduced to a maximum of 13 nucleotides. Only stretches of 10 or more nucleotides are listed. EBOV- EBOV- EBOV- combined B:EBOV-S B:EBOV-Z S:EBOV-Z numbers length non-opt opt non-opt opt non-opt Opt non-opt opt 23 nt 1 1 20 nt 2 2 17 nt 1 1 16 nt 2 2 14 nt 2 2 4 13 nt 1 1 1 2 1 12 nt 1 2 3 11 nt 10 2 4 1 8 22 3 10 nt 1 2 1 1 2 4 3

Example 7

(33) Preparation of Recombinant MVA-BN® Viruses with GP Genes of EBOV Strains.

(34) The three EBOV GP genes were synthesized by GeneArt (Regensburg, Germany) and cloned into recombination vectors to allow for integration into MVA-BN®. A recombinant virus comprising the three optimized homologous GP gene sequences from three different EBOV strains was generated. The transcription of the three inserted GP coding sequences is controlled by different individual early-late promoters.

(35) Specific PCR reactions for the three optimized EBOV-GP sequences showed the presence of the three individual genes in the recombinant MVA-BN®.

Example 8

(36) Preparation of Plasmid Comprising RSV-F Genes

(37) The two versions of the RSV-F gene used in examples 1-5 and shown in FIG. 1 were cloned into one plasmid and maintained in E. coli TZ101 (Trenzyme GmbH, Konstanz, Germany) using standard cloning techniques. The plasmid (see plasmid map in FIG. 10) was isolated and digested with the restriction enzymes Ale I, Dra III and Spe I and separated on a 1% TAE agarose gel (see FIG. 10). The band patterns for pMISC210 encoding the full-length RSV-F protein and RSV-F_trunc protein (lane 1) as well as the control plasmid pMISC209 encoding the RSV-F_trunc protein only (lane 2) were compared with the patterns expected from the results of analysis of the electronic sequence of the plasmids. The expected size of bands for pMISC210 was 404, 573, 809, 1923 and 4874 bp, while for pMISC209 a pattern of bands with sizes of 573, 661, 809 and 4874 bp was expected. All expected bands and no additional bands were found experimentally. In case recombination between the RSV-F variants in pMISC210 occurred, one or more of the smaller fragments would be lost, depending on the sites of recombination. This was clearly not found in the current example. Thus, the results show the stability of the plasmid pMISC210 with the two RSV-F genes (RSV-F and RSV-F_trunc) in E. coli.