VACCINES AND METHODS

20220040284 · 2022-02-10

    Inventors

    Cpc classification

    International classification

    Abstract

    Methods for identifying optimized antigenic pathogen polypeptides capable of inducing a broadly neutralizing immune response, and associated T-cell responses, to a pathogen are described, as well as nucleic acid sequences encoding such polypeptides. Methods for determining whether a broadly neutralizing immune response is induced in a subject following immunization with an optimized antigenic pathogen polypeptide, or a nucleic acid encoding the optimized pathogen polypeptide, are also described. Nucleic acid molecules, polypeptides, vectors, cells, fusion proteins, pharmaceutical compositions, and their use as vaccines against pathogens, especially against emerging or re-emerging pathogens (particularly RNA viruses), are also described.

    Claims

    1. A method for identifying a lead candidate optimized antigenic pathogen polypeptide capable of inducing a broadly neutralizing immune response to a pathogen, which comprises: i) providing a polypeptide library comprising a plurality of different candidate optimized antigenic pathogen polypeptides, wherein the amino acid sequence of each different candidate has been optimized from a plurality of different amino acid sequences of a pathogen polypeptide and is different from each different amino acid sequence of the pathogen polypeptide, wherein each different amino acid sequence of the pathogen polypeptide comprises amino acid sequence of a polypeptide of a different isolate, and wherein each different isolate is an isolate of a pathogen of the same family as the pathogen to which it is desired to induce a broadly neutralizing immune response; ii) screening the candidate optimized antigenic pathogen polypeptides of the polypeptide library for binding by one or more broadly neutralizing antigen-binding molecules, each of which is able to bind and/or neutralize a pathogen of the same family as the pathogen to which it is desired to induce a broadly neutralizing immune response; and iii) identifying a candidate optimized antigenic pathogen polypeptide that is bound by one or more of the antigen-binding molecules in step (ii) as being a lead candidate optimized antigenic pathogen polypeptide capable of inducing a broadly neutralizing immune response to the pathogen.

    2. A method according to claim 1, wherein the one or more broadly neutralizing antigen-binding molecules include an antibody that has been obtained, or derived from an antibody that has been obtained, from a subject that has been exposed to a pathogen of the same family as the pathogen to which it is desired to induce a broadly neutralizing immune response.

    3. A method according to claim 1 or 2, wherein the one or more broadly neutralizing antigen-binding molecules include non-antibody antigen-binding proteins.

    4. A method according to claim 3, wherein the one or more broadly neutralizing antigen-binding molecules include a designed ankyrin repeat protein (DARPin), an anticalin, an aptamer, or a T-cell receptor molecule.

    5. A method according to any preceding claim, wherein the candidate optimized antigenic pathogen polypeptides of the polypeptide library have been expressed in, or on the surface of, mammalian cells.

    6. A method according to any of claims 1 to 4, wherein the candidate optimized antigenic pathogen polypeptides of the polypeptide library have been expressed in, or on the surface of, bacterial, yeast, or insect cells.

    7. A method according to any preceding claim, wherein the pathogen is a virus, the candidate optimized antigenic pathogen polypeptides are candidate optimized antigenic virus polypeptides, and the pathogen peptides are virus polypeptides.

    8. A method according to claim 7, wherein the polypeptide library is a viral pseudotype library comprising a plurality of different viral pseudotypes, each different viral pseudotype comprising a different candidate optimized virus polypeptide.

    9. A method according to claim 8, wherein in step (ii) the candidate optimized antigenic virus polypeptides are screened for binding by one or more of the antigen-binding molecules by screening the viral pseudotypes for binding and/or neutralization by one or more of the antigen-binding molecules.

    10. A method according to any of claims 1 to 7, wherein the candidate optimized antigenic pathogen polypeptides are screened for binding by the one or more antigen-binding molecules by a flow cytometric assay.

    11. A method according to any preceding claim, which further comprises generating the polypeptide library.

    12. A method according to claim 11, wherein the polypeptide library is generated by expressing the different candidate optimized antigenic pathogen polypeptides from a nucleic acid library comprising a plurality of different nucleic acids, each different nucleic acid comprising a nucleotide sequence encoding a different candidate optimized antigenic pathogen polypeptide of the polypeptide library.

    13. A method according to claim 12, wherein the different candidate optimized pathogen polypeptides are expressed in, or on the surface of, mammalian cells.

    14. A method according to claim 12 or 13, wherein the nucleotide sequence of each different nucleic acid of the nucleic acid library is codon-optimized, optionally gene-optimized, for expression of the encoded polypeptide in a mammalian cell.

    15. A method according to any of claims 12 to 14, wherein each different nucleic acid of the nucleic acid library is part of an expression vector for expression of the nucleic acid in a mammalian cell.

    16. A method according to any of claims 12 to 15, wherein the pathogen is a virus, the candidate optimized antigenic pathogen polypeptides are candidate optimized antigenic virus polypeptides, and the pathogen peptides are virus polypeptides.

    17. A method according to claim 16, wherein the nucleic acid library is a viral pseudotype vector library, and each different nucleic acid of the library is part of an expression vector for production of a viral pseudotype comprising the encoded virus polypeptide, and the polypeptide library is a viral pseudotype library generated by producing viral pseudotypes from the expression vectors of the viral pseudotype vector library, wherein the viral pseudotype library comprises a plurality of different viral pseudotypes, each different viral pseudotype comprising a different candidate optimized virus polypeptide encoded by a different nucleic acid sequence of the viral pseudotype vector library.

    18. A method according to any of claims 15 to 17, wherein the expression vector is also a vaccine vector.

    19. A method according to claim 18, wherein the vaccine vector is a viral vaccine vector, a bacterial vaccine vector, an RNA vaccine vector, or a DNA vaccine vector.

    20. A method according to claim 18 or 19, wherein the vaccine vector is based on a viral delivery vector, such as a poxvirus (e.g. MVA, NYVAC, AVIPDX), herpesvirus (e.g. HSV, CMV, Adenovirus of any host species), Morbillivirus (e.g. measles), Alphavirus (e.g. SFV, Sendai), Flavivirus (e.g. Yellow Fever), or Rhabdovirus (e.g. VSV)-based viral delivery vector, a bacterial delivery vector (e.g. Salmonella, E. coli), an RNA expression vector, or a DNA expression vector.

    21. A method according to any of claims 15 to 20, wherein the vector is a pEVAC-based expression vector.

    22. A method according to claim 12, wherein the different candidate optimized antigenic pathogen polypeptides are expressed in, or on the surface of, bacterial, yeast, or insect cells.

    23. A method according to any of claims 12 to 22, which further comprises generating the nucleic acid library by synthesising a plurality of different nucleic acids, each different nucleic acid comprising a different nucleotide sequence encoding a different candidate optimized antigenic pathogen polypeptide.

    24. A method according to claim 23, which further comprises: i) obtaining amino acid sequences of the pathogen polypeptide, and/or nucleotide sequences encoding the pathogen polypeptide, of the different pathogen isolates; and ii) generating a plurality of different nucleotide sequences, each different nucleotide sequence encoding a different candidate optimized antigenic pathogen polypeptide, wherein the encoded amino acid sequence of each different candidate optimized antigenic pathogen polypeptide is optimized from the obtained amino acid sequences or encoded amino acid sequences of the pathogen polypeptide, and is different from each of the obtained amino acid sequences or encoded amino acid sequences.

    25. A method according to claim 24, wherein generation of the plurality of different nucleotide sequences in step (ii) of claim 24 comprises: carrying out a multiple sequence alignment of the amino acid or nucleotide sequences obtained in step (i) of claim 24; identifying from the multiple sequence alignment amino acid sequence or encoded amino acid sequence that is highly conserved between the polypeptides of the different pathogen isolates; and generating a plurality of different nucleotide sequences, each different nucleotide sequence encoding a different candidate optimized antigenic pathogen polypeptide, wherein one or more of the different nucleotide sequences includes sequence encoding a highly conserved amino acid sequence or encoded amino acid sequence identified from the multiple sequence alignment.

    26. A method according to claim 25, which further comprises: identifying from the multiple sequence alignment amino acid sequence or encoded amino acid sequence that is ancestral amino acid sequence; and including in one or more of the different generated nucleotide sequences sequence encoding an ancestral amino acid sequence identified from the multiple sequence alignment.

    27. A method according to any of claims 24 to 26, which includes codon-optimization, optionally gene-optimization codons of the different generated nucleotide sequences for optimal expression of the encoded candidate optimized antigenic pathogen polypeptides in an expression system.

    28. A method according to claim 27, wherein the expression system comprises a mammalian cell.

    29. A method according to claim 27, wherein the expression system comprises a yeast, bacterial, or insect cell.

    30. A method according to any of claims 24 to 29, which includes optimizing the different nucleotide sequences for antigenicity of the encoded candidate optimized antigenic pathogen polypeptides.

    31. A method according to claim 30, wherein the antigenicity optimization includes any of the following: deletion or modification of nucleic acid sequence encoding amino acid sequence that inhibits production and/or function of anti-pathogen polypeptide antibody (for example, deletion or modification of a mucin-like domain); region swapping to recover one or more potential lost encoded epitopes; site-specific mutation, for example of N-linked glycosylation sites; changes to enhance stability (e.g. disulphide bond formation, reduce degradation of the encoded polypeptide by a serine protease); removal of glycans; insertion of nucleic acid sequence, for example to insert nucleic acid sequence encoding a desired epitope.

    32. A method according to any preceding claim, wherein the one or more broadly neutralizing antigen-binding molecules recited in step (ii) of claim 1 include a broadly neutralizing antibody, preferably a broadly neutralizing monoclonal antibody (BNmAb).

    33. A method according to any preceding claim, wherein the one or more antigen-binding molecules recited in step (ii) of claim 1 include an antibody obtained, or derived from an antibody obtained, from a subject that has survived an outbreak of a pathogen of the same family, optionally of the same subtype or type, as the pathogen to which it is desired to induce a broadly neutralizing immune response.

    34. A method according to claim 33, wherein the subject from which the antibody has been obtained or derived is a human or non-human mammalian subject.

    35. A method according to claim 33 or 34, wherein the one or more antigen-binding molecules include a broadly neutralizing monoclonal antibody (BNmAb).

    36. A method according to any preceding claim, wherein the different pathogen isolates include different pathogen isolates from an outbreak of a pathogen of the same subtype as the pathogen to which it is desired to induce a broadly neutralizing immune response.

    37. A method according to any preceding claim, wherein the different pathogen isolates include different pathogen isolates from an outbreak of a pathogen of a different subtype, but the same type, as the pathogen to which it is desired to induce a broadly neutralizing immune response.

    38. A method according to any preceding claim, wherein the different pathogen isolates include different pathogen isolates from an outbreak of a pathogen of a different group, but the same family, as the pathogen to which it is desired to induce a broadly neutralizing immune response.

    39. A method according to any preceding claim, wherein the different pathogen isolates include different prior pathogen isolates of a pathogen of the same subtype, type, or family as the pathogen to which it is desired to induce a broadly neutralizing immune response.

    40. A method according to any preceding claim, wherein each candidate optimized antigenic pathogen polypeptide comprises at least 20 amino acid residues.

    41. A method according to any preceding claim, wherein the pathogen is a virus.

    42. A method according to claim 41, wherein the virus is an RNA virus.

    43. A method according to claim 41 or 42, wherein the virus is an emerging or re-emerging RNA virus.

    44. A method according to any of claims 41 to 43, wherein the virus is a Filovirus, an Arenavirus, or an Orthomyxovirus.

    45. A method according to any of claims 41 to 43, wherein the virus is Ebola virus or Marburg virus.

    46. A method according to any of claims 41 to 43, wherein the virus is Lassa virus.

    47. A method according to any preceding claim, wherein the pathogen polypeptide is a viral glycoprotein.

    48. A method according to any preceding claim, which is an in vitro method.

    49. A method of identifying a nucleic acid sequence encoding an optimized antigenic pathogen polypeptide capable of inducing a broadly neutralizing immune response to a pathogen, which comprises: i) immunizing a human, or a non-human animal, with a nucleic acid comprising a nucleic acid sequence encoding a lead candidate optimized antigenic pathogen polypeptide identified by a method according to any preceding claim; ii) determining whether a broadly neutralizing immune response is induced in the human or non-human animal following the immunization in step (i); and iii) identifying the nucleic acid sequence as a nucleic acid sequence encoding an optimized antigenic pathogen polypeptide capable of inducing a broadly neutralizing immune response to the pathogen if it is determined from step (ii) that a broadly neutralizing immune response is induced in the human or non-human animal.

    50. A method according to claim 49, which comprises determining whether a broadly neutralizing immune response is induced in the human or non-human animal by determining whether antibody in serum obtained from the human or non-human animal binds to and/or neutralizes more than one pathogen subtype.

    51. A method according to claim 49 or 50, wherein the non-human animal is a mammal.

    52. A method according to claim 51, wherein the mammal is a guinea pig, or a mouse.

    53. A method according to claim 49 or 50, wherein the non-human animal is avian.

    54. An isolated nucleic acid molecule, comprising a nucleic acid sequence that is: i) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:1, or identical with SEQ ID NO:1; ii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:2, or identical with SEQ ID NO:2; iii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:4, or identical with SEQ ID NO:4; iv) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:5, or identical with SEQ ID NO:5; v) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:7, or identical with SEQ ID NO:7; or vi) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:8, or identical with SEQ ID NO:8; or the complement thereof.

    55. An isolated nucleic acid molecule, comprising a nucleic acid sequence that is: i) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:10, or identical with SEQ ID NO:10; ii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:12, or identical with SEQ ID NO:12; or iii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:14, or identical with SEQ ID NO:14; or the complement thereof.

    56. An isolated nucleic acid molecule, comprising a nucleic acid sequence that is: i) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:19, or identical with SEQ ID NO:19; ii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:21, or identical with SEQ ID NO:21; iii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:23, or identical with SEQ ID NO:23; iv) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:25, or identical with SEQ ID NO:25; v) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:27, or identical with SEQ ID NO:27; vi) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:29, or identical with SEQ ID NO:29; or vii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:31, or identical with SEQ ID NO:31; or the complement thereof.

    57. An isolated polypeptide, comprising an amino acid sequence that is: i) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:1, or identical with the amino acid sequence encoded by SEQ ID NO:1; ii) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:2, or identical with the amino acid sequence encoded by SEQ ID NO:2; iii) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:4, or identical with the amino acid sequence encoded by SEQ ID NO:4; iv) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:5, or identical with the amino acid sequence encoded by SEQ ID NO:5; v) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:7, or identical with the amino acid sequence encoded by SEQ ID NO:7; vi) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:8, or identical with the amino acid sequence encoded by SEQ ID NO:8; vii) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:10, or identical with the amino acid sequence encoded by SEQ ID NO:10; viii) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:12, or identical with the amino acid sequence encoded by SEQ ID NO:12; ix) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:14, or identical with the amino acid sequence encoded by SEQ ID NO:14.

    58. An isolated polypeptide, comprising an amino acid sequence that is: i) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:3, or identical with SEQ ID NO:3; ii) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:6, or identical with SEQ ID NO:6; or iii) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:9, or identical with SEQ ID NO:9; iv) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:11, or identical with SEQ ID NO:11; v) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:13, or identical with SEQ ID NO:13; or vi) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:15, or identical with SEQ ID NO:15.

    59. An isolated polypeptide, comprising an amino acid sequence that is: i) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:18, or identical with SEQ ID NO:18; ii) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:20, or identical with SEQ ID NO:20; iii) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:22, or identical with SEQ ID NO:22; iv) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:24, or identical with SEQ ID NO:24; v) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:26, or identical with SEQ ID NO:26; vi) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:28, or identical with SEQ ID NO:28; or vii) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:30, or identical with SEQ ID NO:30.

    60. An isolated nucleic acid encoding an amino acid sequence encoded by a nucleic acid of claim 54, 55, or 56, wherein the nucleic acid is codon-optimized, optionally gene-optimized, for expression in mammalian cells.

    61. An isolated nucleic acid encoding a polypeptide of claim 57, 58, or 59, wherein the nucleic acid is codon-optimized, optionally gene-optimized, for expression in mammalian cells.

    62. A vector comprising a nucleic acid of claim 54, 55, 56, 60, or 61.

    63. A vector according to claim 62, which further comprises a promoter operably linked to the nucleic acid.

    64. A vector according to claim 63, wherein the promoter is for expression of a polypeptide encoded by the nucleic acid in mammalian cells.

    65. A vector according to claim 63, wherein the promoter is for expression of a polypeptide encoded by the nucleic acid in yeast or insect cells.

    66. A vector according to any of claims 62 to 65, which is a vaccine vector.

    67. A vector according to claim 66, which is a viral vaccine vector, a bacterial vaccine vector, an RNA vaccine vector, or a DNA vaccine vector.

    68. An isolated cell comprising a vector of any of claims 62 to 65.

    69. A pseudotyped virus particle comprising the polypeptide of claim 57, 58, or 59.

    70. A method of producing a pseudotyped virus particle of claim 69, which includes transfecting a host cell with a vector according to any of claims 62 to 64.

    71. A fusion protein comprising a polypeptide according to claim 57, 58, or 59.

    72. A pharmaceutical composition comprising a nucleic acid according to claim 54, 55, 56, 60, or 61, and a pharmaceutically acceptable carrier, excipient, or diluent.

    73. A pharmaceutical composition comprising a vector according to any of claim 62 to 64, 66, or 67, and a pharmaceutically acceptable carrier, excipient, or diluent.

    74. A pharmaceutical composition comprising a polypeptide according to claim 57, 58, or 59, and a pharmaceutically acceptable carrier, excipient, or diluent.

    75. A pharmaceutical composition according to any of claims 72 to 74, which further comprises an adjuvant for enhancing an immune response in a subject to the polypeptide, or to a polypeptide encoded by the nucleic acid, of the composition.

    76. A method of inducing an immune response to a virus of the Filoviridae family in a subject, which comprises administering to the subject a nucleic acid according to any of claim 54, 55, 60, or 61, a polypeptide according to claim 57 or 58, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75.

    77. A method of immunizing a subject against a virus of the Filoviridae family, which comprises administering to the subject a nucleic acid according to any of claim 54, 55, 60, or 61, a polypeptide according to claim 57 or 58, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75.

    78. A method of inducing an immune response to a virus of the Arenaviridae family in a subject, which comprises administering to the subject a nucleic acid according to any of claim 56, 60, or 61, a polypeptide according to claim 59, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75.

    79. A method of immunizing a subject against a virus of the Arenaviridae family, which comprises administering to the subject a nucleic acid according to any of claim 56, 60, or 61, a polypeptide according to claim 59, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75.

    80. A method according to any of claims 76 to 79, wherein the composition is administered intramuscularly.

    81. A nucleic acid expression vector, which comprises a multiple cloning site, comprising KpnI and NotI endonuclease sites.

    82. A vector according to claim 81, wherein the multiple cloning site comprises a nucleic acid sequence of SEQ ID NO:16.

    83. A vector according to claim 81 or 82, which is an expression vector, and a viral pseudotype vector.

    84. A vector according to any of claims 81 to 83, which is a vaccine vector.

    85. A vector according to any of claims 81 to 84, which comprises, from a 5′ to 3′ direction: a promoter; a splice donor site; a splice acceptor site; and a terminator signal, wherein the multiple cloning site is located between the splice acceptor site and the terminator signal.

    86. A vector according to claim 85, wherein the promoter comprises a CMV immediate early 1 enhancer/promoter and/or the terminator signal comprises a terminator signal of a bovine growth hormone gene that lacks a KpnI restriction endonuclease site.

    87. A vector according to any of claims 81 to 86, which further comprises an origin of replication, and nucleic acid encoding resistance to an antibiotic.

    88. A vector according to claim 87, wherein the origin of replication comprises a pUC-plasmid origin of replication and/or the nucleic acid encodes resistance to kanamycin.

    89. A vector according to any of claims 81 to 88, which comprises a nucleic acid sequence of SEQ ID NO:17.

    90. An isolated nucleic acid molecule which comprises a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 6, and a polypeptide comprising an amino acid sequence of SEQ ID NO: 9.

    91. An isolated nucleic acid molecule which comprises a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 13, and a polypeptide comprising an amino acid sequence of SEQ ID NO: 15.

    92. A composition comprising a first nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 6, and a second nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 9.

    93. A composition comprising a first nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 13, and a second nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 15.

    94. A combined preparation comprising: (i) a first nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 6; and (ii) a second nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 9.

    95. A combined preparation comprising: (i) a first nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 13; and (ii) a second nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 15.

    96. A composition comprising a first polypeptide comprising an amino acid sequence of SEQ ID NO: 6, and a second polypeptide comprising an amino acid sequence of SEQ ID NO: 9.

    97. A composition comprising a first polypeptide comprising an amino acid sequence of SEQ ID NO: 13, and a second polypeptide comprising an amino acid sequence of SEQ ID NO: 15.

    98. A fusion protein comprising a first polypeptide comprising an amino acid sequence of SEQ ID NO: 6, and a second polypeptide comprising an amino acid sequence of SEQ ID NO: 9.

    99. A fusion protein comprising a first polypeptide comprising an amino acid sequence of SEQ ID NO: 13, and a second polypeptide comprising an amino acid sequence of SEQ ID NO: 15.

    100. A combined preparation comprising: (i) a first polypeptide comprising an amino acid sequence of SEQ ID NO: 6; and (ii) a second polypeptide comprising an amino acid sequence of SEQ ID NO: 9.

    101. A combined preparation comprising: (i) a first polypeptide comprising an amino acid sequence of SEQ ID NO: 13; and (ii) a second polypeptide comprising an amino acid sequence of SEQ ID NO: 15.

    102. A nucleic acid according to any of claim 54, 55, 60, or 61, a polypeptide according to claim 57 or 58, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, for use as a medicament.

    103. A nucleic acid according to any of claim 54, 55, 60, or 61, a polypeptide according to claim 57 or 58, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, for use in the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Filoviridae family.

    104. Use of a nucleic acid according to any of claim 54, 55, 60, or 61, a polypeptide according to claim 57 or 58, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, in the manufacture of a medicament for the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Filoviridae family.

    105. A nucleic acid according to any of claim 56, 60, or 61, a polypeptide according to claim 59, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, for use as a medicament.

    106. A nucleic acid according to any of claim 56, 60, or 61, a polypeptide according to claim 59, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, for use in the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Arenaviridae family.

    107. Use of a nucleic acid according to any of claim 56, 60, or 61, a polypeptide according to claim 59, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, in the manufacture of a medicament for the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Arenaviridae family.

    108. A nucleic acid according to claim 90 or 91, a composition according to claim 92, 93, 96, or 97, a combined preparation according to claim 94, 95, 100, or 101, or a fusion protein according to claim 98 or 99, for use as a medicament.

    109. A nucleic acid according to claim 90 or 91, a composition according to claim 92, 93, 96, or 97, a combined preparation according to claim 94, 95, 100, or 101, or a fusion protein according to claim 98 or 99, for use in the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Filoviridae family.

    110. Use of a nucleic acid according to claim 90 or 91, a composition according to claim 92, 93, 96, or 97, a combined preparation according to claim 94, 95, 100, or 101, or a fusion protein according to claim 98 or 99, in the manufacture of a medicament for the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Filoviridae family.

    Description

    [0352] Embodiments of the invention are described, by way of illustration only, in the Examples below, with reference to the accompanying drawings in which:

    [0353] FIG. 1 shows an illustration of a phylogenetic tree and its relation to ancestral sequence reconstruction;

    [0354] FIG. 2 shows a phylogenetic tree comparing ebolaviruses and Marburg viruses. Numbers indicate percent confidence of branches;

    [0355] FIG. 3 shows a plasmid map for pEVAC;

    [0356] FIG. 4 shows challenge study results for an Ebola challenge model. Ebola challenge model was lethal for non-vaccinated guinea pigs (Group 1, lower line) whereas all vaccinated guinea pigs (Group 2, upper line) were protected (left) and continued to gain weight (right);

    [0357] FIG. 5 shows the results of a pseudotype virus neutralisation assay illustrating the strength of neutralising antibody responses to target antigens expressed on the surface of a pseudotyped virus, representative of all Ebola virus species and Marburg viruses. Strength of neutralisation is indicated by the heat-map where red (darkest shading) is very strong neutralisation, decreasing through orange to yellow (progressively lighter shading) and no neutralising/equal to negative control values are white. T2-4 and T2-6 are nucleic acid vaccines encoding lead candidate optimized antigenic Ebola polypeptide, combined with 12-11 a Marburg candidate, at pre-clinical stage testing with serum samples taken from immunised guinea pigs;

    [0358] FIG. 6 shows the results of study to determine the effectiveness of nucleic acid vaccines encoding different lead candidate optimized antigenic pathogenic polypeptides, identified using an embodiment of a method of the invention. Antibody binding was measured by incubation of two groups of cells bearing two different group 1 influenza A glycoproteins on their surface (H1 pandemic and seasonal) with pooled mouse serum. Any bound antibodies were then detected by a secondary antibody, and results recorded using a flow cytometer. Binding was significantly increased before and after vaccination with all constructs, but not after vaccination with PBS (control). Overall, a vaccine candidate out-performed those from COBRA in both cases (*);

    [0359] FIG. 7 shows the results of a study to determine binding of cells expressing two different group 1 influenza A glycoproteins on their cell surface (seasonal H1N1, and pandemic origin H1N1) by mouse sera from animals immunized with either the COBRA or DIOS HA gene antigens; and

    [0360] FIG. 8 shows the results of cross-HA-group binding (left panel), and pseudotype neutralization (right) of H7N9 (A/Shanghai2/2013), by sera from DIOS or COBRA DNA immunized mice. In the right panel, the uppermost curve is for CR9114, the two curves falling from the lowest two starting points at the left of the graph are for H1N1s, and the remaining two curves are for H1N1pdm.

    [0361] Examples of unoptimized Ebola and Marburg viral ancestral nucleic acid sequences (i.e. sequences which have not been codon-optimized or gene-optimized) are given below, as well as gene-optimized nucleic acid sequences encoding candidate antigenic pathogen polypeptides.

    [0362] Methodology

    [0363] For a given virus species, candidate primary sequences are downloaded, for example, from GenBank (and from any other available sources, such as outbreak data), and are filtered to remove identical sequences, sequences that do not span the protein of interest, and sequences that have a high number of ambiguous nucleotides. A multiple sequence alignment of the filtered sequences is generated (typically using MAFFT), and checked manually to ensure that sequences are in the correct open reading frame. A maximum likelihood phylogeny is generated using IQTREE, with automated model selection, and rooted using one of several methods; an outgroup sequence, midpoint rooting, centre-of-the-tree, or a tree that maximises the association between root-to-tip distance and sampling time. Ancestral sequences are generated using HyPhy assuming a MG94 by F3x4 model of codon substitution, and are checked to ensure that known epitopes have been preserved. A phylogenetic tree with both primary and ancestral sequences is generated using IQTREE to check the placement of the ancestral strains. Ancestral sequences are then modified in a number of ways: deletion of regions (e.g. removal of the mucin-like domain); region swapping (to recover potential lost epitopes); mutation of specific sites (e.g. in the fusion domain of the filoviruses), including editing of N-linked glycosylation sites and introduction of mutations to enhance stability.

    EXAMPLE 1

    [0364] Ebola Sudan Ancestor (T2-4)

    TABLE-US-00010 Unoptimised (SEQ ID NO: 1) ATGGGGGGTCTTAGCCTACTCCAATTGCCCAGGGACAAATTTCGGAAAAG CTCTTTCTTTGTTTGGGTCATCATCTTATTCCAAAAGGCCTTTTCCATGC CTTTGGGTGTTGTGACTAACAGCACTTTAGAAGTAACAGAGATTGACCAG CTAGTCTGCAAGGATCATCTTGCATCCACTGACCAGCTGAAATCAGTTGG TCTCAACCTCGAGGGGAGCGGAGTATCTACTGATATCCCATCTGCAACAA AGCGTTGGGGCTTCAGATCTGGTGTTCCTCCCAAGGTGGTCAGCTATGAA GCGGGAGAATGGGCTGAAAATTGCTACAATCTTGAAATAAAGAAGCCGGA CGGGAGCGAATGCTTACCCCCACCGCCAGATGGTGTCAGAGGCTTTCCAA GGTGCCGCTATGTTCACAAAGCCCAAGGAACCGGGCCCTGCCCAGGTGAC TACGCCTTTCACAAGGATGGAGCTTTCTTCCTCTATGACAGGCTGGCTTC AACTGTAATTTACAGAGGAGTCAATTTTGCTGAGGGGGTAATTGCATTCT TGATATTGGCTAAACCAAAAGAAACGTTCCTTCAGTCACCCCCCATTCGA GAGGCAGTAAACTACACTGAAAATACATCAAGTTATTATGCCACATCCTA CTTGGAGTATGAAATCGAAAATTTTGGTGCTCAACACTCCACGACCCTTT TCAAAATTGACAATAATACTTTTGTTCGTCTGGACAGGCCCCACACGCCT CAGTTCCTTTTCCAGCTGAATGATACCATTCACCTTCACCAACAGTTGAG CAACACAACTGGGAGACTAATTTGGACACTAGATGCTAATATCAATGCTG ATATTGGTGAATGGGCTTTTTGGGAAAATAAAAAAAATCTCTCCGAACAA CTACGTGGAGAAGAGCTGTCTTTCGAAGCTTTATCGCTCACAACAGCGGT TAAAACTGTCTTGCCACAGGAGTCCACAAGCAACGGTCTAATAACTTCAA CAGTAACAGGGATTCTTGGGAGTCTTGGGCTTCGAAAACGCAGCAGAAGA CAAGTTAACACCAAAGCCACGGGTAAATGCAATCCCAACTTACACTACTG GACTGCACAAGAACAACATAATGCTGCTGGGATTGCCTGGATCCCGTACT TTGGACCGGGTGCGGAAGGCATATACACTGAAGGCCTGATGCATAACCAA AATGCCTTAGTCTGTGGACTTAGGCAACTTGCAAATGAAACAACTCAAGC TCTGCAGCTTTTCTTAAGAGCCACAACGGAGCTGCGGACATATACCATAC TCAATAGGAAGGCCATAGATTTCCTTCTGCGACGATGGGGCGGGACATGC AGGATCCTGGGACCAGATTGTTGCATTGAGCCACATGATTGGACAAAAAA CATCACTGATAAAATCAACCAAATCATCCATGATTTCATCGACAACCCCT TACCTAATCAGGATAATGATGATAATTGGTGGACGGGCTGGAGACAGTGG ATCCCTGCAGGAATAGGCATTACTGGAATTATTATTGCAATTATTGCTCT TCTTTGCGTTTGCAAGCTGCTTTGCTAG Gene-optimised (SEQ ID NO: 2) ATGGGAGGACTGTCTCTGCTGCAACTGCCCCGGGACAAGTTCCGGAAGTC CAGCTTCTTCGTGTGGGTCATCATCCTGTTCCAGAAAGCCTTCAGCATGC CCCTGGGCGTCGTGACCAATAGCACACTGGAAGTGACCGAGATCGACCAG CTCGTGTGCAAGGATCACCTGGCCAGCACCGATCAGCTGAAGTCTGTGGG ACTGAATCTGGAAGGCAGCGGCGTGTCCACAGATATCCCTAGCGCCACCA AGAGATGGGGCTTTAGAAGCGGAGTGCCTCCTAAGGTGGTGTCTTATGAA GCCGGCGAGTGGGCCGAGAACTGCTACAACCTGGAAATCAAGAAGCCCGA CGGCAGCGAGTGTCTGCCTCCTCCACCTGATGGCGTCAGAGGCTTCCCTA GATGCAGATACGTGCACAAGGCCCAAGGCACAGGACCCTGTCCTGGCGAT TACGCCTTTCACAAGGACGGCGCCTTTTTCCTGTACGATCGGCTGGCCTC CACCGTGATCTACAGAGGCGTTAACTTTGCCGAGGGCGTGATCGCCTTCC TGATCCTGGCCAAGCCTAAAGAGACATTCCTGCAAAGCCCTCCAATCCGC GAGGCCGTGAACTACACAGAGAACACCAGCAGCTACTACGCCACCAGCTA CCTGGAATACGAGATCGAGAATTTCGGCGCCCAGCACAGCACCACACTGT TCAAGATCGACAACAACACCTTCGTGCGGCTGGACAGACCCCACACACCT CAGTTTCTGTTCCAGCTGAACGACACCATCCATCTGCATCAGCAGCTGAG CAACACCACCGGCAGACTGATTTGGACCCTGGACGCCAACATCAACGCCG ACATTGGAGAGTGGGCCTTTTGGGAGAACAAGAAGAACCTGAGCGAACAG CTGAGAGGCGAGGAACTGAGCTTTGAGGCCCTGTCTCTGACCACCGCCGT GAAAACAGTGCTGCCTCAAGAGTCCACCAGCAACGGCCTGATCACAAGCA CAGTGACAGGCATCCTGGGCAGCCTGGGCCTGAGAAAAAGGTCCAGACGG CAAGTGAATACCAAGGCCACCGGCAAGTGCAACCCCAACCTGCACTATTG GACAGCCCAAGAGCAGCACAATGCCGCCGGAATCGCCTGGATTCCTTATT TTGGACCTGGCGCCGAGGGCATCTATACCGAGGGACTGATGCACAACCAG AACGCCCTCGTGTGTGGACTGAGACAGCTGGCCAATGAGACAACACAGGC CCTCCAGCTGTTTCTGAGAGCCACCACCGAGCTGAGAACCTACACCATCC TGAACCGGAAGGCCATCGACTTTCTGCTGAGAAGATGGGGCGGCACCTGT AGAATCCTGGGACCTGATTGCTGCATCGAGCCCCACGACTGGACCAAGAA CATCACCGACAAGATCAACCAGATCATCCACGACTTCATCGACAACCCTC TGCCTAACCAGGACAACGACGACAATTGGTGGACAGGCTGGCGGCAGTGG ATTCCTGCCGGAATTGGCATCACCGGCATCATCATTGCCATTATCGCCCT GCTGTGTGTGTGCAAGCTGCTGTGTTGA Amino acid sequence encoded by unoptimised and gene-optimised sequences (SEQ ID NO: 3): MGGLSLLQLPRDKERKSSFEVWVIILFQKAFSMPLGVVTNSTLEVTEIDQ LVCKDHLASTDQLKSVGLNLEGSGVSTDIPSATKRWGFRSGVPPKVVSYE AGEWAENCYNLEIKKPDGSECLPPPPDGVRGFPRCRYVHKAQGTGPCPGD YAFHKDGAFFLYDRLASTVIYRGVNFAEGVIAFLILAKPKETFLQSPPIR EAVNYTENTSSYYATSYLEYEIENFGAQHSTTLFKIDNNTEVRLDRPHTP QFLFQLNDTIHLHQQLSNTTGRLIWTLDANINADIGEWAFWENKKNLSEQ LRGEELSFEALSLTTAVKTVLPQESTSNGLITSTVTGILGSLGLRKRSRR QVNTKATGKCNPNLHYWTAQEQHNAAGIAWIPYFGPGAEGIYTEGLMHNQ NALVCGLRQLANETTQALQLFLRATTELRTYTILNRKAIDFLLRRWGGTC RILGPDCCIEPHDWTKNITDKINQIIHDFIDNPLPNQDNDDNWWTGWRQW IPAGIGITGIIIAIIALLCVCKLLC

    EXAMPLE 2

    [0365] Ebolavirus Global Ancestor (T2-6)

    TABLE-US-00011 Unoptimised (SEQ ID NO: 4) ATGGGGGGTGGATCCAGACTTCTCCAATTGCCCCGGGAACGCTTTCGGAA AACCTCATTCTTTGTTTGGGTAATCATCCTATTCCAAAAAGCCTTTTCCA TGCCATTGGGTGTTGTAACCAACAGCACTCTAAAAGTAACAGAAATTGAC CAATTGGTTTGCCGGGACAAACTTTCATCCACAAGTCAGCTGAAATCAGT TGGGCTGAATCTGGAAGGGAATGGAGTTGCAACTGATGTCCCATCAGCAA CAAAACGATGGGGCTTCCGATCTGGTGTTCCTCCCAAGGTGGTCAGCTAT GAAGCTGGAGAATGGGCTGAAAATTGCTACAATCTGGAAATCAAGAAGCC AGACGGGAGTGAATGCCTACCTCCACCGCCAGACGGTGTAAGAGGCTTCC CCAGGTGCCGCTATGTCCACAAAGTTCAAGGAACAGGGCCGTGTCCTGGT GACTTCGCCTTCCACAAAGATGGAGCTTTCTTCCTGTATGATAGACTGGC TTCAACTGTCATTTACCGAGGGACAACTTTTGCTGAAGGTGTCGTTGCAT TTTTGATCCTGCCCAAACCTAAAAAGGACTTTTTCCAATCACCCCCAATA CGTGAGCCGGTAAACACCACAGAAGATCCATCAAGTTACTACACCACATC AACACTTAGCTATGAGATTGACAATTTTGGGGCCAATAAAACTAAAACTC TTTTCAAAGTTGACAATCACACTTATGTGCAACTAGACCGACCACACACA CCACAGTTCCTTGTCCAGCTCAATGAAACCATTCATACAAATAACCGTCT AAGCAACACCACAGGGAGACTAATTTGGACATTAGATCCTAAAATTGATA CCGACATTGGTGAGTGGGCCTTCTGGGAAAATAAAAAAAACTTCTCCAAA CAACTTCGTGGAGAAGAGTTGTCTTTCAAAGCTCTATCAACAAAAACTGG AGCTAACGCAGTAGACACTGACGAATCAAGCAAACCTGGCCTAATTACCA ACACAGTAAGAGGGGTTGCTGATTTACTGAGCCCTTGGAGAAGAAAAAGA AGACAAGTCAACCCAAACACAACAAATAAATGCAACCCAAACCTACACTA TTGGACAGCCCAAGATGAAGGTGCTGCCGTTGGATTAGCCTGGATCCCAT ACTTCGGACCAGCAGCAGAAGGCATTTACACTGAAGGAATAATGCATAAT CAAAATGGGTTAATCTGTGGGCTGAGGCAGCTGGCCAATGAAACGACTCA AGCTCTTCAATTATTCTTGAGGGCCACAACGGAGCTGCGGACTTACTCTA TACTCAATAGAAAAGCCATTGATTTCCTTCTCCAACGATGGGGAGGAACA TGCCGCATCTTAGGACCAGATTGTTGCATTGAGCCACATGATTGGACAAA AAACATTACTGATAAAATTAACCAAATCATACATGATTTTATTGACAACC CTCTACCAGATCAGGACGATGATGACAATTGGTGGACAGGCTGGAGACAA TGGATCCCTGCTGGAATTGGAATTACTGGAGTTATAATTGCAATTATAGC TCTACTTTGTATTTGCAAGTTTCTGTGTTAG Gene-optimised (SEQ ID NO: 5) ATGGGCGGAGGATCTAGACTGCTGCAACTGCCCAGAGAGCGGTTCAGAAA GACCAGCTTCTTCGTGTGGGTCATCATCCTGTTCCAGAAAGCCTTCAGCA TGCCCCTGGGCGTCGTGACCAATAGCACCCTGAAAGTGACCGAGATCGAC CAGCTCGTGTGCAGAGATAAGCTGAGCAGCACCAGCCAGCTGAAGTCCGT GGGACTGAATCTGGAAGGCAATGGCGTGGCCACAGATGTGCCTAGCGCCA CCAAAAGATGGGGCTTTAGAAGCGGCGTGCCACCTAAGGTGGTGTCTTAT GAAGCCGGCGAGTGGGCCGAGAACTGCTACAACCTGGAAATCAAGAAGCC CGACGGCAGCGAGTGTCTGCCTCCTCCACCTGATGGCGTCAGAGGCTTCC CTAGATGCAGATACGTGCACAAGGTGCAAGGCACAGGCCCCTGTCCTGGC GATTTCGCCTTTCACAAGGACGGCGCCTTTTTCCTGTACGATCGGCTGGC CTCCACCGTGATCTACAGAGGCACAACATTTGCCGAAGGCGTGGTGGCCT TCCTGATCCTGCCTAAGCCTAAGAAGGACTTCTTTCAGAGCCCTCCTATC CGCGAGCCTGTGAACACAACAGAGGACCCCAGCAGCTACTACACCACCAG CACACTGAGCTACGAGATCGATAACTTCGGCGCCAACAAGACCAAGACAC TGTTCAAGGTGGACAACCACACCTACGTGCAGCTGGACAGACCCCACACA CCTCAGTTTCTGGTGCAGCTGAACGAGACAATCCACACCAACAACAGACT GAGCAACACCACCGGCAGGCTGATCTGGACCCTGGATCCTAAGATCGACA CCGACATCGGAGAGTGGGCCTTTTGGGAGAACAAGAAGAACTTCAGCAAG CAGCTGAGAGGCGAGGAACTGAGCTTTAAGGCCCTGAGCACCAAGACAGG CGCCAACGCTGTGGATACCGATGAGTCTAGCAAGCCCGGCCTGATCACCA ACACAGTTAGAGGCGTTGCCGACCTGCTGAGCCCTTGGAGAAGAAAGCGG AGACAAGTGAACCCCAATACCACCAACAAGTGCAACCCTAACCTGCACTA CTGGACAGCCCAGGATGAAGGCGCTGCTGTTGGACTGGCCTGGATTCCTT ATTTTGGACCTGCCGCCGAGGGCATCTACACAGAGGGAATCATGCACAAC CAGAATGGCCTGATCTGCGGCCTGAGACAGCTGGCCAATGAGACAACACA GGCCCTCCAGCTGTTTCTGAGAGCCACCACCGAGCTGAGAACCTACAGCA TCCTGAACCGGAAGGCCATCGACTTTCTGCTGCAAAGATGGGGAGGCACC TGTAGAATCCTGGGACCTGATTGCTGCATCGAGCCCCACGACTGGACCAA GAACATCACCGACAAGATCAACCAGATCATCCACGACTTCATCGACAACC CTCTGCCTGACCAGGACGACGACGATAATTGGTGGACAGGATGGCGGCAG TGGATTCCTGCCGGAATCGGAATCACAGGCGTGATCATTGCCATTATCGC CCTGCTGTGCATCTGCAAGTTTCTGTGCTGA Amino acid sequence encoded by unoptimised and gene-optimised sequences (SEQ ID NO: 6): MGGGSRLLQLPRERFRKTSFFVWVIILFQKAFSMPLGVVTNSTLKVTEID QLVCRDKLSSTSQLKSVGLNLEGNGVATDVPSATKRWGFRSGVPPKVVSY EAGEWAENCYNLEIKKPDGSECLPPPPDGVRGFPRCRYVHKVQGTGPCPG DFAFHKDGAFFLYDRLASTVIYRGTTFAEGVVAFLILPKPKKDFFQSPPI REPVNTTEDPSSYYTTSTLSYEIDNFGANKTKTLFKVDNHTYVQLDRPHT PQFLVQLNETIHTNNRLSNTTGRLIWTLDPKIDTDIGEWAFWENKKNFSK QLRGEELSFKALSTKTGANAVDTDESSKPGLITNTVRGVADLLSPWRRKR RQVNPNTTNKCNPNLHYWTAQDEGAAVGLAWIPYFGPAAEGIYTEGIMHN QNGLICGLRQLANETTQALQLFLRATTELRTYSILNRKAIDFLLQRWGGT CRILGPDCCIEPHDWTKNITDKINQIIHDFIDNPLPDQDDDDNWWTGWRQ WIPAGIGITGVIIAIIALLCICKFLC

    EXAMPLE 3

    [0366] Marburgvirus Ancestor (T2-11)

    TABLE-US-00012 Unoptimised (SEQ ID NO: 7) ATGAAGACCATATATTTTCTGATTAGTCTCATTTTAATCCAAAGTATAAA AACTCTCCCTGTTTTAGAAATTGCTAGTAACAGCCAACCTCAAGATGTAG ATTCAGTGTGCTCCGGAACCCTCCAAAAGACAGAAGATGTTCATCTGATG GGATTTACACTGAGTGGGCAAAAAGTTGCTGATTCCCCTTTGGAAGCATC TAAACGATGGGCTTTCAGGACAGGTGTTCCTCCCAAGAACGTTGAGTATA CGGAAGGAGAAGAAGCCAAAACATGTTACAATATAAGTGTAACAGACCCT TCTGGAAAATCCTTGCTGCTGGATCCTCCCAGTAATATCCGCGATTACCC TAAATGTAAAACTGTTCATCATATTCAAGGTCAAAACCCTCATGCACAGG GGATTGCCCTCCATTTGTGGGGGGCATTTTTCCTGTATGATCGCATTGCC TCCACAACAATGTACCGAGGCAAAGTCTTCACTGAAGGGAACATAGCAGC TATGATTGTCAATAAGACAGTGCACAAAATGATTTTCTCGAGGCAAGGAC AAGGGTACCGTCACATGAATCTGACTTCTACTAATAAATATTGGACAAGT AGCAACGGAACGCAAACGAATGACACTGGATGCTTCGGTGCTCTTCAAGA ATACAATTCTACGAAGAACCAAACATGTGCTCCGTCCAAAATACCTCCAC CACTGCCCACAGCCCGTCCGGAGATCAAACCCACAAGCACCCCAACTGAT GCCACCAAACTCAACACCACAGACCCAAACAGTGATGATGAGGACCTCAC AACATCCGGCTCAGGGTCCGGAGAACAGGAACCCTACACAACTTCTGATG CGGTCACTAAGCAAGGGCTTTCATCAACAATGCCACCCACTCCCTCACCA CAACCAAGCACGCCACAGCAAGGAGGAAACAACACAAACCATTCCCAAGG TGCTGTGACTGAACCCGACAAAACCAACACAACTGCACAACCGTCCATGC CCCCCCACAACACTACTACAATCTCTACTAACAACACCTCCAAGCACAAC TTCAGCACTCTCTCTGCACCACTACAAAACACCACCAATTACAACACACA GAGCACGGCCACTGAAAATGAGCAAACCAGTGCCCCCTCGAAAACAACCC TGCCTCCAACAGGAAATCCTACCACAGCAAAGAGCACCAACAGCACAAAA GGCCCCACCACAACGGCACCAAATACGACAAATGGGCATTTCACCAGTCC CTCCCCCACCCCCAACTCGACTACACAACATCTTGTATATTTCAGAAGGA AACGAAGTATCCTCTGGAGGGAAGGCGACATGTTCCCTTTTTTAGATGGG TTAATAAATACTGAAATTGATTTTGATCCAATCCCAAACACAGAAACAAT CTTTGATGAATCCCCCAGCTTTAATACTTCAACTAATGAGGAACAACACA CTCCCCCGAATATCAGTTTAACTTTCTCTTATTTTCCTGATAAAAATGGA GATACTGCCTACTCTGGGGAAAACGAGAATGATTGTGATGCAGAGTTGAG GATTTGGAGTGTGCAGGAGGACGATTTGGCGGCAGGGCTTAGCTGGATAC CATTTTTTGGCCCTGGAATCGAAGGACTCTATACTGCCGGTTTAATCAAA AATCAGAACAATTTAGTTTGTAGGTTGAGGCGCTTAGCTAATCAAACTGC TAAATCCTTGGAGCTCTTGTTAAGGGTCACAACCGAGGAAAGGACATTTT CCTTAATCAATAGGCATGCAATTGACTTTTTGCTTACGAGGTGGGGCGGA ACATGCAAGGTGCTAGGACCTGATTGTTGCATAGGAATAGAAGATCTATC TAAAAATATCTCAGAACAAATTGACAAAATCAGAAAGGATGAACAAAAGG AGGAAACTGGCTGGGGTCTAGGTGGCAAATGGTGGACATCTGACTGGGGT GTTCTCACCAATTTGGGCATCCTGCTACTATTATCTATAGCTGTTCTGAT TGCTCTGTCCTGTATCTGTCGTATCTTCACTAAATATATCGGATAG Gene-optimised (SEQ ID NO: 8) ATGAAGACCATCTACTTTCTGATCAGCCTGATCCTGATCCAGAGCATCAA GACCCTGCCTGTGCTGGAAATCGCCAGCAACAGTCAGCCCCAGGATGTGG ATAGCGTGTGTAGCGGCACCCTCCAGAAAACCGAGGATGTGCACCTGATG GGCTTTACCCTGAGCGGCCAGAAAGTGGCCGATTCTCCACTGGAAGCCAG CAAGAGATGGGCCTTTAGAACCGGCGTGCCACCTAAGAACGTCGAGTACA CAGAGGGCGAAGAGGCCAAGACCTGCTACAACATCAGCGTGACCGATCCT AGCGGCAAGAGCCTGCTGCTGGACCCTCCTAGCAACATCAGAGACTACCC CAAGTGCAAGACCGTGCACCACATCCAGGGACAGAATCCCCATGCTCAGG GAATTGCCCTGCACCTGTGGGGCGCCTTTTTCCTGTATGATCGGATCGCC TCCACCACCATGTACAGAGGCAAAGTGTTCACCGAGGGCAATATCGCCGC CATGATCGTGAACAAGACAGTGCACAAGATGATCTTCAGCCGGCAAGGCC AGGGCTACAGACACATGAATCTGACCAGCACCAACAAGTACTGGACCAGC AGCAACGGCACCCAGACCAATGATACAGGCTGCTTTGGCGCCCTGCAAGA GTACAACAGCACCAAGAATCAGACATGCGCCCCTAGCAAGATCCCTCCTC CACTGCCTACTGCCAGACCTGAGATCAAGCCTACCAGCACACCTACCGAC GCCACCAAGCTGAACACCACCGATCCAAACAGCGACGACGAGGATCTGAC AACAAGCGGATCTGGCTCTGGCGAGCAAGAGCCATACACCACCTCTGATG CCGTGACAAAGCAGGGCCTGAGCAGCACAATGCCTCCAACACCTTCTCCA CAGCCTAGCACACCTCAGCAAGGCGGCAACAACACAAATCACTCTCAGGG CGCCGTGACCGAGCCTGACAAGACAAATACCACAGCTCAGCCCAGCATGC CTCCTCACAACACCACCACAATCTCCACCAACAACACCAGCAAGCACAAC TTCAGCACACTGAGCGCCCCTCTCCAGAATACCACCAACTACAATACCCA GAGCACCGCCACCGAGAACGAGCAGACATCTGCCCCTTCTAAGACCACAC TGCCACCTACCGGCAATCCTACCACCGCCAAGAGCACCAATAGCACAAAG GGCCCTACCACCACCGCTCCTAACACCACAAATGGCCACTTCACAAGCCC AAGTCCTACACCTAACAGCACAACCCAGCACCTGGTGTACTTCAGACGGA AGCGGAGCATCCTTTGGCGCGAGGGCGATATGTTCCCTTTCCTGGACGGC CTGATCAACACCGAGATCGACTTCGACCCCATTCCAAACACCGAAACCAT CTTCGACGAGAGCCCCAGCTTCAACACCTCCACCAATGAGGAACAGCACA CCCCTCCAAACATCTCCCTGACCTTCAGCTACTTCCCCGACAAGAACGGC GATACAGCCTACAGCGGCGAGAATGAGAATGACTGCGACGCCGAGCTGCG GATTTGGAGCGTTCAAGAGGATGATCTGGCTGCCGGCCTGAGCTGGATCC CTTTTTTTGGACCTGGCATCGAGGGCCTGTACACCGCCGGACTGATCAAG AACCAGAACAACCTCGTGTGCAGACTGCGGAGACTGGCCAATCAGACCGC CAAGTCTCTGGAACTGCTGCTGCGCGTGACCACCGAGGAAAGAACCTTCT CTCTGATCAACCGGCACGCCATCGATTTTCTGCTGACCAGATGGGGCGGC ACCTGTAAAGTTCTGGGCCCTGATTGCTGCATCGGAATCGAGGACCTGAG CAAGAACATCTCCGAGCAGATCGACAAGATCCGCAAGGACGAGCAGAAAG AGGAAACAGGCTGGGGACTCGGCGGCAAGTGGTGGACATCTGATTGGGGC GTGCTGACCAATCTGGGAATCCTGCTGCTCCTGTCTATCGCCGTGCTGAT CGCCCTGAGCTGCATCTGCCGGATCTTCACCAAGTACATCGGCTGA Amino acid sequence encoded by unoptimised and gene-optimised sequences (SEQ ID NO: 9): MKTIYFLISLILIQSIKTLPVLEIASNSQPQDVDSVCSGTLQKTEDVHLM GFTLSGQKVADSPLEASKRWAFRTGVPPKNVEYTEGEEAKTCYNISVTDP SGKSLLLDPPSNIRDYPKCKTVHHIQGQNPHAQGIALHLWGAFFLYDRIA STTMYRGKVFTEGNIAAMIVNKTVHKMIFSRQGQGYRHMNLTSTNKYWTS SNGTQTNDTGCFGALQEYNSTKNQTCAPSKIPPPLPTARPEIKPTSTPTD ATKLNTTDPNSDDEDLTTSGSGSGEQEPYTTSDAVTKQGLSSTMPPTPSP QPSTPQQGGNNTNHSQGAVTEPDKTNTTAQPSMPPHNTTTISTNNTSKHN FSTLSAPLQNTTNYNTQSTATENEQTSAPSKTTLPPTGNPTTAKSTNSTK GPTTTAPNTTNGHFTSPSPTPNSTTQHLVYFRRKRSILWREGDMFPFLDG LINTEIDFDPIPNTETIFDESPSFNTSTNEEQHTPPNISLTFSYFPDKNG DTAYSGENENDCDAELRIWSVQEDDLAAGLSWIPFFGPGIEGLYTAGLIK NQNNLVCRLRRLANQTAKSLELLLRVTTEERTFSLINRHAIDFLLTRWGG TCKVLGPDCCIGIEDLSKNISEQIDKIRKDEQKEETGWGLGGKWWTSDWG VLTNLGILLLLSIAVLIALSCICRIFTKYIG

    EXAMPLE 4

    [0367] Tier 2-4 (SUDV anc -MLD)

    [0368] Sudan ebolavirus ancestral sequences with deleted (minus “−”) mucin-like domain

    TABLE-US-00013 Nucleotide sequence (SEQ ID NO: 10): atgggaggac tgtctctgct gcaactgccc cgggacaagt tccggaagtc cagcttcttc   60 gtgtgggtca tcatcctgtt ccagaaagcc ttcagcatgc ccctgggcgt cgtgaccaat  120 agcacactgg aagtgaccga gatcgaccag ctcgtgtgca aggatcacct ggccagcacc  180 gatcagctga agtctgtggg actgaatctg gaaggcagcg gcgtgtccac agatatccct  240 agcgccacca agagatgggg ctttagaagc ggagtgcctc ctaaggtggt gtcttatgaa  300 gccggcgagt gggccgagaa ctgctacaac ctggaaatca agaagcccga cggcagcgag  360 tgtctgcctc ctccacctga tggcgtcaga ggcttcccta gatgcagata cgtgcacaag  420 gcccaaggca caggaccctg tcctggcgat tacgcctttc acaaggacgg cgcctttttc  480 ctgtacgatc ggctggcctc caccgtgatc tacagaggcg ttaactttgc cgagggcgtg  540 atcgccttcc tgatcctggc caagcctaaa gagacattcc tgcaaagccc tccaatccgc  600 gaggccgtga actacacaga gaacaccagc agctactacg ccaccagcta cctggaatac  660 gagatcgaga atttcggcgc ccagcacagc accacactgt tcaagatcga caacaacacc  720 ttcgtgcggc tggacagacc ccacacacct cagtttctgt tccagctgaa cgacaccatc  780 catctgcatc agcagctgag caacaccacc ggcagactga tttggaccct ggacgccaac  840 atcaacgccg acattggaga gtgggccttt tgggagaaca agaagaacct gagcgaacag  900 ctgagaggcg aggaactgag ctttgaggcc ctgtctctga ccaccgccgt gaaaacagtg  960 ctgcctcaag agtccaccag caacggcctg atcacaagca cagtgacagg catcctgggc 1020 agcctgggcc tgagaaaaag gtccagacgg caagtgaata ccaaggccac cggcaagtgc 1080 aaccccaacc tgcactattg gacagcccaa gagcagcaca atgccgccgg aatcgcctgg 1140 attccttatt ttggacctgg cgccgagggc atctataccg agggactgat gcacaaccag 1200 aacgccctcg tgtgtggact gagacagctg gccaatgaga caacacaggc cctccagctg 1260 tttctgagag ccaccaccga gctgagaacc tacaccatcc tgaaccggaa ggccatcgac 1320 tttctgctga gaagatgggg cggcacctgt agaatcctgg gacctgattg ctgcatcgag 1380 ccccacgact ggaccaagaa catcaccgac aagatcaacc agatcatcca cgacttcatc 1440 gacaaccctc tgcctaacca ggacaacgac gacaattggt ggacaggctg gcggcagtgg 1500 attcctgccg gaattggcat caccggcatc atcattgcca ttatcgccct gctgtgtgtg 1560 tgcaagctgc tgtgttga 1578 Amino acid sequence (SEQ ID NO: 11): MGGLSLLQLPRDKFRKSSFFVWVIILFQKAFSMPLGVVTNSTLEVTEIDQ   50 LVCKDHLASTDQLKSVGLNLEGSGVSTDIPSATKRWGFRSGVPPKVVSYE  100 AGEWAENCYNLEIKKPDGSECLPPPPDGVRGFPRCRYVHKAQGTGPCPGD  150 YAFHKDGAFFLYDRLASTVIYRGVNFAEGVIAFLILAKPKETFLQSPPIR  200 EAVNYTENTSSYYATSYLEYEIENFGAQHSTTLFKIDNNTFVRLDRPHTP  250 QFLFQLNDTIHLHQQLSNTTGRLIWTLDANINADIGEWAFWENKKNLSEQ  300 LRGEELSFEALSLTTAVKTVLPQESTSNGLITSTVTGILGSLGLRKRSRR  350 QVNTKATGKCNPNLHYWTAQEQHNAAGIAWIPYFGPGAEGIYTEGLMHNQ  400 NALVCGLRQLANETTQALQLFLRATTELRTYTILNRKAIDFLLRRWGGTC  450 RILGPDCCIEPHDWTKNITDKINQIIHDFIDNPLPNQDNDDNWWTGWRQW  500 IPAGIGITGIIIAIIALLCVCKLLC*

    EXAMPLE 5

    [0369] Tier 2-6 (SUDV EBOV-TAFV-BDBV anc -MLD)

    [0370] Ancestral sequence to the four species Sudan, Zaire, Tai Forest, and Bundibugyo ebolavirus with the mucin-like-domain deleted.

    TABLE-US-00014 Nucleotide sequence (SEQ ID NO: 12): atgggcggag gatctagact gctgcaactg cccagagagc ggttcagaaa gaccagcttc   60 ttcgtgtggg tcatcatcct gttccagaaa gccttcagca tgcccctggg cgtcgtgacc  120 aatagcaccc tgaaagtgac cgagatcgac cagctcgtgt gcagagataa gctgagcagc  180 accagccagc tgaagtccgt gggactgaat ctggaaggca atggcgtggc cacagatgtg  240 cctagcgcca ccaaaagatg gggctttaga agcggcgtgc cacctaaggt ggtgtcttat  300 gaagccggcg agtgggccga gaactgctac aacctggaaa tcaagaagcc cgacggcagc  360 gagtgtctgc ctcctccacc tgatggcgtc agaggcttcc ctagatgcag atacgtgcac  420 aaggtgcaag gcacaggccc ctgtcctggc gatttcgcct ttcacaagga cggcgccttt  480 ttcctgtacg atcggctggc ctccaccgtg atctacagag gcacaacatt tgccgaaggc  540 gtggtggcct tcctgatcct gcctaagcct aagaaggact tctttcagag ccctcctatc  600 cgcgagcctg tgaacacaac agaggacccc agcagctact acaccaccag cacactgagc  660 tacgagatcg ataacttcgg cgccaacaag accaagacac tgttcaaggt ggacaaccac  720 acctacgtgc agctggacag accccacaca cctcagtttc tggtgcagct gaacgagaca  780 atccacacca acaacagact gagcaacacc accggcaggc tgatctggac cctggatcct  840 aagatcgaca ccgacatcgg agagtgggcc ttttgggaga acaagaagaa cttcagcaag  900 cagctgagag gcgaggaact gagctttaag gccctgagca ccaagacagg cgccaacgct  960 gtggataccg atgagtctag caagcccggc ctgatcacca acacagttag aggcgttgcc 1020 gacctgctga gcccttggag aagaaagcgg agacaagtga accccaatac caccaacaag 1080 tgcaacccta acctgcacta ctggacagcc caggatgaag gcgctgctgt tggactggcc 1140 tggattcctt attttggacc tgccgccgag ggcatctaca cagagggaat catgcacaac 1200 cagaatggcc tgatctgcgg cctgagacag ctggccaatg agacaacaca ggccctccag 1260 ctgtttctga gagccaccac cgagctgaga acctacagca tcctgaaccg gaaggccatc 1320 gactttctgc tgcaaagatg gggaggcacc tgtagaatcc tgggacctga ttgctgcatc 1380 gagccccacg actggaccaa gaacatcacc gacaagatca accagatcat ccacgacttc 1440 atcgacaacc ctctgcctga ccaggacgac gacgataatt ggtggacagg atggcggcag 1500 tggattcctg ccggaatcgg aatcacaggc gtgatcattg ccattatcgc cctgctgtgc 1560 atctgcaagt ttctgtgctg a 1581 Amino acid sequence (SEQ ID NO: 13): MGGGSRLLQLPRERFRKTSFFVWVIILFQKAFSMPLGVVTNSTLKVTEID   50 QLVCRDKLSSTSQLKSVGLNLEGNGVATDVPSATKRWGFRSGVPPKVVSY  100 EAGEWAENCYNLEIKKPDGSECLPPPPDGVRGFPRCRYVHKVQGTGPCPG  150 DFAFHKDGAFFLYDRLASTVIYRGTTFAEGVVAFLILPKPKKDFFQSPPI  200 REPVNTTEDPSSYYTTSTLSYEIDNFGANKTKTLFKVDNHTYVQLDRPHT  250 PQFLVQLNETIHTNNRLSNTTGRLIWTLDPKIDTDIGEWAFWENKKNFSK  300 QLRGEELSFKALSTKTGANAVDTDESSKPGLITNTVRGVADLLSPWRRKR  350 RQVNPNTTNKCNPNLHYWTAQDEGAAVGLAWIPYFGPAAEGIYTEGIMHN  400 QNGLICGLRQLANETTQALQLFLRATTELRTYSILNRKAIDFLLQRWGGT  450 CRILGPDCCIEPHDWTKNITDKINQIIHDFIDNPLPDQDDDDNWWTGWRQ  500 WIPAGIGITGVIIAIIALLCICKFLC*

    EXAMPLE 6

    [0371] Tier 2-11 (RAVV MARV anc)

    [0372] Ancestral sequence to the strains Marburg Virus and Ravn Virus

    TABLE-US-00015 Nucleotide sequence (SEQ ID NO: 14): atgaagacca tctactttct gatcagcctg atcctgatcc agagcatcaa gaccctgcct   60 gtgctggaaa tcgccagcaa cagtcagccc caggatgtgg atagcgtgtg tagcggcacc  120 ctccagaaaa ccgaggatgt gcacctgatg ggctttaccc tgagcggcca gaaagtggcc  180 gattctccac tggaagccag caagagatgg gcctttagaa ccggcgtgcc acctaagaac  240 gtcgagtaca cagagggcga agaggccaag acctgctaca acatcagcgt gaccgatcct  300 agcggcaaga gcctgctgct ggaccctcct agcaacatca gagactaccc caagtgcaag  360 accgtgcacc acatccaggg acagaatccc catgctcagg gaattgccct gcacctgtgg  420 ggcgcctttt tcctgtatga tcggatcgcc tccaccacca tgtacagagg caaagtgttc  480 accgagggca atatcgccgc catgatcgtg aacaagacag tgcacaagat gatcttcagc  540 cggcaaggcc agggctacag acacatgaat ctgaccagca ccaacaagta ctggaccagc  600 agcaacggca cccagaccaa tgatacaggc tgctttggcg ccctgcaaga gtacaacagc  660 accaagaatc agacatgcgc ccctagcaag atccctcctc cactgcctac tgccagacct  720 gagatcaagc ctaccagcac acctaccgac gccaccaagc tgaacaccac cgatccaaac  780 agcgacgacg aggatctgac aacaagcgga tctggctctg gcgagcaaga gccatacacc  840 acctctgatg ccgtgacaaa gcagggcctg agcagcacaa tgcctccaac accttctcca  900 cagcctagca cacctcagca aggcggcaac aacacaaatc actctcaggg cgccgtgacc  960 gagcctgaca agacaaatac cacagctcag cccagcatgc ctcctcacaa caccaccaca 1020 atctccacca acaacaccag caagcacaac ttcagcacac tgagcgcccc tctccagaat 1080 accaccaact acaataccca gagcaccgcc accgagaacg agcagacatc tgccccttct 1140 aagaccacac tgccacctac cggcaatcct accaccgcca agagcaccaa tagcacaaag 1200 ggccctacca ccaccgctcc taacaccaca aatggccact tcacaagccc aagtcctaca 1260 cctaacagca caacccagca cctggtgtac ttcagacgga agcggagcat cctttggcgc 1320 gagggcgata tgttcccttt cctggacggc ctgatcaaca ccgagatcga cttcgacccc 1380 attccaaaca ccgaaaccat cttcgacgag agccccagct tcaacacctc caccaatgag 1440 gaacagcaca cccctccaaa catctccctg accttcagct acttccccga caagaacggc 1500 gatacagcct acagcggcga gaatgagaat gactgcgacg ccgagctgcg gatttggagc 1560 gttcaagagg atgatctggc tgccggcctg agctggatcc ctttttttgg acctggcatc 1620 gagggcctgt acaccgccgg actgatcaag aaccagaaca acctcgtgtg cagactgcgg 1680 agactggcca atcagaccgc caagtctctg gaactgctgc tgcgcgtgac caccgaggaa 1740 agaaccttct ctctgatcaa ccggcacgcc atcgattttc tgctgaccag atggggcggc 1800 acctgtaaag ttctgggccc tgattgctgc atcggaatcg aggacctgag caagaacatc 1860 tccgagcaga tcgacaagat ccgcaaggac gagcagaaag aggaaacagg ctggggactc 1920 ggcggcaagt ggtggacatc tgattggggc gtgctgacca atctgggaat cctgctgctc 1980 ctgtctatcg ccgtgctgat cgccctgagc tgcatctgcc ggatcttcac caagtacatc 2040 ggctga 2046 Amino acid sequence (SEQ ID NO: 15): MKTIYFLISLILIQSIKTLPVLEIASNSQPQDVDSVCSGTLQKTEDVHLM   50 GFTLSGQKVADSPLEASKRWAFRTGVPPKNVEYTEGEEAKTCYNISVTDP  100 SGKSLLLDPPSNIRDYPKCKTVHHIQGQNPHAQGIALHLWGAFFLYDRIA  150 STTMYRGKVFTEGNIAAMIVNKTVHKMIFSRQGQGYRHMNLTSTNKYWTS  200 SNGTQTNDTGCFGALQEYNSTKNQTCAPSKIPPPLPTARPEIKPTSTPTD  250 ATKLNTTDPNSDDEDLTTSGSGSGEQEPYTTSDAVTKQGLSSTMPPTPSP  300 QPSTPQQGGNNTNHSQGAVTEPDKTNTTAQPSMPPHNTTTISTNNTSKHN  350 FSTLSAPLQNTTNYNTQSTATENEQTSAPSKTTLPPTGNPTTAKSTNSTK  400 GPTTTAPNTTNGHFTSPSPTPNSTTQHLVYFRRKRSILWREGDMFPFLDG  450 LINTEIDFDPIPNTETIFDESPSFNTSTNEEQHTPPNISLTFSYFPDKNG  500 DTAYSGENENDCDAELRIWSVQEDDLAAGLSWIPFFGPGIEGLYTAGLIK  550 NQNNLVCRLRRLANQTAKSLELLLRVTTEERTFSLINRHAIDFLLTRWGG  600 TCKVLGPDCCIGIEDLSKNISEQIDKIRKDEQKEETGWGLGGKWWTSDWG  650 VLTNLGILLLLSIAVLIALSCICRIFTKYIG*

    EXAMPLE 7

    [0373] pEVAC Expression Vector

    [0374] FIG. 3 shows a map of the pEVAC expression vector. The sequence of the multiple cloning site of the vector is given below, followed by its entire nucleotide sequence.

    TABLE-US-00016 Sequence of pEVAC Multiple Cloning Site (MCS) (SEQ ID NO: 16): [00001]embedded image [00002]embedded image Entire Sequence of pEVAC (SEQ ID NO: 17): CMV-IE-E/P:  248-989 CMV immediate early 1 enhancer/promoter KanR: 3445-4098 Kanamycin resistance SD:  990-1220 Splice donor SA: 1221-1343 Splice acceptor Tbgh: 1392-1942 Terminator signal from bovine growth hormone pUC-ori: 2096-2769 pUC-plasmid origin of replication 1 TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG 51  GAGACGGTCA CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG 101 TCAGGGCGCG TCAGCGGGTG TTGGCGGGTG TCGGGGCTGG CTTAACTATG 151 CGGCATCAGA GCAGATTGTA CTGAGAGTGC ACCATATGCG GTGTGAAATA 201 CCGCACAGAT GCGTAAGGAG AAAATACCGC ATCAGATTGG CTATTGGCCA 251 TTGCATACGT TGTATCCATA TCATAATATG TACATTTATA TTGGCTCATG 301 TCCAACATTA CCGCCATGTT GACATTGATT ATTGACTAGT TATTAATAGT 351 AATCAATTAC GGGGTCATTA GTTCATAGCC CATATATGGA GTTCCGCGTT 401 ACATAACTTA CGGTAAATGG CCCGCCTGGC TGACCGCCCA ACGACCCCCG 451 CCCATTGACG TCAATAATGA CGTATGTTCC CATAGTAACG CCAATAGGGA 501 CTTTCCATTG ACGTCAATGG GTGGAGTATT TACGGTAAAC TGCCCACTTG 551 GCAGTACATC AAGTGTATCA TATGCCAAGT ACGCCCCCTA TTGACGTCAA 601 TGACGGTAAA TGGCCCGCCT GGCATTATGC CCAGTACATG ACCTTATGGG 651 ACTTTCCTAC TTGGCAGTAC ATCTACGTAT TAGTCATCGC TATTACCATG 701 GTGATGCGGT TTTGGCAGTA CATCAATGGG CGTGGATAGC GGTTTGACTC 751 ACGGGGATTT CCAAGTCTCC ACCCCATTGA CGTCAATGGG AGTTTGTTTT 801 GGCACCAAAA TCAACGGGAC TTTCCAAAAT GTCGTAACAA CTCCGCCCCA 851 TTGACGCAAA TGGGCGGTAG GCGTGTACGG TGGGAGGTCT ATATAAGCAG 901 AGCTCGTTTA GTGAACCGTC AGATCGCCTG GAGACGCCAT CCACGCTGTT 951 TTGACCTCCA TAGAAGACAC CGGGACCGAT CCAGCCTCCA TCGGCTCGCA 1001 TCTCTCCTTC ACGCGCCCGC CGCCCTACCT GAGGCCGCCA TCCACGCCGG 1051 TTGAGTCGCG TTCTGCCGCC TCCCGCCTGT GGTGCCTCCT GAACTGCGTC 1101 CGCCGTCTAG GTAAGTTTAA AGCTCAGGTC GAGACCGGGC CTTTGTCCGG 1151 CGCTCCCTTG GAGCCTACCT AGACTCAGCC GGCTCTCCAC GCTTTGCCTG 1201 ACCCTGCTTG CTCAACTCTA GTTAACGGTG GAGGGCAGTG TAGTCTGAGC 1251 AGTACTCGTT GCTGCCGCGC GCGCCACCAG ACATAATAGC TGACAGACTA 1301 ACAGACTGTT CCTTTCCATG GGTCTTTTCT GCAGTCACCG TCGGTACCGT 1351 CGACACGTGT GATCATCTAG AGGATCCGCG GCCGCAGATC TGCTGTGCCT 1401 TCTAGTTGCC AGCCATCTGT TGTTTGCCCC TCCCCCGTGC CTTCCTTGAC 1451 CCTGGAAGGT GCCACTCCCA CTGTCCTTTC CTAATAAAAT GAGGAAATTG 1501 CATCGCATTG TCTGAGTAGG TGTCATTCTA TTCTGGGGGG TGGGGTGGGG 1551 CAGGACAGCA AGGGGGAGGA TTGGGAAGAC AATAGCAGGC ATGCTGGGGA 1601 TGCGGTGGGC TCTATGGCTA CCCAGGTGCT GAAGAATTGA CCCGGTTCCT 1651 CCTGGGCCAG AAAGAAGCAG GCACATCCCC TTCTCTGTGA CACACCCTGT 1701 CCACGCCCCT GGTTCTTAGT TCCAGCCCCA CTCATAGGAC ACTCATAGCT 1751 CAGGAGGGCT CCGCCTTCAA TCCCACCCGC TAAAGTACTT GGAGCGGTCT 1801 CTCCCTCCCT CATCAGCCCA CCAAACCAAA CCTAGCCTCC AAGAGTGGGA 1851 AGAAATTAAA GCAAGATAGG CTATTAAGTG CAGAGGGAGA GAAAATGCCT 1901 CCAACATGTG AGGAAGTAAT GAGAGAAATC ATAGAATTTT AAGGCCATGA 1951 TTTAAGGCCA TCATGGCCTT AATCTTCCGC TTCCTCGCTC ACTGACTCGC 2001 TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG TATCAGCTCA CTCAAAGGCG 2051 GTAATACGGT TATCCACAGA ATCAGGGGAT AACGCAGGAA AGAACATGTG 2101 AGCAAAAGGC CAGCAAAAGG CCAGGAACCG TAAAAAGGCC GCGTTGCTGG 2151 CGTTTTTCCA TAGGCTCCGC CCCCCTGACG AGCATCACAA AAATCGACGC 2201 TCAAGTCAGA GGTGGCGAAA CCCGACAGGA CTATAAAGAT ACCAGGCGTT 2251 TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC TGTTCCGACC CTGCCGCTTA 2301 CCGGATACCT GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC GCTTTCTCAT 2351 AGCTCACGCT GTAGGTATCT CAGTTCGGTG TAGGTCGTTC GCTCCAAGCT 2401 GGGCTGTGTG CACGAACCCC CCGTTCAGCC CGACCGCTGC GCCTTATCCG 2451 GTAACTATCG TCTTGAGTCC AACCCGGTAA GACACGACTT ATCGCCACTG 2501 GCAGCAGCCA CTGGTAACAG GATTAGCAGA GCGAGGTATG TAGGCGGTGC 2551 TACAGAGTTC TTGAAGTGGT GGCCTAACTA CGGCTACACT AGAAGAACAG 2601 TATTTGGTAT CTGCGCTCTG CTGAAGCCAG TTACCTTCGG AAAAAGAGTT 2651 GGTAGCTCTT GATCCGGCAA ACAAACCACC GCTGGTAGCG GTGGTTTTTT 2701 TGTTTGCAAG CAGCAGATTA CGCGCAGAAA AAAAGGATCT CAAGAAGATC 2751 CTTTGATCTT TTCTACGGGG TCTGACGCTC AGTGGAACGA AAACTCACGT 2801 TAAGGGATTT TGGTCATGAG ATTATCAAAA AGGATCTTCA CCTAGATCCT 2851 TTTAAATTAA AAATGAAGTT TTAAATCAAT CTAAAGTATA TATGAGTAAA 2901 CTTGGTCTGA CAGTTACCAA TGCTTAATCA GTGAGGCACC TATCTCAGCG 2951 ATCTGTCTAT TTCGTTCATC CATAGTTGCC TGACTCGGGG GGGGGGGGCG 3001 CTGAGGTCTG CCTCGTGAAG AAGGTGTTGC TGACTCATAC CAGGCCTGAA 3051 TCGCCCCATC ATCCAGCCAG AAAGTGAGGG AGCCACGGTT GATGAGAGCT 3101 TTGTTGTAGG TGGACCAGTT GGTGATTTTG AACTTTTGCT TTGCCACGGA 3151 ACGGTCTGCG TTGTCGGGAA GATGCGTGAT CTGATCCTTC AACTCAGCAA 3201 AAGTTCGATT TATTCAACAA AGCCGCCGTC CCGTCAAGTC AGCGTAATGC 3251 TCTGCCAGTG TTACAACCAA TTAACCAATT CTGATTAGAA AAACTCATCG 3301 AGCATCAAAT GAAACTGCAA TTTATTCATA TCAGGATTAT CAATACCATA 3351 TTTTTGAAAA AGCCGTTTCT GTAATGAAGG AGAAAACTCA CCGAGGCAGT 3401 TCCATAGGAT GGCAAGATCC TGGTATCGGT CTGCGATTCC GACTCGTCCA 3451 ACATCAATAC AACCTATTAA TTTCCCCTCG TCAAAAATAA GGTTATCAAG 3501 TGAGAAATCA CCATGAGTGA CGACTGAATC CGGTGAGAAT GGCAAAAGCT 3551 TATGCATTTC TTTCCAGACT TGTTCAACAG GCCAGCCATT ACGCTCGTCA 3601 TCAAAATCAC TCGCATCAAC CAAACCGTTA TTCATTCGTG ATTGCGCCTG 3651 AGCGAGACGA AATACGCGAT CGCTGTTAAA AGGACAATTA CAAACAGGAA 3701 TCGAATGCAA CCGGCGCAGG AACACTGCCA GCGCATCAAC AATATTTTCA 3751 CCTGAATCAG GATATTCTTC TAATACCTGG AATGCTGTTT TCCCGGGGAT 3801 CGCAGTGGTG AGTAACCATG CATCATCAGG AGTACGGATA AAATGCTTGA 3851 TGGTCGGAAG AGGCATAAAT TCCGTCAGCC AGTTTAGTCT GACCATCTCA 3901 TCTGTAACAT CATTGGCAAC GCTACCTTTG CCATGTTTCA GAAACAACTC 3951 TGGCGCATCG GGCTTCCCAT ACAATCGATA GATTGTCGCA CCTGATTGCC 4001 CGACATTATC GCGAGCCCAT TTATACCCAT ATAAATCAGC ATCCATGTTG 4051 GAATTTAATC GCGGCCTCGA GCAAGACGTT TCCCGTTGAA TATGGCTCAT 4101 AACACCCCTT GTATTACTGT TTATGTAAGC AGACAGTTTT ATTGTTCATG 4151 ATGATATATT TTTATCTTGT GCAATGTAAC ATCAGAGATT TTGAGACACA 4201 ACGTGGCTTT CCCCCCCCCC CCATTATTGA AGCATTTATC AGGGTTATTG 4251 TCTCATGAGC GGATACATAT TTGAATGTAT TTAGAAAAAT AAACAAATAG 4301 GGGTTCCGCG CACATTTCCC CGAAAAGTGC CACCTGACGT CTAAGAAACC 4351 ATTATTATCA TGACATTAAC CTATAAAAAT AGGCGTATCA CGAGGCCCTT 4401 TCGTC

    EXAMPLE 8

    [0375] Lead Candidate Optimized Antigenic Ebola Polypeptides Able to Induce a Broadly Neutralizing Antibody Response

    [0376] There was a significant interest to develop vaccines against Ebola followed the West African outbreak in 2014. Programmes currently in clinical development have so far taken a ‘classical’ approach to vaccine development using Ebola and/or Marburg virus surface glycoproteins (GPs) from one to three strains expressed in a viral vector backbone. Antigen specificity comes only from the included EBOV strains: for example Merck use a GP from Kikwit; GSK use Mayinga EBOV and Gulu SUDV strains; Crucell and Profectus Biosiences both use a Marburg virus together with Zaire and Sudan Ebola strains; with the Novavax approach being unique in using the 2014 Makona EBOV strain.

    [0377] Table 1 below shows flow cytometric assay results illustrating the strength of antibody binding to target antigens, representative of all Ebola virus species (subtypes) and Marburg viruses. Strength of binding is indicated by the heat-map where red (the darkest shading when viewed in grayscale) is very strong binding, decreasing through orange to yellow (progressively lighter shading when viewed in grayscale) and no binding/equal to negative control values are white. Serum samples 1-22 were taken from individuals immunised with other Ebola virus vaccine candidates. T2-4 and T2-6 are nucleic acid vaccines encoding lead candidate optimized antigenic Ebola polypeptide, combined with T2-11 a Marburg candidate, at pre-clinical stage testing with serum samples taken from immunised guinea pigs.

    EXAMPLE 9

    [0378] Protection Achieved by a Trivalent Lassa, Ebola and Marburg Viral Vaccine (Tri-LEMvac) in an Ebola Challenge Model

    [0379] We have developed a trivalent vaccine (Tri-LEMvac) that generates combined vaccine efficacy against future outbreaks of variants of the haemorrhagic fever Lassa, Ebola and Marburg viruses.

    [0380] We have bioinformatically designed synthetic glycoprotein sequences from the GPC open reading frames of LASV (L) as well as EBOV (E) and MARV (M) from all available Arenavirus and Filovirus databases. These conserved sequences consist of neutralising antibody and T-cell rich epitopes for each of these viruses. To ensure that these synthetically designed LASV, EBOV and MARV envelopes were functional and antigenic, they were expressed as pseudotypes and quality controlled for both binding and neutralisation against a panel of broadly neutralising antibodies. Herein, we chose the vaccine derived vector Modified Vaccinia Ankara (MVA) for construction of the trimeric LEM vaccine.

    [0381] The Modified Vaccinia Ankara (MVA) vaccine platform is a non-replicating strain (i.e. non-replicating in human cells), third generation smallpox vaccine and one of the most advanced recombinant poxviral vaccine vectors in human clinical trials (Cottingham & Carroll, Vaccine, 2013, 31(39):4247-51). MVA is a robust vector system capable of co-expressing up to four transgenes facilitating potent promoters and stable insertion sites (Orubu et al, Pone, 2012, 7(6)e0040167). MVA was chosen because: 1) its significant capacity to stably express multiple independent ORFs via compatible expression cassettes with strong and timely regulated promotors for trivalent LEM vaccination in one cost effective vaccine lot; 2) its ability to induce robust B and T-cell immune responses in animals and humans especially when primed or boosted with DNA or RNA vectors; and 3) vaccine lots can be thermally stabilised for storage and transport in developing countries in the absence of cold chain (Frey et al, Vaccine, 2015, 33(39):5225-34). Proof of principle for the Trivalent vaccine candidate has been demonstrated by: i) cassette validation for independent L, E and M GPC expression and epitope presentation; and ii) preclinical efficacy by Filovirus challenge. The challenge study results are shown in FIG. 4. The Ebola challenge model was lethal for non-vaccinated guinea pigs (Group 1, lower line) whereas all vaccinated guinea pigs (Group 2, upper line) were protected (left) and continued to gain weight (right).

    EXAMPLE 10

    [0382] Pseudotype Virus Neutralisation Assay

    [0383] FIG. 5 shows the results of a pseudotype virus neutralisation assay illustrating the strength of neutralising antibody responses to target antigens expressed on the surface of a pseudotyped virus, representative of all Ebola virus species and Marburg viruses. Strength of neutralisation is indicated by the heat-map where red (darkest shading when viewed in grayscale) is very strong neutralisation, decreasing through orange to yellow (progressively lighter shading when viewed in grayscale) and no neutralising/equal to negative control values are white.

    [0384] T2-4 and T2-6 are nucleic acid vaccines each encoding lead candidate optimized antigenic Ebola polypeptide, combined with T2-11 a Marburg candidate, at pre-clinical stage testing with serum samples taken from immunised guinea pigs.

    [0385] The results show that administering a combination of T2-6 and T2-11 vaccine inserts gave a synergistic increase in the breadth of the immune response.

    EXAMPLE 11

    [0386] Antibody Binding Assay

    [0387] FIG. 6 shows the results of an antibody binding assay. Antibody binding was measured by incubation of two groups of cells bearing two different group 1 influenza A glycoproteins on their surface (H1 pandemic and seasonal) with pooled mouse serum. Any bound antibodies were then detected by a secondary antibody, and results recorded using a flow cytometer. Binding was significantly increased before and after vaccination with all constructs, but not after vaccination with PBS (control). Overall, a DIOS vaccine candidate out-performed those from COBRA in both cases (*).

    EXAMPLE 12

    [0388] Comparison of Immune Responses Induced by Two Different Computational Approaches

    [0389] Four groups of six mice were immunized five times, at two-week intervals, with 25 μg of four separate pEVAC plasmids encoding HA gene antigens that were designed either by a method according to an embodiment of the invention (DIOS) or by a conventional method (COBRA).

    [0390] Antibody-based FACS was carried out on cells expressing two different group 1 influenza A glycoproteins on their cell surface (seasonal H1N1, and pandemic origin H1N1). These were used to test mouse sera from animals immunized with either the COBRA or DIOS HA gene antigens. The results are shown in FIG. 7.

    [0391] Overall, the DIOS HA gene antigens matched or significantly out-performed the COBRA HA gene antigens (** p<0.01, *** p<0.001).

    EXAMPLE 13

    [0392] Cross-HA-Group Binding, and Pseudotype Neutralization of H7N9 (A/Shanghai2/2013)

    [0393] We tested whether the DIOS-H1N1pdm vaccine of Example 12 (which produced higher levels of antibody binding than H1N1-COBRA to the pandemic H1 HA antigen) could evoke antibodies that recognize and bind divergent group 2 virus HA, such as that from pandemic potential H7N9 strain A/Shanghai/2/2013.

    [0394] FIG. 8 shows the results of cross-HA-group binding (left panel), and pseudotype neutralization (right) of H7N9 (A/Shanghai2/2013), by sera from DIOS or COBRA DNA immunized mice. H7 binding data (left), confirmed by pseudotype neutralization data (right), shows that H1N1pdm-vaccinated mice showed the highest neutralization compared to the other groups. Significantly more binding was elicited by the DIOS-H1N1pdm vaccine than other groups tested, and was comparable with positive control broadly neutralizing monoclonal antibodies F16 (Corti et al., 2011, supra) and CR9114 (Dreyfus et al, Science, 2012; 337(6100): 1343-1348).

    [0395] These results support a conclusion that the DIOS-H1N1pdm immunogen cross neutralizes H7, and that cross-HA group immune protection is possible with vaccines produced by methods of the invention.

    EXAMPLE 14

    [0396] Lassa Virus Glycoprotein

    [0397] This example describes Lassa virus glycoprotein ancestral sequence produced using a method according to an embodiment of the invention, and modifications to the ancestral sequence to improve its immunogenicity by stabilising the structure.

    [0398] Lassa fever is a hemorrhagic disease caused by an Old World (OW) arenavirus known as Lassa virus (LASV). The virus was first isolated in Nigeria in 1969 and is currently endemic in West Africa. Due to the high morbidity and mortality associated with Lassa hemorrhagic fever, LASV is classified as a category A pathogen.

    [0399] Lassa virus is an enveloped ambisense RNA virus with a bisegmented genome. Viral particles are covered in mature glycoprotein (GP) trimeric spikes, which mediate viral entry. Like other class 1 viral fusion proteins, the envelope glycoprotein precursor (GPC) is translated as a single polypeptide and is proteolytically cleaved into three subunits. Processing occurs first in the endoplasmic reticulum (ER) by a cellular signal peptidase. GPC is then trafficked to the cis-Golgi apparatus and processed by cellular proprotein convertase subtilisin kexin isozyme-1/site-1 protease (SKI-1/S1 P) to produce a noncovalent stable-signal peptide (SSP)/GP1/GP2 heterotrimer. Unlike other class I fusion proteins, the relatively long signal peptide of GPC is not degraded; it serves a chaperone-like function necessary for the correct trafficking and processing of GP. SSP interacts with the cytoplasmic domain of GP2 and is involved in pH sensing. GP1 is responsible for binding to cellular receptors, while GP2 mediates membrane fusion during viral entry.

    [0400] Lassa virus glycoprotein ancestral sequence to lineages III and IV (L-10) (construct 1) was produced using a method according to an embodiment of the invention. Modifications were then introduced independently into the parental ancestral sequence (construct 1) to provide: (A) SOSEP (construct 2); and (B) FLEP (construct 4), as well as in combination with a glycan knock-out, called NtoK (to provide constructs 3 and 5), to stabilize the otherwise flexible heterotrimers and prevent dissociation of the external domain of the glycoprotein from the non-covalently linked transmembrane domain.

    [0401] (A) Two cystein residues were introduced at positions 207 and 360 to allow formation of a disulfide bridge (SOS) between the exterior and the transmembrane domains of GP. To facilitate complete cleavage of these two domains, the furin cleavage site was modified from RRLL to RRRR at position 256-259. Mutation of glutamate to proline at position 329 (EP) prevents structural rearrangements making the protein less flexible.

    [0402] (B) The furin cleavage site (256-RRLL-259) between the C-terminus of the external domain and the N-terminus of the transmembrane domain was replaced by a flexible linker with the sequence 256-GGGGSGGGGS-265. Additionally, the EP-mutation as in (A) was introduced at position 335.

    [0403] Variants of both designs were generated that additionally contain an asparagine to lysine mutation at position 272 or 278, for SOSEP-NtoK or FLEP-NtoK, respectively, to inactivate a glycosylation motif. Glycans at this position might block access of some neutralizing antibodies, such as 37.7H.

    [0404] Construct 1:

    [0405] Lassa Virus Glycoprotein Ancestral Sequence to Lineages III and IV (L-10=LASV III IV anc)

    TABLE-US-00017 Amino acid sequence (SEQ ID NO: 18): MGQIVTFFQEVPHVIEEVMNIVLIALSLLAILKGLYNVATCGLIGLVTFL  50 LLCGRSCSTTLYKGVYELQTLELNMETLNMTMPLSCTKNNSHHYIRVGNE 100 TGLELTLTNTSIINHKFCNLSDAHKKNLYDHALMSIISTFHLSIPNFNQY 150 EAMSCDFNGGKISVQYNLSHSYAVDAANHCGTVANGVLQTFMRMAWGGSY 200 IALDSGRGNWDCIMTSYQYLIIQNTTWEDHCQFSRPSPIGYLGLLSQRTR 250 DIYISRRLLGTFTWTLSDSEGNETPGGYCLTRWMLIEAELKCFGNTAVAK 300 CNEKHDEEFCDMLRLFDFNKQAIRRLKAEAQMSIQLINKAVNALINDQLI 350 MKNHLRDIMGIPYCNYSKYWYLNHTITGKTSLPKCWLVSNGSYLNETHFS 400 DDIEQQADNMITEMLQKEYMDRQGKTPLGLVDLFVFSTSFYLISIFLHLV 450 KIPTHRHIVGKPCPKPHRLNHMGICSCGLYKQPGVPVRWKR* DNA sequence (SEQ ID NO: 19):    1 ATGGGCCAGA TCGTGACATT CTTCCAAGAG GTGCCCCACG TGATCGAAGA   51 AGTGATGAAC ATCGTCCTGA TCGCCCTGAG CCTGCTGGCC ATCCTGAAGG  101 GCCTGTATAA TGTGGCCACC TGTGGCCTGA TCGGCCTGGT CACATTTCTG  151 CTGCTGTGCG GCAGAAGCTG CTCCACCACA CTGTATAAGG GCGTGTACGA  201 GCTGCAAACC CTGGAACTGA ACATGGAAAC CCTGAACATG ACCATGCCTC  251 TGAGCTGCAC CAAGAACAAC AGCCACCACT ACATCAGAGT GGGCAACGAG  301 ACAGGCCTCG AGCTGACCCT GACCAACACC AGCATCATCA ACCACAAGTT  351 CTGCAACCTG AGCGACGCCC ACAAGAAGAA CCTGTACGAT CACGCCCTGA  401 TGAGCATCAT CTCCACCTTC CACCTGAGCA TCCCCAACTT CAACCAGTAC  451 GAGGCCATGA GCTGCGACTT CAACGGCGGA AAGATCAGCG TGCAGTACAA  501 TCTGAGCCAC AGCTATGCCG TGGACGCCGC CAATCATTGT GGAACAGTGG  551 CCAATGGCGT GCTCCAGACC TTCATGAGAA TGGCCTGGGG CGGCAGCTAT  601 ATCGCCCTGG ATTCTGGCAG AGGCAACTGG GACTGCATCA TGACCAGCTA  651 CCAGTACCTG ATCATCCAGA ACACCACCTG GGAAGATCAC TGCCAGTTCA  701 GCAGACCCTC TCCTATCGGA TACCTGGGCC TGCTGTCCCA GAGAACCCGG  751 GACATCTACA TCTCTAGACG GCTGCTGGGC ACCTTCACCT GGACACTGTC  801 TGATAGCGAG GGCAATGAGA CACCTGGCGG CTACTGTCTG ACCCGGTGGA  851 TGCTGATTGA GGCCGAGCTG AAGTGCTTCG GAAATACCGC CGTGGCCAAG  901 TGCAACGAGA AGCACGACGA GGAATTCTGC GACATGCTGC GGCTGTTCGA  951 TTTCAACAAG CAGGCCATCA GACGGCTGAA GGCCGAGGCT CAGATGTCCA 1001 TCCAGCTGAT CAACAAGGCC GTGAATGCCC TGATTAACGA CCAGCTCATC 1051 ATGAAGAACC ACCTCAGGGA CATCATGGGC ATCCCTTACT GCAACTACAG 1101 CAAGTACTGG TATCTGAACC ACACCATCAC CGGCAAGACC AGCCTGCCTA 1151 AGTGCTGGCT GGTGTCCAAC GGCAGCTACC TGAACGAGAC ACACTTCAGC 1201 GACGACATCG AGCAGCAGGC CGACAACATG ATCACCGAGA TGCTCCAGAA 1251 AGAGTACATG GACCGGCAGG GCAAGACACC TCTGGGCCTT GTGGATCTGT 1301 TCGTGTTCAG CACCAGCTTC TACCTGATCT CTATCTTCCT GCACCTGGTC 1351 AAGATCCCCA CACACAGACA CATCGTGGGC AAGCCCTGTC CTAAGCCTCA 1401 CAGACTGAAC CATATGGGCA TCTGTAGCTG CGGCCTGTAC AAACAGCCTG 1451 GCGTGCCAGT GCGGTGGAAG AGATAA

    [0406] Construct 2:

    [0407] SOSEP-Variant of Construct 1 (L-10-SOSEP)

    TABLE-US-00018 Amino acid sequence (SEQ ID NO: 20): MGQIVTFFQEVPHVIEEVMNIVLIALSLLAILKGLYNVATCGLIGLVTFL  50 LLCGRSCSTTLYKGVYELQTLELNMETLNMTMPLSCTKNNSHHYIRVGNE 100 TGLELTLTNTSIINHKFCNLSDAHKKNLYDHALMSIISTFHLSIPNFNQY 150 EAMSCDFNGGKISVQYNLSHSYAVDAANHCGTVANGVLQTFMRMAWGGSY 200 IALDSGCGNWDCIMTSYQYLIIQNTTWEDHCQFSRPSPIGYLGLLSQRTR 250 DIYISRRRRGTFTWTLSDSEGNETPGGYCLTRWMLIEAELKCFGNTAVAK 300 CNEKHDEEFCDMLRLFDFNKQAIRRLKAPAQMSIQLINKAVNALINDQLI 350 MKNHLRDIMCIPYCNYSKYWYLNHTITGKTSLPKCWLVSNGSYLNETHFS 400 DDIEQQADNMITEMLQKEYMDRQGKTPLGLVDLFVFSTSFYLISIFLHLV 450 KIPTHRHIVGKPCPKPHRLNHMGICSCGLYKQPGVPVRWKR* DNA-sequence (SEQ ID NO: 21):    1 ATGGGCCAGA TCGTGACATT CTTCCAAGAG GTGCCCCACG TGATCGAAGA   51 AGTGATGAAC ATCGTCCTGA TCGCCCTGAG CCTGCTGGCC ATCCTGAAGG  101 GCCTGTATAA TGTGGCCACC TGTGGCCTGA TCGGCCTGGT CACATTTCTG  151 CTGCTGTGCG GCAGAAGCTG CTCCACCACA CTGTATAAGG GCGTGTACGA  201 GCTGCAAACC CTGGAACTGA ACATGGAAAC CCTGAACATG ACCATGCCTC  251 TGAGCTGCAC CAAGAACAAC AGCCACCACT ACATCAGAGT GGGCAACGAG  301 ACAGGCCTCG AGCTGACCCT GACCAACACC AGCATCATCA ACCACAAGTT  351 CTGCAACCTG AGCGACGCCC ACAAGAAGAA CCTGTACGAT CACGCCCTGA  401 TGAGCATCAT CTCCACCTTC CACCTGAGCA TCCCCAACTT CAACCAGTAC  451 GAGGCCATGA GCTGCGACTT CAACGGCGGA AAGATCAGCG TGCAGTACAA  501 TCTGAGCCAC AGCTATGCCG TGGACGCCGC CAATCATTGT GGAACAGTGG  551 CCAATGGCGT GCTCCAGACC TTCATGAGAA TGGCCTGGGG CGGCAGCTAT  601 ATCGCCCTGG ATTCTGGCTG TGGCAACTGG GACTGCATCA TGACCAGCTA  651 CCAGTACCTG ATCATCCAGA ACACCACCTG GGAAGATCAC TGCCAGTTCA  701 GCAGACCCTC TCCTATCGGA TACCTGGGCC TGCTGTCCCA GAGAACCCGG  751 GACATCTACA TCTCTCGGCG GAGAAGAGGC ACCTTCACCT GGACACTGTC  801 TGATAGCGAG GGCAATGAGA CACCTGGCGG CTACTGTCTG ACCCGGTGGA  851 TGCTGATTGA GGCCGAGCTG AAGTGCTTCG GAAATACCGC CGTGGCCAAG  901 TGCAACGAGA AGCACGACGA GGAATTCTGC GACATGCTGC GGCTGTTCGA  951 TTTCAACAAG CAGGCCATCA GACGGCTGAA GGCCCCTGCT CAGATGTCCA 1001 TCCAGCTGAT CAACAAGGCC GTGAATGCCC TGATTAACGA CCAGCTCATC 1051 ATGAAGAACC ACCTCAGGGA CATCATGTGC ATCCCTTACT GCAACTACAG 1101 CAAGTACTGG TATCTGAACC ACACCATCAC CGGCAAGACC AGCCTGCCTA 1151 AGTGCTGGCT GGTGTCCAAC GGCAGCTACC TGAACGAGAC ACACTTCAGC 1201 GACGACATCG AGCAGCAGGC CGACAACATG ATCACCGAGA TGCTCCAGAA 1251 AGAGTACATG GACCGGCAGG GCAAGACACC TCTGGGCCTT GTGGATCTGT 1301 TCGTGTTCAG CACCAGCTTC TACCTGATCT CTATCTTCCT GCACCTGGTC 1351 AAGATCCCCA CACACAGACA CATCGTGGGC AAGCCCTGTC CTAAGCCTCA 1401 CAGACTGAAC CATATGGGCA TCTGTAGCTG CGGCCTGTAC AAACAGCCTG 1451 GCGTGCCAGT GCGGTGGAAG AGATAA

    [0408] Construct 3:

    [0409] SOSEP-Variant of Construct 1 with N-to-K-Mutation (L-10-SOSEP-NtoK)

    TABLE-US-00019 Amino acid sequence (SEQ ID NO: 22): MGQIVTFFQEVPHVIEEVMNIVLIALSLLAILKGLYNVATCGLIGLVTFL  50 LLCGRSCSTTLYKGVYELQTLELNMETLNMTMPLSCTKNNSHHYIRVGNE 100 TGLELTLTNTSIINHKFCNLSDAHKKNLYDHALMSIISTFHLSIPNFNQY 150 EAMSCDFNGGKISVQYNLSHSYAVDAANHCGTVANGVLQTFMRMAWGGSY 200 IALDSGCGNWDCIMTSYQYLIIQNTTWEDHCQFSRPSPIGYLGLLSQRTR 250 DIYISRRRRGTFTWTLSDSEGKETPGGYCLTRWMLIEAELKCFGNTAVAK 300 CNEKHDEEFCDMLRLFDFNKQAIRRLKAPAQMSIQLINKAVNALINDQLI 350 MKNHLRDIMCIPYCNYSKYWYLNHTITGKTSLPKCWLVSNGSYLNETHFS 400 DDIEQQADNMITEMLQKEYMDRQGKTPLGLVDLFVFSTSFYLISIFLHLV 450 KIPTHRHIVGKPCPKPHRLNHMGICSCGLYKQPGVPVRWKR* DNA-sequence (SEQ ID NO: 23):    1 ATGGGCCAGA TCGTGACATT CTTCCAAGAG GTGCCCCACG TGATCGAAGA   51 AGTGATGAAC ATCGTCCTGA TCGCCCTGAG CCTGCTGGCC ATCCTGAAGG  101 GCCTGTATAA TGTGGCCACC TGTGGCCTGA TCGGCCTGGT CACATTTCTG  151 CTGCTGTGCG GCAGAAGCTG CTCCACCACA CTGTATAAGG GCGTGTACGA  201 GCTGCAAACC CTGGAACTGA ACATGGAAAC CCTGAACATG ACCATGCCTC  251 TGAGCTGCAC CAAGAACAAC AGCCACCACT ACATCAGAGT GGGCAACGAG  301 ACAGGCCTCG AGCTGACCCT GACCAACACC AGCATCATCA ACCACAAGTT  351 CTGCAACCTG AGCGACGCCC ACAAGAAGAA CCTGTACGAT CACGCCCTGA  401 TGAGCATCAT CTCCACCTTC CACCTGAGCA TCCCCAACTT CAACCAGTAC  451 GAGGCCATGA GCTGCGACTT CAACGGCGGA AAGATCAGCG TGCAGTACAA  501 TCTGAGCCAC AGCTATGCCG TGGACGCCGC CAATCATTGT GGAACAGTGG  551 CCAATGGCGT GCTCCAGACC TTCATGAGAA TGGCCTGGGG CGGCAGCTAT  601 ATCGCCCTGG ATTCTGGCTG TGGCAACTGG GACTGCATCA TGACCAGCTA  651 CCAGTACCTG ATCATCCAGA ACACCACCTG GGAAGATCAC TGCCAGTTCA  701 GCAGACCCTC TCCTATCGGA TACCTGGGCC TGCTGTCCCA GAGAACCCGG  751 GACATCTACA TCTCTCGGCG GAGAAGAGGC ACCTTCACCT GGACACTGTC  801 TGATAGCGAG GGCAAAGAGA CACCTGGCGG CTACTGTCTG ACCCGGTGGA  851 TGCTGATTGA GGCCGAGCTG AAGTGCTTCG GAAATACCGC CGTGGCCAAG  901 TGCAACGAGA AGCACGACGA GGAATTCTGC GACATGCTGC GGCTGTTCGA  951 TTTCAACAAG CAGGCCATCA GACGGCTGAA GGCCCCTGCT CAGATGTCCA 1001 TCCAGCTGAT CAACAAGGCC GTGAATGCCC TGATTAACGA CCAGCTCATC 1051 ATGAAGAACC ACCTCAGGGA CATCATGTGC ATCCCTTACT GCAACTACAG 1101 CAAGTACTGG TATCTGAACC ACACCATCAC CGGCAAGACC AGCCTGCCTA 1151 AGTGCTGGCT GGTGTCCAAC GGCAGCTACC TGAACGAGAC ACACTTCAGC 1201 GACGACATCG AGCAGCAGGC CGACAACATG ATCACCGAGA TGCTCCAGAA 1251 AGAGTACATG GACCGGCAGG GCAAGACACC TCTGGGCCTT GTGGATCTGT 1301 TCGTGTTCAG CACCAGCTTC TACCTGATCT CTATCTTCCT GCACCTGGTC 1351 AAGATCCCCA CACACAGACA CATCGTGGGC AAGCCCTGTC CTAAGCCTCA 1401 CAGACTGAAC CATATGGGCA TCTGTAGCTG CGGCCTGTAC AAACAGCCTG 1451 GCGTGCCAGT GCGGTGGAAG AGATAA

    [0410] Construct 4:

    [0411] FLEP-Variant of Construct 1 (L-10-FLEP)

    TABLE-US-00020 Amino acid sequence (SEQ ID NO: 24): MGQIVTFFQEVPHVIEEVMNIVLIALSLLAILKGLYNVATCGLIGLVTFL  50 LLCGRSCSTTLYKGVYELQTLELNMETLNMTMPLSCTKNNSHHYIRVGNE 100 TGLELTLTNTSIINHKFCNLSDAHKKNLYDHALMSIISTFHLSIPNFNQY 150 EAMSCDFNGGKISVQYNLSHSYAVDAANHCGTVANGVLQTFMRMAWGGSY 200 IALDSGRGNWDCIMTSYQYLIIQNTTWEDHCQFSRPSPIGYLGLLSQRTR 250 DIYISGGGGSGGGGSGTFTWTLSDSEGNETPGGYCLTRWMLIEAELKCFG 300 NTAVAKCNEKHDEEFCDMLRLFDFNKQAIRRLKAPAQMSIQLINKAVNAL 350 INDQLIMKNHLRDIMGIPYCNYSKYWYLNHTITGKTSLPKCWLVSNGSYL 400 NETHFSDDIEQQADNMITEMLQKEYMDRQGKTPLGLVDLFVFSTSFYLIS 450 IFLHLVKIPTHRHIVGKPCPKPHRLNHMGICSCGLYKQPGVPVRWKR* DNA-sequence (SEQ ID NO: 25):    1 ATGGGCCAGA TCGTGACATT CTTCCAAGAG GTGCCCCACG TGATCGAAGA   51 AGTGATGAAC ATCGTCCTGA TCGCCCTGAG CCTGCTGGCC ATCCTGAAGG  101 GCCTGTATAA TGTGGCCACC TGTGGCCTGA TCGGCCTGGT CACATTTCTG  151 CTGCTGTGCG GCAGAAGCTG CTCCACCACA CTGTATAAGG GCGTGTACGA  201 GCTGCAAACC CTGGAACTGA ACATGGAAAC CCTGAACATG ACCATGCCTC  251 TGAGCTGCAC CAAGAACAAC AGCCACCACT ACATCAGAGT GGGCAACGAG  301 ACAGGCCTCG AGCTGACCCT GACCAACACC AGCATCATCA ACCACAAGTT  351 CTGCAACCTG AGCGACGCCC ACAAGAAGAA CCTGTACGAT CACGCCCTGA  401 TGAGCATCAT CTCCACCTTC CACCTGAGCA TCCCCAACTT CAACCAGTAC  451 GAGGCCATGA GCTGCGACTT CAACGGCGGA AAGATCAGCG TGCAGTACAA  501 TCTGAGCCAC AGCTATGCCG TGGACGCCGC CAATCATTGT GGAACAGTGG  551 CCAATGGCGT GCTCCAGACC TTCATGAGAA TGGCCTGGGG CGGCAGCTAT  601 ATCGCCCTGG ATTCTGGCAG AGGCAACTGG GACTGCATCA TGACCAGCTA  651 CCAGTACCTG ATCATCCAGA ACACCACCTG GGAAGATCAC TGCCAGTTCA  701 GCAGACCCTC TCCTATCGGA TACCTGGGCC TGCTGTCCCA GAGAACCCGG  751 GACATCTACA TCTCTGGCGG CGGAGGATCT GGCGGAGGTG GAAGTGGCAC  801 CTTCACCTGG ACACTGTCTG ATAGCGAGGG CAATGAGACA CCTGGCGGCT  851 ACTGTCTGAC CCGGTGGATG CTGATTGAGG CCGAGCTGAA GTGCTTCGGA  901 AATACCGCCG TGGCCAAGTG CAACGAGAAG CACGACGAGG AATTCTGCGA  951 CATGCTGCGG CTGTTCGATT TCAACAAGCA GGCCATCAGA CGGCTGAAGG 1001 CCCCTGCTCA GATGTCCATC CAGCTGATCA ACAAGGCCGT GAATGCCCTG 1051 ATTAACGACC AGCTCATCAT GAAGAACCAC CTCAGGGACA TCATGGGCAT 1101 CCCTTACTGC AACTACAGCA AGTACTGGTA TCTGAACCAC ACCATCACCG 1151 GCAAGACCAG CCTGCCTAAG TGCTGGCTGG TGTCCAACGG CAGCTACCTG 1201 AACGAGACAC ACTTCAGCGA CGACATCGAG CAGCAGGCCG ACAACATGAT 1251 CACCGAGATG CTCCAGAAAG AGTACATGGA CCGGCAGGGC AAGACACCTC 1301 TGGGCCTTGT GGATCTGTTC GTGTTCAGCA CCAGCTTCTA CCTGATCTCT 1351 ATCTTCCTGC ACCTGGTCAA GATCCCCACA CACAGACACA TCGTGGGCAA 1401 GCCCTGTCCT AAGCCTCACA GACTGAACCA TATGGGCATC TGTAGCTGCG 1451 GCCTGTACAA ACAGCCTGGC GTGCCAGTGC GGTGGAAGAG ATAA

    [0412] Construct 5:

    [0413] FLEP-Variant of Construct 1 with N-to-K-Mutation (L-10-FLEP-NtoK)

    TABLE-US-00021 Amino acid sequence (SEQ ID NO: 26): MGQIVTFFQEVPHVIEEVMNIVLIALSLLAILKGLYNVATCGLIGLVTFL  50 LLCGRSCSTTLYKGVYELQTLELNMETLNMTMPLSCTKNNSHHYIRVGNE 100 TGLELTLTNTSIINHKFCNLSDAHKKNLYDHALMSIISTFHLSIPNFNQY 150 EAMSCDFNGGKISVQYNLSHSYAVDAANHCGTVANGVLQTFMRMAWGGSY 200 IALDSGRGNWDCIMTSYQYLIIQNTTWEDHCQFSRPSPIGYLGLLSQRTR 250 DIYISGGGGSGGGGSGTFTWTLSDSEGKETPGGYCLTRWMLIEAELKCFG 300 NTAVAKCNEKHDEEFCDMLRLFDFNKQAIRRLKAPAQMSIQLINKAVNAL 350 INDQLIMKNHLRDIMGIPYCNYSKYWYLNHTITGKTSLPKCWLVSNGSYL 400 NETHFSDDIEQQADNMITEMLQKEYMDRQGKTPLGLVDLFVFSTSFYLIS 450 IFLHLVKIPTHRHIVGKPCPKPHRLNHMGICSCGLYKQPGVPVRWKR* DNA-sequence (SEQ ID NO: 27):    1 ATGGGCCAGA TCGTGACATT CTTCCAAGAG GTGCCCCACG TGATCGAAGA   51 AGTGATGAAC ATCGTCCTGA TCGCCCTGAG CCTGCTGGCC ATCCTGAAGG  101 GCCTGTATAA TGTGGCCACC TGTGGCCTGA TCGGCCTGGT CACATTTCTG  151 CTGCTGTGCG GCAGAAGCTG CTCCACCACA CTGTATAAGG GCGTGTACGA  201 GCTGCAAACC CTGGAACTGA ACATGGAAAC CCTGAACATG ACCATGCCTC  251 TGAGCTGCAC CAAGAACAAC AGCCACCACT ACATCAGAGT GGGCAACGAG  301 ACAGGCCTCG AGCTGACCCT GACCAACACC AGCATCATCA ACCACAAGTT  351 CTGCAACCTG AGCGACGCCC ACAAGAAGAA CCTGTACGAT CACGCCCTGA  401 TGAGCATCAT CTCCACCTTC CACCTGAGCA TCCCCAACTT CAACCAGTAC  451 GAGGCCATGA GCTGCGACTT CAACGGCGGA AAGATCAGCG TGCAGTACAA  501 TCTGAGCCAC AGCTATGCCG TGGACGCCGC CAATCATTGT GGAACAGTGG  551 CCAATGGCGT GCTCCAGACC TTCATGAGAA TGGCCTGGGG CGGCAGCTAT  601 ATCGCCCTGG ATTCTGGCAG AGGCAACTGG GACTGCATCA TGACCAGCTA  651 CCAGTACCTG ATCATCCAGA ACACCACCTG GGAAGATCAC TGCCAGTTCA  701 GCAGACCCTC TCCTATCGGA TACCTGGGCC TGCTGTCCCA GAGAACCCGG  751 GACATCTACA TCTCTGGCGG CGGAGGATCT GGCGGAGGTG GAAGTGGCAC  801 CTTCACCTGG ACACTGTCTG ATAGCGAGGG CAAAGAGACA CCTGGCGGCT  851 ACTGTCTGAC CCGGTGGATG CTGATTGAGG CCGAGCTGAA GTGCTTCGGA  901 AATACCGCCG TGGCCAAGTG CAACGAGAAG CACGACGAGG AATTCTGCGA  951 CATGCTGCGG CTGTTCGATT TCAACAAGCA GGCCATCAGA CGGCTGAAGG 1001 CCCCTGCTCA GATGTCCATC CAGCTGATCA ACAAGGCCGT GAATGCCCTG 1051 ATTAACGACC AGCTCATCAT GAAGAACCAC CTCAGGGACA TCATGGGCAT 1101 CCCTTACTGC AACTACAGCA AGTACTGGTA TCTGAACCAC ACCATCACCG 1151 GCAAGACCAG CCTGCCTAAG TGCTGGCTGG TGTCCAACGG CAGCTACCTG 1201 AACGAGACAC ACTTCAGCGA CGACATCGAG CAGCAGGCCG ACAACATGAT 1251 CACCGAGATG CTCCAGAAAG AGTACATGGA CCGGCAGGGC AAGACACCTC 1301 TGGGCCTTGT GGATCTGTTC GTGTTCAGCA CCAGCTTCTA CCTGATCTCT 1351 ATCTTCCTGC ACCTGGTCAA GATCCCCACA CACAGACACA TCGTGGGCAA 1401 GCCCTGTCCT AAGCCTCACA GACTGAACCA TATGGGCATC TGTAGCTGCG 1451 GCCTGTACAA ACAGCCTGGC GTGCCAGTGC GGTGGAAGAG ATAA

    EXAMPLE 15

    [0414] Lassa Virus Nucleoprotein

    [0415] This example describes Lassa virus nucleoprotein ancestral sequence produced using a method according to an embodiment of the invention.

    [0416] Construct 6:

    [0417] Lassa Virus Nucleoprotein Ancestral Sequence of Nigerian Lassa Isolates (L-NP-1=L-NP-CovAnc-1 N)

    TABLE-US-00022 Amino acid sequence (SEQ ID NO: 28): MSASKEVKSFLWTQSLRRELSGYCSNIKLQVVKDAQALLHGLDFSEVSNV  50 QRLMRKQKRDDSDLKRLRDLNQAVNNLVELKSTQQKSILRVGTLTSDDLL 100 TLAADLEKLKSKVIRTERPLSSGVYMGNLSTQQLEQRRALLNMIGMVGGA 150 QGTQPGRDGVVRVWDVKNPDLLNNQFGTMPSLTLACLTKQGQVDLNDAVL 200 ALTDLGLIYTAKYPNSSDLDRLSQSHPILNMVDTKKSSLNISGYNFSLGA 250 AVKAGACMLDGGNMLETIKVTPQTMDGILKSILKVKKSLGMFVSDTPGER 300 NPYENILYKICLSGDGWPYIASRTSIVGRAWENTTVDLESDGKPQKVGTA 350 GSNKSLQSAGFPTGLTYSQLMTLKDSMMQLDPSAKTWIDIEGRPEDPVEI 400 ALYQPMSGCYIHFFREPTDLKQFKQDAKYSHGIDVADLFPAQPGLTSAVI 450 EALPRNMVLTCQGSDDIKRLLDSQGRRDIKLIDIALSKADSRRFENAVWD 500 QCKDLCHMHTGVVVEKKKRGGKEEITPHCALMDCIMYDAAVSGGLNIPVL 550 RAVLPRDMVFRTSSPKVVL* DNA-sequence (SEQ ID NO: 29):    1 ATGAGCGCCA GCAAAGAAGT GAAAAGCTTC CTCTGGACCC AGAGCCTGCG   51 GAGAGAGCTG TCTGGCTACT GCTCCAACAT CAAGCTCCAG GTGGTCAAGG  101 ACGCCCAGGC TCTGCTGCAT GGCCTGGATT TCAGCGAGGT GTCCAACGTG  151 CAGCGGCTGA TGAGAAAGCA GAAGCGGGAC GACAGCGACC TGAAGAGACT  201 GAGGGATCTG AACCAGGCCG TGAACAACCT GGTGGAACTG AAGTCTACCC  251 AGCAGAAATC CATCCTGAGA GTGGGCACCC TGACCAGCGA CGATCTGCTG  301 ACACTGGCCG CCGATCTGGA AAAGCTGAAG TCCAAAGTGA TCCGGACCGA  351 GAGGCCACTG TCTAGCGGAG TGTACATGGG CAACCTGAGC ACCCAGCAGC  401 TGGAACAGAG AAGGGCCCTG CTGAACATGA TCGGCATGGT TGGAGGCGCC  451 CAGGGAACAC AGCCTGGAAG AGATGGTGTC GTCAGAGTGT GGGACGTGAA  501 GAACCCCGAC CTGCTCAACA ACCAGTTCGG CACCATGCCT TCTCTGACCC  551 TGGCCTGCCT GACAAAGCAG GGCCAAGTGG ACCTGAACGA TGCCGTGCTG  601 GCTCTGACTG ATCTGGGCCT GATCTACACC GCCAAGTATC CCAACAGCTC  651 CGACCTGGAC AGGCTGAGCC AGTCTCACCC CATCCTGAAC ATGGTGGACA  701 CCAAGAAGTC CAGCCTGAAC ATCAGCGGCT ACAACTTCTC TCTGGGCGCT  751 GCCGTGAAAG CCGGCGCTTG TATGCTTGAC GGCGGCAACA TGCTGGAAAC  801 CATCAAAGTG ACCCCTCAGA CCATGGACGG CATCCTGAAA AGTATCCTGA  851 AAGTGAAGAA ATCCCTGGGC ATGTTCGTGT CCGACACACC CGGCGAGAGA  901 AACCCCTACG AGAACATCCT GTACAAGATT TGCCTGAGCG GCGACGGCTG  951 GCCCTATATC GCCAGCAGAA CATCTATCGT GGGCAGAGCT TGGGAGAACA 1001 CCACCGTGGA CCTGGAATCC GATGGCAAGC CTCAGAAAGT GGGCACAGCC 1051 GGCAGCAACA AGAGCCTCCA GTCTGCCGGA TTTCCTACCG GCCTGACATA 1101 CAGCCAGCTG ATGACCCTGA AGGACAGCAT GATGCAGCTG GACCCTAGCG 1151 CCAAGACCTG GATCGACATT GAGGGCAGAC CCGAGGATCC CGTGGAAATC 1201 GCTCTGTACC AGCCTATGAG CGGCTGCTAT ATCCACTTCT TCAGAGAGCC 1251 CACCGATCTG AAGCAGTTCA AGCAGGACGC CAAGTACAGC CACGGAATCG 1301 ACGTGGCCGA TCTGTTCCCA GCTCAGCCAG GACTGACATC CGCCGTGATT 1351 GAAGCCCTGC CTAGAAACAT GGTGCTGACC TGTCAGGGCA GCGACGACAT 1401 CAAGAGACTG CTGGACAGCC AGGGCAGAAG AGATATCAAG CTGATCGATA 1451 TCGCCCTGAG CAAGGCCGAC TCTCGGAGAT TCGAAAACGC CGTGTGGGAC 1501 CAGTGCAAGG ACCTGTGTCA CATGCACACA GGCGTGGTGG TGGAAAAGAA 1551 GAAGCGCGGA GGCAAAGAGG AAATCACCCC TCACTGCGCC CTGATGGACT 1601 GCATTATGTA TGACGCCGCC GTGTCTGGCG GCCTGAATAT CCCTGTTCTG 1651 AGAGCCGTGC TGCCCCGCGA CATGGTGTTT AGAACAAGCA GCCCCAAGGT 1701 GGTGCTCTGA

    EXAMPLE 16

    [0418] Lassa Virus Nucleoprotein

    [0419] This example describes Lassa virus nucleoprotein ancestral sequence produced using a method according to an embodiment of the invention.

    [0420] Construct 7:

    [0421] Lassa Virus Nucleoprotein Ancestral Sequence of Sierra Leone Isolates (L-NP-1=L-NP-CovAnc-2 SL)

    TABLE-US-00023 Amino acid sequence (SEQ ID NO: 30): MSASKEIKSFLWTQSLRRELSGYCSNIKLQVVKDAQALLHGLDFSEVSNV  50 QRLMRKERRDDNDLKRLRDLNQAVNNLVELKSTQQKSILRVGTLTSDDLL 100 ILAADLEKLKSKVTRTERPLSAGVYMGNLSSQQLDQRRALLNMIGMSGGN 150 QGARAGRDGVVRVWDVKNAELLNNQFGTMPSLTLACLTKQGQVDLNDAVQ 200 ALTDLGLIYTAKYPNTSDLDRLTQSHPILNMIDTKKSSLNISGYNFSLGA 250 AVKAGACMLDGGNMLETIKVSPQTMDGILKSILKVKKALGMFISDTPGER 300 NPYENILYKICLSGDGWPYIASRTSITGRAWENTVVDLESDGKPQKAGSN 350 NSNKSLQSAGFTAGLTYSQLMTLKDAMLQLDPNAKTWMDIEGRPEDPVEI 400 ALYQPSSGCYIHFFREPTDLKQFKQDAKYSHGIDVTDLFAAQPGLTSAVI 450 DALPRNMVITCQGSDDIRKLLESQGRKDIKLIDIALSKTDSRKYENAVWD 500 QYKDLCHMHTGVVVEKKKRGGKEEITPHCALMDCIMFDAAVSGGLNTSVL 550 RAVLPRDMVFRTSTPRVVL* DNA-sequence (SEQ ID NO: 31):    1 ATGAGCGCCA GCAAAGAGAT CAAGAGCTTC CTGTGGACCC AGAGCCTGCG   51 GAGAGAGCTG TCTGGCTACT GCTCCAACAT CAAGCTCCAG GTGGTCAAGG  101 ACGCCCAGGC TCTGCTGCAT GGCCTGGATT TCAGCGAGGT GTCCAACGTG  151 CAGCGGCTGA TGCGGAAAGA GAGAAGGGAC GACAACGACC TGAAGCGGCT  201 GAGGGATCTG AACCAGGCCG TGAACAACCT GGTGGAACTG AAGTCTACCC  251 AGCAGAAATC CATCCTGAGA GTGGGCACCC TGACCAGCGA CGATCTGCTG  301 ATTCTGGCCG CCGACCTGGA AAAGCTGAAG TCCAAAGTGA CCCGGACCGA  351 GAGGCCACTG TCTGCTGGTG TCTACATGGG CAACCTGAGC AGCCAGCAGC  401 TGGATCAGAG AAGGGCCCTG CTGAACATGA TCGGCATGAG CGGCGGAAAT  451 CAGGGCGCTA GAGCTGGCAG AGATGGCGTC GTCAGAGTGT GGGACGTGAA  501 GAATGCCGAG CTGCTCAACA ACCAGTTCGG CACCATGCCT AGCCTGACAC  551 TGGCCTGCCT GACAAAGCAG GGCCAAGTGG ACCTGAACGA TGCTGTGCAG  601 GCCCTGACTG ATCTGGGCCT GATCTACACC GCCAAGTATC CCAACACCAG  651 CGACCTGGAC AGACTGACCC AGTCTCACCC CATCCTGAAT ATGATCGACA  701 CCAAGAAGTC CAGCCTGAAC ATCAGCGGCT ACAACTTCTC TCTGGGCGCT  751 GCCGTGAAAG CCGGCGCTTG TATGCTTGAC GGCGGCAACA TGCTGGAAAC  801 CATCAAGGTG TCCCCACAGA CCATGGACGG CATCCTGAAA AGTATCCTGA  851 AAGTGAAGAA AGCCCTGGGC ATGTTCATCA GCGACACCCC TGGCGAGAGA  901 AACCCCTACG AGAACATCCT GTACAAGATT TGCCTGAGCG GCGACGGCTG  951 GCCCTATATC GCCAGCAGAA CCAGCATTAC CGGCAGAGCT TGGGAGAACA 1001 CCGTGGTGGA TCTGGAAAGC GACGGCAAGC CTCAGAAGGC CGGCAGCAAC 1051 AACTCCAACA AGAGCCTCCA GTCCGCCGGC TTCACAGCCG GCCTGACATA 1101 TAGCCAGCTG ATGACCCTGA AGGACGCCAT GCTGCAACTG GACCCCAATG 1151 CCAAGACCTG GATGGACATC GAGGGCAGAC CTGAGGACCC TGTGGAAATC 1201 GCCCTGTACC AGCCTAGCTC CGGCTGCTAT ATCCACTTCT TCAGAGAGCC 1251 CACCGATCTG AAGCAGTTCA AGCAGGACGC CAAGTACAGC CACGGCATCG 1301 ACGTGACCGA TCTGTTTGCT GCTCAGCCCG GACTGACCTC CGCCGTGATT 1351 GATGCCCTGC CTCGGAACAT GGTCATCACC TGTCAGGGCA GCGACGACAT 1401 CCGGAAGCTG CTGGAATCTC AGGGCAGAAA GGATATCAAG CTGATCGATA 1451 TCGCCCTGAG CAAGACCGAC AGCCGGAAGT ACGAAAACGC CGTGTGGGAC 1501 CAGTACAAGG ACCTGTGCCA CATGCACACA GGCGTGGTGG TGGAAAAGAA 1551 GAAGCGCGGA GGCAAAGAGG AAATCACCCC TCACTGCGCT CTGATGGACT 1601 GCATCATGTT TGACGCCGCC GTGTCTGGCG GCCTGAATAC CTCTGTTCTG 1651 AGAGCCGTGC TGCCCAGAGA CATGGTGTTC AGAACAAGCA CCCCTAGAGT 1701 GGTGCTCTGA