VACCINES AND METHODS
20220040284 · 2022-02-10
Inventors
- Jonathan Luke HEENEY (Cambridge, Cambridgeshire, GB)
- Simon FROST (Cambridge, Cambridgeshire, GB)
- Ralf WAGNER (Regensburg, DE)
- Benedikt ASBACH (Regensburg, DE)
- Rebecca KINSLEY (Cambridge, Cambridgeshire, GB)
- Edward WRIGHT (London, Greater London, GB)
Cpc classification
C12N2760/10034
CHEMISTRY; METALLURGY
C12N7/00
CHEMISTRY; METALLURGY
C12N2760/14222
CHEMISTRY; METALLURGY
C12N2760/16134
CHEMISTRY; METALLURGY
C12N2760/14134
CHEMISTRY; METALLURGY
C12N2760/16122
CHEMISTRY; METALLURGY
C07K2317/33
CHEMISTRY; METALLURGY
C12N2760/14122
CHEMISTRY; METALLURGY
C07K2317/76
CHEMISTRY; METALLURGY
C12N2760/14034
CHEMISTRY; METALLURGY
C12N2760/14022
CHEMISTRY; METALLURGY
C12N2760/14234
CHEMISTRY; METALLURGY
International classification
Abstract
Methods for identifying optimized antigenic pathogen polypeptides capable of inducing a broadly neutralizing immune response, and associated T-cell responses, to a pathogen are described, as well as nucleic acid sequences encoding such polypeptides. Methods for determining whether a broadly neutralizing immune response is induced in a subject following immunization with an optimized antigenic pathogen polypeptide, or a nucleic acid encoding the optimized pathogen polypeptide, are also described. Nucleic acid molecules, polypeptides, vectors, cells, fusion proteins, pharmaceutical compositions, and their use as vaccines against pathogens, especially against emerging or re-emerging pathogens (particularly RNA viruses), are also described.
Claims
1. A method for identifying a lead candidate optimized antigenic pathogen polypeptide capable of inducing a broadly neutralizing immune response to a pathogen, which comprises: i) providing a polypeptide library comprising a plurality of different candidate optimized antigenic pathogen polypeptides, wherein the amino acid sequence of each different candidate has been optimized from a plurality of different amino acid sequences of a pathogen polypeptide and is different from each different amino acid sequence of the pathogen polypeptide, wherein each different amino acid sequence of the pathogen polypeptide comprises amino acid sequence of a polypeptide of a different isolate, and wherein each different isolate is an isolate of a pathogen of the same family as the pathogen to which it is desired to induce a broadly neutralizing immune response; ii) screening the candidate optimized antigenic pathogen polypeptides of the polypeptide library for binding by one or more broadly neutralizing antigen-binding molecules, each of which is able to bind and/or neutralize a pathogen of the same family as the pathogen to which it is desired to induce a broadly neutralizing immune response; and iii) identifying a candidate optimized antigenic pathogen polypeptide that is bound by one or more of the antigen-binding molecules in step (ii) as being a lead candidate optimized antigenic pathogen polypeptide capable of inducing a broadly neutralizing immune response to the pathogen.
2. A method according to claim 1, wherein the one or more broadly neutralizing antigen-binding molecules include an antibody that has been obtained, or derived from an antibody that has been obtained, from a subject that has been exposed to a pathogen of the same family as the pathogen to which it is desired to induce a broadly neutralizing immune response.
3. A method according to claim 1 or 2, wherein the one or more broadly neutralizing antigen-binding molecules include non-antibody antigen-binding proteins.
4. A method according to claim 3, wherein the one or more broadly neutralizing antigen-binding molecules include a designed ankyrin repeat protein (DARPin), an anticalin, an aptamer, or a T-cell receptor molecule.
5. A method according to any preceding claim, wherein the candidate optimized antigenic pathogen polypeptides of the polypeptide library have been expressed in, or on the surface of, mammalian cells.
6. A method according to any of claims 1 to 4, wherein the candidate optimized antigenic pathogen polypeptides of the polypeptide library have been expressed in, or on the surface of, bacterial, yeast, or insect cells.
7. A method according to any preceding claim, wherein the pathogen is a virus, the candidate optimized antigenic pathogen polypeptides are candidate optimized antigenic virus polypeptides, and the pathogen peptides are virus polypeptides.
8. A method according to claim 7, wherein the polypeptide library is a viral pseudotype library comprising a plurality of different viral pseudotypes, each different viral pseudotype comprising a different candidate optimized virus polypeptide.
9. A method according to claim 8, wherein in step (ii) the candidate optimized antigenic virus polypeptides are screened for binding by one or more of the antigen-binding molecules by screening the viral pseudotypes for binding and/or neutralization by one or more of the antigen-binding molecules.
10. A method according to any of claims 1 to 7, wherein the candidate optimized antigenic pathogen polypeptides are screened for binding by the one or more antigen-binding molecules by a flow cytometric assay.
11. A method according to any preceding claim, which further comprises generating the polypeptide library.
12. A method according to claim 11, wherein the polypeptide library is generated by expressing the different candidate optimized antigenic pathogen polypeptides from a nucleic acid library comprising a plurality of different nucleic acids, each different nucleic acid comprising a nucleotide sequence encoding a different candidate optimized antigenic pathogen polypeptide of the polypeptide library.
13. A method according to claim 12, wherein the different candidate optimized pathogen polypeptides are expressed in, or on the surface of, mammalian cells.
14. A method according to claim 12 or 13, wherein the nucleotide sequence of each different nucleic acid of the nucleic acid library is codon-optimized, optionally gene-optimized, for expression of the encoded polypeptide in a mammalian cell.
15. A method according to any of claims 12 to 14, wherein each different nucleic acid of the nucleic acid library is part of an expression vector for expression of the nucleic acid in a mammalian cell.
16. A method according to any of claims 12 to 15, wherein the pathogen is a virus, the candidate optimized antigenic pathogen polypeptides are candidate optimized antigenic virus polypeptides, and the pathogen peptides are virus polypeptides.
17. A method according to claim 16, wherein the nucleic acid library is a viral pseudotype vector library, and each different nucleic acid of the library is part of an expression vector for production of a viral pseudotype comprising the encoded virus polypeptide, and the polypeptide library is a viral pseudotype library generated by producing viral pseudotypes from the expression vectors of the viral pseudotype vector library, wherein the viral pseudotype library comprises a plurality of different viral pseudotypes, each different viral pseudotype comprising a different candidate optimized virus polypeptide encoded by a different nucleic acid sequence of the viral pseudotype vector library.
18. A method according to any of claims 15 to 17, wherein the expression vector is also a vaccine vector.
19. A method according to claim 18, wherein the vaccine vector is a viral vaccine vector, a bacterial vaccine vector, an RNA vaccine vector, or a DNA vaccine vector.
20. A method according to claim 18 or 19, wherein the vaccine vector is based on a viral delivery vector, such as a poxvirus (e.g. MVA, NYVAC, AVIPDX), herpesvirus (e.g. HSV, CMV, Adenovirus of any host species), Morbillivirus (e.g. measles), Alphavirus (e.g. SFV, Sendai), Flavivirus (e.g. Yellow Fever), or Rhabdovirus (e.g. VSV)-based viral delivery vector, a bacterial delivery vector (e.g. Salmonella, E. coli), an RNA expression vector, or a DNA expression vector.
21. A method according to any of claims 15 to 20, wherein the vector is a pEVAC-based expression vector.
22. A method according to claim 12, wherein the different candidate optimized antigenic pathogen polypeptides are expressed in, or on the surface of, bacterial, yeast, or insect cells.
23. A method according to any of claims 12 to 22, which further comprises generating the nucleic acid library by synthesising a plurality of different nucleic acids, each different nucleic acid comprising a different nucleotide sequence encoding a different candidate optimized antigenic pathogen polypeptide.
24. A method according to claim 23, which further comprises: i) obtaining amino acid sequences of the pathogen polypeptide, and/or nucleotide sequences encoding the pathogen polypeptide, of the different pathogen isolates; and ii) generating a plurality of different nucleotide sequences, each different nucleotide sequence encoding a different candidate optimized antigenic pathogen polypeptide, wherein the encoded amino acid sequence of each different candidate optimized antigenic pathogen polypeptide is optimized from the obtained amino acid sequences or encoded amino acid sequences of the pathogen polypeptide, and is different from each of the obtained amino acid sequences or encoded amino acid sequences.
25. A method according to claim 24, wherein generation of the plurality of different nucleotide sequences in step (ii) of claim 24 comprises: carrying out a multiple sequence alignment of the amino acid or nucleotide sequences obtained in step (i) of claim 24; identifying from the multiple sequence alignment amino acid sequence or encoded amino acid sequence that is highly conserved between the polypeptides of the different pathogen isolates; and generating a plurality of different nucleotide sequences, each different nucleotide sequence encoding a different candidate optimized antigenic pathogen polypeptide, wherein one or more of the different nucleotide sequences includes sequence encoding a highly conserved amino acid sequence or encoded amino acid sequence identified from the multiple sequence alignment.
26. A method according to claim 25, which further comprises: identifying from the multiple sequence alignment amino acid sequence or encoded amino acid sequence that is ancestral amino acid sequence; and including in one or more of the different generated nucleotide sequences sequence encoding an ancestral amino acid sequence identified from the multiple sequence alignment.
27. A method according to any of claims 24 to 26, which includes codon-optimization, optionally gene-optimization codons of the different generated nucleotide sequences for optimal expression of the encoded candidate optimized antigenic pathogen polypeptides in an expression system.
28. A method according to claim 27, wherein the expression system comprises a mammalian cell.
29. A method according to claim 27, wherein the expression system comprises a yeast, bacterial, or insect cell.
30. A method according to any of claims 24 to 29, which includes optimizing the different nucleotide sequences for antigenicity of the encoded candidate optimized antigenic pathogen polypeptides.
31. A method according to claim 30, wherein the antigenicity optimization includes any of the following: deletion or modification of nucleic acid sequence encoding amino acid sequence that inhibits production and/or function of anti-pathogen polypeptide antibody (for example, deletion or modification of a mucin-like domain); region swapping to recover one or more potential lost encoded epitopes; site-specific mutation, for example of N-linked glycosylation sites; changes to enhance stability (e.g. disulphide bond formation, reduce degradation of the encoded polypeptide by a serine protease); removal of glycans; insertion of nucleic acid sequence, for example to insert nucleic acid sequence encoding a desired epitope.
32. A method according to any preceding claim, wherein the one or more broadly neutralizing antigen-binding molecules recited in step (ii) of claim 1 include a broadly neutralizing antibody, preferably a broadly neutralizing monoclonal antibody (BNmAb).
33. A method according to any preceding claim, wherein the one or more antigen-binding molecules recited in step (ii) of claim 1 include an antibody obtained, or derived from an antibody obtained, from a subject that has survived an outbreak of a pathogen of the same family, optionally of the same subtype or type, as the pathogen to which it is desired to induce a broadly neutralizing immune response.
34. A method according to claim 33, wherein the subject from which the antibody has been obtained or derived is a human or non-human mammalian subject.
35. A method according to claim 33 or 34, wherein the one or more antigen-binding molecules include a broadly neutralizing monoclonal antibody (BNmAb).
36. A method according to any preceding claim, wherein the different pathogen isolates include different pathogen isolates from an outbreak of a pathogen of the same subtype as the pathogen to which it is desired to induce a broadly neutralizing immune response.
37. A method according to any preceding claim, wherein the different pathogen isolates include different pathogen isolates from an outbreak of a pathogen of a different subtype, but the same type, as the pathogen to which it is desired to induce a broadly neutralizing immune response.
38. A method according to any preceding claim, wherein the different pathogen isolates include different pathogen isolates from an outbreak of a pathogen of a different group, but the same family, as the pathogen to which it is desired to induce a broadly neutralizing immune response.
39. A method according to any preceding claim, wherein the different pathogen isolates include different prior pathogen isolates of a pathogen of the same subtype, type, or family as the pathogen to which it is desired to induce a broadly neutralizing immune response.
40. A method according to any preceding claim, wherein each candidate optimized antigenic pathogen polypeptide comprises at least 20 amino acid residues.
41. A method according to any preceding claim, wherein the pathogen is a virus.
42. A method according to claim 41, wherein the virus is an RNA virus.
43. A method according to claim 41 or 42, wherein the virus is an emerging or re-emerging RNA virus.
44. A method according to any of claims 41 to 43, wherein the virus is a Filovirus, an Arenavirus, or an Orthomyxovirus.
45. A method according to any of claims 41 to 43, wherein the virus is Ebola virus or Marburg virus.
46. A method according to any of claims 41 to 43, wherein the virus is Lassa virus.
47. A method according to any preceding claim, wherein the pathogen polypeptide is a viral glycoprotein.
48. A method according to any preceding claim, which is an in vitro method.
49. A method of identifying a nucleic acid sequence encoding an optimized antigenic pathogen polypeptide capable of inducing a broadly neutralizing immune response to a pathogen, which comprises: i) immunizing a human, or a non-human animal, with a nucleic acid comprising a nucleic acid sequence encoding a lead candidate optimized antigenic pathogen polypeptide identified by a method according to any preceding claim; ii) determining whether a broadly neutralizing immune response is induced in the human or non-human animal following the immunization in step (i); and iii) identifying the nucleic acid sequence as a nucleic acid sequence encoding an optimized antigenic pathogen polypeptide capable of inducing a broadly neutralizing immune response to the pathogen if it is determined from step (ii) that a broadly neutralizing immune response is induced in the human or non-human animal.
50. A method according to claim 49, which comprises determining whether a broadly neutralizing immune response is induced in the human or non-human animal by determining whether antibody in serum obtained from the human or non-human animal binds to and/or neutralizes more than one pathogen subtype.
51. A method according to claim 49 or 50, wherein the non-human animal is a mammal.
52. A method according to claim 51, wherein the mammal is a guinea pig, or a mouse.
53. A method according to claim 49 or 50, wherein the non-human animal is avian.
54. An isolated nucleic acid molecule, comprising a nucleic acid sequence that is: i) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:1, or identical with SEQ ID NO:1; ii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:2, or identical with SEQ ID NO:2; iii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:4, or identical with SEQ ID NO:4; iv) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:5, or identical with SEQ ID NO:5; v) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:7, or identical with SEQ ID NO:7; or vi) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:8, or identical with SEQ ID NO:8; or the complement thereof.
55. An isolated nucleic acid molecule, comprising a nucleic acid sequence that is: i) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:10, or identical with SEQ ID NO:10; ii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:12, or identical with SEQ ID NO:12; or iii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:14, or identical with SEQ ID NO:14; or the complement thereof.
56. An isolated nucleic acid molecule, comprising a nucleic acid sequence that is: i) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:19, or identical with SEQ ID NO:19; ii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:21, or identical with SEQ ID NO:21; iii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:23, or identical with SEQ ID NO:23; iv) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:25, or identical with SEQ ID NO:25; v) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:27, or identical with SEQ ID NO:27; vi) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:29, or identical with SEQ ID NO:29; or vii) at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:31, or identical with SEQ ID NO:31; or the complement thereof.
57. An isolated polypeptide, comprising an amino acid sequence that is: i) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:1, or identical with the amino acid sequence encoded by SEQ ID NO:1; ii) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:2, or identical with the amino acid sequence encoded by SEQ ID NO:2; iii) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:4, or identical with the amino acid sequence encoded by SEQ ID NO:4; iv) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:5, or identical with the amino acid sequence encoded by SEQ ID NO:5; v) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:7, or identical with the amino acid sequence encoded by SEQ ID NO:7; vi) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:8, or identical with the amino acid sequence encoded by SEQ ID NO:8; vii) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:10, or identical with the amino acid sequence encoded by SEQ ID NO:10; viii) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:12, or identical with the amino acid sequence encoded by SEQ ID NO:12; ix) at least 95%, 96%, 97%, 98%, or 99% identical with an amino acid sequence encoded by SEQ ID NO:14, or identical with the amino acid sequence encoded by SEQ ID NO:14.
58. An isolated polypeptide, comprising an amino acid sequence that is: i) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:3, or identical with SEQ ID NO:3; ii) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:6, or identical with SEQ ID NO:6; or iii) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:9, or identical with SEQ ID NO:9; iv) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:11, or identical with SEQ ID NO:11; v) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:13, or identical with SEQ ID NO:13; or vi) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:15, or identical with SEQ ID NO:15.
59. An isolated polypeptide, comprising an amino acid sequence that is: i) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:18, or identical with SEQ ID NO:18; ii) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:20, or identical with SEQ ID NO:20; iii) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:22, or identical with SEQ ID NO:22; iv) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:24, or identical with SEQ ID NO:24; v) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:26, or identical with SEQ ID NO:26; vi) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:28, or identical with SEQ ID NO:28; or vii) at least 95%, 96%, 97%, 98%, or 99% identical with SEQ ID NO:30, or identical with SEQ ID NO:30.
60. An isolated nucleic acid encoding an amino acid sequence encoded by a nucleic acid of claim 54, 55, or 56, wherein the nucleic acid is codon-optimized, optionally gene-optimized, for expression in mammalian cells.
61. An isolated nucleic acid encoding a polypeptide of claim 57, 58, or 59, wherein the nucleic acid is codon-optimized, optionally gene-optimized, for expression in mammalian cells.
62. A vector comprising a nucleic acid of claim 54, 55, 56, 60, or 61.
63. A vector according to claim 62, which further comprises a promoter operably linked to the nucleic acid.
64. A vector according to claim 63, wherein the promoter is for expression of a polypeptide encoded by the nucleic acid in mammalian cells.
65. A vector according to claim 63, wherein the promoter is for expression of a polypeptide encoded by the nucleic acid in yeast or insect cells.
66. A vector according to any of claims 62 to 65, which is a vaccine vector.
67. A vector according to claim 66, which is a viral vaccine vector, a bacterial vaccine vector, an RNA vaccine vector, or a DNA vaccine vector.
68. An isolated cell comprising a vector of any of claims 62 to 65.
69. A pseudotyped virus particle comprising the polypeptide of claim 57, 58, or 59.
70. A method of producing a pseudotyped virus particle of claim 69, which includes transfecting a host cell with a vector according to any of claims 62 to 64.
71. A fusion protein comprising a polypeptide according to claim 57, 58, or 59.
72. A pharmaceutical composition comprising a nucleic acid according to claim 54, 55, 56, 60, or 61, and a pharmaceutically acceptable carrier, excipient, or diluent.
73. A pharmaceutical composition comprising a vector according to any of claim 62 to 64, 66, or 67, and a pharmaceutically acceptable carrier, excipient, or diluent.
74. A pharmaceutical composition comprising a polypeptide according to claim 57, 58, or 59, and a pharmaceutically acceptable carrier, excipient, or diluent.
75. A pharmaceutical composition according to any of claims 72 to 74, which further comprises an adjuvant for enhancing an immune response in a subject to the polypeptide, or to a polypeptide encoded by the nucleic acid, of the composition.
76. A method of inducing an immune response to a virus of the Filoviridae family in a subject, which comprises administering to the subject a nucleic acid according to any of claim 54, 55, 60, or 61, a polypeptide according to claim 57 or 58, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75.
77. A method of immunizing a subject against a virus of the Filoviridae family, which comprises administering to the subject a nucleic acid according to any of claim 54, 55, 60, or 61, a polypeptide according to claim 57 or 58, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75.
78. A method of inducing an immune response to a virus of the Arenaviridae family in a subject, which comprises administering to the subject a nucleic acid according to any of claim 56, 60, or 61, a polypeptide according to claim 59, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75.
79. A method of immunizing a subject against a virus of the Arenaviridae family, which comprises administering to the subject a nucleic acid according to any of claim 56, 60, or 61, a polypeptide according to claim 59, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75.
80. A method according to any of claims 76 to 79, wherein the composition is administered intramuscularly.
81. A nucleic acid expression vector, which comprises a multiple cloning site, comprising KpnI and NotI endonuclease sites.
82. A vector according to claim 81, wherein the multiple cloning site comprises a nucleic acid sequence of SEQ ID NO:16.
83. A vector according to claim 81 or 82, which is an expression vector, and a viral pseudotype vector.
84. A vector according to any of claims 81 to 83, which is a vaccine vector.
85. A vector according to any of claims 81 to 84, which comprises, from a 5′ to 3′ direction: a promoter; a splice donor site; a splice acceptor site; and a terminator signal, wherein the multiple cloning site is located between the splice acceptor site and the terminator signal.
86. A vector according to claim 85, wherein the promoter comprises a CMV immediate early 1 enhancer/promoter and/or the terminator signal comprises a terminator signal of a bovine growth hormone gene that lacks a KpnI restriction endonuclease site.
87. A vector according to any of claims 81 to 86, which further comprises an origin of replication, and nucleic acid encoding resistance to an antibiotic.
88. A vector according to claim 87, wherein the origin of replication comprises a pUC-plasmid origin of replication and/or the nucleic acid encodes resistance to kanamycin.
89. A vector according to any of claims 81 to 88, which comprises a nucleic acid sequence of SEQ ID NO:17.
90. An isolated nucleic acid molecule which comprises a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 6, and a polypeptide comprising an amino acid sequence of SEQ ID NO: 9.
91. An isolated nucleic acid molecule which comprises a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 13, and a polypeptide comprising an amino acid sequence of SEQ ID NO: 15.
92. A composition comprising a first nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 6, and a second nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 9.
93. A composition comprising a first nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 13, and a second nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 15.
94. A combined preparation comprising: (i) a first nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 6; and (ii) a second nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 9.
95. A combined preparation comprising: (i) a first nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 13; and (ii) a second nucleic acid which includes a nucleotide sequence encoding a polypeptide comprising an amino acid sequence of SEQ ID NO: 15.
96. A composition comprising a first polypeptide comprising an amino acid sequence of SEQ ID NO: 6, and a second polypeptide comprising an amino acid sequence of SEQ ID NO: 9.
97. A composition comprising a first polypeptide comprising an amino acid sequence of SEQ ID NO: 13, and a second polypeptide comprising an amino acid sequence of SEQ ID NO: 15.
98. A fusion protein comprising a first polypeptide comprising an amino acid sequence of SEQ ID NO: 6, and a second polypeptide comprising an amino acid sequence of SEQ ID NO: 9.
99. A fusion protein comprising a first polypeptide comprising an amino acid sequence of SEQ ID NO: 13, and a second polypeptide comprising an amino acid sequence of SEQ ID NO: 15.
100. A combined preparation comprising: (i) a first polypeptide comprising an amino acid sequence of SEQ ID NO: 6; and (ii) a second polypeptide comprising an amino acid sequence of SEQ ID NO: 9.
101. A combined preparation comprising: (i) a first polypeptide comprising an amino acid sequence of SEQ ID NO: 13; and (ii) a second polypeptide comprising an amino acid sequence of SEQ ID NO: 15.
102. A nucleic acid according to any of claim 54, 55, 60, or 61, a polypeptide according to claim 57 or 58, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, for use as a medicament.
103. A nucleic acid according to any of claim 54, 55, 60, or 61, a polypeptide according to claim 57 or 58, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, for use in the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Filoviridae family.
104. Use of a nucleic acid according to any of claim 54, 55, 60, or 61, a polypeptide according to claim 57 or 58, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, in the manufacture of a medicament for the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Filoviridae family.
105. A nucleic acid according to any of claim 56, 60, or 61, a polypeptide according to claim 59, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, for use as a medicament.
106. A nucleic acid according to any of claim 56, 60, or 61, a polypeptide according to claim 59, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, for use in the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Arenaviridae family.
107. Use of a nucleic acid according to any of claim 56, 60, or 61, a polypeptide according to claim 59, a vector according to any of claim 62 to 64, 66, or 67, or a pharmaceutical composition according to any of claims 72 to 75, in the manufacture of a medicament for the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Arenaviridae family.
108. A nucleic acid according to claim 90 or 91, a composition according to claim 92, 93, 96, or 97, a combined preparation according to claim 94, 95, 100, or 101, or a fusion protein according to claim 98 or 99, for use as a medicament.
109. A nucleic acid according to claim 90 or 91, a composition according to claim 92, 93, 96, or 97, a combined preparation according to claim 94, 95, 100, or 101, or a fusion protein according to claim 98 or 99, for use in the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Filoviridae family.
110. Use of a nucleic acid according to claim 90 or 91, a composition according to claim 92, 93, 96, or 97, a combined preparation according to claim 94, 95, 100, or 101, or a fusion protein according to claim 98 or 99, in the manufacture of a medicament for the treatment of a viral infection, preferably a viral infection caused by an emerging or re-emerging virus, preferably a virus of the Filoviridae family.
Description
[0352] Embodiments of the invention are described, by way of illustration only, in the Examples below, with reference to the accompanying drawings in which:
[0353]
[0354]
[0355]
[0356]
[0357]
[0358]
[0359]
[0360]
[0361] Examples of unoptimized Ebola and Marburg viral ancestral nucleic acid sequences (i.e. sequences which have not been codon-optimized or gene-optimized) are given below, as well as gene-optimized nucleic acid sequences encoding candidate antigenic pathogen polypeptides.
[0362] Methodology
[0363] For a given virus species, candidate primary sequences are downloaded, for example, from GenBank (and from any other available sources, such as outbreak data), and are filtered to remove identical sequences, sequences that do not span the protein of interest, and sequences that have a high number of ambiguous nucleotides. A multiple sequence alignment of the filtered sequences is generated (typically using MAFFT), and checked manually to ensure that sequences are in the correct open reading frame. A maximum likelihood phylogeny is generated using IQTREE, with automated model selection, and rooted using one of several methods; an outgroup sequence, midpoint rooting, centre-of-the-tree, or a tree that maximises the association between root-to-tip distance and sampling time. Ancestral sequences are generated using HyPhy assuming a MG94 by F3x4 model of codon substitution, and are checked to ensure that known epitopes have been preserved. A phylogenetic tree with both primary and ancestral sequences is generated using IQTREE to check the placement of the ancestral strains. Ancestral sequences are then modified in a number of ways: deletion of regions (e.g. removal of the mucin-like domain); region swapping (to recover potential lost epitopes); mutation of specific sites (e.g. in the fusion domain of the filoviruses), including editing of N-linked glycosylation sites and introduction of mutations to enhance stability.
EXAMPLE 1
[0364] Ebola Sudan Ancestor (T2-4)
TABLE-US-00010 Unoptimised (SEQ ID NO: 1) ATGGGGGGTCTTAGCCTACTCCAATTGCCCAGGGACAAATTTCGGAAAAG CTCTTTCTTTGTTTGGGTCATCATCTTATTCCAAAAGGCCTTTTCCATGC CTTTGGGTGTTGTGACTAACAGCACTTTAGAAGTAACAGAGATTGACCAG CTAGTCTGCAAGGATCATCTTGCATCCACTGACCAGCTGAAATCAGTTGG TCTCAACCTCGAGGGGAGCGGAGTATCTACTGATATCCCATCTGCAACAA AGCGTTGGGGCTTCAGATCTGGTGTTCCTCCCAAGGTGGTCAGCTATGAA GCGGGAGAATGGGCTGAAAATTGCTACAATCTTGAAATAAAGAAGCCGGA CGGGAGCGAATGCTTACCCCCACCGCCAGATGGTGTCAGAGGCTTTCCAA GGTGCCGCTATGTTCACAAAGCCCAAGGAACCGGGCCCTGCCCAGGTGAC TACGCCTTTCACAAGGATGGAGCTTTCTTCCTCTATGACAGGCTGGCTTC AACTGTAATTTACAGAGGAGTCAATTTTGCTGAGGGGGTAATTGCATTCT TGATATTGGCTAAACCAAAAGAAACGTTCCTTCAGTCACCCCCCATTCGA GAGGCAGTAAACTACACTGAAAATACATCAAGTTATTATGCCACATCCTA CTTGGAGTATGAAATCGAAAATTTTGGTGCTCAACACTCCACGACCCTTT TCAAAATTGACAATAATACTTTTGTTCGTCTGGACAGGCCCCACACGCCT CAGTTCCTTTTCCAGCTGAATGATACCATTCACCTTCACCAACAGTTGAG CAACACAACTGGGAGACTAATTTGGACACTAGATGCTAATATCAATGCTG ATATTGGTGAATGGGCTTTTTGGGAAAATAAAAAAAATCTCTCCGAACAA CTACGTGGAGAAGAGCTGTCTTTCGAAGCTTTATCGCTCACAACAGCGGT TAAAACTGTCTTGCCACAGGAGTCCACAAGCAACGGTCTAATAACTTCAA CAGTAACAGGGATTCTTGGGAGTCTTGGGCTTCGAAAACGCAGCAGAAGA CAAGTTAACACCAAAGCCACGGGTAAATGCAATCCCAACTTACACTACTG GACTGCACAAGAACAACATAATGCTGCTGGGATTGCCTGGATCCCGTACT TTGGACCGGGTGCGGAAGGCATATACACTGAAGGCCTGATGCATAACCAA AATGCCTTAGTCTGTGGACTTAGGCAACTTGCAAATGAAACAACTCAAGC TCTGCAGCTTTTCTTAAGAGCCACAACGGAGCTGCGGACATATACCATAC TCAATAGGAAGGCCATAGATTTCCTTCTGCGACGATGGGGCGGGACATGC AGGATCCTGGGACCAGATTGTTGCATTGAGCCACATGATTGGACAAAAAA CATCACTGATAAAATCAACCAAATCATCCATGATTTCATCGACAACCCCT TACCTAATCAGGATAATGATGATAATTGGTGGACGGGCTGGAGACAGTGG ATCCCTGCAGGAATAGGCATTACTGGAATTATTATTGCAATTATTGCTCT TCTTTGCGTTTGCAAGCTGCTTTGCTAG Gene-optimised (SEQ ID NO: 2) ATGGGAGGACTGTCTCTGCTGCAACTGCCCCGGGACAAGTTCCGGAAGTC CAGCTTCTTCGTGTGGGTCATCATCCTGTTCCAGAAAGCCTTCAGCATGC CCCTGGGCGTCGTGACCAATAGCACACTGGAAGTGACCGAGATCGACCAG CTCGTGTGCAAGGATCACCTGGCCAGCACCGATCAGCTGAAGTCTGTGGG ACTGAATCTGGAAGGCAGCGGCGTGTCCACAGATATCCCTAGCGCCACCA AGAGATGGGGCTTTAGAAGCGGAGTGCCTCCTAAGGTGGTGTCTTATGAA GCCGGCGAGTGGGCCGAGAACTGCTACAACCTGGAAATCAAGAAGCCCGA CGGCAGCGAGTGTCTGCCTCCTCCACCTGATGGCGTCAGAGGCTTCCCTA GATGCAGATACGTGCACAAGGCCCAAGGCACAGGACCCTGTCCTGGCGAT TACGCCTTTCACAAGGACGGCGCCTTTTTCCTGTACGATCGGCTGGCCTC CACCGTGATCTACAGAGGCGTTAACTTTGCCGAGGGCGTGATCGCCTTCC TGATCCTGGCCAAGCCTAAAGAGACATTCCTGCAAAGCCCTCCAATCCGC GAGGCCGTGAACTACACAGAGAACACCAGCAGCTACTACGCCACCAGCTA CCTGGAATACGAGATCGAGAATTTCGGCGCCCAGCACAGCACCACACTGT TCAAGATCGACAACAACACCTTCGTGCGGCTGGACAGACCCCACACACCT CAGTTTCTGTTCCAGCTGAACGACACCATCCATCTGCATCAGCAGCTGAG CAACACCACCGGCAGACTGATTTGGACCCTGGACGCCAACATCAACGCCG ACATTGGAGAGTGGGCCTTTTGGGAGAACAAGAAGAACCTGAGCGAACAG CTGAGAGGCGAGGAACTGAGCTTTGAGGCCCTGTCTCTGACCACCGCCGT GAAAACAGTGCTGCCTCAAGAGTCCACCAGCAACGGCCTGATCACAAGCA CAGTGACAGGCATCCTGGGCAGCCTGGGCCTGAGAAAAAGGTCCAGACGG CAAGTGAATACCAAGGCCACCGGCAAGTGCAACCCCAACCTGCACTATTG GACAGCCCAAGAGCAGCACAATGCCGCCGGAATCGCCTGGATTCCTTATT TTGGACCTGGCGCCGAGGGCATCTATACCGAGGGACTGATGCACAACCAG AACGCCCTCGTGTGTGGACTGAGACAGCTGGCCAATGAGACAACACAGGC CCTCCAGCTGTTTCTGAGAGCCACCACCGAGCTGAGAACCTACACCATCC TGAACCGGAAGGCCATCGACTTTCTGCTGAGAAGATGGGGCGGCACCTGT AGAATCCTGGGACCTGATTGCTGCATCGAGCCCCACGACTGGACCAAGAA CATCACCGACAAGATCAACCAGATCATCCACGACTTCATCGACAACCCTC TGCCTAACCAGGACAACGACGACAATTGGTGGACAGGCTGGCGGCAGTGG ATTCCTGCCGGAATTGGCATCACCGGCATCATCATTGCCATTATCGCCCT GCTGTGTGTGTGCAAGCTGCTGTGTTGA Amino acid sequence encoded by unoptimised and gene-optimised sequences (SEQ ID NO: 3): MGGLSLLQLPRDKERKSSFEVWVIILFQKAFSMPLGVVTNSTLEVTEIDQ LVCKDHLASTDQLKSVGLNLEGSGVSTDIPSATKRWGFRSGVPPKVVSYE AGEWAENCYNLEIKKPDGSECLPPPPDGVRGFPRCRYVHKAQGTGPCPGD YAFHKDGAFFLYDRLASTVIYRGVNFAEGVIAFLILAKPKETFLQSPPIR EAVNYTENTSSYYATSYLEYEIENFGAQHSTTLFKIDNNTEVRLDRPHTP QFLFQLNDTIHLHQQLSNTTGRLIWTLDANINADIGEWAFWENKKNLSEQ LRGEELSFEALSLTTAVKTVLPQESTSNGLITSTVTGILGSLGLRKRSRR QVNTKATGKCNPNLHYWTAQEQHNAAGIAWIPYFGPGAEGIYTEGLMHNQ NALVCGLRQLANETTQALQLFLRATTELRTYTILNRKAIDFLLRRWGGTC RILGPDCCIEPHDWTKNITDKINQIIHDFIDNPLPNQDNDDNWWTGWRQW IPAGIGITGIIIAIIALLCVCKLLC
EXAMPLE 2
[0365] Ebolavirus Global Ancestor (T2-6)
TABLE-US-00011 Unoptimised (SEQ ID NO: 4) ATGGGGGGTGGATCCAGACTTCTCCAATTGCCCCGGGAACGCTTTCGGAA AACCTCATTCTTTGTTTGGGTAATCATCCTATTCCAAAAAGCCTTTTCCA TGCCATTGGGTGTTGTAACCAACAGCACTCTAAAAGTAACAGAAATTGAC CAATTGGTTTGCCGGGACAAACTTTCATCCACAAGTCAGCTGAAATCAGT TGGGCTGAATCTGGAAGGGAATGGAGTTGCAACTGATGTCCCATCAGCAA CAAAACGATGGGGCTTCCGATCTGGTGTTCCTCCCAAGGTGGTCAGCTAT GAAGCTGGAGAATGGGCTGAAAATTGCTACAATCTGGAAATCAAGAAGCC AGACGGGAGTGAATGCCTACCTCCACCGCCAGACGGTGTAAGAGGCTTCC CCAGGTGCCGCTATGTCCACAAAGTTCAAGGAACAGGGCCGTGTCCTGGT GACTTCGCCTTCCACAAAGATGGAGCTTTCTTCCTGTATGATAGACTGGC TTCAACTGTCATTTACCGAGGGACAACTTTTGCTGAAGGTGTCGTTGCAT TTTTGATCCTGCCCAAACCTAAAAAGGACTTTTTCCAATCACCCCCAATA CGTGAGCCGGTAAACACCACAGAAGATCCATCAAGTTACTACACCACATC AACACTTAGCTATGAGATTGACAATTTTGGGGCCAATAAAACTAAAACTC TTTTCAAAGTTGACAATCACACTTATGTGCAACTAGACCGACCACACACA CCACAGTTCCTTGTCCAGCTCAATGAAACCATTCATACAAATAACCGTCT AAGCAACACCACAGGGAGACTAATTTGGACATTAGATCCTAAAATTGATA CCGACATTGGTGAGTGGGCCTTCTGGGAAAATAAAAAAAACTTCTCCAAA CAACTTCGTGGAGAAGAGTTGTCTTTCAAAGCTCTATCAACAAAAACTGG AGCTAACGCAGTAGACACTGACGAATCAAGCAAACCTGGCCTAATTACCA ACACAGTAAGAGGGGTTGCTGATTTACTGAGCCCTTGGAGAAGAAAAAGA AGACAAGTCAACCCAAACACAACAAATAAATGCAACCCAAACCTACACTA TTGGACAGCCCAAGATGAAGGTGCTGCCGTTGGATTAGCCTGGATCCCAT ACTTCGGACCAGCAGCAGAAGGCATTTACACTGAAGGAATAATGCATAAT CAAAATGGGTTAATCTGTGGGCTGAGGCAGCTGGCCAATGAAACGACTCA AGCTCTTCAATTATTCTTGAGGGCCACAACGGAGCTGCGGACTTACTCTA TACTCAATAGAAAAGCCATTGATTTCCTTCTCCAACGATGGGGAGGAACA TGCCGCATCTTAGGACCAGATTGTTGCATTGAGCCACATGATTGGACAAA AAACATTACTGATAAAATTAACCAAATCATACATGATTTTATTGACAACC CTCTACCAGATCAGGACGATGATGACAATTGGTGGACAGGCTGGAGACAA TGGATCCCTGCTGGAATTGGAATTACTGGAGTTATAATTGCAATTATAGC TCTACTTTGTATTTGCAAGTTTCTGTGTTAG Gene-optimised (SEQ ID NO: 5) ATGGGCGGAGGATCTAGACTGCTGCAACTGCCCAGAGAGCGGTTCAGAAA GACCAGCTTCTTCGTGTGGGTCATCATCCTGTTCCAGAAAGCCTTCAGCA TGCCCCTGGGCGTCGTGACCAATAGCACCCTGAAAGTGACCGAGATCGAC CAGCTCGTGTGCAGAGATAAGCTGAGCAGCACCAGCCAGCTGAAGTCCGT GGGACTGAATCTGGAAGGCAATGGCGTGGCCACAGATGTGCCTAGCGCCA CCAAAAGATGGGGCTTTAGAAGCGGCGTGCCACCTAAGGTGGTGTCTTAT GAAGCCGGCGAGTGGGCCGAGAACTGCTACAACCTGGAAATCAAGAAGCC CGACGGCAGCGAGTGTCTGCCTCCTCCACCTGATGGCGTCAGAGGCTTCC CTAGATGCAGATACGTGCACAAGGTGCAAGGCACAGGCCCCTGTCCTGGC GATTTCGCCTTTCACAAGGACGGCGCCTTTTTCCTGTACGATCGGCTGGC CTCCACCGTGATCTACAGAGGCACAACATTTGCCGAAGGCGTGGTGGCCT TCCTGATCCTGCCTAAGCCTAAGAAGGACTTCTTTCAGAGCCCTCCTATC CGCGAGCCTGTGAACACAACAGAGGACCCCAGCAGCTACTACACCACCAG CACACTGAGCTACGAGATCGATAACTTCGGCGCCAACAAGACCAAGACAC TGTTCAAGGTGGACAACCACACCTACGTGCAGCTGGACAGACCCCACACA CCTCAGTTTCTGGTGCAGCTGAACGAGACAATCCACACCAACAACAGACT GAGCAACACCACCGGCAGGCTGATCTGGACCCTGGATCCTAAGATCGACA CCGACATCGGAGAGTGGGCCTTTTGGGAGAACAAGAAGAACTTCAGCAAG CAGCTGAGAGGCGAGGAACTGAGCTTTAAGGCCCTGAGCACCAAGACAGG CGCCAACGCTGTGGATACCGATGAGTCTAGCAAGCCCGGCCTGATCACCA ACACAGTTAGAGGCGTTGCCGACCTGCTGAGCCCTTGGAGAAGAAAGCGG AGACAAGTGAACCCCAATACCACCAACAAGTGCAACCCTAACCTGCACTA CTGGACAGCCCAGGATGAAGGCGCTGCTGTTGGACTGGCCTGGATTCCTT ATTTTGGACCTGCCGCCGAGGGCATCTACACAGAGGGAATCATGCACAAC CAGAATGGCCTGATCTGCGGCCTGAGACAGCTGGCCAATGAGACAACACA GGCCCTCCAGCTGTTTCTGAGAGCCACCACCGAGCTGAGAACCTACAGCA TCCTGAACCGGAAGGCCATCGACTTTCTGCTGCAAAGATGGGGAGGCACC TGTAGAATCCTGGGACCTGATTGCTGCATCGAGCCCCACGACTGGACCAA GAACATCACCGACAAGATCAACCAGATCATCCACGACTTCATCGACAACC CTCTGCCTGACCAGGACGACGACGATAATTGGTGGACAGGATGGCGGCAG TGGATTCCTGCCGGAATCGGAATCACAGGCGTGATCATTGCCATTATCGC CCTGCTGTGCATCTGCAAGTTTCTGTGCTGA Amino acid sequence encoded by unoptimised and gene-optimised sequences (SEQ ID NO: 6): MGGGSRLLQLPRERFRKTSFFVWVIILFQKAFSMPLGVVTNSTLKVTEID QLVCRDKLSSTSQLKSVGLNLEGNGVATDVPSATKRWGFRSGVPPKVVSY EAGEWAENCYNLEIKKPDGSECLPPPPDGVRGFPRCRYVHKVQGTGPCPG DFAFHKDGAFFLYDRLASTVIYRGTTFAEGVVAFLILPKPKKDFFQSPPI REPVNTTEDPSSYYTTSTLSYEIDNFGANKTKTLFKVDNHTYVQLDRPHT PQFLVQLNETIHTNNRLSNTTGRLIWTLDPKIDTDIGEWAFWENKKNFSK QLRGEELSFKALSTKTGANAVDTDESSKPGLITNTVRGVADLLSPWRRKR RQVNPNTTNKCNPNLHYWTAQDEGAAVGLAWIPYFGPAAEGIYTEGIMHN QNGLICGLRQLANETTQALQLFLRATTELRTYSILNRKAIDFLLQRWGGT CRILGPDCCIEPHDWTKNITDKINQIIHDFIDNPLPDQDDDDNWWTGWRQ WIPAGIGITGVIIAIIALLCICKFLC
EXAMPLE 3
[0366] Marburgvirus Ancestor (T2-11)
TABLE-US-00012 Unoptimised (SEQ ID NO: 7) ATGAAGACCATATATTTTCTGATTAGTCTCATTTTAATCCAAAGTATAAA AACTCTCCCTGTTTTAGAAATTGCTAGTAACAGCCAACCTCAAGATGTAG ATTCAGTGTGCTCCGGAACCCTCCAAAAGACAGAAGATGTTCATCTGATG GGATTTACACTGAGTGGGCAAAAAGTTGCTGATTCCCCTTTGGAAGCATC TAAACGATGGGCTTTCAGGACAGGTGTTCCTCCCAAGAACGTTGAGTATA CGGAAGGAGAAGAAGCCAAAACATGTTACAATATAAGTGTAACAGACCCT TCTGGAAAATCCTTGCTGCTGGATCCTCCCAGTAATATCCGCGATTACCC TAAATGTAAAACTGTTCATCATATTCAAGGTCAAAACCCTCATGCACAGG GGATTGCCCTCCATTTGTGGGGGGCATTTTTCCTGTATGATCGCATTGCC TCCACAACAATGTACCGAGGCAAAGTCTTCACTGAAGGGAACATAGCAGC TATGATTGTCAATAAGACAGTGCACAAAATGATTTTCTCGAGGCAAGGAC AAGGGTACCGTCACATGAATCTGACTTCTACTAATAAATATTGGACAAGT AGCAACGGAACGCAAACGAATGACACTGGATGCTTCGGTGCTCTTCAAGA ATACAATTCTACGAAGAACCAAACATGTGCTCCGTCCAAAATACCTCCAC CACTGCCCACAGCCCGTCCGGAGATCAAACCCACAAGCACCCCAACTGAT GCCACCAAACTCAACACCACAGACCCAAACAGTGATGATGAGGACCTCAC AACATCCGGCTCAGGGTCCGGAGAACAGGAACCCTACACAACTTCTGATG CGGTCACTAAGCAAGGGCTTTCATCAACAATGCCACCCACTCCCTCACCA CAACCAAGCACGCCACAGCAAGGAGGAAACAACACAAACCATTCCCAAGG TGCTGTGACTGAACCCGACAAAACCAACACAACTGCACAACCGTCCATGC CCCCCCACAACACTACTACAATCTCTACTAACAACACCTCCAAGCACAAC TTCAGCACTCTCTCTGCACCACTACAAAACACCACCAATTACAACACACA GAGCACGGCCACTGAAAATGAGCAAACCAGTGCCCCCTCGAAAACAACCC TGCCTCCAACAGGAAATCCTACCACAGCAAAGAGCACCAACAGCACAAAA GGCCCCACCACAACGGCACCAAATACGACAAATGGGCATTTCACCAGTCC CTCCCCCACCCCCAACTCGACTACACAACATCTTGTATATTTCAGAAGGA AACGAAGTATCCTCTGGAGGGAAGGCGACATGTTCCCTTTTTTAGATGGG TTAATAAATACTGAAATTGATTTTGATCCAATCCCAAACACAGAAACAAT CTTTGATGAATCCCCCAGCTTTAATACTTCAACTAATGAGGAACAACACA CTCCCCCGAATATCAGTTTAACTTTCTCTTATTTTCCTGATAAAAATGGA GATACTGCCTACTCTGGGGAAAACGAGAATGATTGTGATGCAGAGTTGAG GATTTGGAGTGTGCAGGAGGACGATTTGGCGGCAGGGCTTAGCTGGATAC CATTTTTTGGCCCTGGAATCGAAGGACTCTATACTGCCGGTTTAATCAAA AATCAGAACAATTTAGTTTGTAGGTTGAGGCGCTTAGCTAATCAAACTGC TAAATCCTTGGAGCTCTTGTTAAGGGTCACAACCGAGGAAAGGACATTTT CCTTAATCAATAGGCATGCAATTGACTTTTTGCTTACGAGGTGGGGCGGA ACATGCAAGGTGCTAGGACCTGATTGTTGCATAGGAATAGAAGATCTATC TAAAAATATCTCAGAACAAATTGACAAAATCAGAAAGGATGAACAAAAGG AGGAAACTGGCTGGGGTCTAGGTGGCAAATGGTGGACATCTGACTGGGGT GTTCTCACCAATTTGGGCATCCTGCTACTATTATCTATAGCTGTTCTGAT TGCTCTGTCCTGTATCTGTCGTATCTTCACTAAATATATCGGATAG Gene-optimised (SEQ ID NO: 8) ATGAAGACCATCTACTTTCTGATCAGCCTGATCCTGATCCAGAGCATCAA GACCCTGCCTGTGCTGGAAATCGCCAGCAACAGTCAGCCCCAGGATGTGG ATAGCGTGTGTAGCGGCACCCTCCAGAAAACCGAGGATGTGCACCTGATG GGCTTTACCCTGAGCGGCCAGAAAGTGGCCGATTCTCCACTGGAAGCCAG CAAGAGATGGGCCTTTAGAACCGGCGTGCCACCTAAGAACGTCGAGTACA CAGAGGGCGAAGAGGCCAAGACCTGCTACAACATCAGCGTGACCGATCCT AGCGGCAAGAGCCTGCTGCTGGACCCTCCTAGCAACATCAGAGACTACCC CAAGTGCAAGACCGTGCACCACATCCAGGGACAGAATCCCCATGCTCAGG GAATTGCCCTGCACCTGTGGGGCGCCTTTTTCCTGTATGATCGGATCGCC TCCACCACCATGTACAGAGGCAAAGTGTTCACCGAGGGCAATATCGCCGC CATGATCGTGAACAAGACAGTGCACAAGATGATCTTCAGCCGGCAAGGCC AGGGCTACAGACACATGAATCTGACCAGCACCAACAAGTACTGGACCAGC AGCAACGGCACCCAGACCAATGATACAGGCTGCTTTGGCGCCCTGCAAGA GTACAACAGCACCAAGAATCAGACATGCGCCCCTAGCAAGATCCCTCCTC CACTGCCTACTGCCAGACCTGAGATCAAGCCTACCAGCACACCTACCGAC GCCACCAAGCTGAACACCACCGATCCAAACAGCGACGACGAGGATCTGAC AACAAGCGGATCTGGCTCTGGCGAGCAAGAGCCATACACCACCTCTGATG CCGTGACAAAGCAGGGCCTGAGCAGCACAATGCCTCCAACACCTTCTCCA CAGCCTAGCACACCTCAGCAAGGCGGCAACAACACAAATCACTCTCAGGG CGCCGTGACCGAGCCTGACAAGACAAATACCACAGCTCAGCCCAGCATGC CTCCTCACAACACCACCACAATCTCCACCAACAACACCAGCAAGCACAAC TTCAGCACACTGAGCGCCCCTCTCCAGAATACCACCAACTACAATACCCA GAGCACCGCCACCGAGAACGAGCAGACATCTGCCCCTTCTAAGACCACAC TGCCACCTACCGGCAATCCTACCACCGCCAAGAGCACCAATAGCACAAAG GGCCCTACCACCACCGCTCCTAACACCACAAATGGCCACTTCACAAGCCC AAGTCCTACACCTAACAGCACAACCCAGCACCTGGTGTACTTCAGACGGA AGCGGAGCATCCTTTGGCGCGAGGGCGATATGTTCCCTTTCCTGGACGGC CTGATCAACACCGAGATCGACTTCGACCCCATTCCAAACACCGAAACCAT CTTCGACGAGAGCCCCAGCTTCAACACCTCCACCAATGAGGAACAGCACA CCCCTCCAAACATCTCCCTGACCTTCAGCTACTTCCCCGACAAGAACGGC GATACAGCCTACAGCGGCGAGAATGAGAATGACTGCGACGCCGAGCTGCG GATTTGGAGCGTTCAAGAGGATGATCTGGCTGCCGGCCTGAGCTGGATCC CTTTTTTTGGACCTGGCATCGAGGGCCTGTACACCGCCGGACTGATCAAG AACCAGAACAACCTCGTGTGCAGACTGCGGAGACTGGCCAATCAGACCGC CAAGTCTCTGGAACTGCTGCTGCGCGTGACCACCGAGGAAAGAACCTTCT CTCTGATCAACCGGCACGCCATCGATTTTCTGCTGACCAGATGGGGCGGC ACCTGTAAAGTTCTGGGCCCTGATTGCTGCATCGGAATCGAGGACCTGAG CAAGAACATCTCCGAGCAGATCGACAAGATCCGCAAGGACGAGCAGAAAG AGGAAACAGGCTGGGGACTCGGCGGCAAGTGGTGGACATCTGATTGGGGC GTGCTGACCAATCTGGGAATCCTGCTGCTCCTGTCTATCGCCGTGCTGAT CGCCCTGAGCTGCATCTGCCGGATCTTCACCAAGTACATCGGCTGA Amino acid sequence encoded by unoptimised and gene-optimised sequences (SEQ ID NO: 9): MKTIYFLISLILIQSIKTLPVLEIASNSQPQDVDSVCSGTLQKTEDVHLM GFTLSGQKVADSPLEASKRWAFRTGVPPKNVEYTEGEEAKTCYNISVTDP SGKSLLLDPPSNIRDYPKCKTVHHIQGQNPHAQGIALHLWGAFFLYDRIA STTMYRGKVFTEGNIAAMIVNKTVHKMIFSRQGQGYRHMNLTSTNKYWTS SNGTQTNDTGCFGALQEYNSTKNQTCAPSKIPPPLPTARPEIKPTSTPTD ATKLNTTDPNSDDEDLTTSGSGSGEQEPYTTSDAVTKQGLSSTMPPTPSP QPSTPQQGGNNTNHSQGAVTEPDKTNTTAQPSMPPHNTTTISTNNTSKHN FSTLSAPLQNTTNYNTQSTATENEQTSAPSKTTLPPTGNPTTAKSTNSTK GPTTTAPNTTNGHFTSPSPTPNSTTQHLVYFRRKRSILWREGDMFPFLDG LINTEIDFDPIPNTETIFDESPSFNTSTNEEQHTPPNISLTFSYFPDKNG DTAYSGENENDCDAELRIWSVQEDDLAAGLSWIPFFGPGIEGLYTAGLIK NQNNLVCRLRRLANQTAKSLELLLRVTTEERTFSLINRHAIDFLLTRWGG TCKVLGPDCCIGIEDLSKNISEQIDKIRKDEQKEETGWGLGGKWWTSDWG VLTNLGILLLLSIAVLIALSCICRIFTKYIG
EXAMPLE 4
[0367] Tier 2-4 (SUDV anc -MLD)
[0368] Sudan ebolavirus ancestral sequences with deleted (minus “−”) mucin-like domain
TABLE-US-00013 Nucleotide sequence (SEQ ID NO: 10): atgggaggac tgtctctgct gcaactgccc cgggacaagt tccggaagtc cagcttcttc 60 gtgtgggtca tcatcctgtt ccagaaagcc ttcagcatgc ccctgggcgt cgtgaccaat 120 agcacactgg aagtgaccga gatcgaccag ctcgtgtgca aggatcacct ggccagcacc 180 gatcagctga agtctgtggg actgaatctg gaaggcagcg gcgtgtccac agatatccct 240 agcgccacca agagatgggg ctttagaagc ggagtgcctc ctaaggtggt gtcttatgaa 300 gccggcgagt gggccgagaa ctgctacaac ctggaaatca agaagcccga cggcagcgag 360 tgtctgcctc ctccacctga tggcgtcaga ggcttcccta gatgcagata cgtgcacaag 420 gcccaaggca caggaccctg tcctggcgat tacgcctttc acaaggacgg cgcctttttc 480 ctgtacgatc ggctggcctc caccgtgatc tacagaggcg ttaactttgc cgagggcgtg 540 atcgccttcc tgatcctggc caagcctaaa gagacattcc tgcaaagccc tccaatccgc 600 gaggccgtga actacacaga gaacaccagc agctactacg ccaccagcta cctggaatac 660 gagatcgaga atttcggcgc ccagcacagc accacactgt tcaagatcga caacaacacc 720 ttcgtgcggc tggacagacc ccacacacct cagtttctgt tccagctgaa cgacaccatc 780 catctgcatc agcagctgag caacaccacc ggcagactga tttggaccct ggacgccaac 840 atcaacgccg acattggaga gtgggccttt tgggagaaca agaagaacct gagcgaacag 900 ctgagaggcg aggaactgag ctttgaggcc ctgtctctga ccaccgccgt gaaaacagtg 960 ctgcctcaag agtccaccag caacggcctg atcacaagca cagtgacagg catcctgggc 1020 agcctgggcc tgagaaaaag gtccagacgg caagtgaata ccaaggccac cggcaagtgc 1080 aaccccaacc tgcactattg gacagcccaa gagcagcaca atgccgccgg aatcgcctgg 1140 attccttatt ttggacctgg cgccgagggc atctataccg agggactgat gcacaaccag 1200 aacgccctcg tgtgtggact gagacagctg gccaatgaga caacacaggc cctccagctg 1260 tttctgagag ccaccaccga gctgagaacc tacaccatcc tgaaccggaa ggccatcgac 1320 tttctgctga gaagatgggg cggcacctgt agaatcctgg gacctgattg ctgcatcgag 1380 ccccacgact ggaccaagaa catcaccgac aagatcaacc agatcatcca cgacttcatc 1440 gacaaccctc tgcctaacca ggacaacgac gacaattggt ggacaggctg gcggcagtgg 1500 attcctgccg gaattggcat caccggcatc atcattgcca ttatcgccct gctgtgtgtg 1560 tgcaagctgc tgtgttga 1578 Amino acid sequence (SEQ ID NO: 11): MGGLSLLQLPRDKFRKSSFFVWVIILFQKAFSMPLGVVTNSTLEVTEIDQ 50 LVCKDHLASTDQLKSVGLNLEGSGVSTDIPSATKRWGFRSGVPPKVVSYE 100 AGEWAENCYNLEIKKPDGSECLPPPPDGVRGFPRCRYVHKAQGTGPCPGD 150 YAFHKDGAFFLYDRLASTVIYRGVNFAEGVIAFLILAKPKETFLQSPPIR 200 EAVNYTENTSSYYATSYLEYEIENFGAQHSTTLFKIDNNTFVRLDRPHTP 250 QFLFQLNDTIHLHQQLSNTTGRLIWTLDANINADIGEWAFWENKKNLSEQ 300 LRGEELSFEALSLTTAVKTVLPQESTSNGLITSTVTGILGSLGLRKRSRR 350 QVNTKATGKCNPNLHYWTAQEQHNAAGIAWIPYFGPGAEGIYTEGLMHNQ 400 NALVCGLRQLANETTQALQLFLRATTELRTYTILNRKAIDFLLRRWGGTC 450 RILGPDCCIEPHDWTKNITDKINQIIHDFIDNPLPNQDNDDNWWTGWRQW 500 IPAGIGITGIIIAIIALLCVCKLLC*
EXAMPLE 5
[0369] Tier 2-6 (SUDV EBOV-TAFV-BDBV anc -MLD)
[0370] Ancestral sequence to the four species Sudan, Zaire, Tai Forest, and Bundibugyo ebolavirus with the mucin-like-domain deleted.
TABLE-US-00014 Nucleotide sequence (SEQ ID NO: 12): atgggcggag gatctagact gctgcaactg cccagagagc ggttcagaaa gaccagcttc 60 ttcgtgtggg tcatcatcct gttccagaaa gccttcagca tgcccctggg cgtcgtgacc 120 aatagcaccc tgaaagtgac cgagatcgac cagctcgtgt gcagagataa gctgagcagc 180 accagccagc tgaagtccgt gggactgaat ctggaaggca atggcgtggc cacagatgtg 240 cctagcgcca ccaaaagatg gggctttaga agcggcgtgc cacctaaggt ggtgtcttat 300 gaagccggcg agtgggccga gaactgctac aacctggaaa tcaagaagcc cgacggcagc 360 gagtgtctgc ctcctccacc tgatggcgtc agaggcttcc ctagatgcag atacgtgcac 420 aaggtgcaag gcacaggccc ctgtcctggc gatttcgcct ttcacaagga cggcgccttt 480 ttcctgtacg atcggctggc ctccaccgtg atctacagag gcacaacatt tgccgaaggc 540 gtggtggcct tcctgatcct gcctaagcct aagaaggact tctttcagag ccctcctatc 600 cgcgagcctg tgaacacaac agaggacccc agcagctact acaccaccag cacactgagc 660 tacgagatcg ataacttcgg cgccaacaag accaagacac tgttcaaggt ggacaaccac 720 acctacgtgc agctggacag accccacaca cctcagtttc tggtgcagct gaacgagaca 780 atccacacca acaacagact gagcaacacc accggcaggc tgatctggac cctggatcct 840 aagatcgaca ccgacatcgg agagtgggcc ttttgggaga acaagaagaa cttcagcaag 900 cagctgagag gcgaggaact gagctttaag gccctgagca ccaagacagg cgccaacgct 960 gtggataccg atgagtctag caagcccggc ctgatcacca acacagttag aggcgttgcc 1020 gacctgctga gcccttggag aagaaagcgg agacaagtga accccaatac caccaacaag 1080 tgcaacccta acctgcacta ctggacagcc caggatgaag gcgctgctgt tggactggcc 1140 tggattcctt attttggacc tgccgccgag ggcatctaca cagagggaat catgcacaac 1200 cagaatggcc tgatctgcgg cctgagacag ctggccaatg agacaacaca ggccctccag 1260 ctgtttctga gagccaccac cgagctgaga acctacagca tcctgaaccg gaaggccatc 1320 gactttctgc tgcaaagatg gggaggcacc tgtagaatcc tgggacctga ttgctgcatc 1380 gagccccacg actggaccaa gaacatcacc gacaagatca accagatcat ccacgacttc 1440 atcgacaacc ctctgcctga ccaggacgac gacgataatt ggtggacagg atggcggcag 1500 tggattcctg ccggaatcgg aatcacaggc gtgatcattg ccattatcgc cctgctgtgc 1560 atctgcaagt ttctgtgctg a 1581 Amino acid sequence (SEQ ID NO: 13): MGGGSRLLQLPRERFRKTSFFVWVIILFQKAFSMPLGVVTNSTLKVTEID 50 QLVCRDKLSSTSQLKSVGLNLEGNGVATDVPSATKRWGFRSGVPPKVVSY 100 EAGEWAENCYNLEIKKPDGSECLPPPPDGVRGFPRCRYVHKVQGTGPCPG 150 DFAFHKDGAFFLYDRLASTVIYRGTTFAEGVVAFLILPKPKKDFFQSPPI 200 REPVNTTEDPSSYYTTSTLSYEIDNFGANKTKTLFKVDNHTYVQLDRPHT 250 PQFLVQLNETIHTNNRLSNTTGRLIWTLDPKIDTDIGEWAFWENKKNFSK 300 QLRGEELSFKALSTKTGANAVDTDESSKPGLITNTVRGVADLLSPWRRKR 350 RQVNPNTTNKCNPNLHYWTAQDEGAAVGLAWIPYFGPAAEGIYTEGIMHN 400 QNGLICGLRQLANETTQALQLFLRATTELRTYSILNRKAIDFLLQRWGGT 450 CRILGPDCCIEPHDWTKNITDKINQIIHDFIDNPLPDQDDDDNWWTGWRQ 500 WIPAGIGITGVIIAIIALLCICKFLC*
EXAMPLE 6
[0371] Tier 2-11 (RAVV MARV anc)
[0372] Ancestral sequence to the strains Marburg Virus and Ravn Virus
TABLE-US-00015 Nucleotide sequence (SEQ ID NO: 14): atgaagacca tctactttct gatcagcctg atcctgatcc agagcatcaa gaccctgcct 60 gtgctggaaa tcgccagcaa cagtcagccc caggatgtgg atagcgtgtg tagcggcacc 120 ctccagaaaa ccgaggatgt gcacctgatg ggctttaccc tgagcggcca gaaagtggcc 180 gattctccac tggaagccag caagagatgg gcctttagaa ccggcgtgcc acctaagaac 240 gtcgagtaca cagagggcga agaggccaag acctgctaca acatcagcgt gaccgatcct 300 agcggcaaga gcctgctgct ggaccctcct agcaacatca gagactaccc caagtgcaag 360 accgtgcacc acatccaggg acagaatccc catgctcagg gaattgccct gcacctgtgg 420 ggcgcctttt tcctgtatga tcggatcgcc tccaccacca tgtacagagg caaagtgttc 480 accgagggca atatcgccgc catgatcgtg aacaagacag tgcacaagat gatcttcagc 540 cggcaaggcc agggctacag acacatgaat ctgaccagca ccaacaagta ctggaccagc 600 agcaacggca cccagaccaa tgatacaggc tgctttggcg ccctgcaaga gtacaacagc 660 accaagaatc agacatgcgc ccctagcaag atccctcctc cactgcctac tgccagacct 720 gagatcaagc ctaccagcac acctaccgac gccaccaagc tgaacaccac cgatccaaac 780 agcgacgacg aggatctgac aacaagcgga tctggctctg gcgagcaaga gccatacacc 840 acctctgatg ccgtgacaaa gcagggcctg agcagcacaa tgcctccaac accttctcca 900 cagcctagca cacctcagca aggcggcaac aacacaaatc actctcaggg cgccgtgacc 960 gagcctgaca agacaaatac cacagctcag cccagcatgc ctcctcacaa caccaccaca 1020 atctccacca acaacaccag caagcacaac ttcagcacac tgagcgcccc tctccagaat 1080 accaccaact acaataccca gagcaccgcc accgagaacg agcagacatc tgccccttct 1140 aagaccacac tgccacctac cggcaatcct accaccgcca agagcaccaa tagcacaaag 1200 ggccctacca ccaccgctcc taacaccaca aatggccact tcacaagccc aagtcctaca 1260 cctaacagca caacccagca cctggtgtac ttcagacgga agcggagcat cctttggcgc 1320 gagggcgata tgttcccttt cctggacggc ctgatcaaca ccgagatcga cttcgacccc 1380 attccaaaca ccgaaaccat cttcgacgag agccccagct tcaacacctc caccaatgag 1440 gaacagcaca cccctccaaa catctccctg accttcagct acttccccga caagaacggc 1500 gatacagcct acagcggcga gaatgagaat gactgcgacg ccgagctgcg gatttggagc 1560 gttcaagagg atgatctggc tgccggcctg agctggatcc ctttttttgg acctggcatc 1620 gagggcctgt acaccgccgg actgatcaag aaccagaaca acctcgtgtg cagactgcgg 1680 agactggcca atcagaccgc caagtctctg gaactgctgc tgcgcgtgac caccgaggaa 1740 agaaccttct ctctgatcaa ccggcacgcc atcgattttc tgctgaccag atggggcggc 1800 acctgtaaag ttctgggccc tgattgctgc atcggaatcg aggacctgag caagaacatc 1860 tccgagcaga tcgacaagat ccgcaaggac gagcagaaag aggaaacagg ctggggactc 1920 ggcggcaagt ggtggacatc tgattggggc gtgctgacca atctgggaat cctgctgctc 1980 ctgtctatcg ccgtgctgat cgccctgagc tgcatctgcc ggatcttcac caagtacatc 2040 ggctga 2046 Amino acid sequence (SEQ ID NO: 15): MKTIYFLISLILIQSIKTLPVLEIASNSQPQDVDSVCSGTLQKTEDVHLM 50 GFTLSGQKVADSPLEASKRWAFRTGVPPKNVEYTEGEEAKTCYNISVTDP 100 SGKSLLLDPPSNIRDYPKCKTVHHIQGQNPHAQGIALHLWGAFFLYDRIA 150 STTMYRGKVFTEGNIAAMIVNKTVHKMIFSRQGQGYRHMNLTSTNKYWTS 200 SNGTQTNDTGCFGALQEYNSTKNQTCAPSKIPPPLPTARPEIKPTSTPTD 250 ATKLNTTDPNSDDEDLTTSGSGSGEQEPYTTSDAVTKQGLSSTMPPTPSP 300 QPSTPQQGGNNTNHSQGAVTEPDKTNTTAQPSMPPHNTTTISTNNTSKHN 350 FSTLSAPLQNTTNYNTQSTATENEQTSAPSKTTLPPTGNPTTAKSTNSTK 400 GPTTTAPNTTNGHFTSPSPTPNSTTQHLVYFRRKRSILWREGDMFPFLDG 450 LINTEIDFDPIPNTETIFDESPSFNTSTNEEQHTPPNISLTFSYFPDKNG 500 DTAYSGENENDCDAELRIWSVQEDDLAAGLSWIPFFGPGIEGLYTAGLIK 550 NQNNLVCRLRRLANQTAKSLELLLRVTTEERTFSLINRHAIDFLLTRWGG 600 TCKVLGPDCCIGIEDLSKNISEQIDKIRKDEQKEETGWGLGGKWWTSDWG 650 VLTNLGILLLLSIAVLIALSCICRIFTKYIG*
EXAMPLE 7
[0373] pEVAC Expression Vector
[0374]
TABLE-US-00016 Sequence of pEVAC Multiple Cloning Site (MCS) (SEQ ID NO: 16):
EXAMPLE 8
[0375] Lead Candidate Optimized Antigenic Ebola Polypeptides Able to Induce a Broadly Neutralizing Antibody Response
[0376] There was a significant interest to develop vaccines against Ebola followed the West African outbreak in 2014. Programmes currently in clinical development have so far taken a ‘classical’ approach to vaccine development using Ebola and/or Marburg virus surface glycoproteins (GPs) from one to three strains expressed in a viral vector backbone. Antigen specificity comes only from the included EBOV strains: for example Merck use a GP from Kikwit; GSK use Mayinga EBOV and Gulu SUDV strains; Crucell and Profectus Biosiences both use a Marburg virus together with Zaire and Sudan Ebola strains; with the Novavax approach being unique in using the 2014 Makona EBOV strain.
[0377] Table 1 below shows flow cytometric assay results illustrating the strength of antibody binding to target antigens, representative of all Ebola virus species (subtypes) and Marburg viruses. Strength of binding is indicated by the heat-map where red (the darkest shading when viewed in grayscale) is very strong binding, decreasing through orange to yellow (progressively lighter shading when viewed in grayscale) and no binding/equal to negative control values are white. Serum samples 1-22 were taken from individuals immunised with other Ebola virus vaccine candidates. T2-4 and T2-6 are nucleic acid vaccines encoding lead candidate optimized antigenic Ebola polypeptide, combined with T2-11 a Marburg candidate, at pre-clinical stage testing with serum samples taken from immunised guinea pigs.
EXAMPLE 9
[0378] Protection Achieved by a Trivalent Lassa, Ebola and Marburg Viral Vaccine (Tri-LEMvac) in an Ebola Challenge Model
[0379] We have developed a trivalent vaccine (Tri-LEMvac) that generates combined vaccine efficacy against future outbreaks of variants of the haemorrhagic fever Lassa, Ebola and Marburg viruses.
[0380] We have bioinformatically designed synthetic glycoprotein sequences from the GPC open reading frames of LASV (L) as well as EBOV (E) and MARV (M) from all available Arenavirus and Filovirus databases. These conserved sequences consist of neutralising antibody and T-cell rich epitopes for each of these viruses. To ensure that these synthetically designed LASV, EBOV and MARV envelopes were functional and antigenic, they were expressed as pseudotypes and quality controlled for both binding and neutralisation against a panel of broadly neutralising antibodies. Herein, we chose the vaccine derived vector Modified Vaccinia Ankara (MVA) for construction of the trimeric LEM vaccine.
[0381] The Modified Vaccinia Ankara (MVA) vaccine platform is a non-replicating strain (i.e. non-replicating in human cells), third generation smallpox vaccine and one of the most advanced recombinant poxviral vaccine vectors in human clinical trials (Cottingham & Carroll, Vaccine, 2013, 31(39):4247-51). MVA is a robust vector system capable of co-expressing up to four transgenes facilitating potent promoters and stable insertion sites (Orubu et al, Pone, 2012, 7(6)e0040167). MVA was chosen because: 1) its significant capacity to stably express multiple independent ORFs via compatible expression cassettes with strong and timely regulated promotors for trivalent LEM vaccination in one cost effective vaccine lot; 2) its ability to induce robust B and T-cell immune responses in animals and humans especially when primed or boosted with DNA or RNA vectors; and 3) vaccine lots can be thermally stabilised for storage and transport in developing countries in the absence of cold chain (Frey et al, Vaccine, 2015, 33(39):5225-34). Proof of principle for the Trivalent vaccine candidate has been demonstrated by: i) cassette validation for independent L, E and M GPC expression and epitope presentation; and ii) preclinical efficacy by Filovirus challenge. The challenge study results are shown in
EXAMPLE 10
[0382] Pseudotype Virus Neutralisation Assay
[0383]
[0384] T2-4 and T2-6 are nucleic acid vaccines each encoding lead candidate optimized antigenic Ebola polypeptide, combined with T2-11 a Marburg candidate, at pre-clinical stage testing with serum samples taken from immunised guinea pigs.
[0385] The results show that administering a combination of T2-6 and T2-11 vaccine inserts gave a synergistic increase in the breadth of the immune response.
EXAMPLE 11
[0386] Antibody Binding Assay
[0387]
EXAMPLE 12
[0388] Comparison of Immune Responses Induced by Two Different Computational Approaches
[0389] Four groups of six mice were immunized five times, at two-week intervals, with 25 μg of four separate pEVAC plasmids encoding HA gene antigens that were designed either by a method according to an embodiment of the invention (DIOS) or by a conventional method (COBRA).
[0390] Antibody-based FACS was carried out on cells expressing two different group 1 influenza A glycoproteins on their cell surface (seasonal H1N1, and pandemic origin H1N1). These were used to test mouse sera from animals immunized with either the COBRA or DIOS HA gene antigens. The results are shown in
[0391] Overall, the DIOS HA gene antigens matched or significantly out-performed the COBRA HA gene antigens (** p<0.01, *** p<0.001).
EXAMPLE 13
[0392] Cross-HA-Group Binding, and Pseudotype Neutralization of H7N9 (A/Shanghai2/2013)
[0393] We tested whether the DIOS-H1N1pdm vaccine of Example 12 (which produced higher levels of antibody binding than H1N1-COBRA to the pandemic H1 HA antigen) could evoke antibodies that recognize and bind divergent group 2 virus HA, such as that from pandemic potential H7N9 strain A/Shanghai/2/2013.
[0394]
[0395] These results support a conclusion that the DIOS-H1N1pdm immunogen cross neutralizes H7, and that cross-HA group immune protection is possible with vaccines produced by methods of the invention.
EXAMPLE 14
[0396] Lassa Virus Glycoprotein
[0397] This example describes Lassa virus glycoprotein ancestral sequence produced using a method according to an embodiment of the invention, and modifications to the ancestral sequence to improve its immunogenicity by stabilising the structure.
[0398] Lassa fever is a hemorrhagic disease caused by an Old World (OW) arenavirus known as Lassa virus (LASV). The virus was first isolated in Nigeria in 1969 and is currently endemic in West Africa. Due to the high morbidity and mortality associated with Lassa hemorrhagic fever, LASV is classified as a category A pathogen.
[0399] Lassa virus is an enveloped ambisense RNA virus with a bisegmented genome. Viral particles are covered in mature glycoprotein (GP) trimeric spikes, which mediate viral entry. Like other class 1 viral fusion proteins, the envelope glycoprotein precursor (GPC) is translated as a single polypeptide and is proteolytically cleaved into three subunits. Processing occurs first in the endoplasmic reticulum (ER) by a cellular signal peptidase. GPC is then trafficked to the cis-Golgi apparatus and processed by cellular proprotein convertase subtilisin kexin isozyme-1/site-1 protease (SKI-1/S1 P) to produce a noncovalent stable-signal peptide (SSP)/GP1/GP2 heterotrimer. Unlike other class I fusion proteins, the relatively long signal peptide of GPC is not degraded; it serves a chaperone-like function necessary for the correct trafficking and processing of GP. SSP interacts with the cytoplasmic domain of GP2 and is involved in pH sensing. GP1 is responsible for binding to cellular receptors, while GP2 mediates membrane fusion during viral entry.
[0400] Lassa virus glycoprotein ancestral sequence to lineages III and IV (L-10) (construct 1) was produced using a method according to an embodiment of the invention. Modifications were then introduced independently into the parental ancestral sequence (construct 1) to provide: (A) SOSEP (construct 2); and (B) FLEP (construct 4), as well as in combination with a glycan knock-out, called NtoK (to provide constructs 3 and 5), to stabilize the otherwise flexible heterotrimers and prevent dissociation of the external domain of the glycoprotein from the non-covalently linked transmembrane domain.
[0401] (A) Two cystein residues were introduced at positions 207 and 360 to allow formation of a disulfide bridge (SOS) between the exterior and the transmembrane domains of GP. To facilitate complete cleavage of these two domains, the furin cleavage site was modified from RRLL to RRRR at position 256-259. Mutation of glutamate to proline at position 329 (EP) prevents structural rearrangements making the protein less flexible.
[0402] (B) The furin cleavage site (256-RRLL-259) between the C-terminus of the external domain and the N-terminus of the transmembrane domain was replaced by a flexible linker with the sequence 256-GGGGSGGGGS-265. Additionally, the EP-mutation as in (A) was introduced at position 335.
[0403] Variants of both designs were generated that additionally contain an asparagine to lysine mutation at position 272 or 278, for SOSEP-NtoK or FLEP-NtoK, respectively, to inactivate a glycosylation motif. Glycans at this position might block access of some neutralizing antibodies, such as 37.7H.
[0404] Construct 1:
[0405] Lassa Virus Glycoprotein Ancestral Sequence to Lineages III and IV (L-10=LASV III IV anc)
TABLE-US-00017 Amino acid sequence (SEQ ID NO: 18): MGQIVTFFQEVPHVIEEVMNIVLIALSLLAILKGLYNVATCGLIGLVTFL 50 LLCGRSCSTTLYKGVYELQTLELNMETLNMTMPLSCTKNNSHHYIRVGNE 100 TGLELTLTNTSIINHKFCNLSDAHKKNLYDHALMSIISTFHLSIPNFNQY 150 EAMSCDFNGGKISVQYNLSHSYAVDAANHCGTVANGVLQTFMRMAWGGSY 200 IALDSGRGNWDCIMTSYQYLIIQNTTWEDHCQFSRPSPIGYLGLLSQRTR 250 DIYISRRLLGTFTWTLSDSEGNETPGGYCLTRWMLIEAELKCFGNTAVAK 300 CNEKHDEEFCDMLRLFDFNKQAIRRLKAEAQMSIQLINKAVNALINDQLI 350 MKNHLRDIMGIPYCNYSKYWYLNHTITGKTSLPKCWLVSNGSYLNETHFS 400 DDIEQQADNMITEMLQKEYMDRQGKTPLGLVDLFVFSTSFYLISIFLHLV 450 KIPTHRHIVGKPCPKPHRLNHMGICSCGLYKQPGVPVRWKR* DNA sequence (SEQ ID NO: 19): 1 ATGGGCCAGA TCGTGACATT CTTCCAAGAG GTGCCCCACG TGATCGAAGA 51 AGTGATGAAC ATCGTCCTGA TCGCCCTGAG CCTGCTGGCC ATCCTGAAGG 101 GCCTGTATAA TGTGGCCACC TGTGGCCTGA TCGGCCTGGT CACATTTCTG 151 CTGCTGTGCG GCAGAAGCTG CTCCACCACA CTGTATAAGG GCGTGTACGA 201 GCTGCAAACC CTGGAACTGA ACATGGAAAC CCTGAACATG ACCATGCCTC 251 TGAGCTGCAC CAAGAACAAC AGCCACCACT ACATCAGAGT GGGCAACGAG 301 ACAGGCCTCG AGCTGACCCT GACCAACACC AGCATCATCA ACCACAAGTT 351 CTGCAACCTG AGCGACGCCC ACAAGAAGAA CCTGTACGAT CACGCCCTGA 401 TGAGCATCAT CTCCACCTTC CACCTGAGCA TCCCCAACTT CAACCAGTAC 451 GAGGCCATGA GCTGCGACTT CAACGGCGGA AAGATCAGCG TGCAGTACAA 501 TCTGAGCCAC AGCTATGCCG TGGACGCCGC CAATCATTGT GGAACAGTGG 551 CCAATGGCGT GCTCCAGACC TTCATGAGAA TGGCCTGGGG CGGCAGCTAT 601 ATCGCCCTGG ATTCTGGCAG AGGCAACTGG GACTGCATCA TGACCAGCTA 651 CCAGTACCTG ATCATCCAGA ACACCACCTG GGAAGATCAC TGCCAGTTCA 701 GCAGACCCTC TCCTATCGGA TACCTGGGCC TGCTGTCCCA GAGAACCCGG 751 GACATCTACA TCTCTAGACG GCTGCTGGGC ACCTTCACCT GGACACTGTC 801 TGATAGCGAG GGCAATGAGA CACCTGGCGG CTACTGTCTG ACCCGGTGGA 851 TGCTGATTGA GGCCGAGCTG AAGTGCTTCG GAAATACCGC CGTGGCCAAG 901 TGCAACGAGA AGCACGACGA GGAATTCTGC GACATGCTGC GGCTGTTCGA 951 TTTCAACAAG CAGGCCATCA GACGGCTGAA GGCCGAGGCT CAGATGTCCA 1001 TCCAGCTGAT CAACAAGGCC GTGAATGCCC TGATTAACGA CCAGCTCATC 1051 ATGAAGAACC ACCTCAGGGA CATCATGGGC ATCCCTTACT GCAACTACAG 1101 CAAGTACTGG TATCTGAACC ACACCATCAC CGGCAAGACC AGCCTGCCTA 1151 AGTGCTGGCT GGTGTCCAAC GGCAGCTACC TGAACGAGAC ACACTTCAGC 1201 GACGACATCG AGCAGCAGGC CGACAACATG ATCACCGAGA TGCTCCAGAA 1251 AGAGTACATG GACCGGCAGG GCAAGACACC TCTGGGCCTT GTGGATCTGT 1301 TCGTGTTCAG CACCAGCTTC TACCTGATCT CTATCTTCCT GCACCTGGTC 1351 AAGATCCCCA CACACAGACA CATCGTGGGC AAGCCCTGTC CTAAGCCTCA 1401 CAGACTGAAC CATATGGGCA TCTGTAGCTG CGGCCTGTAC AAACAGCCTG 1451 GCGTGCCAGT GCGGTGGAAG AGATAA
[0406] Construct 2:
[0407] SOSEP-Variant of Construct 1 (L-10-SOSEP)
TABLE-US-00018 Amino acid sequence (SEQ ID NO: 20): MGQIVTFFQEVPHVIEEVMNIVLIALSLLAILKGLYNVATCGLIGLVTFL 50 LLCGRSCSTTLYKGVYELQTLELNMETLNMTMPLSCTKNNSHHYIRVGNE 100 TGLELTLTNTSIINHKFCNLSDAHKKNLYDHALMSIISTFHLSIPNFNQY 150 EAMSCDFNGGKISVQYNLSHSYAVDAANHCGTVANGVLQTFMRMAWGGSY 200 IALDSGCGNWDCIMTSYQYLIIQNTTWEDHCQFSRPSPIGYLGLLSQRTR 250 DIYISRRRRGTFTWTLSDSEGNETPGGYCLTRWMLIEAELKCFGNTAVAK 300 CNEKHDEEFCDMLRLFDFNKQAIRRLKAPAQMSIQLINKAVNALINDQLI 350 MKNHLRDIMCIPYCNYSKYWYLNHTITGKTSLPKCWLVSNGSYLNETHFS 400 DDIEQQADNMITEMLQKEYMDRQGKTPLGLVDLFVFSTSFYLISIFLHLV 450 KIPTHRHIVGKPCPKPHRLNHMGICSCGLYKQPGVPVRWKR* DNA-sequence (SEQ ID NO: 21): 1 ATGGGCCAGA TCGTGACATT CTTCCAAGAG GTGCCCCACG TGATCGAAGA 51 AGTGATGAAC ATCGTCCTGA TCGCCCTGAG CCTGCTGGCC ATCCTGAAGG 101 GCCTGTATAA TGTGGCCACC TGTGGCCTGA TCGGCCTGGT CACATTTCTG 151 CTGCTGTGCG GCAGAAGCTG CTCCACCACA CTGTATAAGG GCGTGTACGA 201 GCTGCAAACC CTGGAACTGA ACATGGAAAC CCTGAACATG ACCATGCCTC 251 TGAGCTGCAC CAAGAACAAC AGCCACCACT ACATCAGAGT GGGCAACGAG 301 ACAGGCCTCG AGCTGACCCT GACCAACACC AGCATCATCA ACCACAAGTT 351 CTGCAACCTG AGCGACGCCC ACAAGAAGAA CCTGTACGAT CACGCCCTGA 401 TGAGCATCAT CTCCACCTTC CACCTGAGCA TCCCCAACTT CAACCAGTAC 451 GAGGCCATGA GCTGCGACTT CAACGGCGGA AAGATCAGCG TGCAGTACAA 501 TCTGAGCCAC AGCTATGCCG TGGACGCCGC CAATCATTGT GGAACAGTGG 551 CCAATGGCGT GCTCCAGACC TTCATGAGAA TGGCCTGGGG CGGCAGCTAT 601 ATCGCCCTGG ATTCTGGCTG TGGCAACTGG GACTGCATCA TGACCAGCTA 651 CCAGTACCTG ATCATCCAGA ACACCACCTG GGAAGATCAC TGCCAGTTCA 701 GCAGACCCTC TCCTATCGGA TACCTGGGCC TGCTGTCCCA GAGAACCCGG 751 GACATCTACA TCTCTCGGCG GAGAAGAGGC ACCTTCACCT GGACACTGTC 801 TGATAGCGAG GGCAATGAGA CACCTGGCGG CTACTGTCTG ACCCGGTGGA 851 TGCTGATTGA GGCCGAGCTG AAGTGCTTCG GAAATACCGC CGTGGCCAAG 901 TGCAACGAGA AGCACGACGA GGAATTCTGC GACATGCTGC GGCTGTTCGA 951 TTTCAACAAG CAGGCCATCA GACGGCTGAA GGCCCCTGCT CAGATGTCCA 1001 TCCAGCTGAT CAACAAGGCC GTGAATGCCC TGATTAACGA CCAGCTCATC 1051 ATGAAGAACC ACCTCAGGGA CATCATGTGC ATCCCTTACT GCAACTACAG 1101 CAAGTACTGG TATCTGAACC ACACCATCAC CGGCAAGACC AGCCTGCCTA 1151 AGTGCTGGCT GGTGTCCAAC GGCAGCTACC TGAACGAGAC ACACTTCAGC 1201 GACGACATCG AGCAGCAGGC CGACAACATG ATCACCGAGA TGCTCCAGAA 1251 AGAGTACATG GACCGGCAGG GCAAGACACC TCTGGGCCTT GTGGATCTGT 1301 TCGTGTTCAG CACCAGCTTC TACCTGATCT CTATCTTCCT GCACCTGGTC 1351 AAGATCCCCA CACACAGACA CATCGTGGGC AAGCCCTGTC CTAAGCCTCA 1401 CAGACTGAAC CATATGGGCA TCTGTAGCTG CGGCCTGTAC AAACAGCCTG 1451 GCGTGCCAGT GCGGTGGAAG AGATAA
[0408] Construct 3:
[0409] SOSEP-Variant of Construct 1 with N-to-K-Mutation (L-10-SOSEP-NtoK)
TABLE-US-00019 Amino acid sequence (SEQ ID NO: 22): MGQIVTFFQEVPHVIEEVMNIVLIALSLLAILKGLYNVATCGLIGLVTFL 50 LLCGRSCSTTLYKGVYELQTLELNMETLNMTMPLSCTKNNSHHYIRVGNE 100 TGLELTLTNTSIINHKFCNLSDAHKKNLYDHALMSIISTFHLSIPNFNQY 150 EAMSCDFNGGKISVQYNLSHSYAVDAANHCGTVANGVLQTFMRMAWGGSY 200 IALDSGCGNWDCIMTSYQYLIIQNTTWEDHCQFSRPSPIGYLGLLSQRTR 250 DIYISRRRRGTFTWTLSDSEGKETPGGYCLTRWMLIEAELKCFGNTAVAK 300 CNEKHDEEFCDMLRLFDFNKQAIRRLKAPAQMSIQLINKAVNALINDQLI 350 MKNHLRDIMCIPYCNYSKYWYLNHTITGKTSLPKCWLVSNGSYLNETHFS 400 DDIEQQADNMITEMLQKEYMDRQGKTPLGLVDLFVFSTSFYLISIFLHLV 450 KIPTHRHIVGKPCPKPHRLNHMGICSCGLYKQPGVPVRWKR* DNA-sequence (SEQ ID NO: 23): 1 ATGGGCCAGA TCGTGACATT CTTCCAAGAG GTGCCCCACG TGATCGAAGA 51 AGTGATGAAC ATCGTCCTGA TCGCCCTGAG CCTGCTGGCC ATCCTGAAGG 101 GCCTGTATAA TGTGGCCACC TGTGGCCTGA TCGGCCTGGT CACATTTCTG 151 CTGCTGTGCG GCAGAAGCTG CTCCACCACA CTGTATAAGG GCGTGTACGA 201 GCTGCAAACC CTGGAACTGA ACATGGAAAC CCTGAACATG ACCATGCCTC 251 TGAGCTGCAC CAAGAACAAC AGCCACCACT ACATCAGAGT GGGCAACGAG 301 ACAGGCCTCG AGCTGACCCT GACCAACACC AGCATCATCA ACCACAAGTT 351 CTGCAACCTG AGCGACGCCC ACAAGAAGAA CCTGTACGAT CACGCCCTGA 401 TGAGCATCAT CTCCACCTTC CACCTGAGCA TCCCCAACTT CAACCAGTAC 451 GAGGCCATGA GCTGCGACTT CAACGGCGGA AAGATCAGCG TGCAGTACAA 501 TCTGAGCCAC AGCTATGCCG TGGACGCCGC CAATCATTGT GGAACAGTGG 551 CCAATGGCGT GCTCCAGACC TTCATGAGAA TGGCCTGGGG CGGCAGCTAT 601 ATCGCCCTGG ATTCTGGCTG TGGCAACTGG GACTGCATCA TGACCAGCTA 651 CCAGTACCTG ATCATCCAGA ACACCACCTG GGAAGATCAC TGCCAGTTCA 701 GCAGACCCTC TCCTATCGGA TACCTGGGCC TGCTGTCCCA GAGAACCCGG 751 GACATCTACA TCTCTCGGCG GAGAAGAGGC ACCTTCACCT GGACACTGTC 801 TGATAGCGAG GGCAAAGAGA CACCTGGCGG CTACTGTCTG ACCCGGTGGA 851 TGCTGATTGA GGCCGAGCTG AAGTGCTTCG GAAATACCGC CGTGGCCAAG 901 TGCAACGAGA AGCACGACGA GGAATTCTGC GACATGCTGC GGCTGTTCGA 951 TTTCAACAAG CAGGCCATCA GACGGCTGAA GGCCCCTGCT CAGATGTCCA 1001 TCCAGCTGAT CAACAAGGCC GTGAATGCCC TGATTAACGA CCAGCTCATC 1051 ATGAAGAACC ACCTCAGGGA CATCATGTGC ATCCCTTACT GCAACTACAG 1101 CAAGTACTGG TATCTGAACC ACACCATCAC CGGCAAGACC AGCCTGCCTA 1151 AGTGCTGGCT GGTGTCCAAC GGCAGCTACC TGAACGAGAC ACACTTCAGC 1201 GACGACATCG AGCAGCAGGC CGACAACATG ATCACCGAGA TGCTCCAGAA 1251 AGAGTACATG GACCGGCAGG GCAAGACACC TCTGGGCCTT GTGGATCTGT 1301 TCGTGTTCAG CACCAGCTTC TACCTGATCT CTATCTTCCT GCACCTGGTC 1351 AAGATCCCCA CACACAGACA CATCGTGGGC AAGCCCTGTC CTAAGCCTCA 1401 CAGACTGAAC CATATGGGCA TCTGTAGCTG CGGCCTGTAC AAACAGCCTG 1451 GCGTGCCAGT GCGGTGGAAG AGATAA
[0410] Construct 4:
[0411] FLEP-Variant of Construct 1 (L-10-FLEP)
TABLE-US-00020 Amino acid sequence (SEQ ID NO: 24): MGQIVTFFQEVPHVIEEVMNIVLIALSLLAILKGLYNVATCGLIGLVTFL 50 LLCGRSCSTTLYKGVYELQTLELNMETLNMTMPLSCTKNNSHHYIRVGNE 100 TGLELTLTNTSIINHKFCNLSDAHKKNLYDHALMSIISTFHLSIPNFNQY 150 EAMSCDFNGGKISVQYNLSHSYAVDAANHCGTVANGVLQTFMRMAWGGSY 200 IALDSGRGNWDCIMTSYQYLIIQNTTWEDHCQFSRPSPIGYLGLLSQRTR 250 DIYISGGGGSGGGGSGTFTWTLSDSEGNETPGGYCLTRWMLIEAELKCFG 300 NTAVAKCNEKHDEEFCDMLRLFDFNKQAIRRLKAPAQMSIQLINKAVNAL 350 INDQLIMKNHLRDIMGIPYCNYSKYWYLNHTITGKTSLPKCWLVSNGSYL 400 NETHFSDDIEQQADNMITEMLQKEYMDRQGKTPLGLVDLFVFSTSFYLIS 450 IFLHLVKIPTHRHIVGKPCPKPHRLNHMGICSCGLYKQPGVPVRWKR* DNA-sequence (SEQ ID NO: 25): 1 ATGGGCCAGA TCGTGACATT CTTCCAAGAG GTGCCCCACG TGATCGAAGA 51 AGTGATGAAC ATCGTCCTGA TCGCCCTGAG CCTGCTGGCC ATCCTGAAGG 101 GCCTGTATAA TGTGGCCACC TGTGGCCTGA TCGGCCTGGT CACATTTCTG 151 CTGCTGTGCG GCAGAAGCTG CTCCACCACA CTGTATAAGG GCGTGTACGA 201 GCTGCAAACC CTGGAACTGA ACATGGAAAC CCTGAACATG ACCATGCCTC 251 TGAGCTGCAC CAAGAACAAC AGCCACCACT ACATCAGAGT GGGCAACGAG 301 ACAGGCCTCG AGCTGACCCT GACCAACACC AGCATCATCA ACCACAAGTT 351 CTGCAACCTG AGCGACGCCC ACAAGAAGAA CCTGTACGAT CACGCCCTGA 401 TGAGCATCAT CTCCACCTTC CACCTGAGCA TCCCCAACTT CAACCAGTAC 451 GAGGCCATGA GCTGCGACTT CAACGGCGGA AAGATCAGCG TGCAGTACAA 501 TCTGAGCCAC AGCTATGCCG TGGACGCCGC CAATCATTGT GGAACAGTGG 551 CCAATGGCGT GCTCCAGACC TTCATGAGAA TGGCCTGGGG CGGCAGCTAT 601 ATCGCCCTGG ATTCTGGCAG AGGCAACTGG GACTGCATCA TGACCAGCTA 651 CCAGTACCTG ATCATCCAGA ACACCACCTG GGAAGATCAC TGCCAGTTCA 701 GCAGACCCTC TCCTATCGGA TACCTGGGCC TGCTGTCCCA GAGAACCCGG 751 GACATCTACA TCTCTGGCGG CGGAGGATCT GGCGGAGGTG GAAGTGGCAC 801 CTTCACCTGG ACACTGTCTG ATAGCGAGGG CAATGAGACA CCTGGCGGCT 851 ACTGTCTGAC CCGGTGGATG CTGATTGAGG CCGAGCTGAA GTGCTTCGGA 901 AATACCGCCG TGGCCAAGTG CAACGAGAAG CACGACGAGG AATTCTGCGA 951 CATGCTGCGG CTGTTCGATT TCAACAAGCA GGCCATCAGA CGGCTGAAGG 1001 CCCCTGCTCA GATGTCCATC CAGCTGATCA ACAAGGCCGT GAATGCCCTG 1051 ATTAACGACC AGCTCATCAT GAAGAACCAC CTCAGGGACA TCATGGGCAT 1101 CCCTTACTGC AACTACAGCA AGTACTGGTA TCTGAACCAC ACCATCACCG 1151 GCAAGACCAG CCTGCCTAAG TGCTGGCTGG TGTCCAACGG CAGCTACCTG 1201 AACGAGACAC ACTTCAGCGA CGACATCGAG CAGCAGGCCG ACAACATGAT 1251 CACCGAGATG CTCCAGAAAG AGTACATGGA CCGGCAGGGC AAGACACCTC 1301 TGGGCCTTGT GGATCTGTTC GTGTTCAGCA CCAGCTTCTA CCTGATCTCT 1351 ATCTTCCTGC ACCTGGTCAA GATCCCCACA CACAGACACA TCGTGGGCAA 1401 GCCCTGTCCT AAGCCTCACA GACTGAACCA TATGGGCATC TGTAGCTGCG 1451 GCCTGTACAA ACAGCCTGGC GTGCCAGTGC GGTGGAAGAG ATAA
[0412] Construct 5:
[0413] FLEP-Variant of Construct 1 with N-to-K-Mutation (L-10-FLEP-NtoK)
TABLE-US-00021 Amino acid sequence (SEQ ID NO: 26): MGQIVTFFQEVPHVIEEVMNIVLIALSLLAILKGLYNVATCGLIGLVTFL 50 LLCGRSCSTTLYKGVYELQTLELNMETLNMTMPLSCTKNNSHHYIRVGNE 100 TGLELTLTNTSIINHKFCNLSDAHKKNLYDHALMSIISTFHLSIPNFNQY 150 EAMSCDFNGGKISVQYNLSHSYAVDAANHCGTVANGVLQTFMRMAWGGSY 200 IALDSGRGNWDCIMTSYQYLIIQNTTWEDHCQFSRPSPIGYLGLLSQRTR 250 DIYISGGGGSGGGGSGTFTWTLSDSEGKETPGGYCLTRWMLIEAELKCFG 300 NTAVAKCNEKHDEEFCDMLRLFDFNKQAIRRLKAPAQMSIQLINKAVNAL 350 INDQLIMKNHLRDIMGIPYCNYSKYWYLNHTITGKTSLPKCWLVSNGSYL 400 NETHFSDDIEQQADNMITEMLQKEYMDRQGKTPLGLVDLFVFSTSFYLIS 450 IFLHLVKIPTHRHIVGKPCPKPHRLNHMGICSCGLYKQPGVPVRWKR* DNA-sequence (SEQ ID NO: 27): 1 ATGGGCCAGA TCGTGACATT CTTCCAAGAG GTGCCCCACG TGATCGAAGA 51 AGTGATGAAC ATCGTCCTGA TCGCCCTGAG CCTGCTGGCC ATCCTGAAGG 101 GCCTGTATAA TGTGGCCACC TGTGGCCTGA TCGGCCTGGT CACATTTCTG 151 CTGCTGTGCG GCAGAAGCTG CTCCACCACA CTGTATAAGG GCGTGTACGA 201 GCTGCAAACC CTGGAACTGA ACATGGAAAC CCTGAACATG ACCATGCCTC 251 TGAGCTGCAC CAAGAACAAC AGCCACCACT ACATCAGAGT GGGCAACGAG 301 ACAGGCCTCG AGCTGACCCT GACCAACACC AGCATCATCA ACCACAAGTT 351 CTGCAACCTG AGCGACGCCC ACAAGAAGAA CCTGTACGAT CACGCCCTGA 401 TGAGCATCAT CTCCACCTTC CACCTGAGCA TCCCCAACTT CAACCAGTAC 451 GAGGCCATGA GCTGCGACTT CAACGGCGGA AAGATCAGCG TGCAGTACAA 501 TCTGAGCCAC AGCTATGCCG TGGACGCCGC CAATCATTGT GGAACAGTGG 551 CCAATGGCGT GCTCCAGACC TTCATGAGAA TGGCCTGGGG CGGCAGCTAT 601 ATCGCCCTGG ATTCTGGCAG AGGCAACTGG GACTGCATCA TGACCAGCTA 651 CCAGTACCTG ATCATCCAGA ACACCACCTG GGAAGATCAC TGCCAGTTCA 701 GCAGACCCTC TCCTATCGGA TACCTGGGCC TGCTGTCCCA GAGAACCCGG 751 GACATCTACA TCTCTGGCGG CGGAGGATCT GGCGGAGGTG GAAGTGGCAC 801 CTTCACCTGG ACACTGTCTG ATAGCGAGGG CAAAGAGACA CCTGGCGGCT 851 ACTGTCTGAC CCGGTGGATG CTGATTGAGG CCGAGCTGAA GTGCTTCGGA 901 AATACCGCCG TGGCCAAGTG CAACGAGAAG CACGACGAGG AATTCTGCGA 951 CATGCTGCGG CTGTTCGATT TCAACAAGCA GGCCATCAGA CGGCTGAAGG 1001 CCCCTGCTCA GATGTCCATC CAGCTGATCA ACAAGGCCGT GAATGCCCTG 1051 ATTAACGACC AGCTCATCAT GAAGAACCAC CTCAGGGACA TCATGGGCAT 1101 CCCTTACTGC AACTACAGCA AGTACTGGTA TCTGAACCAC ACCATCACCG 1151 GCAAGACCAG CCTGCCTAAG TGCTGGCTGG TGTCCAACGG CAGCTACCTG 1201 AACGAGACAC ACTTCAGCGA CGACATCGAG CAGCAGGCCG ACAACATGAT 1251 CACCGAGATG CTCCAGAAAG AGTACATGGA CCGGCAGGGC AAGACACCTC 1301 TGGGCCTTGT GGATCTGTTC GTGTTCAGCA CCAGCTTCTA CCTGATCTCT 1351 ATCTTCCTGC ACCTGGTCAA GATCCCCACA CACAGACACA TCGTGGGCAA 1401 GCCCTGTCCT AAGCCTCACA GACTGAACCA TATGGGCATC TGTAGCTGCG 1451 GCCTGTACAA ACAGCCTGGC GTGCCAGTGC GGTGGAAGAG ATAA
EXAMPLE 15
[0414] Lassa Virus Nucleoprotein
[0415] This example describes Lassa virus nucleoprotein ancestral sequence produced using a method according to an embodiment of the invention.
[0416] Construct 6:
[0417] Lassa Virus Nucleoprotein Ancestral Sequence of Nigerian Lassa Isolates (L-NP-1=L-NP-CovAnc-1 N)
TABLE-US-00022 Amino acid sequence (SEQ ID NO: 28): MSASKEVKSFLWTQSLRRELSGYCSNIKLQVVKDAQALLHGLDFSEVSNV 50 QRLMRKQKRDDSDLKRLRDLNQAVNNLVELKSTQQKSILRVGTLTSDDLL 100 TLAADLEKLKSKVIRTERPLSSGVYMGNLSTQQLEQRRALLNMIGMVGGA 150 QGTQPGRDGVVRVWDVKNPDLLNNQFGTMPSLTLACLTKQGQVDLNDAVL 200 ALTDLGLIYTAKYPNSSDLDRLSQSHPILNMVDTKKSSLNISGYNFSLGA 250 AVKAGACMLDGGNMLETIKVTPQTMDGILKSILKVKKSLGMFVSDTPGER 300 NPYENILYKICLSGDGWPYIASRTSIVGRAWENTTVDLESDGKPQKVGTA 350 GSNKSLQSAGFPTGLTYSQLMTLKDSMMQLDPSAKTWIDIEGRPEDPVEI 400 ALYQPMSGCYIHFFREPTDLKQFKQDAKYSHGIDVADLFPAQPGLTSAVI 450 EALPRNMVLTCQGSDDIKRLLDSQGRRDIKLIDIALSKADSRRFENAVWD 500 QCKDLCHMHTGVVVEKKKRGGKEEITPHCALMDCIMYDAAVSGGLNIPVL 550 RAVLPRDMVFRTSSPKVVL* DNA-sequence (SEQ ID NO: 29): 1 ATGAGCGCCA GCAAAGAAGT GAAAAGCTTC CTCTGGACCC AGAGCCTGCG 51 GAGAGAGCTG TCTGGCTACT GCTCCAACAT CAAGCTCCAG GTGGTCAAGG 101 ACGCCCAGGC TCTGCTGCAT GGCCTGGATT TCAGCGAGGT GTCCAACGTG 151 CAGCGGCTGA TGAGAAAGCA GAAGCGGGAC GACAGCGACC TGAAGAGACT 201 GAGGGATCTG AACCAGGCCG TGAACAACCT GGTGGAACTG AAGTCTACCC 251 AGCAGAAATC CATCCTGAGA GTGGGCACCC TGACCAGCGA CGATCTGCTG 301 ACACTGGCCG CCGATCTGGA AAAGCTGAAG TCCAAAGTGA TCCGGACCGA 351 GAGGCCACTG TCTAGCGGAG TGTACATGGG CAACCTGAGC ACCCAGCAGC 401 TGGAACAGAG AAGGGCCCTG CTGAACATGA TCGGCATGGT TGGAGGCGCC 451 CAGGGAACAC AGCCTGGAAG AGATGGTGTC GTCAGAGTGT GGGACGTGAA 501 GAACCCCGAC CTGCTCAACA ACCAGTTCGG CACCATGCCT TCTCTGACCC 551 TGGCCTGCCT GACAAAGCAG GGCCAAGTGG ACCTGAACGA TGCCGTGCTG 601 GCTCTGACTG ATCTGGGCCT GATCTACACC GCCAAGTATC CCAACAGCTC 651 CGACCTGGAC AGGCTGAGCC AGTCTCACCC CATCCTGAAC ATGGTGGACA 701 CCAAGAAGTC CAGCCTGAAC ATCAGCGGCT ACAACTTCTC TCTGGGCGCT 751 GCCGTGAAAG CCGGCGCTTG TATGCTTGAC GGCGGCAACA TGCTGGAAAC 801 CATCAAAGTG ACCCCTCAGA CCATGGACGG CATCCTGAAA AGTATCCTGA 851 AAGTGAAGAA ATCCCTGGGC ATGTTCGTGT CCGACACACC CGGCGAGAGA 901 AACCCCTACG AGAACATCCT GTACAAGATT TGCCTGAGCG GCGACGGCTG 951 GCCCTATATC GCCAGCAGAA CATCTATCGT GGGCAGAGCT TGGGAGAACA 1001 CCACCGTGGA CCTGGAATCC GATGGCAAGC CTCAGAAAGT GGGCACAGCC 1051 GGCAGCAACA AGAGCCTCCA GTCTGCCGGA TTTCCTACCG GCCTGACATA 1101 CAGCCAGCTG ATGACCCTGA AGGACAGCAT GATGCAGCTG GACCCTAGCG 1151 CCAAGACCTG GATCGACATT GAGGGCAGAC CCGAGGATCC CGTGGAAATC 1201 GCTCTGTACC AGCCTATGAG CGGCTGCTAT ATCCACTTCT TCAGAGAGCC 1251 CACCGATCTG AAGCAGTTCA AGCAGGACGC CAAGTACAGC CACGGAATCG 1301 ACGTGGCCGA TCTGTTCCCA GCTCAGCCAG GACTGACATC CGCCGTGATT 1351 GAAGCCCTGC CTAGAAACAT GGTGCTGACC TGTCAGGGCA GCGACGACAT 1401 CAAGAGACTG CTGGACAGCC AGGGCAGAAG AGATATCAAG CTGATCGATA 1451 TCGCCCTGAG CAAGGCCGAC TCTCGGAGAT TCGAAAACGC CGTGTGGGAC 1501 CAGTGCAAGG ACCTGTGTCA CATGCACACA GGCGTGGTGG TGGAAAAGAA 1551 GAAGCGCGGA GGCAAAGAGG AAATCACCCC TCACTGCGCC CTGATGGACT 1601 GCATTATGTA TGACGCCGCC GTGTCTGGCG GCCTGAATAT CCCTGTTCTG 1651 AGAGCCGTGC TGCCCCGCGA CATGGTGTTT AGAACAAGCA GCCCCAAGGT 1701 GGTGCTCTGA
EXAMPLE 16
[0418] Lassa Virus Nucleoprotein
[0419] This example describes Lassa virus nucleoprotein ancestral sequence produced using a method according to an embodiment of the invention.
[0420] Construct 7:
[0421] Lassa Virus Nucleoprotein Ancestral Sequence of Sierra Leone Isolates (L-NP-1=L-NP-CovAnc-2 SL)
TABLE-US-00023 Amino acid sequence (SEQ ID NO: 30): MSASKEIKSFLWTQSLRRELSGYCSNIKLQVVKDAQALLHGLDFSEVSNV 50 QRLMRKERRDDNDLKRLRDLNQAVNNLVELKSTQQKSILRVGTLTSDDLL 100 ILAADLEKLKSKVTRTERPLSAGVYMGNLSSQQLDQRRALLNMIGMSGGN 150 QGARAGRDGVVRVWDVKNAELLNNQFGTMPSLTLACLTKQGQVDLNDAVQ 200 ALTDLGLIYTAKYPNTSDLDRLTQSHPILNMIDTKKSSLNISGYNFSLGA 250 AVKAGACMLDGGNMLETIKVSPQTMDGILKSILKVKKALGMFISDTPGER 300 NPYENILYKICLSGDGWPYIASRTSITGRAWENTVVDLESDGKPQKAGSN 350 NSNKSLQSAGFTAGLTYSQLMTLKDAMLQLDPNAKTWMDIEGRPEDPVEI 400 ALYQPSSGCYIHFFREPTDLKQFKQDAKYSHGIDVTDLFAAQPGLTSAVI 450 DALPRNMVITCQGSDDIRKLLESQGRKDIKLIDIALSKTDSRKYENAVWD 500 QYKDLCHMHTGVVVEKKKRGGKEEITPHCALMDCIMFDAAVSGGLNTSVL 550 RAVLPRDMVFRTSTPRVVL* DNA-sequence (SEQ ID NO: 31): 1 ATGAGCGCCA GCAAAGAGAT CAAGAGCTTC CTGTGGACCC AGAGCCTGCG 51 GAGAGAGCTG TCTGGCTACT GCTCCAACAT CAAGCTCCAG GTGGTCAAGG 101 ACGCCCAGGC TCTGCTGCAT GGCCTGGATT TCAGCGAGGT GTCCAACGTG 151 CAGCGGCTGA TGCGGAAAGA GAGAAGGGAC GACAACGACC TGAAGCGGCT 201 GAGGGATCTG AACCAGGCCG TGAACAACCT GGTGGAACTG AAGTCTACCC 251 AGCAGAAATC CATCCTGAGA GTGGGCACCC TGACCAGCGA CGATCTGCTG 301 ATTCTGGCCG CCGACCTGGA AAAGCTGAAG TCCAAAGTGA CCCGGACCGA 351 GAGGCCACTG TCTGCTGGTG TCTACATGGG CAACCTGAGC AGCCAGCAGC 401 TGGATCAGAG AAGGGCCCTG CTGAACATGA TCGGCATGAG CGGCGGAAAT 451 CAGGGCGCTA GAGCTGGCAG AGATGGCGTC GTCAGAGTGT GGGACGTGAA 501 GAATGCCGAG CTGCTCAACA ACCAGTTCGG CACCATGCCT AGCCTGACAC 551 TGGCCTGCCT GACAAAGCAG GGCCAAGTGG ACCTGAACGA TGCTGTGCAG 601 GCCCTGACTG ATCTGGGCCT GATCTACACC GCCAAGTATC CCAACACCAG 651 CGACCTGGAC AGACTGACCC AGTCTCACCC CATCCTGAAT ATGATCGACA 701 CCAAGAAGTC CAGCCTGAAC ATCAGCGGCT ACAACTTCTC TCTGGGCGCT 751 GCCGTGAAAG CCGGCGCTTG TATGCTTGAC GGCGGCAACA TGCTGGAAAC 801 CATCAAGGTG TCCCCACAGA CCATGGACGG CATCCTGAAA AGTATCCTGA 851 AAGTGAAGAA AGCCCTGGGC ATGTTCATCA GCGACACCCC TGGCGAGAGA 901 AACCCCTACG AGAACATCCT GTACAAGATT TGCCTGAGCG GCGACGGCTG 951 GCCCTATATC GCCAGCAGAA CCAGCATTAC CGGCAGAGCT TGGGAGAACA 1001 CCGTGGTGGA TCTGGAAAGC GACGGCAAGC CTCAGAAGGC CGGCAGCAAC 1051 AACTCCAACA AGAGCCTCCA GTCCGCCGGC TTCACAGCCG GCCTGACATA 1101 TAGCCAGCTG ATGACCCTGA AGGACGCCAT GCTGCAACTG GACCCCAATG 1151 CCAAGACCTG GATGGACATC GAGGGCAGAC CTGAGGACCC TGTGGAAATC 1201 GCCCTGTACC AGCCTAGCTC CGGCTGCTAT ATCCACTTCT TCAGAGAGCC 1251 CACCGATCTG AAGCAGTTCA AGCAGGACGC CAAGTACAGC CACGGCATCG 1301 ACGTGACCGA TCTGTTTGCT GCTCAGCCCG GACTGACCTC CGCCGTGATT 1351 GATGCCCTGC CTCGGAACAT GGTCATCACC TGTCAGGGCA GCGACGACAT 1401 CCGGAAGCTG CTGGAATCTC AGGGCAGAAA GGATATCAAG CTGATCGATA 1451 TCGCCCTGAG CAAGACCGAC AGCCGGAAGT ACGAAAACGC CGTGTGGGAC 1501 CAGTACAAGG ACCTGTGCCA CATGCACACA GGCGTGGTGG TGGAAAAGAA 1551 GAAGCGCGGA GGCAAAGAGG AAATCACCCC TCACTGCGCT CTGATGGACT 1601 GCATCATGTT TGACGCCGCC GTGTCTGGCG GCCTGAATAC CTCTGTTCTG 1651 AGAGCCGTGC TGCCCAGAGA CATGGTGTTC AGAACAAGCA CCCCTAGAGT 1701 GGTGCTCTGA