Influenza A 2009 pandemic H1N1 polypeptide fragments comprising endonuclease activity and their use

09783794 · 2017-10-10

Assignee

Inventors

Cpc classification

International classification

Abstract

The present invention relates to polypeptide fragments comprising an amino-terminal fragment of the PA subunit of a viral RNA-dependent RNA polymerase possessing endonuclease activity, wherein the PA subunit is from Influenza A 2009 pandemic H1N1 virus or is a variant thereof. This invention also relates to (i) crystals of the polypeptide fragments which are suitable for structure determination of the polypeptide fragments using X-ray crystallography and (ii) computational methods using the structural coordinates of the polypeptide to screen for and design compounds that modulate, preferably inhibit the endonucleolytically active site within the polypeptide fragment. In addition, this invention relates to methods of identifying compounds that bind to the PA polypeptide fragments possessing endonuclease activity and preferably inhibit the endonucleolytic activity, as well as the compounds themselves. Preferably, the compounds are identifiable by the methods disclosed herein or the pharmaceutical compositions are producible by the methods disclosed herein.

Claims

1. A polypeptide fragment comprising an N-terminal fragment of the PA subunit of a viral RNA-dependent RNA polymerase possessing endonuclease activity, wherein said PA subunit has at least 95% sequence identity to SEQ ID NO: 2 having a maximum length of 240 amino acids, wherein the N-terminus is identical to or corresponds to amino acid position 1 and the C-terminus is identical to or corresponds to an amino acid at a position selected from positions 190 to 198 of the amino acid sequence of the PA subunit according to SEQ ID NO: 2, or is a variant thereof, wherein said variant comprises the amino acid serine at an amino acid position 186 according to SEQ ID NO: 2 or at an amino acid position corresponding thereto.

2. The polypeptide fragment of claim 1, which is soluble and remains in the supernatant after centrifugation for 30 min at 100,000×g in an aqueous buffer under physiologically isotonic conditions.

3. The polypeptide fragment of claim 1, which is crystallizable.

4. The polypeptide fragment of claim 1, wherein the N-terminal fragment has at least amino acids 1 to 190 of the PA subunit of the RNA-dependent RNA polymerase of Influenza A 2009 pandemic H1N1 virus according to SEQ ID NO: 2.

5. The polypeptide fragment of claim 1, wherein said polypeptide fragment is purified to an extent to be suitable for crystallization and is at least 85% pure.

6. The polypeptide fragment of claim 1 to which two divalent cations are bound.

7. The polypeptide fragment of claim 6, wherein the divalent cation is manganese and/or magnesium.

8. The polypeptide fragment of claim 1, which (a) consists of amino acids 1 to 198 of the amino acid sequence set forth in SEQ ID NO: 2 and of an N-terminal linker having the amino acid sequence MGSGMA (SEQ ID NO: 3) and which has the structure defined by (i) the structure coordinates as shown in FIG. 1, (ii) the structure coordinates as shown in FIG. 2, (iii) the structure coordinates as shown in FIG. 3, (iv) the structure coordinates as shown in FIG. 4, or (v) the structure coordinates as shown in FIG. 5, or (b) consists of amino acids 1 to 198 of the amino acid sequence set forth in SEQ ID NO: 2 with amino acids 52 to 64 replaced by the amino acid glycine and of an N-terminal linker having the amino acid sequence MGSGMA (SEQ ID NO: 3) and which has the structure defined by (vi) the structure coordinates as shown in FIG. 15, or (vii) the structure coordinates as shown in FIG. 16.

9. The polypeptide fragment of claim 8, wherein the polypeptide fragment having the structure defined by (i) has a crystalline form with space group C2 and unit cell dimensions of a=26.36 nm±0.5 nm, b=6.62 nm±0.3 nm, c=6.63 nm±0.3 nm, α=90 deg, β=96±2 deg, γ=90 deg, (ii) to (v) has a crystalline form with space group P2.sub.12.sub.12.sub.1 and unit cell dimensions of a=5.46±0.3 nm, b=12.25±0.4 nm, c=13.0±0.3 nm, α=90 deg, β=90 deg, γ=90 deg, (vi) has a crystalline form with space group P6.sub.222 and unit cell dimensions of a=7.50 nm±0.3 nm, b=7.50 nm±0.3 nm, c=12.00 nm±0.5 nm, α=90 deg, β=90 deg, γ=120 deg, or (vii) has a crystalline form with space group P6.sub.422 and unit cell dimensions of a=9.99 nm±0.5 nm, b=9.99 nm±0.5 nm, c=8.27 nm±0.3 nm, α=90 deg, β=90 deg, γ=120 deg.

10. The polypeptide fragment of claim 8, wherein the crystal diffracts X-rays to a resolution of 2.6 Å, 2.1 Å, 1.9 Å or higher.

11. The polypeptide fragment of claim 8, wherein the polypeptide fragment has a crystalline form, and the crystal diffracts X-rays to a resolution of 2.1 Å or higher.

12. The polypeptide fragment of claim 8, wherein the polypeptide fragment has a crystalline form, and the crystal diffracts X-rays to a resolution of 1.9 Å or higher.

13. A polypeptide consisting of the amino acid sequence according to SEQ ID NO:14.

14. The polypeptide of claim 13, wherein the polypeptide is in a crystal form selected from the group consisting of a crystal in space group P6.sub.222 with unit cell dimensions of a=75.0 ű3 Å, b=75.0 ű3 Å, c=120.0 ű5 Å, α=90 deg, β=90 deg, γ=120 deg, or a crystal in space group P6.sub.422 with unit cell dimensions of a=99.9 ű5 Å, b=99.9 ű5 Å, c=82.7 ű3 Å, α=90 deg, β=90 deg, γ=120 deg.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIGS. 1-5 Refined atomic structure coordinates for PA polypeptide fragment amino acids 1 to 198 according to amino acids 1 to 198 of the amino acid sequence set forth in SEQ ID NO: 2 (PA H1N1 1 to 198), with or without bound compounds. For each structure there are generally four molecules in the crystallographic asymmetric unit (ASU) denoted A, B, C and D. In FIGS. 1 to 5, however, only one selected molecule is shown (see below). The file header gives information about the structure refinement. “Atom” refers to the element whose coordinates are measured. The first letter in the column defines the element. The 3-letter code of the respective amino acid is given and the amino acid sequence position. The first 3 values in the line “Atom” define the atomic position of the element as measured. The fourth value corresponds to the occupancy and the fifth (last) value is the temperature factor (B factor). The occupancy factor refers to the fraction of the molecules in which each atom occupies the position specified by the coordinates. A value of “1” indicates that each atom has the same conformation, i.e., the same position, in all equivalent molecules of the crystal. B is a thermal factor that measures movement of the atom around its atomic center. This nomenclature corresponds to the Protein Data Bank (PDB) format.

(2) FIG. 1: (Sheets labelled FIG. 1A to FIG. 1AB) Structural co-ordinates of the native (i.e. without compound/ligand) H1N1 PA endonuclease domain in standard Protein Data Bank (PDB) format. Only the chain A (with associated divalent cations, i.e. one magnesium and one manganese ion, and water molecules) from the asymmetric unit is included. Chains B, C and D are very similar.

(3) FIG. 2: (Sheets labelled FIG. 2A to FIG. 2AC) Structural co-ordinates of the H1N1 PA endonuclease domain with bound 4-[3-[(4-chlorophenyl)methyl]-1-(phenylmethyl)-3-piperidinyl]-2-hydroxy-4-oxo-2-butenoic acid (EMBL-R05-3) in standard Protein Data Bank (PDB) format. Only the chain A (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. The compound EMBL-R05-3 has residue descriptor ci3.

(4) FIG. 3: (Sheets labelled FIG. 3A to FIG. 3AB) Structural co-ordinates of the H1N1 PA endonuclease domain with bound 4-[3-[(4-chlorophenyl)methyl]-1-(phenylmethyl)-3-piperidinyl]-2-hydroxy-4-oxo-2-butenoic acid (EMBL-R05-3) in standard Protein Data Bank (PDB) format. Only the chain D (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. The compound EMBL-05-03 has a different configuration than in chain A (FIG. 2). Said compound has residue descriptor ci3.

(5) FIG. 4: (Sheets labelled FIG. 4A to FIG. 4AC) Structural co-ordinates of the H1N1 PA endonuclease domain with bound 4-[4-[(4-chlorophenyl)methyl]-1-(cyclohexylmethyl)-4-piperidinyl]-2-hydroxy-4-oxo-2-butenoic acid (EMBL-R05-2) in standard Protein Data Bank (PDB) format. Only the chain A (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. Chains B, C and D are very similar. The compound EMBL-R05-2 has residue descriptor cit.

(6) FIG. 5: (Sheets labelled FIG. 5A to FIG. 5AA) Structural co-ordinates of the H1N1 PA endonuclease domain with bound ribo-Uridine Monophosphate (rUMP) in standard Protein Data Bank (PDB) format. Only the chain A (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. Chains B, C and D are very similar. The compound rUMP has residue descriptor U.

(7) FIG. 6: Diagram using the structural co-ordinates of FIG. 1 illustrating the endonuclease active site (PA polypeptide fragment (chain A)), showing the divalent cations (one manganese and one magnesium) and key active site residues.

(8) FIG. 7: (Sheets labelled FIG. 7A to 7B) (A) Diagram using the structural co-ordinates of FIG. 2 illustrating the endonuclease active site (PA polypeptide fragment (chain A)), showing the bound compound EMBL-R05-3, the divalent cations (two manganese) and key active site residues that interact with the compound (notably Arg84) or are close to it and (B) Diagram comparing the co-ordinates of FIG. 1 (chain subjacent on left hand side) and FIG. 2 showing change in conformation of the loop in the vicinity of Tyr24 upon binding of EMBL-R05-3 (chain A). Tyr24 side-chain moves to partially stack with the chlorobenzene of EMBL-R05-3.

(9) FIG. 8: Diagram using the structural co-ordinates of FIG. 3 illustrating the endonuclease active site (PA polypeptide fragment (chain D)), showing the bound compound EMBL-R05-3, the divalent cations (two manganese) and key active site residues that interact with the compound (notably Tyr24 and Arg84) or are close to it.

(10) FIG. 9: Diagram using the structural co-ordinates of FIG. 4 illustrating the endonuclease active site (PA polypeptide fragment (chain A)), showing the bound compound EMBL-R05-2, the divalent cations (two manganese) and key active site residues that interact with the compound (notably Tyr24 and Phe105) or are close to it.

(11) FIG. 10: Diagram using the structural co-ordinates of FIG. 5 illustrating the endonuclease active site (PA polypeptide fragment (chain A)), showing the bound compound rUMP, the divalent cations (two manganese) and key active site residues that interact with the compound (notably Tyr24 and Lys34) or are close to it.

(12) FIGS. 11A, 11B, and 11C: 15% PAGE gels and gel filtration profile from a typical purification of H1N1 PA1-198 (SEQ ID NO:13). The arrow indicates faint traces of residual MBP.

(13) FIG. 12: Frozen crystal of H1N1 PA-Nter co-crystallised with rUMP in the P212121 space-group.

(14) FIG. 13: Divalent ion co-ordination in the native endonuclease structure.

(15) FIG. 14: Electron density for rUMP and divalent cations (manganese, Mn1 and Mn2) in co-crystals with H1N1 PA-Nter. Refined 2Fo-Fc electron density contoured at 1.1 σ. Unbiased Fo-Fc electron density contoured at 2.8σ. Anomalous difference map contoured at 4σ.

(16) FIGS. 15-16: Refined atomic structure coordinates for PA polypeptide fragment amino acids 1 to 198 of the amino acid sequence set forth in SEQ ID NO: 2 with amino acids 52 to 64 replaced by the amino acid glycine (PA H1N1 1 to 198 Δ52-64: Gly (SEQ ID NO:14) with bound compounds. For each structure there is one molecule in the crystallographic asymmetric unit (ASU). The file header gives information about the structure refinement. “Atom” refers to the element whose coordinates are measured. The first letter in the column defines the element. The 3-letter code of the respective amino acid is given and the amino acid sequence position. The first 3 values in the line “Atom” define the atomic position of the element as measured. The fourth value corresponds to the occupancy and the fifth (last) value is the temperature factor (B factor). The occupancy factor refers to the fraction of the molecules in which each atom occupies the position specified by the coordinates. A value of “1” indicates that each atom has the same conformation, i.e., the same position, in all equivalent molecules of the crystal. B is a thermal factor that measures movement of the atom around its atomic center. This nomenclature corresponds to the Protein Data Bank (PDB) format.

(17) FIG. 15: (Sheets labelled FIG. 15A to FIG. 15AB) Structural co-ordinates of the H1N1 PA endonuclease domain with bound 4-[3-[(4-chlorophenyl)methyl]-1-(phenylmethylsulpho)-3-piperidinyl]-2-hydroxy-4-oxo-2-butenoic acid (EMBL-R05-1) in standard Protein Data Bank (PDB) format. The chain A (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. The compound EMBL-R05-1 has residue descriptor ci1.

(18) FIG. 16: (Sheets labelled FIG. 16A to FIG. 16AA) Structural co-ordinates of the H1N1 PA endonuclease domain with bound epigallocatechin 3-gallate (EGCG) in standard Protein Data Bank (PDB) format. The chain A (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. The compound EGCG has residue descriptor tte.

(19) FIG. 17: Diagram using the structural co-ordinates of FIG. 15 illustrating the endonuclease active site (PA polypeptide fragment), showing the bound compound EMBL-R05-1, the divalent cations (two manganese) and key active site residues that interact with the compound (notably Tyr24 and Arg84) or are close to it.

(20) FIG. 18: (Sheets labelled FIG. 18A to FIG. 18B) Diagram using the structural co-ordinates of FIG. 16 illustrating the endonuclease active site (PA polypeptide fragment), showing the bound compound EGCG, the divalent cations (two manganese ions) and key active site residues that interact with the compound or are close to it. (B) Bound EGCG, the divalent cations (two manganese ions) and key active site residues that interact with the compound or are close to it. Interactions less than 3.3 Å (grey dotted lines), additional possible interactions less than 4 Å (dark grey dotted lines).

(21) FIG. 19: Superposition of all diketo inhibitor compounds (EMBL-R05-1, EMBL-R05-2, and EMBL-R05-3A and D) and EGCG bound in PA active site. As shown in FIG. 19, the mode of binding of the three diketo inhibitors to the metals is conserved (although there is some variability in exact position) but the two ‘arms’ of each compound are inserted into different combinations of the pockets 1 to 4. EMBL-R05-1 has a similar configuration to EMBL-R05-3D, with the two arms occupying pockets 2 and 3. EMBL-R05-3A occupies pockets 2 and 4. EMBL-R05-2, which differs notably from R05-1 and R05-3 in the point of substitution on the piperidinyl ring (Table 2) occupies pocket 3 and uniquely pocket 1 (FIG. 7B). The green tea compound EGCG occupies pockets 3 and 4. More potent and specific compounds could be perhaps designed that occupy more than two of the pockets.

EXAMPLES

(22) The Examples are designed in order to further illustrate the present invention and serve a better understanding. They are not to be construed as limiting the scope of the invention in any way.

(23) 1. Methods

(24) 1.1 Cloning, Expression and Purification of PA-Nter (PA H1N1 1 to 198 (SEQ ID NO:13)) and PA-Nter Mutant (PA H1N1 1 to 198 Δ52-64: Gly (SEQ ID NO:14)) from Influenza Strain a/California/04/2009-H1N1

(25) The DNA coding for PA-Nter (residues 1-198 SEQ ID NO:13) (see SEQ ID NO: 1 and 2) from influenza strain A/California/04/2009-H1N1 was synthesized (PA H1N1 1 to 198 (SEQ ID NO:13)) and sub-cloned in the expression vector pESPRIT002 (EMBL) by GeneArt, Regensburg, Germany. The sequence was designed to contain a MGSGMA (SEQ ID NO: 3) polypeptide linker between the tobacco etch virus (TEV) cleavage site at the N-terminus to obtain 100% cleavage by TEV protease.

(26) To further improve crystallisation properties, a deletion of part of the flexible loop (52-73) was engineered by site directed mutagenesis. To this end, a PCR amplification of the whole vector containing the wild-type gene was performed using two primers flanking the mutation site, one of them phosphorylated, and TurboPfu polymerase (Stratagene). Subsequently template vector was digested with DpnI (New England Biolabs) and the mutated vector was re-ligated. In the PA-Nter mutant, the amino acid sequence encompassing amino acids 52-64 was replaced by a single glycine (PA H1N1 1 to 198 Δ52-64: Gly (SEQ ID NO:14)).

(27) The wild-type and mutant plasmids were transformed to E. coli BL21(DE3) (Stratagene) and the protein was expressed in LB medium overnight at 20° C. after induction at an OD 0.8-1.0 with 0.2 mM isopropyl-β-thiogalactopyranoside (IPTG). The protein was purified by an immobilized metal affinity column (IMAC). A second IMAC step was performed after cleavage by the His-tagged TEV protease, followed by gel filtration on a Superdex 75 column (GE Healthcare). Finally, the protein was concentrated to 10-15 mg/ml. See FIG. 11.

(28) 1.2 Compounds

(29) Compounds used for co-crystallisation are given in Table 2. Compound rUMP was purchased from Sigma. Compounds EMBL-R05-1, EMBL-R05-2, and EMBL-R05-3 (first described in {Tomassini, 1994 #397}) were custom synthesised by Shanghai ChemPartner. EGCG was purchased from Sigma (E4143).

(30) 1.3 Crystallization

(31) Initial sitting drop screening was carried out at 20° C. mixing 100 nL of protein solution (15 mg/mL) with 100 nL of reservoir solution using a Cartesian robot at the EMBL Grenoble crystallization platform. Around 600 conditions were screened. Subsequently, larger crystals were obtained at 20° C. by the hanging drop method mixing protein and reservoir solutions in a ratio of 1:1. The protein solution contained 10-15 mg/mL of PA-N-ter in 20 mM HEPES pH 7.5, 150 mM NaCl, 2 mM MnCl.sub.2, 2 mM MgCl.sub.2. The refined reservoir compositions for native crystals and co-crystallization with different compounds/ligands are listed in Table 1. For co-crystallisations, compounds/ligands rUMP, EMBL-R05-3 and EGCG were added to the protein solution to final concentrations of 5 mM, 1.5 mM and 5 mM, respectively. Native crystals and those co-crystallized with EMBL-R05-3 and EGCG were flash frozen in liquid nitrogen in their reservoir solution with additional 25% glycerol as cryo-protectant. Co-crystals with rUMP were frozen in their reservoir solution with additonal rUMP at 10 mM concentration and additional 20% glycerol as cryoprotectant. The structure with EMBL-R05-2 was obtained by soaking co-crystals of PA-N-ter and rUMP for 2 h with reservoir solution containing 1.5 mM EMBL-R05-2 and no rUMP (the inhibitor displaces the rUMP) followed by cryo protection in reservoir solution containing 20% glycerol and 1.5 mM EMBL-R05-2. FIG. 12 shows a typical co-crystal with rUMP. The structure with EMBL-R05-1 was obtained by soaking co-crystals of PA-N-ter mutant and dTMP for 2 h with reservoir solution containing the inhibitor followed by cryo protection in reservoir solution containing 20% glycerol and the inhibitor. See also Table 2 for the compounds/ligands.

(32) TABLE-US-00001 TABLE 1 Summary of crystallisation conditions and crystallographic parameters for various compounds Compound/ Resolution Fragment/ Ligand (final Crystallisation Space group and Refinement R- FIG.(s) concentration) reservoir condition unit cell parameters factor/R-free PA H1N1 No 1.6M sodium C2  2.1 Å 1 to 198 compound formate 4 Molecules/ASU 0.222/0.268 fragment (native) 0.1M HEPES pH 7 263.630 66.240 66.32 (SEQ ID 5% Glycerol 90.00 95.98 90.00 NO: 13) (chain A), see FIGS. 1 and 6 PA H1N1 EMBL-R05-3 2.0M ammonium P 2.sub.1 2.sub.1 2.sub.1  2.5 Å 1 to 198 (1.5 mM) sulphate 4 Molecules/ASU 0.205/0.276 fragment 0.1M BisTris 54.57 122.54 129.78 (SEQ ID pH 5.5 90.00 90.00 90.00 NO: 13) (chain A), see FIGS. 2 and 7 PA H1N1 fragment (chain D), see FIGS. 3 and 8 PA H1N1 EMBL-R05-2 0.1M ammonium P 2.sub.1 2.sub.1 2.sub.1 2.07 Å 1 to 198 (1.5 mM, sulphate 4 Molecules/ASU 0.205/0.259 fragment (SEQ soaked into 0.1M Bis-Tris 56.59 120.81 128.20 ID NO: 13) crystals pH 5.5 90.00 90.00 90.00 (chain A), initially 25% (w/v) PEG see FIGS. 4 grown with 3350 and 9 rUMP) PA H1N1 rUMP 0.1M ammonium P 2.sub.1 2.sub.1 2.sub.1 2.05 Å 1 to 198 (5.0 mM) sulphate 4 Molecules/ASU 0.206/0.246 fragment 0.1M Bis-Tris 54.94 120.11 128.05 (chain A), see pH 5.5 90.00 90.00 90.00 FIGS. 5 and 25% (w/v) PEG 10 3350 PA H1N1 EMBL-R05-1 25-30% PEG4K P 6.sub.222 (180)  1.9 Å 1-198 (1.5 mM, 0.1M Tris pH 8.5 1 molecule/ASU 0.201/0.251 Δ52-64: Gly soaked into 0.2M NaCl a = b = 75.06 Fragment (SEQ crystals c = 120.05 Å ID NO: 14) initially (chain A) grown with see FIGS. 15 dTMP) and 17 PA H1N1 EGCG 10% peg 3350, P6.sub.422 (181) 2.65 Å 1-198 (5 mM) 0.1M NaCl, 0.1M 1 molecule/ASU 0.250/0.304 Δ52-64: Gly Hepes, pH 7.0 a = b = 99.9 Fragment (SEQ c = 82.7 Å ID NO: 14) (chain A) see FIGS. 16 and 18

(33) TABLE-US-00002 TABLE 2 Compounds used Compound CA Index Name Formula Chemical structure EMBL-R05-3 (Tomassini et al. Antimicrob Agents Chemother 1994, 38:2827- 2837) 4-[3-[(4- chlorophenyl)methyl]-1- (phenylmethyl)-3- piperidinyl]-2-hydroxy-4- oxo-2-butenoic acid C23 H24 Cl N O4 embedded image EMBL-R05-2 (Tomassini et al. Antimicrob Agents Chemother 1994, 38:2827- 2837) 4-[4-[(4- chlorophenyl)methyl]-1- (cyclohexylmethyl)-4- piperidinyl]-2-hydroxy-4- oxo-2-butenoic acid C23 H30 Cl N O4 embedded image rUMP (ribo- uridine monophosphate) 5'-Uridylic acid C9 H13 N2 O9 P embedded image EMBL-R05-1 (Tomassini et al. Antimicrob Agents Chemother 1994, 38:2827- 2837) 4-[3-[(4- chlorophenyl)methyl]-1- (phenylmethylsulpho)-3- piperidinyl]-2-hydroxy-4- oxo-2-butenoic acid C23 H24 Cl N S O6 embedded image (−)- Epigallocatechin gallate (EGCG) [(2R,3R)-5,7-dihydroxy-2- (3,4,5- trihydroxyphenyl)chroman- 3-yl]3,4,5- trihydroxybenzoate C22 H18 O11 embedded image
1.4 Crystal Structure Determination

(34) Diffraction data were collected on various beamlines at the European Synchrotron Radiation Facility. Data sets were integrated with XDS and scaled with XSCALE. Subsequent data analysis was performed with the CCP4i programme suite. The initial H1N1 structure was solved with molecular replacement with PHASER using the previously determined H3N2 PA N-ter structure (Dias et al., Nature 2009). Subsequent co-crystal structures were determined with PHASER using the H1N1 structure. Refinement was carried out with REFMAC and/or model building with COOT or O. In the C2 and P2.sub.12.sub.12.sub.1 crystal forms there are four molecules per asymmetric unit. However due to structural variations between the molecules due to plasticity (in particular the 53-73 region) and the generally good resolution, NCS restraints were not applied.

(35) 2. Results

(36) 2.1 PA H1N1 Polypeptide Fragment or PA H1N1 Polypeptide Fragment Variant Generation and Crystallisation

(37) The inventors of the present invention found that a polypeptide fragment comprising amino acids 1 to 198 of SEQ ID NO: 2 of influenza strain A/California/04/2009-H1N1 (2009 pandemic strain) (PA H1N1 1 to 198) readily crystallised with and without relevant compounds/ligands (see FIGS. 1 to 10). Crystallisation properties could further be improved with a polypeptide fragment variant comprising amino acids 1 to 198 of SEQ ID NO: 2 of influenza strain A/California/04/2009-H1N1 (2009 pandemic strain) with amino acids 52 to 64 replaced by the amino acid glycine (PA H1N1 1 to 198 Δ52-64: Gly) (see FIGS. 15 to 18).

(38) Thus, in contrast to the numerous unsuccessful attempts undertaken in the prior art, the inventors of the present invention were able to obtain structures of PA H1N1 polypeptide fragments or PA H1N1 polypeptide fragment variants with and without compounds/ligands.

(39) The structures of PA H1N1 polypeptide fragments or PA H1N1 polypeptide fragment variants with and without compounds/ligands are described below. All these structures show in detail how these compounds/ligands bind directly to the metal ions as well as interacting with a number of residues in the active site. Furthermore several of the interacting residues change conformation upon ligand binding, information which was unavailable before. This three-dimensional knowledge of the ligand interacting residues and the regions of plasticity of the active site is critical for the optimised design of modifications to existing inhibitors to improve their potency or for structure based design and optimisation of novel inhibitors that effectively block endonuclease activity.

(40) 2.2 H1N1 PA Native Structure

(41) Structural co-ordinates of the native (i.e. without compound/ligand) H1N1 PA endonuclease domain (PA H1N1 1 to 198(SEQ ID NO:13)) (chain A) in standard Protein Data Bank (PDB) format are shown in FIG. 1. Only the chain A (with associated divalent cations, i.e. one magnesium and one manganese ion, and water molecules) from the asymmetric unit is included. Chains B, C and D are very similar. A diagram using the structural co-ordinates of FIG. 1 illustrating the endonuclease active site (PA polypeptide fragment (chain A)) is shown in FIG. 6. It shows the divalent cations (one manganese and one magnesium) and key active site residues.

(42) 2.3 EMBL-R05-3-Bound Structure

(43) Structural co-ordinates of the H1N1 PA endonuclease domain (PA H1N1 1 to 198(SEQ ID NO:13)) (chain A) with bound 4-[3-[(4-chlorophenyl)methyl]-1-(phenylmethyl)-3-piperidinyl]-2-hydroxy-4-oxo-2-butenoic acid (EMBL-R05-3) in standard Protein Data Bank (PDB) format are shown in FIG. 2. Only the chain A (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. The compound EMBL-R05-3 has residue descriptor ci3. A diagram using the structural co-ordinates of FIG. 2 illustrating the endonuclease active site (PA polypeptide fragment (chain A)) is shown in FIG. 7A. It shows the bound compound EMBL-R05-3, the divalent cations (two manganese) and key active site residues that interact with the compound (notably Arg84) or are close to it.

(44) A diagram comparing the co-ordinates of FIG. 1 (chain subjacent on left hand side) and FIG. 2 showing change in conformation of the loop in the vicinity of Tyr24 upon binding of EMBL-R05-3 (chain A) is shown in FIG. 7B. The loop around Tyr24 is poorly ordered in the native structure. Tyr24 side-chain moves to partially stack with the chlorobenzene of EMBL-R05-3. This indicates a plasticity of the active site and an induced fit mode of ligand binding. An important conclusion for designing more potent inhibitors is to ensure that the extensions (‘arms’) to any ion-binding scaffold optimise interactions in one or more pocket(s). Imperfect matching will lead to residual flexibility and sub-optimal potency, as seems to be the case for the current compounds, none of which exhibit very well defined, full occupancy binding modes.

(45) Further, structural co-ordinates of the H1N1 PA endonuclease domain (PA H1N1 1 to 198(SEQ ID NO:13)) (chain D) with bound 4-[3-[(4-chlorophenyl)methyl]-1-(phenylmethyl)-3-piperidinyl]-2-hydroxy-4-oxo-2-butenoic acid (EMBL-R05-3) in standard Protein Data Bank (PDB) format are shown in FIG. 3. Only the chain D (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. The compound EMBL-05-3 has a different configuration than in chain A (FIG. 2). Said compound has residue descriptor ci3. A diagram using the structural co-ordinates of FIG. 3 illustrating the endonuclease active site (PA polypeptide fragment (chain D)) is shown in FIG. 8. It shows the bound compound EMBL-R05-3, the divalent cations (two manganese) and key active site residues that interact with the compound (notably Tyr24 and Arg84) or are close to it.

(46) 2.4 EMBL-R05-2-Bound Structure

(47) Structural co-ordinates of the H1N1 PA endonuclease domain (PA H1N1 1 to 198 (SEQ ID NO:13)) (chain A) with bound 4-[4-[(4-chlorophenyl)methyl]-1-(cyclohexylmethyl)-4-piperidinyl]-2-hydroxy-4-oxo-2-butenoic acid (EMBL-R05-2) in standard Protein Data Bank (PDB) format are shown in FIG. 4. Only the chain A (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. Chains B, C and D are very similar. The compound EMBL-R05-2 has residue descriptor cit. A diagram using the structural co-ordinates of FIG. 4 illustrating the endonuclease active site (PA polypeptide fragment (chain A)) is shown in FIG. 9. It shows the bound compound EMBL-R05-2, the divalent cations (two manganese) and key active site residues that interact with the compound (notably Tyr24 and Phe105) or are close to it.

(48) 2.5 Ribo-Uridine Monosphosphate (rUMP)-Bound Structure

(49) Structural co-ordinates of the H1N1 PA endonuclease domain (PA H1N1 1 to 198 (SEQ ID NO:13)) (chain A) with bound ribo-Uridine Monophosphate (rUMP) in standard Protein Data Bank (PDB) format are shown in FIG. 5. Only the chain A (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. Chains B, C and D are very similar. The compound rUMP has residue descriptor U.

(50) Co-crystallisation trials with rUMP gave large, well-ordered crystals in a new orthorhombic space-group (FIGS. 12 and 14).

(51) A diagram using the structural co-ordinates of FIG. 5 illustrating the endonuclease active site (PA polypeptide fragment (chain A)) is shown in FIG. 10. It shows the bound compound rUMP, the divalent cations (two manganese) and key active site residues that interact with the compound (notably Tyr24 and Lys34) or are close to it. The rUMP binds with two oxygens of the phosphate completing the co-ordination sphere of Mn1, one of them also co-ordinating Mn2. The base is well stacked on Tyr24 and Lys34 makes a hydrogen bond to the 02 position. The ribose hydroxyl groups do not make hydrogen bonds to the protein, consistent with the fact that deoxy ribose binds equally well and the protein is a DNAase as much as an RNAase {Dias, 2009 #448}.

(52) The conformation observed for rUMP is quite different from that previously published (PDB entry 3hw3 {Zhao, 2009 #444}). The latter structure was obtained by soaking nucleotides into existing crystals of the endonuclease in the absence of manganese and the electron density is very poor. In this structure, a water molecule replaces Mn1 and a magnesium ion replaces Mn2. This difference in metal ligation is reflected in the altered conformation of Glu119. The ribose and base positions are quite different from the positions in the structure provided herein and unable to interact with Lys34 or Tyr24. The differences between the two structures may reflect firstly the lack of manganese and secondly the fact that soaking pre-grown crystals does not allow the active site to adapt to the ligand as is more likely the case for co-crystallisation.

(53) 2.6 EMBL-R05-1-Bound Structure

(54) Structural co-ordinates of the H1N1 PA endonuclease domain (PA H1N1 1 to 198 Δ52-64: Gly (SEQ ID NO:14)) with bound 4-[3-[(4-chlorophenyl)methyl]-1-(phenylmethylsulpho)-3-piperidinyl]-2-hydroxy-4-oxo-2-butenoic acid (EMBL-R05-1) in standard Protein Data Bank (PDB) format are shown in FIG. 15. The chain (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. The compound EMBL-R05-1 has residue descriptor ci1. A diagram using the structural co-ordinates of FIG. 15 illustrating the endonuclease active site (PA polypeptide fragment) is shown in FIG. 17. It shows the bound compound EMBL-R05-1, the divalent cations (two manganese) and key active site residues that interact with the compound (notably Tyr24 and Arg84) or are close to it.

(55) 2.7 EGCG-Bound Structure

(56) Epigallocatechin 3-gallate (EGCG), is the ester of epigallocatechin and gallic acid and is the most abundant catechin in green tea. It has recently been reported that EGCG inhibits the influenza endonuclease {Kuzuhara, 2009 #629}. Co-crystallisation of PA H1N1 1 to 198 Δ52-64: Gly (SEQ ID NO:14) with (−)-Epigallocatechin gallate gave a new crystal form (Table 1) diffracting to 2.65 Å resolution. The compound was clearly observed in the active site. Strong extra density also exists around a 2-fold crystallographic axis and most likely represents other EGCG molecules trapped by crystal packing.

(57) Structural co-ordinates of the H1N1 PA endonuclease domain (PA H1N1 1 to 198 Δ52-64: Gly (SEQ ID NO:14)) with bound epigallocatechin 3-gallate (EGCG) in standard Protein Data Bank (PDB) format is shown in FIG. 16. The chain (with associated divalent cations, i.e. two manganese ions, and water molecules) from the asymmetric unit is included. The compound EGCG has residue descriptor tte. A diagram using the structural co-ordinates of FIG. 16 illustrating the endonuclease active site (PA polypeptide fragment) is shown in FIG. 18A. It shows the bound compound EGCG, the divalent cations (two manganese ions) and key active site residues that interact with the compound or are close to it. FIG. 18B shows the bound EGCG, the divalent cations (two manganese ions) and key active site residues that interact with the compound or are close to it. Interactions less than 3.3 Å (grew dotted lines), additional possible interactions less than 4 Å (dark grey dotted lines) are shown.

(58) 2.8 Superposition of all Diketo Inhibitor Compounds

(59) A superposition of all diketo inhibitor compounds (EMBL-R05-1, EMBL-R05-2, and EMBL-R05-3A and D) and EGCG bound in PA active site is shown in FIG. 19. As shown in FIG. 19, the mode of binding of the three diketo inhibitors to the metals is conserved (although there is some variability in exact position) but the two ‘arms’ of each compound are inserted into different combinations of the pockets 1 to 4. EMBL-R05-1 has a similar configuration to EMBL-R05-3D, with the two arms occupying pockets 2 and 3. EMBL-R05-3A occupies pockets 2 and 4. EMBL-R05-2, which differs notably from R05-1 and R05-3 in the point of substitution on the piperidinyl ring (Table 2) occupies pocket 3 and uniquely pocket 1 (FIG. 7B). The green tea compound EGCG occupies pockets 3 and 4. More potent and specific compounds could be perhaps designed that occupy more than two of the pockets.

(60) 2.9 Divalent Cation Binding

(61) In the native structure (FIG. 1 and FIG. 6), two divalent cations are observed in the active site. These are identified (using magnitude of electron density and anomalous scattering, data not shown) to be a manganese atom in site 1 (Mn1 in FIG. 13), ligated by His41, Asp108, Glu119, Ile120 and two water molecules (W4 and W5) and a magnesium ion in site 2 (Mg2 in FIG. 13), ligated by Glu80 and Asp108 and four water molecules (W1, W2, W3 and W4). In the other structures (FIGS. 2-5 and FIGS. 7-10 as well as FIGS. 15-18) the ions in both site 1 and site 2 are manganese ions (Mn1 and Mn2), as identified by magnitude of electron density and anomalous scattering (e.g. for rUMP, see FIG. 14).