A FUSION PROTEIN
20230365656 · 2023-11-16
Inventors
- Mohammadmehdi MOBLI (Upper Mount Gravatt, Queensland, AU)
- Xinying JIA (Wishart, Queensland, AU)
- Yanni Ka-Yan CHIN (Sunnybank Hills, Queensland, AU)
- Alan Harry ZHANG (Canberra, Australian Capital Territory, AU)
- Theo Gene CRAWFORD (Toowong, Queensland, AU)
Cpc classification
C12Y304/22034
CHEMISTRY; METALLURGY
C12N9/50
CHEMISTRY; METALLURGY
International classification
Abstract
This invention relates to a method of producing a circularised form of a target protein from a fusion protein and to a fusion protein capable of producing a circularised form of a target protein. The fusion protein can comprise: the target protein; at least one circularisation site adjacent the target protein; and an enzyme domain capable of interacting with the at least one circularisation site and circularising the target protein. The target protein can be a membrane scaffold protein or a cyclotide such as SFTI, Vc1.1, Kalata B1 or MCOTI-II.
Claims
1. (canceled)
2. A fusion protein capable of producing a circularised form of a target protein, the fusion protein comprising: the target protein; at least one circularisation site adjacent the target protein; at least one inhibitory domain, and an enzyme domain capable of interacting with the at least one circularisation site and circularising the target protein.
3. The fusion protein of claim 2, wherein the target protein is circularised by way of a unimolecular reaction.
4. The fusion protein of claim 2, wherein the fusion protein integrates protein expression, purification and circularisation into one single molecule.
5. (canceled)
6. The fusion protein of claim 2, wherein the enzyme domain comprises at least one ligase or cyclase, or an enzymatically active fragment, variant or derivative thereof.
7. The fusion protein of claim 6, wherein the at least one circularisation site comprises first and second circularisation sites adjacent respective terminal ends of the target protein.
8. The fusion protein of claim 6, wherein the enzyme domain is a sortase or an asparaginyl endopeptidase (AEP), or a combination thereof, or an enzymatically active fragment, variant or derivative thereof.
9. The fusion protein of claim 6, further comprising at least one spacer, wherein preferably the at least one spacer is situated between the target protein and the enzyme domain.
10. The fusion protein of claim 9, wherein the at least one inhibitory domain is adjacent one or more of the at least one circularisation site, and optionally the at least one inhibitory domain is positioned at, near, adjacent or towards an N- or C-terminus of the fusion protein.
11. The fusion protein of claim 2, wherein the circularised form of the target protein is capable of binding to a target of interest, such as a therapeutic target or pesticide target, wherein preferably the target of interest is a biomacromolecule, such as a protein, a peptide, a nucleic acid, a polycarbohydrate, or a small molecule such as an organic compound or an organometallic complex, or any other molecule that contributes to a disease or is a target of a pesticide.
12. The fusion protein of claim 2, wherein the target protein is or comprises a membrane scaffold protein (MSP), or a fragment, variant or derivative thereof, wherein preferably a circularised MSP or fragment, variant or derivative thereof is capable of being used in the production of a nanodisc.
13. The fusion protein of claim 2, wherein the target protein is or comprises a cyclotide, or a fragment, variant or derivative thereof, wherein preferably the cyclotide is SFTI, Vc1.1, Kalata B1 or MCOTI-II, or an orthologue, fragment, variant or derivative thereof.
14-16. (canceled)
17. An isolated nucleic acid [1] encoding the fusion protein of claim 2; a genetic construct [2] comprising said nucleic acid [1]; or, a host cell comprising said nucleic acid [1] and/or said genetic construct [2].
18. (canceled)
19. A method for circularising a target protein, said method including the steps of: (a) providing a fusion protein comprising: a target protein; at least one circularisation site adjacent the target protein; at least one inhibitory domain, an enzyme domain capable of circularising the target protein; and optionally a spacer positioned between the target protein and the enzyme domain; and (b) facilitating interaction of the enzyme domain with the at least one circularisation site to thereby circularise the amino acid sequence of the target protein.
20-21. (canceled)
22. The method of claim 19, comprising the step of modulating activity of the enzyme domain by introducing at least one mutation into the enzyme domain, wherein modulated activity results in: i) increased or reduced catalytic activity; ii) increased or reduced binding to the at least one circularisation site; and/or iii) an altered circularisation site.
23. The method of claim 19, wherein the target protein of the fusion protein is circularised by way of an enzyme domain released from a like said fusion protein.
24. The method of claim 19: (i) further including an initial step of producing the fusion protein; (ii) wherein the step of facilitating interaction of the enzyme domain with the at least one circularisation site comprises the step of activating the enzyme domain; (iii) wherein the step of facilitating interaction of the enzyme domain with the at least one circularisation site comprises the step of removing the inhibitory domain adjacent one or more of the at least one circularisation site; (iv) further including a subsequent step of removing the spacer from the enzyme domain; (v) further including a subsequent step of isolating or purifying the enzyme domain removed from the fusion protein by way of an affinity tag positioned at or towards an N- or C-terminus of the fusion protein and adjacent the enzyme domain.
25. (canceled)
Description
BRIEF DESCRIPTION OF THE FIGURES
[0429]
[0430]
[0431]
[0432]
[0433]
[0434]
[0435]
[0436]
[0437]
[0438]
[0439]
[0440]
[0441]
[0442]
[0443]
[0444]
TABLE-US-00001 BRIEF DESCRIPTION OF THE SEQUENCES SEQ ID NO: Description Sequence 1 His.sub.6-TEV-MSP9-eSrtA MGSSHHHHHHENLYFQGSTFSKLREQLGPVTQEFWDNLE amino acid sequence, KETEGLRQEMSKDLEEVKAKVQPYLDDFQKKWQEEMELY shown in FIG. 7. RQKVEPLGEEMRDRARAHVDALRTHLAPYSDELRQRLAA RLEALKENGGARLAEYHAKATEHLSTLSEKAKPALEDLRQ GLLPVLESFKVSFLSALEEYTKKLNTQLPGTGAAALEGTQA KPQIPKDKSKVAGYIEIPDADIKEPVYPGPATREQLNRGVSF AEENESLDDQNISIAGHTFIDRPNYQFTNLKAAKKGSMVYF KVGNETRKYKMTSIRNVKPTAVEVLDEQKGKDKQLTLITC DDYNEETGVWETRKIFVATEVKLEHHHHHH 2 MSP9-LPGT(GGS)x5- MASSENLYFQGSTFSKLREQLGPVTQEFWDNLEKETEGLR SrtA-His.sub.10 amino acid QEMSKDLEEVKAKVQPYLDDFQKKWQEEMELYRQKVEP sequence, shown in FIG. LGEEMRDRARAHVDALRTHLAPYSDELRQRLAARLEALK 14A. ENGGARLAEYHAKATEHLSTLSEKAKPALEDLRQGLLPVL ESFKVSFLSALEEYTKKLNTQLPGTGGSGGSGGSGGSGGSQ AKPQIPKDKSKVAGYIEIPDADIKEPVYPGPATPEQLNRGVS FAEENESLDDQNISIAGHTFIDRPNYQFTNLKAAKKGSMVY FKVGNETRKYKMTSIRDVKPTDVGVLDEQKGKDKQLTLIT CDDYNEKTGVWEKRKIFVATEVKLEHHHHHHHHHH 3 MSP11-LPGT(GGS)5- MASSENLYFQGSTFSKLREQLGPVTQEFWDNLEKETEGLR wtSrtA-His.sub.10 amino acid QEMSKDLEEVKAKVQPYLDDFQKKWQEEMELYRQKVEP sequence, shown in FIG. LRAELQEGARQKLHELQEKLSPLGEEMRDRARAHVDALR 14B. THLAPYSDELRQRLAARLEALKENGGARLAEYHAKATEH LSTLSEKAKPALEDLRQGLLPVLESFKVSFLSALEEYTKKL NTQLPGTGGSGGSGGSGGSGGSQAKPQIPKDKSKVAGYIEI PDADIKEPVYPGPATPEQLNRGVSFAEENESLDDQNISIAG HTFIDRPNYQFTNLKAAKKGSMVYFKVGNETRKYKMTSIR DVKPTDVGVLDEQKGKDKQLTLITCDDYNEKTGVWEKRK IFVATEVKLEHHHHHHHHHH 4 MSP20-LPGT(GGS)5- MASSENLYFQGSTFSKLREQLGPVTQEFWDNLEKETEGLR wtSrtA-His.sub.10 amino acid QEMSKDLEEVKAKVQPYLDDFQKKWQEEMELYRQKVEP sequence, shown in FIG. LRAELQEGARQKLHELQEKLSPLGEEMRDRARAHVDALR 14C. THLAPYSDELRQRLAARLEALKENGGARLAEYHAKATEH LSTLSEKAKPALEDLRQGLLPVLESFKVSFLSALEEYTKKL NTQGTPVTQEFWDNLEKETEGLRQEMSKDLEEVKAKVQP YLDDFQKKWQEEMELYRQKVEPLRAELQEGARQKLHELQ EKLSPLGEEMRDRARAHVDALRTHLAPYSDELRQRLAARL EALKENGGARLAEYHAKATEHLSTLSEKAKPALEDLRQGL LPVLESFKVSFLSALEEYTKKLNTQLPGTGGSGGSGGSGGS GGSQAKPQIPKDKSKVAGYIEIPDADIKEPVYPGPATPEQL NRGVSFAEENESLDDQNISIAGHTFIDRPNYQFTNLKAAKK GSMVYFKVGNETRKYKMTSIRDVKPTDVGVLDEQKGKDK QLTLITCDDYNEKTGVWEKRKIFVATEVKLEHHHHHHHH HH 5 MSP7-LPGT(GGS)5- MASSENLYFQGSTFSKLREQLGPVTQEFWDNLEKETEGLR wtSrtA-His.sub.10 amino acid QEMSKDLEEVKAKVQPLGEEMRDRARAHVDALRTHLAPY sequence, shown in FIG. SDELRQRLAARLEALKENGGARLAEYHAKATEHLSTLSEK 14D. AKPALEDLRQGLLPVLESFKVSFLSALEEYTKKLNTQLPGT GGSGGSGGSGGSGGSQAKPQIPKDKSKVAGYIEIPDADIKE PVYPGPATPEQLNRGVSFAEENESLDDQNISIAGHTFIDRPN YQFTNLKAAKKGSMVYFKVGNETRKYKMTSIRDVKPTDV GVLDEQKGKDKQLTLITCDDYNEKTGVWEKRKIFVATEV KLEHHHHHHHHHH 6 MSP6-LPGT(GGS)5- MASSENLYFQGSTFSKLREQLGPVTQEFWDNLEKETEGLR wtSrtA-His.sub.10 amino acid QEMSKDLEEVKAKVQPYSDELRQRLAARLEALKENGGAR sequence, shown in FIG. LAEYHAKATEHLSTLSEKAKPALEDLRQGLLPVLESFKVSF 14E. LSALEEYTKKLNTQLPGTGGSGGSGGSGGSGGSQAKPQIP KDKSKVAGYIEIPDADIKEPVYPGPATPEQLNRGVSFAEEN ESLDDQNISIAGHTFIDRPNYQFTNLKAAKKGSMVYFKVG NETRKYKMTSIRDVKPTDVGVLDEQKGKDKQLTLITCDDY NEKTGVWEKRKIFVATEVKLEHHHHHHHHHH 7 MSP9- MASSENLYFQGSTFSKLREQLGPVTQEFWDNLEKETEGLR LPGTGAAALEGTLVPR QEMSKDLEEVKAKVQPYLDDFQKKWQEEMELYRQKVEP S-SrtA-His.sub.10 amino acid LGEEMRDRARAHVDALRTHLAPYSDELRQRLAARLEALK sequence, shown in FIG. ENGGARLAEYHAKATEHLSTLSEKAKPALEDLRQGLLPVL 15A. ESFKVSFLSALEEYTKKLNTQLPGTGAAALEGTLVPRSQAK PQIPKDKSKVAGYIEIPDADIKEPVYPGPATPEQLNRGVSFA EENESLDDQNISIAGHTFIDRPNYQFTNLKAAKKGSMVYFK VGNETRKYKMTSIRDVKPTDVGVLDEQKGKDKQLTLITCD DYNEKTGVWEKRKIFVATEVKLEHHHHHHHHHH 8 MSP7- MASSENLYFQGSTFSKLREQLGPVTQEFWDNLEKETEGLR LPGTGAAALEGTLVPR QEMSKDLEEVKAKVQPLGEEMRDRARAHVDALRTHLAPY S-SrtA-His.sub.10 amino acid SDELRQRLAARLEALKENGGARLAEYHAKATEHLSTLSEK sequence, shown in FIG. AKPALEDLRQGLLPVLESFKVSFLSALEEYTKKLNTQLPGT 15B. GAAALEGTLVPRSQAKPQIPKDKSKVAGYIEIPDADIKEPV YPGPATPEQLNRGVSFAEENESLDDQNISIAGHTFIDRPNYQ FTNLKAAKKGSMVYFKVGNETRKYKMTSIRDVKPTDVGV LDEQKGKDKQLTLITCDDYNEKTGVWEKRKIFVATEVKLE HHHHHHHHHH 9 MSP6- MASSENLYFQGSTFSKLREQLGPVTQEFWDNLEKETEGLR LPGTGAAALEGTLVPR QEMSKDLEEVKAKVQPYSDELRQRLAARLEALKENGGAR S-SrtA-His.sub.10 amino acid LAEYHAKATEHLSTLSEKAKPALEDLRQGLLPVLESFKVSF sequence, shown in FIG. LSALEEYTKKLNTQLPGTGAAALEGTLVPRSQAKPQIPKDK 15C. SKVAGYIEIPDADIKEPVYPGPATPEQLNRGVSFAEENESLD DQNISIAGHTFIDRPNYQFTNLKAAKKGSMVYFKVGNETR KYKMTSIRDVKPTDVGVLDEQKGKDKQLTLITCDDYNEKT GVWEKRKIFVATEVKLEHHHHHHHHHH 10 MSP11- MASSENLYFQGSTFSKLREQLGPVTQEFWDNLEKETEGLR LPGTGAAALEGTLVPR QEMSKDLEEVKAKVQPYLDDFQKKWQEEMELYRQKVEP S-SrtA-His.sub.10 amino acid LRAELQEGARQKLHELQEKLSPLGEEMRDRARAHVDALR sequence, shown in FIG. THLAPYSDELRQRLAARLEALKENGGARLAEYHAKATEH 15D. LSTLSEKAKPALEDLRQGLLPVLESFKVSFLSALEEYTKKL NTQLPGTGAAALEGTLVPRSQAKPQIPKDKSKVAGYIEIPD ADIKEPVYPGPATPEQLNRGVSFAEENESLDDQNISIAGHTF IDRPNYQFTNLKAAKKGSMVYFKVGNETRKYKMTSIRDV KPTDVGVLDEQKGKDKQLTLITCDDYNEKTGVWEKRKIF VATEVKLEHHHHHHHHHH 11 MSP20- MASSENLYFQGSTFSKLREQLGPVTQEFWDNLEKETEGLR LPGTGAAALEGTLVPR QEMSKDLEEVKAKVQPYLDDFQKKWQEEMELYRQKVEP S-SrtA-His.sub.10 amino acid LRAELQEGARQKLHELQEKLSPLGEEMRDRARAHVDALR sequence, shown in FIG. THLAPYSDELRQRLAARLEALKENGGARLAEYHAKATEH 15E. LSTLSEKAKPALEDLRQGLLPVLESFKVSFLSALEEYTKKL NTQGTPVTQEFWDNLEKETEGLRQEMSKDLEEVKAKVQP YLDDFQKKWQEEMELYRQKVEPLRAELQEGARQKLHELQ EKLSPLGEEMRDRARAHVDALRTHLAPYSDELRQRLAARL EALKENGGARLAEYHAKATEHLSTLSEKAKPALEDLRQGL LPVLESFKVSFLSALEEYTKKLNTQLPGTGAAALEGTLVPR SQAKPQIPKDKSKVAGYIEIPDADIKEPVYPGPATPEQLNRG VSFAEENESLDDQNISIAGHTFIDRPNYQFTNLKAAKKGSM VYFKVGNETRKYKMTSIRDVKPTDVGVLDEQKGKDKQLT LITCDDYNEKTGVWEKRKIFVATEVKLEHHHHHHHHHH 12 G-SFTI- MASSLPRDAENLYFQGRCTKSIPPICFPDLPGTGGSGGSGGS LPGT(GGS)5LVPRS- GGSGGSLVPRSQAKPQIPKDKSKVAGYIEIPDADIKEPVYP SrtA-His.sub.10 amino acid GPATPEQLNRGVSFAEENESLDDQNISIAGHTFIDRPNYQFT sequence, shown in FIG. NLKAAKKGSMVYFKVGNETRKYKMTSIRDVKPTDVGVLD 16A. EQKGKDKQLTLITCDDYNEKTGVWEKRKIFVATEVKLEHH HHHHHHHH 13 G-kB1- MASSLPRDAENLYFQGCGETCVGGTCNTPGCTCSWPVCTR LPGT(GGS)5LVPRS- NGLPVTGGSGGSGGSGGSGGSLVPRSQAKPQIPKDKSKVA SrtA-His.sub.10 amino acid GYIEIPDADIKEPVYPGPATPEQLNRGVSFAEENESLDDQNI sequence, shown in FIG. SIAGHTFIDRPNYQFTNLKAAKKGSMVYFKVGNETRKYK 16B. MTSIRDVKPTDVGVLDEQKGKDKQLTLITCDDYNEKTGV WEKRKIFVATEVKLEHHHHHHHHHH 14 G-SFTI- MASSENLYFQGRCTKSIPPICFPDLPGTGAAALEGTLVPRSQ LPGTGAAALEGTLVPR AKPQIPKDKSKVAGYIEIPDADIKEPVYPGPATPEQLNRGVS S-SrtA-His.sub.10 amino acid FAEENESLDDQNISIAGHTFIDRPNYQFTNLKAAKKGSMVY sequence of FIG. 16C. FKVGNETRKYKMTSIRDVKPTDVGVLDEQKGKDKQLTLIT CDDYNEKTGVWEKRKIFVATEVKLEHHHHHHHHHH 15 GGG-SFTI- MASSENLYFQGGGGRCTKSIPPICFPDLPGTGAAALEGTLV LPGTGAAALEGTLVPR PRSQAKPQIPKDKSKVAGYIEIPDADIKEPVYPGPATPEQLN S-SrtA-His.sub.10 amino acid RGVSFAEENESLDDQNISIAGHTFIDRPNYQFTNLKAAKKG sequence of FIG. 16D. SMVYFKVGNETRKYKMTSIRDVKPTDVGVLDEQKGKDKQ LTLITCDDYNEKTGVWEKRKIFVATEVKLEHHHHHHHHH H 16 GGG-SFTI- MASSENLYFQGGGRCTKSIPPICFPDLPGTGGSGGSGGSGG LPGT(GGS)5LVPRS- SGGSLVPRSQAKPQIPKDKSKVAGYIEIPDADIKEPVYPGPA SrtA-His.sub.10 amino acid TPEQLNRGVSFAEENESLDDQNISIAGHTFIDRPNYQFTNLK sequence, shown in FIG. AAKKGSMVYFKVGNETRKYKMTSIRDVKPTDVGVLDEQK 16E. GKDKQLTLITCDDYNEKTGVWEKRKIFVATEVKLEHHHH HHHHHH 17 GGG-kB1- MASSENLYFQGGGCGETCVGGTCNTPGCTCSWPVCTRNG LPGT(GGS)5LVPRS- LPVTGGSGGSGGSGGSGGSLVPRSQAKPQIPKDKSKVAGYI SrtA-His.sub.10 amino acid EIPDADIKEPVYPGPATPEQLNRGVSFAEENESLDDQNISIA sequence, shown in FIG. GHTFIDRPNYQFTNLKAAKKGSMVYFKVGNETRKYKMTSI 16F. RDVKPTDVGVLDEQKGKDKQLTLITCDDYNEKTGVWEKR KIFVATEVKLEHHHHHHHHHH 18 GGG-kB1- MASSENLYFQGGGCGETCVGGTCNTPGCTCSWPVCTRNG LPVTGAAALEGTLVPR LPVTGAAALEGTLVPRSQAKPQIPKDKSKVAGYIEIPDADI S-SrtA-His.sub.10 amino acid KEPVYPGPATPEQLNRGVSFAEENESLDDQNISIAGHTFIDR sequence, shown in FIG. PNYQFTNLKAAKKGSMVYFKVGNETRKYKMTSIRDVKPT 16G. DVGVLDEQKGKDKQLTLITCDDYNEKTGVWEKRKIFVAT EVKLEHHHHHHHHHH 19 G-kB1- MASSENLYFQGCGETCVGGTCNTPGCTCSWPVCTRNGLP LPVTGAAALEGTLVPR VTGAAALEGTLVPRSQAKPQIPKDKSKVAGYIEIPDADIKE S-SrtA-His.sub.10 amino acid PVYPGPATPEQLNRGVSFAEENESLDDQNISIAGHTFIDRPN sequence, shown in FIG. YQFTNLKAAKKGSMVYFKVGNETRKYKMTSIRDVKPTDV 16H. GVLDEQKGKDKQLTLITCDDYNEKTGVWEKRKIFVATEV KLEHHHHHHHHHH 20 GG-Vc1.1- MASSENLYFQGGCCSDPRCNYDHPEICGLPGTGAAALEGT LPGTGAAALEGTLVPR LVPRSQAKPQIPKDKSKVAGYIEIPDADIKEPVYPGPATPEQ S-SrtA-His.sub.10 amino acid LNRGVSFAEENESLDDQNISIAGHTFIDRPNYQFTNLKAAK sequence, shown in FIG. KGSMVYFKVGNETRKYKMTSIRDVKPTDVGVLDEQKGKD 16I. KQLTLITCDDYNEKTGVWEKRKIFVATEVKLEHHHHHHH HHH 21 GG-Vc1.1- MASSENLYFQGGCCSDPRCNYDHPEICGLPGTGGSGGSGG LPGT(GGS)5LVPRS- SGGSGGSLVPRSQAKPQIPKDKSKVAGYIEIPDADIKEPVYP SrtA-His.sub.10 amino acid GPATPEQLNRGVSFAEENESLDDQNISIAGHTFIDRPNYQFT sequence, shown in FIG. NLKAAKKGSMVYFKVGNETRKYKMTSIRDVKPTDVGVLD 16J. EQKGKDKQLTLITCDDYNEKTGVWEKRKIFVATEVKLEHH HHHHHHHH 22 His.sub.6-TEV-MSP9-eSrtA ATGGGGTCGTCCCACCATCACCACCATCATGAGAATTTG nucleotide sequence, TACTTCCAAGGATCGACGTTTTCCAAGTTACGCGAACAG shown in FIG. 7. TTAGGACCCGTAACGCAGGAATTCTGGGACAACCTTGA GAAAGAGACGGAAGGCCTTCGCCAGGAGATGTCAAAAG ACCTTGAGGAAGTGAAGGCTAAGGTACAACCCTATCTG GACGATTTTCAAAAGAAGTGGCAAGAAGAAATGGAGTT GTATCGTCAAAAAGTTGAACCTTTGGGGGAGGAGATGC GTGATCGCGCCCGCGCCCACGTGGATGCATTGCGCACGC ATTTAGCTCCATATAGTGATGAGTTGCGCCAGCGTTTGG CCGCACGTTTAGAGGCTTTGAAAGAGAATGGCGGTGCC CGTCTGGCCGAGTACCATGCAAAGGCGACAGAACATTT GTCCACCTTGAGCGAGAAAGCTAAACCGGCTCTGGAGG ACTTGCGTCAGGGCTTGCTTCCGGTACTTGAATCATTCA AGGTGTCCTTTCTGTCTGCCTTAGAAGAGTATACTAAGA AGCTTAACACACAACTGCCTGGCACAGGTGCTGCAGCTT TAGAGGGTACCCAAGCTAAACCGCAGATCCCCAAAGAC AAATCTAAAGTTGCAGGTTATATTGAGATCCCAGACGCG GATATTAAGGAGCCCGTGTATCCGGGTCCCGCCACTCGC GAGCAGTTGAATCGCGGAGTCTCCTTTGCAGAGGAAAA TGAATCGTTGGATGACCAGAATATCTCTATTGCCGGTCA TACATTCATCGACCGTCCAAATTACCAATTCACTAACCT TAAAGCCGCGAAAAAGGGGTCGATGGTCTATTTCAAGG TGGGCAATGAAACACGCAAATATAAAATGACTTCGATT CGTAACGTCAAACCAACGGCTGTGGAAGTGTTAGACGA GCAAAAAGGCAAGGATAAGCAACTTACTTTAATTACGT GTGACGATTATAATGAAGAGACAGGAGTATGGGAGACA CGCAAAATCTTCGTGGCGACGGAGGTTAAGCTCGAGCA TCATCATCATCACCACTAG 23 MSP9-LPGT(GGS)x5- ATGGCTTCGTCCGAGAATTTGTACTTCCAAGGATCGACG SrtA-His.sub.10 nucleotide TTTTCCAAGTTACGCGAACAGTTAGGACCCGTAACGCAG sequence, shown in FIG. GAATTCTGGGACAACCTTGAGAAAGAGACGGAAGGCCT 14A. TCGCCAGGAGATGTCAAAAGACCTTGAGGAAGTGAAGG CTAAGGTACAACCCTATCTGGACGATTTTCAAAAGAAGT GGCAAGAAGAAATGGAGTTGTATCGTCAAAAAGTTGAA CCTTTGGGGGAGGAGATGCGTGATCGCGCCCGCGCCCA CGTGGATGCATTGCGCACGCATTTAGCTCCATATAGTGA TGAGTTGCGCCAGCGTTTGGCCGCACGTTTAGAGGCTTT GAAAGAGAATGGCGGTGCCCGTCTGGCCGAGTACCATG CAAAGGCGACAGAACATTTGTCCACCTTGAGCGAGAAA GCTAAACCGGCTCTGGAGGACTTGCGTCAGGGCTTGCTT CCGGTACTTGAATCATTCAAGGTGTCCTTTCTGTCTGCCT TAGAAGAGTATACTAAGAAGCTTAACACACAACTGCCT GGTACCGGGGGATCGGGAGGTTCAGGTGGGTCCGGTGG TAGTGGTGGGAGTCAAGCTAAACCTCAAATTCCGAAAG ATAAATCGAAAGTGGCAGGCTATATTGAAATTCCAGAT GCTGATATTAAAGAACCAGTATATCCAGGACCAGCAAC ACCTGAACAATTAAATAGAGGTGTAAGCTTTGCAGAAG AAAATGAATCACTAGATGATCAAAATATTTCAATTGCAG GACACACTTTCATTGACCGTCCGAACTATCAATTTACAA ATCTTAAAGCAGCCAAAAAAGGTAGTATGGTGTACTTTA AAGTTGGTAATGAAACACGTAAGTATAAAATGACAAGT ATAAGAGATGTTAAGCCTACAGATGTAGGAGTTCTAGAT GAACAAAAAGGTAAAGATAAACAATTAACATTAATTAC TTGTGATGATTACAATGAAAAGACAGGCGTTTGGGAAA AACGTAAAATCTTTGTAGCTACAGAAGTCAAACTCGAGC ACCACCACCACCACCACCATCATCATCATTGA 24 MSP11-LPGT(GGS)5- ATGGCTAGCAGCGAAAACCTGTATTTTCAGGGCAGCACC wtSrtA-His.sub.10 nucleotide TTTAGCAAACTGCGTGAACAGCTGGGCCCGGTGACCCA sequence, shown in FIG. GGAATTTTGGGATAACCTGGAAAAAGAAACCGAAGGCC 14B. TGCGTCAGGAAATGAGCAAAGATCTGGAAGAGGTGAAA GCGAAAGTGCAGCCGTATCTGGATGACTTTCAGAAAAA ATGGCAGGAAGAGATGGAACTGTATCGTCAGAAAGTGG AACCGCTGCGTGCGGAACTGCAGGAAGGCGCGCGTCAG AAACTGCATGAACTGCAGGAAAAACTGAGCCCGCTGGG CGAAGAGATGCGTGATCGTGCGCGTGCGCATGTGGATG CGCTGCGTACCCATCTGGCGCCGTATAGCGATGAACTGC GTCAGCGTCTGGCGGCCCGTCTGGAAGCGCTGAAAGAA AACGGCGGTGCGCGTCTGGCGGAATATCATGCGAAAGC GACCGAACATCTGAGCACCCTGAGCGAAAAAGCGAAAC CGGCGCTGGAAGATCTGCGTCAGGGCCTGCTGCCGGTG CTGGAAAGCTTTAAAGTGAGCTTTCTGAGCGCGCTGGAA GAGTATACCAAAAAACTGAACACCCAGCTGCCGGGTAC CGGGGGATCGGGAGGTTCAGGTGGGTCCGGTGGTAGTG GTGGGAGTCAAGCTAAACCTCAAATTCCGAAAGATAAA TCGAAAGTGGCAGGCTATATTGAAATTCCAGATGCTGAT ATTAAAGAACCAGTATATCCAGGACCAGCAACACCTGA ACAATTAAATAGAGGTGTAAGCTTTGCAGAAGAAAATG AATCACTAGATGATCAAAATATTTCAATTGCAGGACACA CTTTCATTGACCGTCCGAACTATCAATTTACAAATCTTA AAGCAGCCAAAAAAGGTAGTATGGTGTACTTTAAAGTT GGTAATGAAACACGTAAGTATAAAATGACAAGTATAAG AGATGTTAAGCCTACAGATGTAGGAGTTCTAGATGAAC AAAAAGGTAAAGATAAACAATTAACATTAATTACTTGT GATGATTACAATGAAAAGACAGGCGTTTGGGAAAAACG TAAAATCTTTGTAGCTACAGAAGTCAAACTCGAGCACCA CCACCACCACCACCATCATCATCATTGA 25 MSP20-LPGT(GGS)5- ATGGCATCGTCGGAGAACTTGTATTTCCAAGGCTCTACT wtSrtA-His.sub.10 nucleotide TTCTCGAAGTTGCGTGAGCAGTTGGGACCTGTGACACAA sequence, shown in FIG. GAGTTCTGGGATAATTTAGAAAAGGAGACAGAAGGGCT 14C. GCGTCAAGAGATGAGTAAAGACCTTGAAGAAGTTAAAG CAAAGGTGCAGCCCTATCTGGATGATTTCCAAAAAAAAT GGCAAGAAGAAATGGAATTATACCGTCAGAAGGTAGAG CCACTTCGTGCAGAATTGCAAGAAGGCGCACGCCAGAA GTTGCACGAACTGCAAGAAAAATTGTCACCTTTGGGGG AGGAGATGCGCGACCGTGCACGCGCGCACGTTGACGCC TTGCGTACGCATCTGGCGCCGTACTCTGACGAATTACGT CAGCGCTTGGCCGCGCGCTTAGAGGCCTTGAAGGAGAA CGGGGGAGCGCGTCTTGCAGAGTACCATGCCAAAGCCA CGGAACATCTGTCCACCTTGAGCGAGAAGGCGAAGCCA GCACTGGAAGACTTACGCCAGGGTTTGCTGCCAGTCCTT GAGTCTTTTAAAGTATCGTTTCTTTCTGCGCTTGAGGAAT ACACGAAGAAGTTAAACACTCAGGGTACTCCAGTGACA CAGGAGTTTTGGGATAATTTGGAAAAAGAGACTGAAGG GCTTCGCCAAGAGATGTCGAAGGATTTGGAAGAGGTAA AGGCGAAGGTCCAACCTTACCTGGACGATTTCCAAAAG AAGTGGCAGGAAGAAATGGAGTTATACCGTCAGAAAGT CGAACCTTTACGTGCCGAATTACAAGAAGGAGCACGCC AAAAACTTCATGAGCTTCAGGAGAAGCTGTCCCCCCTTG GTGAGGAGATGCGCGACCGTGCGCGTGCTCATGTAGAT GCATTACGTACCCACCTTGCCCCCTATAGCGATGAGTTG CGTCAGCGTCTTGCCGCCCGCCTGGAAGCTTTGAAAGAG AATGGCGGTGCTCGTTTAGCAGAGTATCACGCCAAGGCC ACCGAACATCTTTCAACTTTGTCTGAGAAAGCCAAACCT GCGTTAGAAGACTTGCGTCAAGGGCTTCTGCCTGTCTTA GAGTCGTTCAAGGTGTCATTTCTGTCGGCGCTTGAAGAA TATACTAAAAAGTTGAATACACAGTTACCTGGTACCGGG GGATCGGGAGGTTCAGGTGGGTCCGGTGGTAGTGGTGG GAGTCAAGCTAAACCTCAAATTCCGAAAGATAAATCGA AAGTGGCAGGCTATATTGAAATTCCAGATGCTGATATTA AAGAACCAGTATATCCAGGACCAGCAACACCTGAACAA TTAAATAGAGGTGTAAGCTTTGCAGAAGAAAATGAATC ACTAGATGATCAAAATATTTCAATTGCAGGACACACTTT CATTGACCGTCCGAACTATCAATTTACAAATCTTAAAGC AGCCAAAAAAGGTAGTATGGTGTACTTTAAAGTTGGTA ATGAAACACGTAAGTATAAAATGACAAGTATAAGAGAT GTTAAGCCTACAGATGTAGGAGTTCTAGATGAACAAAA AGGTAAAGATAAACAATTAACATTAATTACTTGTGATGA TTACAATGAAAAGACAGGCGTTTGGGAAAAACGTAAAA TCTTTGTAGCTACAGAAGTCAAACTCGAGCACCACCACC ACCACCACCATCATCATCATTGA 26 MSP7-LPGT(GGS)5- ATGGCTTCGTCCGAGAATTTGTACTTCCAAGGATCGACG wtSrtA-His.sub.10 nucleotide TTTTCCAAGTTACGCGAACAGTTAGGACCCGTAACGCAG sequence, shown in FIG. GAATTCTGGGACAACCTTGAGAAAGAGACGGAAGGCCT 14D. TCGCCAGGAGATGTCAAAAGACCTTGAGGAAGTGAAGG CTAAGGTACAACCCTTGGGGGAGGAGATGCGTGATCGC GCCCGCGCCCACGTGGATGCATTGCGCACGCATTTAGCT CCATATAGTGATGAGTTGCGCCAGCGTTTGGCCGCACGT TTAGAGGCTTTGAAAGAGAATGGCGGTGCCCGTCTGGCC GAGTACCATGCAAAGGCGACAGAACATTTGTCCACCTTG AGCGAGAAAGCTAAACCGGCTCTGGAGGACTTGCGTCA GGGCTTGCTTCCGGTACTTGAATCATTCAAGGTGTCCTTT CTGTCTGCCTTAGAAGAGTATACTAAGAAGCTTAACACA CAACTGCCTGGTACCGGGGGATCGGGAGGTTCAGGTGG GTCCGGTGGTAGTGGTGGGAGTCAAGCTAAACCTCAAA TTCCGAAAGATAAATCGAAAGTGGCAGGCTATATTGAA ATTCCAGATGCTGATATTAAAGAACCAGTATATCCAGGA CCAGCAACACCTGAACAATTAAATAGAGGTGTAAGCTTT GCAGAAGAAAATGAATCACTAGATGATCAAAATATTTC AATTGCAGGACACACTTTCATTGACCGTCCGAACTATCA ATTTACAAATCTTAAAGCAGCCAAAAAAGGTAGTATGG TGTACTTTAAAGTTGGTAATGAAACACGTAAGTATAAAA TGACAAGTATAAGAGATGTTAAGCCTACAGATGTAGGA GTTCTAGATGAACAAAAAGGTAAAGATAAACAATTAAC ATTAATTACTTGTGATGATTACAATGAAAAGACAGGCGT TTGGGAAAAACGTAAAATCTTTGTAGCTACAGAAGTCA AACTCGAGCACCACCACCACCACCACCATCATCATCATT GA 27 MSP6-LPGT(GGS)5- ATGGCTTCGTCCGAGAATTTGTACTTCCAAGGATCGACG wtSrtA-His.sub.10 nucleotide TTTTCCAAGTTACGCGAACAGTTAGGACCCGTAACGCAG sequence, shown in FIG. GAATTCTGGGACAACCTTGAGAAAGAGACGGAAGGCCT 14E. TCGCCAGGAGATGTCAAAAGACCTTGAGGAAGTGAAGG CTAAGGTACAACCCTATAGTGATGAGTTGCGCCAGCGTT TGGCCGCACGTTTAGAGGCTTTGAAAGAGAATGGCGGT GCCCGTCTGGCCGAGTACCATGCAAAGGCGACAGAACA TTTGTCCACCTTGAGCGAGAAAGCTAAACCGGCTCTGGA GGACTTGCGTCAGGGCTTGCTTCCGGTACTTGAATCATT CAAGGTGTCCTTTCTGTCTGCCTTAGAAGAGTATACTAA GAAGCTTAACACACAACTGCCTGGTACCGGGGGATCGG GAGGTTCAGGTGGGTCCGGTGGTAGTGGTGGGAGTCAA GCTAAACCTCAAATTCCGAAAGATAAATCGAAAGTGGC AGGCTATATTGAAATTCCAGATGCTGATATTAAAGAACC AGTATATCCAGGACCAGCAACACCTGAACAATTAAATA GAGGTGTAAGCTTTGCAGAAGAAAATGAATCACTAGAT GATCAAAATATTTCAATTGCAGGACACACTTTCATTGAC CGTCCGAACTATCAATTTACAAATCTTAAAGCAGCCAAA AAAGGTAGTATGGTGTACTTTAAAGTTGGTAATGAAACA CGTAAGTATAAAATGACAAGTATAAGAGATGTTAAGCC TACAGATGTAGGAGTTCTAGATGAACAAAAAGGTAAAG ATAAACAATTAACATTAATTACTTGTGATGATTACAATG AAAAGACAGGCGTTTGGGAAAAACGTAAAATCTTTGTA GCTACAGAAGTCAAACTCGAGCACCACCACCACCACCA CCATCATCATCATTGA 28 MSP9- ATGGCTTCGTCCGAGAATTTGTACTTCCAAGGATCGACG LPGTGAAALEGTLVPR TTTTCCAAGTTACGCGAACAGTTAGGACCCGTAACGCAG S-SrtA-His.sub.10 nucleotide GAATTCTGGGACAACCTTGAGAAAGAGACGGAAGGCCT sequence, shown in FIG. TCGCCAGGAGATGTCAAAAGACCTTGAGGAAGTGAAGG 15A. CTAAGGTACAACCCTATCTGGACGATTTTCAAAAGAAGT GGCAAGAAGAAATGGAGTTGTATCGTCAAAAAGTTGAA CCTTTGGGGGAGGAGATGCGTGATCGCGCCCGCGCCCA CGTGGATGCATTGCGCACGCATTTAGCTCCATATAGTGA TGAGTTGCGCCAGCGTTTGGCCGCACGTTTAGAGGCTTT GAAAGAGAATGGCGGTGCCCGTCTGGCCGAGTACCATG CAAAGGCGACAGAACATTTGTCCACCTTGAGCGAGAAA GCTAAACCGGCTCTGGAGGACTTGCGTCAGGGCTTGCTT CCGGTACTTGAATCATTCAAGGTGTCCTTTCTGTCTGCCT TAGAAGAGTATACTAAGAAGCTTAACACACAACTGCCT GGCACAGGTGCTGCAGCTTTAGAGGGTACCCTGGTGCCG CGCAGCCAAGCTAAACCTCAAATTCCGAAAGATAAATC GAAAGTGGCAGGCTATATTGAAATTCCAGATGCTGATAT TAAAGAACCAGTATATCCAGGACCAGCAACACCTGAAC AATTAAATAGAGGTGTAAGCTTTGCAGAAGAAAATGAA TCACTAGATGATCAAAATATTTCAATTGCAGGACACACT TTCATTGACCGTCCGAACTATCAATTTACAAATCTTAAA GCAGCCAAAAAAGGTAGTATGGTGTACTTTAAAGTTGGT AATGAAACACGTAAGTATAAAATGACAAGTATAAGAGA TGTTAAGCCTACAGATGTAGGAGTTCTAGATGAACAAA AAGGTAAAGATAAACAATTAACATTAATTACTTGTGATG ATTACAATGAAAAGACAGGCGTTTGGGAAAAACGTAAA ATCTTTGTAGCTACAGAAGTCAAACTCGAGCACCACCAC CACCACCACCATCATCATCATTGA 29 MSP7- ATGGCTTCGTCCGAGAATTTGTACTTCCAAGGATCGACG LPGTGAAALEGTLVPR TTTTCCAAGTTACGCGAACAGTTAGGACCCGTAACGCAG S-SrtA-His.sub.10 nucleotide GAATTCTGGGACAACCTTGAGAAAGAGACGGAAGGCCT sequence, shown in FIG. TCGCCAGGAGATGTCAAAAGACCTTGAGGAAGTGAAGG 15B. CTAAGGTACAACCCTTGGGGGAGGAGATGCGTGATCGC GCCCGCGCCCACGTGGATGCATTGCGCACGCATTTAGCT CCATATAGTGATGAGTTGCGCCAGCGTTTGGCCGCACGT TTAGAGGCTTTGAAAGAGAATGGCGGTGCCCGTCTGGCC GAGTACCATGCAAAGGCGACAGAACATTTGTCCACCTTG AGCGAGAAAGCTAAACCGGCTCTGGAGGACTTGCGTCA GGGCTTGCTTCCGGTACTTGAATCATTCAAGGTGTCCTTT CTGTCTGCCTTAGAAGAGTATACTAAGAAGCTTAACACA CAACTGCCTGGCACAGGTGCTGCAGCTTTAGAGGGTACC CTGGTGCCGCGCAGCCAAGCTAAACCTCAAATTCCGAA AGATAAATCGAAAGTGGCAGGCTATATTGAAATTCCAG ATGCTGATATTAAAGAACCAGTATATCCAGGACCAGCA ACACCTGAACAATTAAATAGAGGTGTAAGCTTTGCAGA AGAAAATGAATCACTAGATGATCAAAATATTTCAATTGC AGGACACACTTTCATTGACCGTCCGAACTATCAATTTAC AAATCTTAAAGCAGCCAAAAAAGGTAGTATGGTGTACT TTAAAGTTGGTAATGAAACACGTAAGTATAAAATGACA AGTATAAGAGATGTTAAGCCTACAGATGTAGGAGTTCTA GATGAACAAAAAGGTAAAGATAAACAATTAACATTAAT TACTTGTGATGATTACAATGAAAAGACAGGCGTTTGGGA AAAACGTAAAATCTTTGTAGCTACAGAAGTCAAACTCG AGCACCACCACCACCACCACCATCATCATCATTGA 30 MSP6- ATGGCTTCGTCCGAGAATTTGTACTTCCAAGGATCGACG LPGTGAAALEGTLVPR TTTTCCAAGTTACGCGAACAGTTAGGACCCGTAACGCAG S-SrtA-His.sub.10 nucleotide GAATTCTGGGACAACCTTGAGAAAGAGACGGAAGGCCT sequence, shown in FIG. TCGCCAGGAGATGTCAAAAGACCTTGAGGAAGTGAAGG 15C. CTAAGGTACAACCCTATAGTGATGAGTTGCGCCAGCGTT TGGCCGCACGTTTAGAGGCTTTGAAAGAGAATGGCGGT GCCCGTCTGGCCGAGTACCATGCAAAGGCGACAGAACA TTTGTCCACCTTGAGCGAGAAAGCTAAACCGGCTCTGGA GGACTTGCGTCAGGGCTTGCTTCCGGTACTTGAATCATT CAAGGTGTCCTTTCTGTCTGCCTTAGAAGAGTATACTAA GAAGCTTAACACACAACTGCCTGGCACAGGTGCTGCAG CTTTAGAGGGTACCCTGGTGCCGCGCAGCCAAGCTAAA CCTCAAATTCCGAAAGATAAATCGAAAGTGGCAGGCTA TATTGAAATTCCAGATGCTGATATTAAAGAACCAGTATA TCCAGGACCAGCAACACCTGAACAATTAAATAGAGGTG TAAGCTTTGCAGAAGAAAATGAATCACTAGATGATCAA AATATTTCAATTGCAGGACACACTTTCATTGACCGTCCG AACTATCAATTTACAAATCTTAAAGCAGCCAAAAAAGG TAGTATGGTGTACTTTAAAGTTGGTAATGAAACACGTAA GTATAAAATGACAAGTATAAGAGATGTTAAGCCTACAG ATGTAGGAGTTCTAGATGAACAAAAAGGTAAAGATAAA CAATTAACATTAATTACTTGTGATGATTACAATGAAAAG ACAGGCGTTTGGGAAAAACGTAAAATCTTTGTAGCTACA GAAGTCAAACTCGAGCACCACCACCACCACCACCATCA TCATCATTGA 31 MSP11- ATGGCTAGCAGCGAAAACCTGTATTTTCAGGGCAGCACC LPGTGAAALEGTLVPR TTTAGCAAACTGCGTGAACAGCTGGGCCCGGTGACCCA S-SrtA-His.sub.10 nucleotide GGAATTTTGGGATAACCTGGAAAAAGAAACCGAAGGCC sequence, shown in FIG. TGCGTCAGGAAATGAGCAAAGATCTGGAAGAGGTGAAA 15D. GCGAAAGTGCAGCCGTATCTGGATGACTTTCAGAAAAA ATGGCAGGAAGAGATGGAACTGTATCGTCAGAAAGTGG AACCGCTGCGTGCGGAACTGCAGGAAGGCGCGCGTCAG AAACTGCATGAACTGCAGGAAAAACTGAGCCCGCTGGG CGAAGAGATGCGTGATCGTGCGCGTGCGCATGTGGATG CGCTGCGTACCCATCTGGCGCCGTATAGCGATGAACTGC GTCAGCGTCTGGCGGCCCGTCTGGAAGCGCTGAAAGAA AACGGCGGTGCGCGTCTGGCGGAATATCATGCGAAAGC GACCGAACATCTGAGCACCCTGAGCGAAAAAGCGAAAC CGGCGCTGGAAGATCTGCGTCAGGGCCTGCTGCCGGTG CTGGAAAGCTTTAAAGTGAGCTTTCTGAGCGCGCTGGAA GAGTATACCAAAAAACTGAACACCCAGCTGCCGGGTAC GGGCGCCGCTGCACTGGAAGGTACCCTGGTGCCGCGCA GCCAAGCTAAACCTCAAATTCCGAAAGATAAATCGAAA GTGGCAGGCTATATTGAAATTCCAGATGCTGATATTAAA GAACCAGTATATCCAGGACCAGCAACACCTGAACAATT AAATAGAGGTGTAAGCTTTGCAGAAGAAAATGAATCAC TAGATGATCAAAATATTTCAATTGCAGGACACACTTTCA TTGACCGTCCGAACTATCAATTTACAAATCTTAAAGCAG CCAAAAAAGGTAGTATGGTGTACTTTAAAGTTGGTAATG AAACACGTAAGTATAAAATGACAAGTATAAGAGATGTT AAGCCTACAGATGTAGGAGTTCTAGATGAACAAAAAGG TAAAGATAAACAATTAACATTAATTACTTGTGATGATTA CAATGAAAAGACAGGCGTTTGGGAAAAACGTAAAATCT TTGTAGCTACAGAAGTCAAACTCGAGCACCACCACCAC CACCACCATCATCATCATTGA 32 MSP20- ATGGCCAGTTCTGAAAACCTGTATTTTCAGGGATCGACG LPGTGAAALEGTLVPR TTTTCCAAGTTACGTGAGCAGTTAGGACCTGTTACACAA S-SrtA-His.sub.10 nucleotide GAGTTCTGGGATAACTTAGAGAAAGAGACAGAAGGGCT sequence, shown in FIG. GCGTCAAGAGATGAGTAAAGACCTTGAAGAAGTTAAAG 15E CAAAGGTTCAGCCCTATCTGGATGATTTCCAGAAGAAAT GGCAGGAGGAAATGGAATTATACCGTCAGAAGGTAGAG CCACTTCGTGCAGAATTGCAAGAAGGCGCACGCCAGAA GTTACACGAACTGCAAGAAAAATTATCACCTTTAGGGG AGGAGATGCGCGACCGTGCACGCGCGCACGTTGACGCC TTACGTACGCATCTGGCGCCGTACTCTGACGAATTACGT CAGCGCTTAGCCGCGCGCTTAGAGGCCTTAAAGGAGAA CGGGGGAGCGCGTCTTGCAGAGTACCATGCCAAAGCCA CGGAACATCTGTCCACCTTGAGCGAGAAGGCGAAGCCA GCACTGGAAGACTTACGCCAGGGTTTACTGCCAGTCCTT GAGTCTTTTAAAGTATCGTTTCTTTCTGCGCTTGAGGAAT ACACGAAGAAGTTAAACACTCAGGGTACTCCAGTTACA CAGGAGTTTTGGGATAATTTAGAAAAAGAGACTGAAGG GCTTCGCCAAGAGATGTCGAAGGATTTAGAAGAGGTAA AGGCGAAGGTCCAACCTTACCTGGACGATTTCCAGAAG AAGTGGCAAGAAGAAATGGAGTTATACCGTCAGAAAGT CGAACCTTTACGTGCCGAATTACAAGAAGGAGCACGCC AAAAACTTCATGAGCTTCAGGAGAAGCTGTCCCCCCTTG GTGAAGAGATGCGCGACCGTGCGCGTGCTCATGTAGAT GCATTACGTACCCACCTTGCCCCCTATAGCGATGAGTTA CGTCAGCGTCTTGCCGCCCGCCTGGAAGCTTTAAAAGAG AATGGCGGTGCTCGTTTAGCAGAGTATCACGCCAAGGCC ACCGAACATCTTTCAACTTTATCTGAGAAAGCCAAACCT GCGTTAGAAGACTTACGTCAAGGGCTTCTGCCTGTCTTA GAGTCGTTCAAGGTTTCATTTCTGTCGGCGCTTGAAGAA TATACTAAAAAGTTAAATACACAGTTACCTGGTACAGGT GCTGCAGCTTTAGAGGGTACCCTGGTGCCGCGCAGCCA AGCTAAACCTCAAATTCCGAAAGATAAATCGAAAGTGG CAGGCTATATTGAAATTCCAGATGCTGATATTAAAGAAC CAGTATATCCAGGACCAGCAACACCTGAACAATTAAAT AGAGGTGTAAGCTTTGCAGAAGAAAATGAATCACTAGA TGATCAAAATATTTCAATTGCAGGACACACTTTCATTGA CCGTCCGAACTATCAATTTACAAATCTTAAAGCAGCCAA AAAAGGTAGTATGGTGTACTTTAAAGTTGGTAATGAAAC ACGTAAGTATAAAATGACAAGTATAAGAGATGTTAAGC CTACAGATGTAGGAGTTCTAGATGAACAAAAAGGTAAA GATAAACAATTAACATTAATTACTTGTGATGATTACAAT GAAAAGACAGGCGTTTGGGAAAAACGTAAAATCTTTGT AGCTACAGAAGTCAAACTCGAGCACCACCACCACCACC ACCATCATCATCATTGA 33 G-SFTI- ATGGCCAGTTCTTTACCTCGTGACGCGGAAAACCTGTAT LPGT(GGS)5LVPRS- TTTCAGGGACGCTGCACCAAAAGCATTCCGCCGATTTGC SrtA-His.sub.10 nucleotide TTTCCGGATCTGCCTGGTACCGGGGGATCGGGAGGTTCA sequence, shown in FIG. GGTGGGTCCGGTGGTAGTGGTGGGAGTCTCGTGCCGCGC 16A. TCCCAAGCTAAACCTCAAATTCCGAAAGATAAATCGAA AGTGGCAGGCTATATTGAAATTCCAGATGCTGATATTAA AGAACCAGTATATCCAGGACCAGCAACACCTGAACAAT TAAATAGAGGTGTAAGCTTTGCAGAAGAAAATGAATCA CTAGATGATCAAAATATTTCAATTGCAGGACACACTTTC ATTGACCGTCCGAACTATCAATTTACAAATCTTAAAGCA GCCAAAAAAGGTAGTATGGTGTACTTTAAAGTTGGTAAT GAAACACGTAAGTATAAAATGACAAGTATAAGAGATGT TAAGCCTACAGATGTAGGAGTTCTAGATGAACAAAAAG GTAAAGATAAACAATTAACATTAATTACTTGTGATGATT ACAATGAAAAGACAGGCGTTTGGGAAAAACGTAAAATC TTTGTAGCTACAGAAGTCAAACTCGAGCACCACCACCAC CACCACCATCATCATCATTGA 34 G-kB1- ATGGCCAGTTCTTTACCTCGTGACGCGGAAAACCTGTAT LPGT(GGS)5LVPRS- TTTCAGGGATGCGGCGAAACCTGCGTGGGCGGCACCTG SrtA-His.sub.10 nucleotide CAACACCCCGGGCTGCACCTGCAGCTGGCCGGTGTGCA sequence, shown in FIG. CCCGCAACGGCCTGCCGGTGACCGGGGGATCGGGAGGT 16B. TCAGGTGGGTCCGGTGGTAGTGGTGGGAGTCTCGTGCCG CGCTCCCAAGCTAAACCTCAAATTCCGAAAGATAAATC GAAAGTGGCAGGCTATATTGAAATTCCAGATGCTGATAT TAAAGAACCAGTATATCCAGGACCAGCAACACCTGAAC AATTAAATAGAGGTGTAAGCTTTGCAGAAGAAAATGAA TCACTAGATGATCAAAATATTTCAATTGCAGGACACACT TTCATTGACCGTCCGAACTATCAATTTACAAATCTTAAA GCAGCCAAAAAAGGTAGTATGGTGTACTTTAAAGTTGGT AATGAAACACGTAAGTATAAAATGACAAGTATAAGAGA TGTTAAGCCTACAGATGTAGGAGTTCTAGATGAACAAA AAGGTAAAGATAAACAATTAACATTAATTACTTGTGATG ATTACAATGAAAAGACAGGCGTTTGGGAAAAACGTAAA ATCTTTGTAGCTACAGAAGTCAAACTCGAGCACCACCAC CACCACCACCATCATCATCATTGA 35 Amino acid linker L1, GAAALEGTQAKP shown in FIG. 2. 36 Amino acid linker L1a, GGSGGSGGQAKP shown in FIG. 2. 37 Amino acid linker L1b, GAAALEGTLVPRS shown in FIG. 2. 38 Amino acid linker L2, GGSGGSGGSGGSGGS shown in FIG. 2. 39 Amino acid linker L2b, GGSGGSGGSGGSGGSLVPRS shown in FIG. 2. 40 G-SFTI-LPGT amino acid GRCTKSIPPICFPDLPGT sequence, shown in FIG. 4. 41 G-(kB1-LPV)T amino GCGETCVGGTCNTPGCTCSWPVCTRNGLPVT acid sequence, shown in FIG. 4. 42 GG-Vc1.1-LPGT amino GGCCSDPRCNYDHPEICGLPGT acid sequence, shown in FIG. 4. 43 Optimized nucleotide TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATA sequence for an mRNA CATATGGCCAGTTCTTTACCTCGTGACGCGGAAAACCTG translation initiation TATTTTCAG region 1, shown in FIG. 12. 44 Optimized nucleotide TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATA sequence for an mRNA CATATGGCCAGTTCTCTACCCCGTGATGCGGAAAACCTG translation initiation TATTTTCAG region 2, shown in FIG. 12. 45 Optimized nucleotide TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATA sequence for an mRNA CATATGGCTAGTTCCCTACCCCGTGATGCAGAGAATCTG translation initiation TACTTTCAG region 3, shown in FIG. 12. 46 Optimized nucleotide TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATA sequence for an mRNA CATATGGCTTCCTCCCTTCCACGCGACGCAGAGAATTTG translation initiation TATTTCCAG region 4, shown in FIG. 12. 47 Optimized nucleotide TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATA sequence for an mRNA CATATGGCAAGTTCACTCCCTCGGGACGCAGAAAATCTG translation initiation TACTTTCAA region 5, shown in FIG. 12. 48 Optimized nucleotide TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATA sequence for an mRNA CATATGGCCAGTTCGTTGCCCCGTGATGCTGAGAATCTG translation initiation TACTTCCAA region 6, shown in FIG. 12. 49 Optimized nucleotide TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATA sequence for an mRNA CATATGGCTTCGAGTTTACCACGTGACGCTGAGAATCTG translation initiation TACTTCCAG region 7, shown in FIG. 12. 50 Optimized nucleotide TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATA sequence for an mRNA CATATGGCCTCGTCTTTACCCCGTGATGCAGAGAATCTG translation initiation TATTTTCAA region 8, shown in FIG. 12. 51 Optimized nucleotide TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATA sequence for an mRNA CATATGGCCTCTTCCCTTCCGCGCGATGCAGAGAACTTG translation initiation TATTTCCAA region 9, shown in FIG. 12. 52 P1-forward CGGGAATCCGGTACCCAAGCTAAACCTCAAATTCCGAA oligonucleotide, for wt- AG SrtA. 53 P1-reverse TTTTTTCCGCTCGAGTTTGACTTCTGTAGCTACAAAGATT oligonucleotide. TTACG 54 P2-forward TATACATATGGCTTCGTCCCACCATCAC oligonucleotide, for Gly2Ala. 55 P2-reverse TCTCCTTCTTAAAGTTAAACAAAATTATTTC oligonucleotide. 56 P3-forward CGCGGATCCCATATGGCTAGCAGCGAAAACCTGTATTTT oligonucleotide, for CAGGGCAGCACC MSP11. 57 P3-reverse GGCGAATTCGGTACCCGGCAGCTGGGTG oligonucleotide. 58 P4-forward GAGAATTTGTACTTCCAAGGATC oligonucleotide, for removing N-terminal His- tag. 59 P4-reverse GGACGAAGCCATATGTATATC oligonucleotide. 60 P5-forward CATCATTGAGATCCGGCTGCTAAC oligonucleotide, for introducing His.sub.4 into His.sub.6 at the c-terminal. 61 P5-reverse ATGATGGTGGTGGTGGTGGTGGTG oligonucleotide. 62 P6-forward TGGGTCCGGTGGTAGTGGTGGGAGTCAAGCTAAACCGC oligonucleotide, for AGATC introducing (GGS)x5 linker in MSP9-eSrtA. 63 P6-reverse CCTGAACCTCCCGATCCCCCGGTACCAGGCAGTTGTGTG oligonucleotide. TTAAG 64 P7-forward TGGGTCCGGTGGTAGTGGTGGGAGTCAAGCTAAACCTC oligonucleotide, for AAATTC introducing (GGS)x5 linker in MSP9-wtSrtA. 65 P7-reverse CCTGAACCTCCCGATCCCCCGGTACCAGGCAGTTGTGTG oligonucleotide = P6- TTAAG reverse oligonucleotide. 66 P8-forward TTGGGGGAGGAGATGCGT oligonucleotide, for deleting H4 in MSP9 to make MSP7. 67 P8-reverse GGGTTGTACCTTAGCCTTCAC oligonucleotide. 68 P9-forward TATAGTGATGAGTTGCGC oligonucleotide, for deleting H4 and H.sub.6 in MSP9 to make MSP6. 69 P9-reverse GGGTTGTACCTTAGCCTTC oligonucleotide. 70 P10-forward CGATGCCGAGAATTTGTACTTCCAAGG oligonucleotide, for introducing an inhibitor peptide. 71 P10-reverse CGTGGCAAGGACGAAGCCATATGTATATC oligonucleotide. 72 P11-forward CGCGGAAAACCTGTATTTTCAGGGATCGACGTTTTCCAA oligonucleotide, G optimized inhibitory peptide, option1. 73 P11-reverse TCACGAGGTAAAGAACTGGCCATATGTATATCTCCTTCT oligonucleotide. TAAAGTTAAAC 74 P12-forward CGCAGAGAATTTGTATTTCCAGGGATCGACGTTTTCCAA oligonucleotide, G optimized inhibitory peptide, option2. 75 P12-reverse TCGCGTGGAAGGGAGGAAGCCATATGTATATCTCCTTCT oligonucleotide. TAAAGTTAAAC 76 P13-forward GCGCAGCCAAGCTAAACCTCAAATTCC oligonucleotide, for introducing a Thrombin site between LPGTGAAALEGT linker and SrtA. 77 P13-reverse GGCACCAGGGTACCCTCTAAAGCTGC oligonucleotide. 78 SEQ ID NO: 78 - P14- GCGCTCCCAAGCTAAACCTCAAATTCCGAAAG forward oligonucleotide, for introducing a Thrombin site between LPGTG(GGS)5 linker and SrtA. 79 P14-reverse GGCACGAGACTCCCACCACTACCACC oligonucleotide. 80 P15-forward GAAAACCTGTATTTTCAGGG oligonucleotide, for removing N-terminal his tag in MSP11- LPGTGAAALEGTLVPR S-SrtA-His.sub.10. 81 P15-reverse GCTGCTAGCCATATGTATATC oligonucleotide. 82 P16-forward AAGAAGGAGATATACATATGGCCAGTTCTGAAAACCTGT oligonucleotide, for ATTTTCAGGGATCGACG amplification of MSP20 and replace MSP9 in MSP9- LPGTGAAALEGTLVPR S-wtSrtA-His.sub.10. 83 P16-reverse CACCAGGGTACCCTCTAAAGCTGCAGCACCTGTACCAG oligonucleotide. GTAACTGTGTATTTAACTTTTTAGTATATTCTTC 84 P17-forward CTGCCTGGTACCGGGGGA oligonucleotide, for generating empty autocyclase-L2a vector. 85 P17-reverse TCCCTGAAAATACAGGTTTTCCGCG oligonucleotide. 86 P18-forward GCCGATTTGCTTTCCGGATCTGCCTGGTACCGGGGGA oligonucleotide, to generate autocyclase-L2a- G-SFTI. 87 P18-reverse GGAATGCTTTTGGTGCAGCGTCCCTGAAAATACAGGTTT oligonucleotide. TCCGCG 88 P19-forward CACCTGCAGCTGGCCGGTGTGCACCCGCAACGGCCTGCC oligonucleotide, to GGTGACCGGGGGATCGGGAGGT generate autocyclase-L2a- G-KalataB1. 89 P19-reverse CAGCCCGGGGTGTTGCAGGTGCCGCCCACGCAGGTTTCG oligonucleotide. CCGCATCCCTGAAAATACAGGTTTTCCGCG 90 P20-forward GCCGATTTGCTTTCCGGATCTGCCTGGCACAGGTGCT oligonucleotide, to generate autocyclase-L1b- G-SFTI. 91 P20-reverse GGAATGCTTTTGGTGCAGCGTCCTTGGAAGTACAAATTC oligonucleotide. TCGGAC 92 P21-forward, to generate TCCGCCGATTTGCTTTCCGGATCTGCCTGGCACAGGTGC autocyclase-L1b-GGG- T SFTI. 93 P21-reverse ATGCTTTTGGTGCAGCGGCCACCTCCTTGGAAGTACAAAT oligonucleotide. TCTCGGAC 94 P22-forward TGGGTCCGGTGGTAGTGGTGGGAGTCTGGTGCCGCGCA oligonucleotide, to GCCAA generate autocyclase-L2a- GGG-SFTI. 95 P22-reverse CCTGAACCTCCCGATCCCCCGGTACCAGGCAGATCCGGAA oligonucleotide. AGCAAATCGG 96 P23-forward GGTGGCTGCGGCGAAACCTGCGTG oligonucleotide, to generate autocyclase-L1b- GGG-kB1. 97 P23-reverse TCCTTGGAAGTACAAATTCTCGGACG oligonucleotide 98 P24-forward GTCCGGTGGTAGTGGTGGGAGTCTGGTGCCGCGCAGCC oligonucleotide, to AA generate autocyclase-L2a- GGG-kB1. 99 P24-reverse CCACCTGAACCTCCCGATCCCCCTGTCACAGGCAGGCCG oligonucleotide. TTG 100 P25-forward CTATGATCATCCGGAAATTTGCGGTCTGCCTGGCACAGG oligonucleotide, to TGCT generate autocyclase-L1b- GG-Vc1.1. 101 P25-reverse TTGCAGCGCGGATCGCTGCAGCAACCTCCTTGGAAGTAC oligonucleotide. AAATTCTCGGAC 102 P26-forward TGGGTCCGGTGGTAGTGGTGGGAGTCTGGTGCCGCGCA oligonucleotide, to GCCAA generate autocyclase-L2a- GG-Vc1.1. 103 P26-reverse CCTGAACCTCCCGATCCCCCGGTACCAGGCAGACCGCA oligonucleotide. AATTTCCGG 104 Sortase A (SrtA) amino LPXTG/A acid recognition motif, [LPXTGA] where X is any amino acid 1. 105 Sortase A (SrtA) amino LAXTG acid recognition motif, where X is any amino acid 2. 106 Sortase A (SrtA) amino LPXSG acid recognition motif, where X is any amino acid 3. 107 Sortase A (SrtA) amino LPGTG/A acid recognition motif 1. [LPGTGA] 108 Sortase A (SrtA) amino LPSTG/A acid recognition motif 2. [LPSTGA] 109 Sortase A (SrtA) amino LPETG/A acid recognition motif 3. [LPETGA] 110 Sortase A (SrtA) amino LPGTG acid recognition motif 4. 111 Sortase A (SrtA) amino LPGTA acid recognition motif 5. 112 Sortase A (SrtA) amino LPSTG acid recognition motif 6. 113 Sortase A (SrtA) amino LPSTA acid recognition motif 7. 114 Sortase A (SrtA) amino LPETG acid recognition motif 8. 115 Sortase A (SrtA) amino LPETA acid recognition motif 9. 116 Sortase A (SrtA) amino LPATG acid recognition motif 10. 117 eSrtA(4S-9) amino acid LPXSG recognition motif, where X is any amino acid. 118 Sortase A (SrtA) amino LPXTG acid recognition motif, where X is any amino acid 4. 119 eSrtA(2A-9) amino acid LAXTG recognition motif, where X is any amino acid. 120 SrtA-F40 and SrtA-A1-22 APXTG amino acid recognition motif, where X is any amino acid. 121 SrtA-F1-20 amino acid FPXTG recognition motif, where X is any amino acid. 122 SrtAβ amino acid LMVGG recognition motif. 123 Amino acid linker/spacer, GS(GGS).sub.N where n is an integer of at least one, two, three, four, or five. 124 Amino acid linker/spacer GAAA 1. 125 Amino acid linker/spacer LEGT 2. 126 Amino acid spacer/linker GAAALEGT 3. 127 Amino acid spacer/linker GS(GGS)4 4. [GS GGS GGS GGS GGS] 128 Amino acid spacer/linker LPGTGAAALEGT 5. 129 Amino acid spacer/linker LPGT(GGS)5 6. [LPGT GGS GGS GGS GGS GGS] 130 Amino acid sequence of LVPRS Thrombin cleavage site. 131 Amino acid spacer/linker GGSGGSGG 7. 132 Amino acid sequence of LPRDA inhibitory peptide before TEV recognition site. 133 Amino acid sequence of GLU-ASN-LEU-TYR-PHE-GLN-(GLY/SER) TEV protease cleavage [ENLYFQGS] site 1. 134 Amino acid sequence of ENLYFQG TEV protease cleavage site 2. 135 Optimized amino acid GAAALEGT spacer/linker 1. 136 Optimized amino acid GS(GGS)4 spacer/linker 2. [GS GGS GGS GGS GGS] 137 Polyhistidine (affinity) tag HHHHHHHHHH 1. 138 Polyhistidine (affinity) tag HHHHHH 2. 139 Amino acid sequence of QAKP flexible segment of srtA. 140 5′-UTR Sequence, shown TCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATA in FIG. 12. CAT 141 Protein coding sequence, ATGGCTTCGTCCTTGCCACGCGATGCCGAGAATTTGTAC shown in FIG. 12. TTCCAA 142 16s rRNA sequence, AAGAAGGAGA shown in FIG. 12. 143 KpnI nucleotide GGTACC recognition site. 144 Butelase-1 amino acid ASX-HIS-VAL recognition motif, where Asx is Asn (N) or Asp (D). 145 Butelase-1 amino acid ASP-HIS-VAL recognition motif. 146 OaAEP1 amino acid ASN-GLY-LEU recognition motif. 147 AEP amino acid X.sub.1X.sub.2X.sub.3 recognition motif, where X.sub.1 is N or D; X.sub.2 is G or S; and X.sub.3 is L, A or I. 148 AEP amino acid acceptor X.sub.4X.sub.5 motif, wherein X.sub.4 is optional and any amino acid or G, Q, K, V or L; and X.sub.5 is optional or any amino acid or L, F or I or a hydrophobic amino acid residue. 149 Amino acid linker/spacer (G).sub.n polymer, where n is an integer of at least one, two, three, four, or five. 150 Amino acid linker/spacer (G.sub.1-5S.sub.1-5).sub.n polymer, where n is an integer of at least one, two, three, four, or five. 151 Amino acid linker/spacer. GGS
EXAMPLE
Introduction
[0445] The present Example was designed to overcome the shortcomings of the conventional enzyme catalyzed cyclisation of proteins. Below is described the first example of a unimolecular cyclisation reaction, which has a fundamentally different reaction mechanism and behaves according to first order reaction kinetics over a wide range of concentrations. The reaction is performed by a new family of proteins we have termed ‘autocyclases’, where the ligase is fused to the protein being cyclised. We present a workflow for use of autocyclases for production of cyclic proteins (including peptides) which includes expression, purification and cyclisation. The general utility of autocyclases is demonstrated by circularisation of two challenging systems: (1) α-helical membrane scaffold proteins (MSPs) for making circular nanodiscs (cNDs) and (2) disulfide-rich cyclotides.
[0446] Methods
Reagents and Plasmids
[0447] All lipids were purchased from Avanti Polar Lipids, Inc. (Alabaster, AL) or Sigma-Aldrich. Enzymes and buffers used for polymerase chain reaction and molecular cloning were purchased from Genesearch, the exclusive Australian distributor of New England Biolabs (NEB) molecular biology products. The Quick-stick ligase was purchased from Bioline. The DNA miniprep kit was bought from Qiagene. The gel extraction and PCR clean-up kit were ordered from Macherey-Nagel. All sequencing was performed by Sanger sequencing at the Australian Genome Research Facility (AGRF). All primers and codon-optimized gene fragments were ordered from Integrated DNA technology (IDT).
[0448] The plasmids for expressing evolved P94R/D160N/D165A/K190E/K196T sortase A and MSP11 were kindly provided by Prof David Liu and Prof. Gerhard Wagner respectively, both at Harvard University.
The Molecular Cloning of Autocyclase for MSPs
[0449] The amino acid sequences of the MSPs were based on literature reports (including MSP9 and MSP11.sup.22, MSP6 and MSP7.sup.43, and MSP20.sup.41). A codon-optimized gene fragment was designed to encode N-terminal His.sub.6, TEV site, MSP9 and evolved sortase A (eSrtA) (His.sub.6-G2-MSP9-LPGTGAAALEGT-eSrtA-His.sub.6). To facilitate the replacement of SrtA gene, a KpnI site (GGTACC) was introduced before the SrtA gene. This fusion gene was ordered from IDT, digested with NdeI/XhoI to generate an overhang for cloning into the pET29 vector and cleaned using a PCR clean up kit. The eSrtA expression plasmid was digested with NdeI/XhoI.sup.30 and purified by DNA agarose electrophoresis to generate the pET29a vector. The vector was then ligated with a suitable gene fragment (via NdeI/XhoI sites) to deliver a vector expressing N-terminal His.sub.6, TEV site, MSP9 and evolved sortase A (eSrtA).
[0450] To generate the fusion protein of MSP9-LPGTGAAALEGT-wild type SrtA (His.sub.6-G2-MSP9-LPGTGAAALEGT-wtSrtA-His.sub.6), SrtA-staph-A59 was amplified using suitable primers (P1 pairs—Table 1). The PCR product was digested with KpnI and XhoI to replace the eSrtA fragment of His.sub.6-G2-MSP9-LPGTGAAALEGT-eSrtA-His.sub.6 to yield His.sub.6-G2-MSP9-LPGTGAAALEGT-wtSrtA-His.sub.6. The mutation of Gly2 to Ala2 in both MSP9-wild type SrtA and pentamutant SrtA was achieved using the primer P2 pairs (Table 1) with NEB Q5 mutagenesis kit and home-made CaCl.sub.2) competent cells.
[0451] The removal of N-terminal His.sub.6-tag in His.sub.6-MSP9-LPGTGAAALEGT-wtSrtA/eSrtA-His.sub.6 was achieved using the primer P4 pairs while the introduction of four more histidines into His.sub.6-tag at the C-terminus was achieved using the P5 primer pairs (Table 1), yielding MSP9-LPGTGAAALEGT-wtSrtA/eSrtA-His.sub.10.
[0452] The linker replacement in MSP9-LPGTGAAALEGT-wtSrtA-His.sub.10 and MSP9-LPGTGAAALEGT-eSrtA-His.sub.10 was achieved using the P6 and P7 primer pairs respectively (Table 1), yielding MSP9-LPGT(GGS)5-wtSrtA-His.sub.10 (
[0453] To generate the fusion of MSP11—LPGT(GGS)5-wtSrtA-His.sub.10, MSP11 was amplified from MSP1D1.sup.22 expression plasmid using the primer P3 pair. The PCR product was digested with NdeI/KpnI to replace the MSP9 fragment of MSP9—LPGT(GGS)5-wtSrtA-His.sub.10 and to yield MSP11—LPGT(GGS)5-wtSrtA-His.sub.10 (
[0454] To generate the fusion of MSP20-linker-wtSrtA, codon optimized MSP20 gene block was digested with NdeI/KpnI to replace the MSP9 fragment at MSP9—LPGT(GGS)5-wtSrtA-His.sub.10 to yield MSP20—LPGT(GGS)5-wtSrtA-His.sub.10 (
The Molecular Cloning of Autocyclases for Cyclotides
[0455] We used the primer pair P17 to delete MSP9 in A2-inhibitory-peptide-TEV-MSP9-LPGT(GGS)5LVPRS-SrtA-His.sub.10 to generate the empty autocyclase vector A2-inhibitory-peptide-TEV-LPGT(GGS)5LVPRS-SrtA-His.sub.10 (autocyclase-L2a vector). In this autocyclase-L2a vector, primer pair P18 was used to insert G-SFTI between TEV site and linker L2a, and primer pair P19 was used to insert G-kB1. To generate SFTI in autocyclase-L1b, primer pair P20 was used to insert G-SFTI and P21 for GGG-SFTI between TEV site and linker L1b. Then GGG-SFTI in autocyclase-L2a was made by replacing L1b with L2a linker using primer pair P22.
[0456] To generate G-kB1 in autocyclase-L1b, a gene block coding G-kB1 was ordered from IDT and replaced MSP9 in autocyclase-L1b-MSP9 to generate autocyclase-L1b-G-kB1. Autocyclase-L1b-G-kB1 was converted to Autocyclase-L1b-GGG-kB1 by the primer pair P23. Subsequently autocyclase-L1b-GGG-kB1 was changed into autocyclase-L2a-GGG-kB1 by the primer pair P24.
[0457] Autocyclase-L1b-GG-Vc1.1 was constructed by replacing MSP9 in autocyclase-L1b-MSP9 by GG-Vc1.1 with the primer pair P25. Primer pair P26 was used in PCR mutagenesis to replace L1b by L2a to produce autocyclase-L2a-GG-Vc1.1 (Table 1).
SDS-Page Analysis
[0458] Protein samples were run on 12 or 15% SDS-page gels (made in-house) under electrophoresis at 180-200 V for 30-50 mins. The Precision Plus Protein Dual Xtra ladder (Bio-Rad) was used as a detectable standard, consisting of 20, 25, 37, 50, 75, 100, 150 and 250 kDa markers. After electrophoresis, gels were washed with warm distilled water three times for 3-5 minutes for each wash, prior to staining with Coomassie Brilliant Blue (CBB), then destained in distilled water.sup.50. Band intensity of Coomassie staining was quantified using ImageLab software (Bio-Rad). Circularisation efficiency was calculated as the intensity of circular protein bands divided by the intensity of the intact autocyclase.
Autocyclase Expression and Purification
[0459] Each fusion protein expression construct (in pET29 vector) was transformed into E. coli BL21(DE3) cells. The freshly transformed colonies or a glycerol stock was inoculated into 10 mL LB media containing 50 μg/mL of kanamycin. LB media was then shaken at 30° C. at 220 rpm overnight for a preculture. 1% of the preculture (3 mL) was used to inoculate 300 mL LB broth containing 50 μg/mL of kanamycin. The culture was then incubated at 37° C. (shaking at 250 rpm) until the OD.sub.600 reached about 1.0. Induction was commenced by the addition of 0.2 mM IPTG and the culture was left shaking at 250 rpm for 1-6 h at 30° C. The cells were harvested by centrifugation in 0.5 L bottles using a JLA-10.500 rotor (Beckman) operating at 6,000 g for 10 min at 4° C. and the cell pellets were stored at −20° C. The cell pellets were resuspended in lysis buffer (25 mM sodium phosphate, 500 mM NaCl, pH 7.4, 20 mM imidazole) plus 1 mg/mL lysozyme and were stirred at 4° C. for 0.5 hour. The resuspended cells were lysed by sonication (digital sonifier 450 Branson) on ice (40% power, 3 s on and 12 s off) for 5 min and the suspension was mixed for a better cooling, followed by repeated sonication.
[0460] The sonicated sample was centrifuged in 50 mL bottles using a JA-25.50 rotor (Beckman) at 30,000 g for 30 min at 4° C. The supernatant was loaded onto a gravity column containing 3 mL Ni-NTA resin (pre-equilibrated with 4° C. lysis buffer). The column was washed with five column volumes of lysis buffer. The autocyclase was then eluted with five column volumes of elution buffer (25 mM sodium phosphate, 500 mM NaCl, pH 7.4, 500 mM imidazole).
[0461] The collected fractions from the column were supplemented with 20 mM EDTA and buffer exchanged into the reaction buffer (20 mM Tris HCl pH 7.5, 150 mM NaCl) using 10,000 MW cutoff Amicon centrifugal filter unit through centrifugation (Sigma 2-16K Centrifuge, SciQuip) at 4000 g, 4° C. The protein solution was then supplemented with 1 mM β-mercapto ethanol (BME), 0.5 mM EDTA and TEV at TEV to protein ratio of 1 mg: 50 mg for TEV protease cleavage overnight at 4° C. The cleaved protein sample was then spun down at 3000 g for 5 min at 4° C. to remove any precipitates.
MSP and Cyclotides Circularization
[0462] For MSP cyclisation, TEV protease-cleaved autocyclase was kept at a total protein concentration<100 μM in equilibration buffer (20 mM Tris HCl pH 7.5, 150 mM NaCl, 0.5 mM EDTA, 1 mM β-mercapto ethanol). The sample was supplemented with 1 mM DDM and 10 mM CaCl.sub.2) to initiate cyclisation. The reaction was carried out at 37° C. for 6-8 hours or overnight with shaking at 200 rpm.
[0463] For cyclotide cyclisation, the autocyclase was also cleaved by TEV protease to expose N-terminal glycine. The reaction was initialized at a total concentration of <100 μM in the reaction buffer (20 mM Tris HCl pH 7.5, 150 mM NaCl, 10 mM CaCl.sub.2)). The reaction was supplemented with 3 mM GSH/0.3 mM GSSG to produce cyclised and oxidized cyclotides. The glutathione can be replaced by 3 mM β-mercaptoethanol for producing cyclised and reduced peptides.
cMSPs Purification
[0464] Following completion of the reaction, the reaction mixture was filtered or centrifuged to remove any precipitation and directly loaded onto a gravity column containing 3 mL Ni-NTA resin (pre-equilibrated with the reaction buffer, i.e. 20 mM Tris HCl pH 7.5, 150 mM NaCl). The flow-through containing cyclised MSP products was collected while generated free SrtA remained on the column.
[0465] cMSPs were buffer exchanged into equilibration buffer (20 mM Tris HCl pH 7.5, 1 mM DDM) using 10,000 MW cutoff Amicon centrifugal filter unit through centrifugation (4,000 g, 4° C.). The sample was loaded onto a 5 mL HiScreen Q HP (GE Healthcare) anion-exchange column at 4° C. and purified by an AKTA FPLC system (GE Healthcare). A flow rate was 0.3 mL/min and a linear gradient of 0 to 25% of equilibration buffer supplemented with 1 M NaCl over 20 column volumes was applied. Chromatograms were recorded as A.sub.280 over volume (mL) and samples were fractionated as 4 mL fractions through an automated fraction collector (Frac-920) module (GE Healthcare). The fractions containing>95% pure cMSPs as judged by SDS-page were pooled together, concentrated to about 0.5 mM, aliquoted and flash frozen to be stored at −80° C. or used directly for nanodisc assembly.
Sortase a Regeneration
[0466] After MSP cyclisation, the unreacted autocyclase, if any, SrtA with the linker and SrtA with MSP degradants in the case of MSP and other by-products were captured on Ni-NTA resin. The resin was rinsed with a buffer containing 20 mM Tris HCl pH 7.5, 50 mM NaCl and incubated with thrombin protease (sigma T4648-1KU) at a 1 unit: 1 mg ratio of thrombin protease: fusion protein at room temperature for six hours or overnight. After thrombin cleavage, the resin was washed with the same buffer and the SrtA eluted with the elution buffer (25 mM sodium phosphate, 500 mM NaCl, 500 mM imidazole, pH 7.4). Finally, the collected eluant for pure SrtA was buffer exchanged into 20 mM Tris.Math.HCl, 100 mM NaCl, pH 7.5, 2 mM DTT by a 10,000 Mw Amicon Centricon (Merck Millipore) through centrifugation with a Refrigerated Sigma 2-16K Centrifuge (SciQuip) at 4000 rcf, 4° C. Aliquots of the SrtA were flash frozen and stored at −80° C. Wild-type sortase A (wtSrtA) concentration was calculated from the measured A.sub.280 using the extinction coefficient of 17,420 M.sup.−1 cm.sup.−1 (https://web.expasy.org/protparam/).
ND Production
[0467] Lipid stocks, stored at −80° C., of powder masses of either POPC, POPG and DOTAP (all Tm<4° C.) were suspended in reconstitution buffer (25 mM Tris pH 7.5, 100 mM NaCl, 0.5 mM EDTA and 100 mM cholate) and always handled on ice.
[0468] To assemble cNWs, [lipid]/[cMSP] ratios were defined based on past literature, using the equation: N.sub.L×S=(0.423×M−9.75).sup.2, where N.sub.L is the number of lipids per ND, M is the number of amino acids in the scaffold protein and S is the mean surface area per lipid used to form the lipid-nanodisc, measured in Å.sup.2 58. POPC and POPG have been estimated to have a similar mean surface area of around 70 Å.sup.2 59. We therefore determined ratios of x:1, x:1, 40:1, 50:1 and 60:1 for cMSP6, cMSP7, cMSP9, cMSP11 and cMSP20, respectively. MSP and lipid were both dissolved at the desired ratio and rocked for 1 hour at 4° C. Subsequently, 0.6 g of Bio-Beads SM-2 per mL of solution were added to absorb detergent and initiate ND assembly. The mixture was gently stirred for 4 h at 4° C. The solution was filtered through a 0.45 μm PES membrane to remove the Bio-Beads and then concentrated using an Amicon 10 kDa Centricon at 4° C. and 3000 g. The assembled discs were injected into size exclusion chromatography to monitor aggregation behavior in the buffer of 20 mM Tris.Math.HCl, 50 mM NaCl, 1 mM EDTA, pH 7.5.
Results
The Rational Design of Autocyclases
[0469] The design of an autocyclase involves engineering several modules: (i) an activation site which is liberated by application of a suitable protease; (ii) a target protein (which could be a peptide or polypeptide) to be cyclised; (iii) a ligation recognition site; (iv) a spacer of suitable length and flexibility; (v) the ligase enzyme sequence and finally (vi) a purification/affinity tag to remove the ligase byproduct once the reaction has taken place. The ligase once liberated may contain a reactive N-terminal amino acid (glycine in the case of SrtA), making it of little value as a ligase enzyme in other applications. However, the ligase can be recovered in a useful form if an additional module is introduced between modules (iv) and (v). In our design this module (iv′) is a protease site—orthogonal to that used in step (i)—which removes the reactive N-terminal sequence of the liberated ligase. Thus, the principles of the autocyclase approach are general where each module may be optimized or swapped for module with similar properties.
[0470] The activation site (i): The N-terminus of the protein to be cyclised generally requires a specific recognition sequence. However, it is typically important that this sequence is shielded from exposure prematurely and therefore, the N-terminus of the protein is extended beyond this sequence by addition of a protease recognition sequence. Here we have inserted a TEV-protease recognition site N-terminal to the ligase recognition site. Once the autocyclase is purified the reactive N-terminal sequence may be liberated by application of TEV protease. In this case this leaves an N-terminal glycine as the substrate for the sortase A ligation reaction.
[0471] The target (ii) and recognition site (iii): The target protein (or peptide) to be cyclised generally requires N- and C-termini that are in close proximity when the protein is folded. Thus, this includes both naturally cyclic (e.g., cyclotides) and naturally linear (e.g., MSP) targets. Furthermore, the C-terminus requires a ligase recognition site for fusion to the reactive N-terminal sequence. Here, we have used two cyclotides and five MSPs as the targets and C-terminally added the SrtA recognition site to these sequences.
[0472] The linker (iv): In our design the linker fuses the SrtA recognition site LPXTG (where X denotes any amino acid) to the N-terminal end of SrtA. Thus, a linker is required that would put the LPXTG site in contact with the ligase catalytic site (i.e. catalytic C184 in
[0473] The structure of SrtA in complex with its substrate (LPXTG), reveals that the distance from the glycine in the LPXTG motif to the N-terminal Q64 in SrtA is ˜35 Å along the surface of the protein, corresponding to approximately a 12 amino acid linker. After subtracting the flexible N-terminal part (QAKP) of SrtA, eight amino acids would represent the shortest possible shortest linker without the need to further unfold the N-terminal region beyond this point. Here, the composition of those eight amino acids is chosen as GAAALEGT (L1 in
[0474] The above design was used to autocircularize an MSP9-L1-SrtA (9 refers to the diameter, in nanometers, of the nanodisc produced by this protein) and an MSP9-L1b-SrtA construct to produce cMSP9 at 4° C., 16° C., 23° C. (room temperature) and 37° C. All reactions proceeded successfully with the highest yield achieved at 37° C. (
[0475] To relax the conformational constraints imposed by the shorter L1 and L1b linkers, a longer linker (L2) was also engineered that included five GGS repeats (GGS)5.sup.26. This is a significantly longer and more flexible linker that can easily span the distance between the active and recognition sites. Indeed, the production of cMSP9 from MSP9-L2-SrtA (
[0476] While the use of MSP9-L2-SrtA led to improved reaction kinetics, it unfortunately also led to more in vivo hydrolysis and decrease overall yields of the fusion protein (
[0477] The ligase: While natural cyclases are now available, SrtA remains one of the most widely utilised ligases for biochemical reactions. In addition, sortase fusions have been utilized to simplify protein purification and labeling.sup.26, 28, 29.
[0478] Our initial autocyclase design was constructed using the evolved sortase pentamutant (eSrt).sup.30 to prepare circular target protein.sup.22. However, we found that the fusion protein is not very soluble when expressed at 37° C., while soluble expression was found at 30° C. (
[0479] These species likely represent polymeric products. The major species present was a 75 kDa protein which corresponds to the weight of 2x(MSP-SrtA). For these polymeric products to form, an N-terminal glycine is required, and it is possible.sup.28 that in E. coli an endogenous methionine aminopeptidase removes Met1 to expose Gly2.sup.31, which then is a substrate for the enzyme. This hypothesis was then supported by a G2A mutation that resulted in expression without the in vivo polymerization products (
[0480] Nevertheless, even when expressing the G2A MSP9-eSrtA, we observed that ˜35% of the fusion protein underwent in vivo hydrolysis (cleavage) after three hours resulting in high levels of impurities (
[0481] The purification tag: The autocyclase can be purified after expression by use of a suitable purification tag. Here we have investigated the use of hexa- and deca-histidine tags (His.sub.6 and His.sub.10 respectively) at both termini. We find that if a hexahistidine-tag is placed at both ends of autocyclase, e.g., His.sub.6-MSP9-SrtA-His.sub.6, the in vivo hydrolysis will produce his-tagged linear His.sub.6-MSP9, which downstream contaminates the product of the autocyclase reaction (data not shown).
[0482] The purification tag is therefore ideally placed, only at the C-terminal end of the autocyclase. We further found that, compared to a His.sub.6-tag, a His.sub.10-tag yielded improved purity of the final product without otherwise affecting the process.
[0483] In vivo stability of the target (iii): It is well-known that human derived MSP is prone to strong bacterial proteolysis. Faas et al. performed detailed experiments to determine the time-course and degradation rate of MSP1D1 in a standard pET expression system. Their work revealed that the maximum MSP1D1 yields are achieved at 4 h post-induction before the degradation rate overtakes the production rate.sup.33. We performed similar time-course experiments which show that our MSP containing autocyclase also had the highest expression level of the fusion protein at 4 h post-induction at 30° C. using 1 mM IPTG for induction. Furthermore, the IPTG concentration used for SrtA fusions has previously been optimized to 0.2 mM.sup.28, 29. We therefore also performed time-course experiments at 0.2 mM IPTG and 1 mM IPTG. We found that when using the longer and more flexible L2 linker protein expression peaks at 4 h regardless of IPTG concentration, the shorter more rigid L1 linker reaches a maximum after 4 hours at the higher IPTG concentration and peaks after 5-6 hours when the lower concentration is used (
The General Application of Autocyclase
[0484] Next, we investigated the generality of the autocyclase design outlined above using different sizes of MSPs as well as a number of cyclic peptides.
[0485] The MSP-autocyclases: First, we investigated the production of cMSP6, 7, 11 and 20, by introducing these sequences into the autocyclase-L1b construct. In all cases we were able to successfully produce circularised MSPs through the procedure outlined above for cMSP9, albeit with some minor adjustments in each case.
[0486] A significant challenge in the production of MSPs is the need to reduce self-association of the protein to prevent polymerization during the cyclisation reaction. Indeed, the crystal structure.sup.34 of MSP reveals that the protein forms stable dimers, which may explain the challenges in producing monomeric circular proteins. To address this issue, studies have shown that the addition of detergents (DDM.sup.35 or Triton X-100.sup.36) significantly reduces the presence of polymeric byproducts. Our result for the autocyclase construct is consistent with these reports and we find that the addition of 1 mM DDM dramatically improves the yield of cMSP9 (termed detergent-assisted cyclisation). Thus, we include 1 mM DDM in autocyclase reaction for producing all MSPs investigated here.
[0487] We note here that since Triton X-100 is present in the lysis buffer it will co-purify with the protein unless specific measures are taken to reduce its concentration. Indeed, we find that while addition of DDM to the MSP9-autocyclase containing co-purified Triton X-100 results in improved yields, complete removal of Triton X-100 prior to addition of DDM results in significant levels of polymeric products. Thus, our results suggest that Triton X-100 is important in the detergent-assisted cyclisation process while DDM may also be used if Triton X-100 is co-purified with the protein from earlier steps. We also note that the challenges in removal of Triton X-100 while fortuitous in the cyclisation process, does interfere with absorbance readings at 280 nm, and should be removed prior to quantitation—this can be readily verified by 1D .sup.1H NMR after ion exchange chromatography or treatment with biobeads.
[0488] The requirement of detergents in the circularization of MSPs is also dependent on the size of the MSP. The circularization of MSP9 to cMSP9 heavily relies on the presence of 1 mM DDM while the cyclisation of cMSP20 and cMSP6 is independent of the presence of additional detergents. MSPs 6, 7, and 11 all show moderate improvements in yields (˜5%, ˜15% and ˜10% respectively) upon addition of detergents (1 mM DDM) during cyclisation. The extent of polymerization is dependent on the rate of aggregation and this of course is also influence by the protein concentration. Details of the reaction mechanism are discussed in the next section, but we note here that cMSP9 is efficiently produced at concentrations up to ˜50 μM, while MSPs less prone to aggregation can be produced effectively also at higher concentrations (˜100 μM—
[0489] The disulfide-stabilized peptide-autocyclases: The second class of molecules that we tested were the naturally occurring cyclic peptides. These peptides feature head-to-tail circularization and are further stabilized by disulfide bonds. Here, we have several well-characterized peptides including SFTI (one disulfide bond).sup.37, Vc1.1 (two disulfide bonds).sup.38 and KalataB1 (kB1—three disulfide bonds).sup.39. We note that in these reactions the proteins remain monomeric in solution and thus do not require addition of detergents.
[0490] Initially, the autocyclase-L1b construct was used (as for the MSPs), but we found very slow rates of cyclisation reaction using this construct (>3 days for SFTI and kB1). Thus, we generated a series of autocyclase-L2a constructs. We find that for SFTI-L2a this significantly improves the reaction velocity and the reaction is complete within 24 h (at either 37° C. or room temperature). The L2a-kB1 reacts even faster and is completed within 3 h at 37° C. (or 4 h at room temperature). Since SFTI contains only one disulfide bond, the inclusion of GSH/GSSG in the buffer simply produces circularised, oxidized and natively folded SFTI (
[0491] It has been shown that SrtA mediated ligation is sensitive to the number of glycine residues at the N-terminus of the ligated protein, with higher numbers of glycines yielding improved cyclisation efficiency. In the case of the native sequence has been modified for improved oral stability.sup.40. Thus, to maintain the length of linker used in this modified version, a GG-Vc1.1-L2a design was followed here.sup.21.
The Mechanism of Autocyclase
[0492] As noted above, the main motivation in generating the autocyclase enzyme was to change the reaction mechanism to overcome challenges associated with polymerisation in the bimolecular reaction. While the bimolecular reaction is an enzymatic reaction which often follows zero order kinetics, in this case the near stoichiometric amounts of enzyme used (i.e. in cMSP reactions reported), means that the reaction will more closely follow second order kinetics. Thus, while the unimolecular autocyclase reaction will be independent of the reactant concentration and the rate will scale linearly with the reactant concentration, in the bimolecular reaction the rate will scale by the square of the reactant concentration. However, since the product of the unimolecular reaction may catalyse subsequent reactions, it is expected that the reaction will deviate from first order kinetics once significant amounts of free sortase A is released. A summary of the different reaction pathways is shown in
[0493] To characterise the reaction mechanism and kinetics we measured the initial reaction velocity (ν.sub.0) for the L2a-kB1 autocyclase using three different starting concentrations. We find that at early time points the reaction rate doubles when the concentration of the L2a-kB1 autocyclase is doubled from 50 μm to 100 μM. This is consistent with first order reaction kinetics and confirms the proposed unimolecular reaction mechanism. When the concentration is further increased to 150 μM, the reaction rate increases by 4.35 times, compared to the reaction rate at 50 μM (ν.sub.0.sup.1.33) indicating that the reaction is predominantly the first order with contributions from a higher order reaction—consistent with the competition with bimolecular reactions (
Discussion
[0494] Nanodiscs (NDs) provide a physiologically relevant bilayer environment for performing biochemical and biophysical characterization of membrane proteins, with applications in a wide variety of fields.sup.41. In the pioneering work in establishing cyclised MSPs, the reaction yielded undesired byproducts.sup.22. Subsequently these polymeric byproducts were effectively suppressed by detergent assisted cyclisation and a dropwise addition of the MSP to the eSrtA solution.sup.36. Johansen et al. further engineered a solubility-enhanced cMSP with improved production yields, by introduction of a high abundance of negatively charged amino acids..sup.46 However, both methods still require the low concentration of MSP to suppress the undesired polymeric by-products.
[0495] Furthermore, the high molar ratio of eSrtA to MSP used in these experiments requires an extra step for separate preparation of large amounts of eSrtA.
[0496] Recently, an alternative approach was introduced for the generation of cMSPs based on in vivo split intein ligation in E. coli..sup.47 Although the intein method eliminates the additional in vitro enzymatic reaction, there are extra purification steps required, while it is also necessary to introduce a His.sub.6-tag and extra cysteine residue into the final cMSP sequence. The presence of a free cysteine residue further complicates application of this methodology to disulfide-bond containing proteins such as the cyclic peptides described here.
[0497] The autocyclase method described herein produces circular MSPs that are identical in sequence to those originally presented. The unimolecular reaction design results in higher yields, less reaction steps and reduced time (<two days including protein expression and purification). We also demonstrate the versatility of the method by producing a wide range of cMSP of varied lengths, including cMSP6, cMSP7, cMSP9, cMSP11, and cMSP20—this includes the first reports of a cysteine- and His-tag-free cMSP6 and cMSP7. Further, the presented procedure allows for purification of reactive sortase A as a byproduct. Finally, we note that previous studies have noted significant amounts of insoluble MSP9 and MSP11 after cell lysis, while the original MSP30 and MSP50 were purified under denaturing conditions. Here SrtA appears to function as a solubility enhancement tag and we find that all of the expressed MSP9, MSP11, and MSP20 are in the soluble fraction upon cell lysis.sup.22. The autocyclase approach, when applied to MSPs, therefore represents a significant improvement of the current nanodisc technology, and may facilitate increased uptake and utility of this powerful technology.
TABLE-US-00002 TABLE 1 Primers used in this study. Primer Name Primer Sequence P1-forward with KpnI site CGG GAATCC GGTACC CAA GCTAAACCTCAAATT CCGAAAG underlined, for wt-SrtA P1-reverse with XhoI site TTTTTT CCG CTCGAG TTT GAC TTC TGT AGC TAC AAA GAT underlined TTT ACG P2-forward, for Gly2Ala TATACATATGgctTCGTCCCACCATCAC P2-reverse TCTCCTTCTTAAAGTTAAACAAAATTATTTC P3-forward with NdeI site CGCGGATCC CAT ATG GCT AGC AGC GAA AAC CTG TAT TTT underlined, for MSP11 CAG GGC AGC ACC P3-reverse with KpnI site GGCGAATTC GGT ACC CGG CAG CTG GGT G underlined P4-forward, for removing GAGAATTTGTACTTCCAAGGATC N-terminal His-tag P4-reverse GGACGAAGCCATATGTATATC P5-forward, for introducing catcatTGAGATCCGGCTGCTAAC His.sub.4 into His.sub.6 at the c- terminal P5-reverse atgatgGTGGTGGTGGTGGTGGTG P6-forward, for introducing tgggtccggtggtagtggtgggagtCAAGCTAAACCGCAGATC (GGS)x5 linker in MSP9- eSrtA P6-reverse cctgaacctcccgatcccccggtaccAGGCAGTTGTGTGTTAAG P7-forward, for introducing tgggtccggtggtagtggtgggagtCAAGCTAAACCTCAAATTC (GGS)x5 linker in MSP9- wtSrtA P7-reverse = P6-reverse cctgaacctcccgatcccccggtaccAGGCAGTTGTGTGTTAAG P8-forward, for deleteing TTGGGGGAGGAGATGCGT H4 in MSP9 to make MSP7 P8-reverse GGGTTGTACCTTAGCCTTCAC P9-forward, for deleteing TATAGTGATGAGTTGCGC H4 and H6 in MSP9 to make MSP6 P9-reverse GGGTTGTACCTTAGCCTTC P10-forward, for cgatgccGAGAATTTGTACTTCCAAGG introducing an inhibitor peptide P10-reverse cgtggcaaGGACGAAGCCATATGTATATC P11-forward, optimized cgcggaaaacctgtattttcagGGATCGACGTTTTCCAAG inhibitory peptide, option1 P11-reverse tcacgaggtaaagaactggccatATGTATATCTCCTTCTTAAAGTTAAAC P12-forward, optimized cgcagagaatttgtatttccagGGATCGACGTTTTCCAAG inhibitory peptide, option2 P12-reverse tcgcgtggaagggaggaagccatATGTATATCTCCTTCTTAAAGTTAAAC P13-forward, for gcgcagcCAAGCTAAACCTCAAATTCC introducing a Thrombin site between LPGTGAAALEGT linker and SrtA P13-reverse ggcaccagGGTACCCTCTAAAGCTGC P14-forward, for gcgctccCAAGCTAAACCTCAAATTCCGAAAG introducing a Thrombin site between LPGTG(GGS)5 linker and SrtA P14-reverse ggcacgagACTCCCACCACTACCACC P15-forward, for removing GAAAACCTGTATTTTCAGGG N-terminal his tag in MSP11- LPGTGAAALEGTLVPRS- SrtA-His.sub.10 P15-reverse GCTGCTAGCCATATGTATATC P16-forward, for AAGAAGGAGATATAcatatg gccagttct amplification of MSP20 and gaaaacctgtattttcagggatcgacg replace MSP9 in MSP9- LPGTGAAALEGTLVPRS- wtSrtA-His.sub.10 P16-reverse CAC CAG GGT ACC CTC TAA AGC TGC AGC ACC TGT ACC AGG TAA CTG TGT ATT TAA CTT TTT AGT ATA TTC TTC P17-forward, for generating CTGCCTGGTACCGGGGGA empty autocyclase-L2a vector P17-reverse TCCCTGAAAATACAGGTTTTCCGCG P18-forward, to generate gccgatttgctttccggatCTGCCTGGTACCGGGGGA autocyclase-L2a-G-SFTI P18-reverse ggaatgcttttggtgcagcgTCCCTGAAAATACAGGTTTTCCGCG P19-forward, to generate Cacctgcagctggccggtgtgcacccgcaacggcctgccggtg autocyclase-L2a-G- ACCGGGGGATCGGGAGGT KalataB1 P19-reverse Cagcccggggtgttgcaggtgccgcccacgcaggtttcgccgca TCCCTGAAAATACAGGTTTTCCGCG P20-forward, to generate gccgatttgctttccggatCTGCCTGGCACAGGTGCT autocyclase-L1b-G-SFTI P20-reverse ggaatgcttttggtgcagcgTCCTTGGAAGTACAAATTCTCGGAC P21-forward, to generate tccgccgatttgctttccggatCTGCCTGGCACAGGTGCT autocyclase-L1b-GGG- SFTI P21-reverse atgcttttggtgcagcggccaccTCCTTGGAAGTACAAATTCTCGGAC P22-forward, to generate tgggtccggtggtagtggtgggagtCTGGTGCCGCGCAGCCAA autocyclase-L2a-GGG-SFTI P22-reverse cctgaacctcccgatcccccggtaccAGGCAGATCCGGAAAGCAAATCGG P23-forward, to generate ggtggcTGCGGCGAAACCTGCGTG autocyclase-L1b-GGG-kB1 P23-reverse TCCTTGGAAGTACAAATTCTCGGACG P24-forward, to generate gtccggtggtagtggtgggagtCTGGTGCCGCGCAGCCAA autocyclase-L2a-GGG-kB1 P24-reverse ccacctgaacctcccgatcccccTGTCACAGGCAGGCCGTTG P25-forward, to generate ctatgatcatccggaaatttgcggtCTGCCTGGCACAGGTGCT autocyclase-L1b-GG-Vc1.1 P25-reverse ttgcagcgcggatcgctgcagcaaccTCCTTGGAAGTACAAATTCTCGGAC P26-forward, to generate tgggtccggtggtagtggtgggagtCTGGTGCCGCGCAGCCAA autocyclase-L2a-GG-Vc1.1 P26-reverse cctgaacctcccgatcccccggtaccAGGCAGACCGCAAATTTCCGG
TABLE-US-00003 TABLE 2 The average yields of cMSP of various sizes per liter of E. coli culture from srtA-fusion based circularisation. Yield of Yield of circular MSP Constructs fusion(mg/400 mL) (cMSP)(mg/400 mL) cMSPΔH4-6 (cMSP6) 2.2 cMSPΔH4-5 (cMSP7) 2.1 cMSPΔH5 (cMSP9) 25.0 6.7 cMSPD1 (cMSP11) 19.6 8.4 cMSP2N2 (cMSP20)* 22.9 6.9 *The yield of MSP20 is heavily underestimated, since the amount of Ni-NTA is not enough to capture MSP20-sortase. Half of MSP20-sortase is estimated to have been lost.
TABLE-US-00004 TABLE 3 The characterization of circular and intact MSP proteins by mass spectrometry. Mass, Da Mass, Da Circularised Circularised Constructs (calculated) (Observed) cMSPΔH4-6 (cMSP6) 14,456.37 14456 cMSPΔH4-5 (cMSP7) 16,953.20 16953 cMSPΔH5 (cMSP9) cMSPD1 (cMSP11) cMSP2N2 (cMSP15)
TABLE-US-00005 TABLE 4 Kinetic parameters of SrtA mutants with a reduced activity compared to the autocyclase enzyme. K.sub.cat, K.sub.m LPETG K.sub.cat/K.sub.m LPETG Activity loss Mutants [s.sup.−1] [mM] [M.sup.−1S.sup.−1] [fold] WT 1.10 ± 0.06 8.76 ± 0.78 125 ± 18 V168A 0.15 ± 0.01 6.56 ± 0.64 22.7 ± 3.6 5.5 L169A 1.23 × 10.sup.−2 ± 9.14 ± 0.15 1.35 ± 0.14 93 0.06 × 10.sup.−2 E171A 0.16 ± 0.01 6.74 ± 0.69 23.1 ± 3.6 5.4 Q172A 1.13 ± 0.11 12.7 ± 1.9 89.3 ± 22.sup. 1.4 R197A 6.28 × 10.sup.−4 ± 4.69 ± 0.12 0.13 ± 0.01 960 0.60 × 10.sup.−5 R197K 1.90 × 10.sup.−3 ± 10.4 ± 0.19 0.18 ± 0.01 690 0.20 × 10.sup.−4
REFERENCES
[0498] 1. Cascales, L. & Craik, D. J. Naturally occurring circular proteins: distribution, biosynthesis and evolution. Org. Biomol. Chem. 8, 5035-5047 (2010). [0499] 2. Clark, R. J., Akcan, M., Kaas, Q., Daly, N. L. & Craik, D. J. Cyclization of conotoxins to improve their biopharmaceutical properties. Toxicon (2010). [0500] 3. Wong, C. T. T. et al. Orally Active Peptidic Bradykinin B1 Receptor Antagonists Engineered from a Cyclotide Scaffold for Inflammatory Pain Treatment. Angewandte Chemie International Edition, n/a-n/a (2012). [0501] 4. Driggers, E. M., Hale, S. P., Lee, J. & Terrett, N. K. The exploration of macrocycles for drug discovery—an underexploited structural class. Nat. Rev. Drug Discov. 7, 608-624 (2008). [0502] 5. Dawson, P. E., Muir, T. W., Clark-Lewis, I. & Kent, S. B. Synthesis of proteins by native chemical ligation. Science 266, 776-779 (1994). [0503] 6. Muir, T. W. Semisynthesis of proteins by expressed protein ligation. Annu. Rev. Biochem. 72, 249-289 (2003). [0504] 7. Muir, T. W., Sondhi, D. & Cole, P. A. Expressed protein ligation: a general method for protein engineering. Proc. Natl. Acad. Sci. U.S.A 95, 6705-6710 (1998). [0505] 8. Kimura, R. & Camarero, J. A. Expressed protein ligation: a new tool for the biosynthesis of cyclic polypeptides. Protein Pept. Lett. 12, 789-794 (2005). [0506] 9. Kimura, R. H., Tran, A. T. & Camarero, J. A. Biosynthesis of the cyclotide Kalata B1 by using protein splicing. Angew. Chem. Int. Ed. Engl. 45, 973-976 (2006). [0507] 10. Tavassoli, A. & Benkovic, S. J. Split-intein mediated circular ligation used in the synthesis of cyclic peptide libraries in E. coli. Nat. Protoc. 2, 1126-1133 (2007). [0508] 11. Kawakami, T. et al. Diverse backbone-cyclized peptides via codon reprogramming. Nat. Chem. Biol. 5, 888-890 (2009). [0509] 12. Nguyen, G. K. T. et al. Butelase 1 is an Asx-specific ligase enabling peptide macrocyclization and synthesis. Nat. Chem. Biol. 10, 732-738 (2014). [0510] 13. Harris, K. S. et al. Efficient backbone cyclization of linear peptides by a recombinant asparaginyl endopeptidase. Nat. Commun. 6, 10199 (2015). [0511] 14. Yang, R. et al. Engineering a Catalytically Efficient Recombinant Protein Ligase. J Am. Chem. Soc. 139, 5351-5358 (2017). [0512] 15. Lee, J., McIntosh, J., Hathaway, B. J. & Schmidt, E. W. Using marine natural products to discover a protease that catalyzes peptide macrocyclization of diverse substrates. J Am. Chem. Soc. 131, 2122-2124 (2009). [0513] 16. Barber, C. J. et al. The two-step biosynthesis of cyclic peptides from linear precursors in a member of the plant family Caryophyllaceae involves cyclization by a serine protease-like enzyme. J. Biol. Chem. 288, 12500-12510 (2013). [0514] 17. Luo, H. et al. Peptide macrocyclization catalyzed by a prolyl oligopeptidase involved in alpha-amanitin biosynthesis. Chem. Biol. 21, 1610-1617 (2014). [0515] 18. Pi, N. et al. Recombinant Butelase-Mediated Cyclization of the p53-Binding Domain of the Oncoprotein MdmX-Stabilized Protein Conformation as a Promising Model for Structural Investigation. Biochemistry 58, 3005-3015 (2019). [0516] 19. Mazmanian, S. K., Liu, G., Ton-That, H. & Schneewind, O. Staphylococcus aureus sortase, an enzyme that anchors surface proteins to the cell wall. Science 285, 760-763 (1999). [0517] 20. Antos, J. M. et al. Site-specific N- and C-terminal labeling of a single polypeptide using sortases of different specificity. J Am. Chem. Soc. 131, 10800-10801 (2009). [0518] 21. Jia, X. et al. Semienzymatic cyclization of disulfide-rich peptides using Sortase A. J Biol. Chem. 289, 6627-6638 (2014). [0519] 22. Nasr, M. L. et al. Covalently circularized nanodiscs for studying membrane proteins and viral entry. Nat. Methods 14, 49-52 (2017). [0520] 23. Suree, N. et al. The structure of the Staphylococcus aureus sortase-substrate complex reveals how the universally conserved LPXTG sorting signal is recognized. J. Biol. Chem. 284, 24465-24477 (2009). [0521] 24. Jacobitz, A. W., Kattke, M. D., Wereszczynski, J. & Clubb, R. T. Sortase Transpeptidases: Structural Biology and Catalytic Mechanism. Adv Protein Chem Struct Biol 109, 223-264 (2017). [0522] 25. Zhou, C., Yan, Y., Fang, J., Cheng, B. & Fan, J. A new fusion protein platform for quantitatively measuring activity of multiple proteases. Microb. Cell Fact. 13, 44 (2014). [0523] 26. Warden-Rothman, R., Caturegli, I., Popik, V. & Tsourkas, A. Sortase-tag expressed protein ligation: combining protein purification and site-specific bioconjugation into a single step. Anal. Chem. 85, 11090-11097 (2013). [0524] 27. Wang, J. et al. Oligopeptide Targeting Sortase A as Potential Anti-infective Therapy for Staphylococcus aureus. Front Microbiol 9, 245 (2018). [0525] 28. Mao, H. Y. A self-cleavable sortase fusion for one-step purification of free recombinant proteins. Protein Expr. Purif. 37, 253-263 (2004). [0526] 29. Jia, X., Crawford, T., Zhang, A. H. & Mobli, M. A new vector coupling ligation-independent cloning with sortase a fusion for efficient cloning and one-step purification of tag-free recombinant proteins. Protein Expr. Purif. (2019). [0527] 30. Chen, I., Dorr, B. M. & Liu, D. R. A general strategy for the evolution of bond-forming enzymes using yeast display. Proc. Natl. Acad. Sci. U.S.A 108, 11399-11404 (2011). [0528] 31. Ben-Bassat, A. et al. Processing of the initiation methionine from proteins: properties of the Escherichia coli methionine aminopeptidase and its gene structure. J. Bacteriol. 169, 751-757 (1987). [0529] 32. Gangola, P. & Rosen, B. P. Maintenance of intracellular calcium in Escherichia coli. J. Biol. Chem. 262, 12570-12574 (1987). [0530] 33. Faas, R. et al. Time-course and degradation rate of membrane scaffold protein (MSP1D1) during recombinant production. Biotechnol Rep (Amst) 17, 45-48 (2018). [0531] 34. Mei, X. & Atkinson, D. Crystal structure of C-terminal truncated apolipoprotein A-I reveals the assembly of high density lipoprotein (HDL) by dimerization. J Biol. Chem. 286, 38570-38582 (2011). [0532] 35. Zhang, A. H. et al. Elucidating the Lipid Binding Properties of Membrane-Active Peptides Using Cyclised Nanodiscs. Frontiers in Chemistry 7 (2019). [0533] 36. Yusuf, Y. et al. Optimization of the Production of Covalently Circularized Nanodiscs and Their Characterization in Physiological Conditions. Langmuir 34, 3525-3532 (2018). [0534] 37. Luckett, S. et al. High-resolution structure of a potent, cyclic proteinase inhibitor from sunflower seeds. J Mol. Biol. 290, 525-533 (1999). [0535] 38. Sandall, D. et al. A novel α-conotoxin identified by gene sequencing is active in suppressing the vascular response to selective stimulation of sensory nerves in vivo. Biochemistry 42, 6904-6911 (2003). [0536] 39. Saether, O. et al. Elucidation of the primary and three-dimensional structure of the uterotonic polypeptide kalata B1. Biochemistry 34, 4147-4158 (1995). [0537] 40. Clark, R. J. et al. The engineering of an orally active conotoxin for the treatment of neuropathic pain. Angew. Chem. Int. Ed. Engl. 49, 6545-6548 (2010). [0538] 41. Denisov, I. G. & Sligar, S. G. Nanodiscs in Membrane Biochemistry and Biophysics. Chem. Rev. 117, 4669-4713 (2017). [0539] 42. Denisov, I. G., Grinkova, Y. V., Lazarides, A. A. & Sligar, S. G. Directed self-assembly of monodisperse phospholipid bilayer Nanodiscs with controlled size. J. Am. Chem. Soc. 126, 3477-3487 (2004). [0540] 43. Hagn, F., Etzkorn, M., Raschle, T. & Wagner, G. Optimized Phospholipid Bilayer Nanodiscs Facilitate High-Resolution Structure Determination of Membrane Proteins. J. Am. Chem. Soc. 135, 1919-1925 (2013). [0541] 44. Hagn, F. & Wagner, G. Structure refinement and membrane positioning of selectively labeled OmpX in phospholipid nanodiscs. J. Biomol. NMR 61, 249-260 (2015). [0542] 45. Raschle, T. et al. Structural and functional characterization of the integral membrane protein VDAC-1 in lipid bilayer nanodiscs. J Am. Chem. Soc. 131, 17777-17779 (2009). [0543] 46. Johansen, N. T. et al. Circularized and solubility-enhanced MSPs facilitate simple and high yield production of stable nanodiscs for studies of membrane proteins in solution. FEBS J (2019). [0544] 47. Miehling, J., Goricanec, D. & Hagn, F. A Split-Intein-Based Method for the Efficient Production of Circularized Nanodiscs for Structural Studies of Membrane Proteins. ChemBioChem 19, 1927-1933 (2018). [0545] 48. Jennings, M. J., Barrios, A. F. & Tan, S. Elimination of truncated recombinant protein expressed in Escherichia coli by removing cryptic translation initiation site. Protein Expr. Purif. 121, 17-21 (2016). [0546] 49. Whitaker, W. R., Lee, H., Arkin, A. P. & Dueber, J. E. Avoidance of truncated proteins from unintended ribosome binding sites within heterologous protein coding sequences. ACS Synth Biol 4, 249-257 (2015). [0547] 50. Lawrence, A.-M. & Besir, H. Staining of proteins in gels with Coomassie G-250 without organic solvent and acetic acid. JoVE (Journal of Visualized Experiments), e1350 (2009). [0548] 51. Ritchie, T. K. et al. Chapter 11—Reconstitution of membrane proteins in phospholipid bilayer nanodiscs. Methods in Enzymology 464, 211-231 (2009). [0549] 52. Janosi, L. & Gorfe, A. A. Simulating POPC and POPC/POPG Bilayers: Conserved Packing and Altered Surface Reactivity. Journal of Chemical Theory and Computation 6, 3267-3273 (2010).