DELIVERY OF CRISPR/MCAS9 THROUGH EXTRACELLULAR VESICLES FOR GENOME EDITING
20220195455 · 2022-06-23
Inventors
Cpc classification
C12N2310/20
CHEMISTRY; METALLURGY
C12N9/22
CHEMISTRY; METALLURGY
C07K2319/033
CHEMISTRY; METALLURGY
International classification
C12N15/11
CHEMISTRY; METALLURGY
C12N15/90
CHEMISTRY; METALLURGY
Abstract
Disclosed herein is a fusion protein for gene editing, comprising a Cas9 domain that is configured to be encapsulated into exosomes and to localize to the nucleus of recipient cells. Also disclosed are recombinant polynucleotides that comprise a nucleic acid sequence encoding the disclosed Cas9 fusion protein. Also disclosed are cells comprising the disclosed polynucleotides. Also disclosed are methods of making a gene editing composition that involve culturing the disclosed cells under conditions suitable to produce extracellular vesicles encapsulating the guide RNA and fusion protein. Also disclosed are gene editing compositions that involve extracellular vesicles encapsulating the disclosed Cas9 fusion proteins and guide RNA. Finally, also disclosed herein are methods for editing a gene in a cell that involves contact the cell with the herein disclosed gene editing compositions.
Claims
1. A fusion protein, comprising a myristoylation domain, a Cas9 domain, and a nuclear localization signal, wherein the myristoylation domain does not comprises a palmitoylation motif, wherein the polypeptide is configured to be myristoylated during translation, to be encapsulated into exosomes, and to localize to the nucleus of recipient cells.
2. The fusion protein of claim 1, wherein the myristoylation domain comprises the amino acid sequence G-X1-X1-X1-S/T-X2-X2-X2 (SEQ ID NO:1), wherein X1 is any amino acid other than Cys, and wherein X2 is any amino acid or nothing.
3. A recombinant polynucleotide, comprising a nucleic acid sequence encoding a guide RNA operably linked to a first expression control sequence, and a nucleic acid sequence encoding the fusion protein of claim 1 operably linked to a second expression control sequence.
4. A cell comprising the polynucleotide of claim 3.
5. A method of making a gene editing composition, comprising culturing the cell of claim 4 under conditions suitable to produce extracellular vesicles encapsulating the guide RNA and fusion protein.
6. A gene editing composition, comprising extracellular vesicle encapsulating the fusion protein of claim 1 and a guide RNA.
7. The gene editing composition of claim 6 produced by the method of claim 6.
8. A method for editing a gene in a cell, comprising contact the cell with the gene editing composition of claim 6.
9. A method for encapsulating a protein into an extracellular vesicle, comprising providing a fusion of the protein with a myristoylation domain, wherein the myristoylation domain does not comprises a palmitoylation motif, wherein the polypeptide is configured to be myristoylated during translation and encapsulated into extracellular vesicles.
Description
DESCRIPTION OF DRAWINGS
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
DETAILED DESCRIPTION
[0030] Before the present disclosure is described in greater detail, it is to be understood that this disclosure is not limited to particular embodiments described, and as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present disclosure will be limited only by the appended claims.
[0031] Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the disclosure. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also encompassed within the disclosure, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure.
[0032] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Although any methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present disclosure, the preferred methods and materials are now described.
[0033] All publications and patents cited in this specification are herein incorporated by reference as if each individual publication or patent were specifically and individually indicated to be incorporated by reference and are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited. The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the present disclosure is not entitled to antedate such publication by virtue of prior disclosure. Further, the dates of publication provided could be different from the actual publication dates that may need to be independently confirmed.
[0034] As will be apparent to those of skill in the art upon reading this disclosure, each of the individual embodiments described and illustrated herein has discrete components and features which may be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present disclosure. Any recited method can be carried out in the order of events recited or in any other order that is logically possible.
[0035] Embodiments of the present disclosure will employ, unless otherwise indicated, techniques of chemistry, biology, and the like, which are within the skill of the art.
[0036] The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to perform the methods and use the probes disclosed and claimed herein. Efforts have been made to ensure accuracy with respect to numbers (e.g., amounts, temperature, etc.), but some errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, temperature is in ° C., and pressure is at or near atmospheric. Standard temperature and pressure are defined as 20° C. and 1 atmosphere.
[0037] Before the embodiments of the present disclosure are described in detail, it is to be understood that, unless otherwise indicated, the present disclosure is not limited to particular materials, reagents, reaction materials, manufacturing processes, or the like, as such can vary. It is also to be understood that the terminology used herein is for purposes of describing particular embodiments only, and is not intended to be limiting. It is also possible in the present disclosure that steps can be executed in different sequence where this is logically possible.
[0038] It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise.
Cas9 Fusion Protein
[0039] Disclosed herein is a fusion protein for gene editing, comprising a Cas9 domain that is configured to be encapsulated into EVs and to localize to the nucleus of recipient cells. The fusion should possess the following criteria: 1) it should be encapsulated into EVs; and 2) it should be taken into the recipient cells, and be localized into the nucleus for genome editing. The fusion protein can therefore contain a myristoylation domain and possess a positive charge, which allows encapsulation of the protein in EVs. As disclosed herein, palmitoylation of the peptide can significantly inhibit encapsulation and/or nucleus localization. Therefore, in some embodiments, the disclosed fusion protein contains a myristoylation domain that contains a myristoylation motif but does not contain a palmitoylation motif. Therefore, disclosed herein is a fusion protein, comprising a myristoylation domain, a Cas9 domain, and a nuclear localization signal (NLS), wherein the polypeptide is configured to be myristoylated during protein translation. In some embodiments, the fusion protein comprises a myristoylation domain that possesses a myristoylation motif and a positive charge, but does not contain a palmitoylation motif.
[0040] In some embodiments, the one or more domains of the fusion proteins are separated by a polypeptide linker.
[0041] Myristoylation Domain
[0042] Myristoylation is a lipidation modification where a myristoyl group, derived from myristic acid, is covalently attached by an amide bond to the alpha-amino group of an N-terminal glycine residue. Briefly, proteins that will become myristoylated begin with a consensus sequence Met-Gly-X-X-X-Ser/Thr (SEQ ID NO:3). The start Met is cotranslationally, proteolytically removed and the myristate is added to the exposed N-terminal glycine via a stable amide bond.
[0043] As used herein, “palmitoylation” refers the covalent attachment of fatty acids, such as palmitic acid, to cysteine. Therefore, in some embodiments, the myristoylation domain of the disclosed fusion protein does not comprises a cysteine residue.
[0044] Therefore, in some cases, the myristoylation domain comprises the amino acid sequence G-X-X-X-S/T (SEQ ID NO:1), wherein X is any amino acid other than Cys. In some embodiments, the myristoylation domain comprises the amino acid sequence GSNKS (SEQ ID NO:340). In some cases, the myristoylation domain comprises 5 to 10 amino acids, including 5, 6, 7, 8, 9, or 10 amino acids. Therefore, in some cases, the myristoylation domain comprises the amino acid sequence G-X.sub.1-X.sub.1-X.sub.1-S/T-X.sub.2-X.sub.2-X.sub.2-X.sub.2-X.sub.2 (SEQ ID NO:2), wherein X.sub.1 is any amino acid other than Cys, and wherein X.sub.2 is a basic amino acid, any amino acid, or nothing. For example, in some embodiments, the myristoylation domain comprises or consists of the amino acid sequence GSNKSKPKDA (SEQ ID NO:341). In some cases, the myristoylation domain is encoded by the nucleic acid sequence
TABLE-US-00001 (SEQ ID NO: 344) GGCAGCAACAAGAGCAAGCCCAAG.
[0045] Cas9 Domain
[0046] The term “Cas9” or “Cas9 nuclease” refers to an RNA-guided nuclease comprising a Cas9 protein, or a fragment thereof (e.g., a protein comprising an active or inactive DNA cleavage domain of Cas9, and/or the gRNA binding domain of Cas9). A Cas9 nuclease is also referred to sometimes as a casn1 nuclease or a CRISPR (clustered regularly interspaced short palindromic repeat)-associated nuclease. CRISPR is an adaptive immune system that provides protection against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain spacers, sequences complementary to antecedent mobile elements, and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA). In type II CRISPR systems correct processing of pre-crRNA requires a trans-encoded small RNA (tracrRNA), endogenous ribonuclease 3 (rnc) and a Cas9 protein. The tracrRNA serves as a guide for ribonuclease 3-aided processing of pre-crRNA. Subsequently, Cas9/crRNA/tracrRNA endonucleolytically cleaves linear or circular dsDNA target complementary to the spacer. The target strand not complementary to crRNA is first cut endonucleolytically, then trimmed 3′-5′ exonucleolytically. In nature, DNA-binding and cleavage typically requires protein and both RNA. However, single guide RNAs (“sgRNA”, or simply “gNRA”) can be engineered so as to incorporate aspects of both the crRNA and tracrRNA into a single RNA species. See e.g., Jinek M., Chylinski K., Fonfara I., Hauer M., Doudna J. A., Charpentier E. Science 337:816-821 (2012), the entire contents of which is hereby incorporated by reference. Cas9 recognizes a short motif in the CRISPR repeat sequences (the PAM or protospacer adjacent motif) to help distinguish self versus non-self. Cas9 nuclease sequences and structures are well known to those of skill in the art (see, e.g., “Complete genome sequence of an M1 strain of Streptococcus pyogenes.” Ferretti et al., J. J., McShan W. M., Ajdic D. J., Savic D. J., Savic G., Lyon K., Primeaux C., Sezate S., Suvorov A. N., Kenton S., Lai H. S., Lin S. P., Qian Y., Jia H. G., Najar F. Z., Ren Q., Zhu H., Song L., White J., Yuan X., Clifton S. W., Roe B. A., McLaughlin R. E., Proc. Natl. Acad. Sci. U.S.A. 98:4658-4663 (2001); “CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III.” Deltcheva E., Chylinski K., Sharma C. M., Gonzales K., Chao Y., Pirzada Z. A., Eckert M. R., Vogel J., Charpentier E., Nature 471:602-607 (2011); and “A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity.” Jinek M., Chylinski K., Fonfara I., Hauer M., Doudna J. A., Charpentier E. Science 337:816-821 (2012), the entire contents of each of which are incorporated herein by reference). Cas9 orthologs have been described in various species, including, but not limited to, S. pyogenes and S. thermophilus. Additional suitable Cas9 nucleases and sequences will be apparent to those of skill in the art based on this disclosure, and such Cas9 nucleases and sequences include Cas9 sequences from the organisms and loci disclosed in Chylinski, Rhun, and Charpentier, “The tracrRNA and Cas9 families of type II CRISPR-Cas immunity systems” (2013) RNA Biology 10:5, 726-737; the entire contents of which are incorporated herein by reference. In some embodiments, a Cas9 nuclease has an inactive (e.g., an inactivated) DNA cleavage domain.
[0047] In some embodiments, the Cas9 domain comprises wild type Cas9 from Streptococcus pyogenes (NCBI Reference Sequence: NC_017053.1. Therefore, in some embodiments, the Cas9 domain comprise the amino acid sequence:
TABLE-US-00002 (SEQ ID NO: 4) MDKKYSIGLDIGTNSVGWAVITDDYKVPSKKFKVLGNTDRHSIKKNLIGA LLFGSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHR LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLADSTDKAD LRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQIYNQLFEENP INASRVDAKAILSARLSKSRRLENLIAQLPGEKRNGLFGNLIALSLGLTP NFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAI LLSDILRVNSEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEI FFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPY YVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDK NLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVD LLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGAYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDRGMIEERLKTYAHLFDDKVMKQ LKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDD SLTFKEDIQKAQVSGQGHSLHEQ1ANLAGSPAIKKG1LQTVKIVDELVKV MGHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQ1LKEHPV ENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFIKDDS IDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLT KAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIR EVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKY PKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEIT LANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQ TGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEK GKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKY SLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPED NEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKP IREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQS ITGLYETRIDLSQLGGD.
[0048] In some embodiments, the Cas9 domain comprises the amino acid sequence:
TABLE-US-00003 (SEQ ID NO: 5) MDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGA LLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHR LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKAD LRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENP INASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTP NFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAI LLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEI FFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPY YVGPLARGNSRFAVVMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFD KNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIV DLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLK IIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMK QLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVK VMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEH PVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFLKD DSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDN LTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKL IREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK KYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTE ITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTE VQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKV EKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLP KYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSP EDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRD KPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIH QSITGLYETRIDLSQLGGD.
[0049] In some embodiments, the Cas9 domain comprises wild type Cas9 from Corynebacterium ulcerans (NCBI Refs: NC_015683.1, NC_017317.1); Corynebacterium diphtheria (NCBI Refs: NC_016782.1, NC_016786.1); Spiroplasma syrphidicola (NCBI Ref: NC_021284.1); Prevotella intermedia (NCBI Ref: NC_017861.1); Spiroplasma taiwanense (NCBI Ref: NC_021846.1); Streptococcus iniae (NCBI Ref: NC_021314.1); Belliella baltica (NCBI Ref: NC_018010.1); Psychroflexus torquisl (NCBI Ref: NC_018721.1); Streptococcus thermophilus (NCBI Ref: YP_820832.1), Listeria innocua (NCBI Ref: NP_472073.1), Campylobacter jejuni (NCBI Ref: YP_002344900.1) or Neisseria meningitidis (NCBI Ref: YP_002342100.1).
[0050] In some embodiments, the Cas9 domain is nuclease-inactive. Point mutations can be introduced into Cas9 to abolish nuclease activity, resulting in a dead Cas9 (dCas9) that still retains its ability to bind DNA in a sgRNA-programmed manner. In principle, when fused to another protein or domain, dCas9 can target that protein to virtually any DNA sequence simply by co-expression with an appropriate sgRNA. Methods for generating a Cas9 protein (or a fragment thereof) having an inactive DNA cleavage domain are known (See, e.g., Jinek et al., Science. 337:816-821(2012); Qi et al., “Repurposing CRISPR as an RNA-Guided Platform for Sequence-Specific Control of Gene Expression” (2013) Cell. 28; 152(5):1173-83, the entire contents of each of which are incorporated herein by reference). For example, the DNA cleavage domain of Cas9 is known to include two subdomains, the HNH nuclease subdomain and the RuvC1 subdomain. The HNH subdomain cleaves the strand complementary to the gRNA, whereas the RuvC1 subdomain cleaves the non-complementary strand. Mutations within these subdomains can silence the nuclease activity of Cas9. For example, the mutations D10A and H841A completely inactivate the nuclease activity of S. pyogenes Cas9 (Jinek et al., Science. 337:816-821(2012); Qi et al., Cell. 28; 152(5):1173-83 (2013).
[0051] For example, in some embodiments, the Cas9 domain comprises the amino acid sequence:
TABLE-US-00004 (dCas9 with D10A and H840A, SEQ ID NO: 6) MDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGA LLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHR LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKAD LRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENP INASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTP NFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAI LLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEI FFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPY YVGPLARGNSRFAVVMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFD KNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIV DLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLK IIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMK QLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVK VMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEH PVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKD DSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDN LTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKL IREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK KYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTE ITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTE VQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKV EKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLP KYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSP EDNEQKQLFVEQHKHYLDEI1EQISEFSKRVILADANLDKVLSAYNKHRD KPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIH QSITGLYETRIDLSQLGGD.
[0052] In some embodiments, the Cas9 domain is encoded by the nucleic acid sequence:
TABLE-US-00005 (SEQ ID NO: 345) ATGGGCAGCAACAAGAGCAAGCCCAAGGATAAGAAATACTCAATAGGACT GGATATTGGCACAAATAGCGTCGGATGGGCTGTGATCACTGATGAATATA AGGTTCCTTCTAAAAAGTTCAAGGTTCTGGGAAATACAGACCGCCACAGT ATCAAAAAAAATCTTATAGGGGCTCTTCTGTTTGACAGTGGAGAGACAGC CGAAGCTACTAGACTCAAACGGACAGCTAGGAGAAGGTATACAAGACGGA AGAATAGGATTTGTTATCTCCAGGAGATTTTTTCAAATGAGATGGCCAAA GTGGATGATAGTTTCTTTCATAGACTTGAAGAGTCTTTTTTGGTGGAAGA AGACAAGAAGCATGAAAGACATCCTATTTTTGGAAATATAGTGGATGAAG TTGCTTATCACGAGAAATATCCAACTATCTATCATCTGAGAAAAAAATTG GTGGATTCTACTGATAAAGCCGATTTGCGCCTGATCTATTTGGCCCTGGC CCACATGATTAAGTTTAGAGGTCATTTTTTGATTGAGGGCGATCTGAATC CTGATAATAGTGATGTGGACAAACTGTTTATCCAGTTGGTGCAAACCTAC AATCAACTGTTTGAAGAAAACCCTATTAACGCAAGTGGAGTGGATGCTAA AGCCATTCTTTCTGCAAGATTGAGTAAATCAAGAAGACTGGAAAATCTCA TTGCTCAGCTCCCCGGTGAGAAGAAAAATGGCCTGTTTGGGAATCTCATT GCTTTGTCATTGGGTTTGACCCCTAATTTTAAATCAAATTTTGATTTGGC AGAAGATGCTAAACTCCAGCTTTCAAAAGATACTTACGATGATGATCTGG ATAATCTGTTGGCTCAAATTGGAGATCAATATGCTGATTTGTTTTTGGCA GCTAAGAATCTGTCAGATGCTATTCTGCTTTCAGACATCCTGAGAGTGAA TACTGAAATAACTAAGGCTCCCCTGTCAGCTTCAATGATTAAACGCTACG ATGAACATCATCAAGACTTGACTCTTCTGAAAGCCCTGGTTAGACAACAA CTTCCAGAAAAGTATAAAGAAATCTTTTTTGATCAATCAAAAAACGGATA TGCAGGTTATATTGATGGCGGCGCAAGCCAAGAAGAATTTTATAAATTTA TCAAACCAATTCTGGAAAAAATGGATGGTACTGAGGAACTGTTGGTGAAA CTGAATAGAGAAGATTTGCTGCGCAAGCAACGGACCTTTGACAACGGCTC TATTCCCCATCAAATTCACTTGGGTGAGCTGCATGCTATTTTGAGAAGAC AAGAAGACTTTTATCCATTTCTGAAAGACAATAGAGAGAAGATTGAAAAA ATCTTGACTTTTAGGATTCCTTATTATGTTGGTCCATTGGCCAGAGGCAA TAGTAGGTTTGCATGGATGACTCGGAAGTCTGAAGAAACAATTACCCCAT GGAATTTTGAAGAAGTTGTCGATAAAGGTGCTTCAGCTCAATCATTTATT GAACGCATGACAAACTTTGATAAAAATCTTCCAAATGAAAAAGTGCTGCC AAAACATAGTTTGCTTTATGAGTATTTTACCGTTTATAACGAATTGACAA AGGTCAAATATGTTACTGAAGGAATGAGAAAACCAGCATTTCTTTCAGGT GAACAGAAGAAAGCCATTGTTGATCTGCTCTTCAAAACAAATAGGAAAGT GACCGTTAAGCAACTGAAAGAAGATTATTTCAAAAAAATAGAATGTTTTG ATAGTGTTGAAATTTCAGGAGTTGAAGATAGATTTAATGCTTCACTGGGT ACATACCATGATTTGCTGAAAATTATTAAAGATAAAGATTTTTTGGATAA TGAAGAAAATGAAGACATCCTGGAGGATATTGTTCTGACATTGACCCTGT TTGAAGATAGGGAGATGATTGAGGAAAGACTTAAAACATACGCTCACCTC TTTGATGATAAGGTGATGAAACAGCTTAAAAGACGCAGATATACTGGTTG GGGAAGGTTGTCCAGAAAATTGATTAATGGTATTAGGGATAAGCAATCTG GCAAAACAATACTGGATTTTTTGAAATCAGATGGTTTTGCCAATCGCAAT TTTATGCAGCTCATCCATGATGATAGTTTGACATTTAAAGAAGACATCCA AAAAGCACAAGTGTCTGGACAAGGCGATAGTCTGCATGAACATATTGCAA ATCTGGCTGGTAGCCCTGCTATTAAAAAAGGTATTCTCCAGACTGTGAAA GTTGTTGATGAATTGGTCAAAGTGATGGGGCGGCATAAGCCAGAAAATAT CGTTATTGAAATGGCAAGAGAAAATCAGACAACTCAAAAGGGCCAGAAAA ATTCCAGAGAGAGGATGAAAAGAATCGAAGAAGGTATCAAAGAACTGGGA AGTCAGATTCTTAAAGAGCATCCTGTTGAAAATACTCAATTGCAAAATGA AAAGCTCTATCTCTATTATCTCCAAAATGGAAGAGATATGTATGTGGACC AAGAACTGGATATTAATAGGCTGAGTGATTATGATGTCGATCACATTGTT CCACAAAGTTTCCTTAAAGACGATTCAATAGACAATAAGGTCCTGACCAG GTCTGATAAAAATAGAGGTAAATCCGATAACGTTCCAAGTGAAGAAGTGG TCAAAAAGATGAAAAACTATTGGAGACAACTTCTGAACGCCAAGCTGATC ACTCAAAGGAAGTTTGATAATCTGACCAAAGCTGAAAGAGGAGGTTTGAG TGAACTTGATAAAGCTGGTTTTATCAAACGCCAATTGGTTGAAACTCGCC AAATCACTAAGCATGTGGCACAAATTTTGGATAGTCGCATGAATACTAAA TACGATGAAAATGATAAACTTATTAGAGAGGTTAAAGTGATTACCCTGAA ATCTAAACTGGTTTCTGACTTCAGAAAAGATTTCCAATTCTATAAAGTGA GAGAGATTAACAATTACCATCATGCCCATGATGCCTATCTGAATGCCGTC GTTGGAACTGCTTTGATTAAGAAATATCCAAAACTTGAAAGCGAGTTTGT CTATGGTGATTATAAAGTTTATGATGTTAGGAAAATGATTGCTAAGTCTG AGCAAGAAATAGGCAAAGCAACCGCAAAGTATTTCTTTTACTCTAATATC ATGAACTTCTTCAAAACAGAAATTACACTTGCAAATGGAGAGATTCGCAA ACGCCCTCTGATCGAAACTAATGGGGAAACTGGAGAAATTGTCTGGGATA AAGGGAGAGATTTTGCCACAGTGCGCAAAGTGTTGTCCATGCCCCAAGTC AATATCGTCAAGAAAACAGAAGTGCAGACAGGCGGATTCTCTAAGGAGTC AATTCTGCCAAAAAGAAATTCCGACAAGCTGATTGCTAGGAAAAAAGACT GGGACCCAAAAAAATATGGTGGTTTTGATAGTCCAACCGTGGCTTATTCA GTCCTGGTGGTTGCTAAGGTGGAAAAAGGGAAATCCAAGAAGCTGAAATC CGTTAAAGAGCTGCTGGGGATCACAATTATGGAAAGAAGTTCCTTTGAAA AAAATCCCATTGACTTTCTGGAAGCTAAAGGATATAAGGAAGTTAAAAAA GACCTGATCATTAAACTGCCTAAATATAGTCTTTTTGAGCTGGAAAACGG TAGGAAACGGATGCTGGCTAGTGCCGGAGAACTGCAAAAAGGAAATGAGC TGGCTCTGCCAAGCAAATATGTGAATTTTCTGTATCTGGCTAGTCATTAT GAAAAGTTGAAGGGTAGTCCAGAAGATAACGAACAAAAACAATTGTTTGT GGAGCAGCATAAGCATTATCTGGATGAGATTATTGAGCAAATCAGTGAAT TTTCTAAGAGAGTTATTCTGGCAGATGCCAATCTGGATAAAGTTCTTAGT GCATATAACAAACATAGAGACAAACCAATAAGAGAACAAGCAGAAAATAT CATTCATCTGTTTACCTTGACCAATCTTGGAGCACCCGCTGCTTTTAAAT ACTTTGATACAACAATTGATAGGAAAAGATATACCTCTACAAAAGAAGTT CTGGATGCCACTCTTATCCATCAATCCATCACTGGTCTTTATGAAACACG CATTGATTTGAGTCAGCTGGGAGGTGAC.
[0053] In some embodiments, the Cas9 domain is a Cas9 variant. For example a Cas9 variant is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% to wild type Cas9. In some embodiments, the Cas9 variant comprises a fragment of Cas9 (e.g., a gRNA binding domain or a DNA-cleavage domain), such that the fragment is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% to the corresponding fragment of Cas9.
[0054] Nuclear Localization Signal (NLS)
[0055] In some embodiments, the NLS sequence comprises, in part or in whole, the amino acid sequence of one or dual SV40 NLS sequence (PKKKRKV, SEQ ID NO:342). In some embodiments, the NLS sequence comprises, in part or in whole, the amino acid sequence nucleoplasmin (AVKRPAATKKAGQAKKKKLD, SEQ ID NO: 343), EGL-13 (MSRRRKANPTKLSENAKKLAKEVEN, SEQ ID NO: 344), c-Myc (PAAKRVKLD, SEQ ID NO: 345), orTUS-protein (KLKIKRPVK, SEQ ID NO: 346). In some embodiments, the NLS sequence is encoded by the nucleic acid sequence CCCAAGAAAAAACGCAAGGTG (SEQ ID NO:347), CCTAAGAAAAAGCGGAAAGTG (SEQ ID NO:348), or a combination thereof.
[0056] Additional features may be present, for example, one or more linker sequences between the NLS and the rest of the fusion protein and/or between the nucleic acid-editing enzyme or domain and the Cas9. Other exemplary features that may be present are localization sequences, such as cytoplasmic localization sequences, export sequences, such as nuclear export sequences, or other localization sequences, as well as sequence tags that are useful for solubilization, purification, or detection of the fusion proteins. Suitable localization signal sequences and sequences of protein tags are provided herein, and include, but are not limited to, biotin carboxylase carrier protein (BCCP) tags, myc-tags, calmodulin-tags, FLAG-tags, hemagglutinin (HA)-tags, polyhistidine tags, also referred to as histidine tags or His-tags, maltose binding protein (MBP)-tags, nus-tags, glutathione-S-transferase (GST)-tags, green fluorescent protein (GFP)-tags, thioredoxin-tags, S-tags, Softags (e.g., Softag 1, Softag 3), strep-tags, biotin ligase tags, FlAsH tags, V5 tags, and SBP-tags. Additional suitable sequences will be apparent to those of skill in the art. For example, in some embodiments, a myc tag is encoded by the nucleic acid sequence GAGCAGAAACTCATCTCAGAAGAGGATCTG (SEQ ID NO:349). For example, in some embodiments, a FLAG tag is encoded by the nucleic acid sequence
TABLE-US-00006 (SEQ ID NO: 350) GATTACAAGGATGACGACGATAAG.
[0057] In some embodiments, the polynucleotide encoding the disclosed fusion protein comprises the nucleic acid sequence:
TABLE-US-00007 (SEQ ID NO: 351) GTCGACGGATCGGGAGATCTCCCGATCCCCTATGGTGCACTCTCAGTACAATCTGCTC TGATGCCGCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTCGCTGAGT AGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAAGGCTTGACCGACAATTGCATGAA GAATCTGCTTAGGGTTAGGCGTTTTGCGCTGCTTCGCGATGTACGGGCCAGATATACG CGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATA GCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACC GCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAA TAGGGACTTTCCATTGACGTCAATGGGTGGACTATTTACGGTAAACTGCCCACTTGGCA GTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATG GCCCGCCTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTGGCAGTACA TCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGG CGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATG GGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCC CCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTCTGTACTGGGTCTCT CTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGA CTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGTG GCGCCCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCTCTCGACGCAG GACTCGGCTTGCTGAAGCGCGCACGGCAAGAGGCGAGGGGCGGCGACTGGTGAGTA CGCCAAAAATTTTGACTAGCGGAGGCTAGAAGGAGAGAGATGGGTGCGAGAGCGTCA GTATTAAGCGGGGGAGAATTAGATCGCGATGGGAAAAAATTCGGTTAAGGCCAGGGGG AAAGAAAAAATATAAATTAAAACATATAGTATGGGCAAGCAGGGAGCTAGAACGATTCG CAGTTAATCCTGGCCTGTTAGAAACATCAGAAGGCTGTAGACAAATACTGGGACAGCTA CAACCATCCCTTCAGACAGGATCAGAAGAACTTAGATCATTATATAATACAGTAGCAACC CTCTATTGTGTGCATCAAAGGATAGAGATAAAAGACACCAAGGAAGCTTTAGACAAGAT AGAGGAAGAGCAAAACAAAAGTAAGACCACCGCACAGCAAGCGGCCGCTGATCTTCAG ACCTGGAGGAGGAGATATGAGGGACAATTGGAGAAGTGAATTATATAAATATAAAGTAG TAAAAATTGAACCATTAGGAGTAGCACCCACCAAGGCAAAGAGAAGAGTGGTGCAGAG AGAAAAAAGAGCAGTGGGAATAGGAGCTTTGTTCCTTGGGTTCTTGGGAGCAGCAGGA AGCACTATGGGCGCAGCGTCAATGACGCTGACGGTACAGGCCAGACAATTATTGTCTG GTATAGTGCAGCAGCAGAACAATTTGCTGAGGGCTATTGAGGCGCAACAGCATCTGTT GCAACTCACAGTCTGGGGCATCAAGCAGCTCCAGGCAAGAATCCTGGCTGTGGAAAGA TACCTAAAGGATCAACAGCTCCTGGGGATTTGGGGTTGCTCTGGAAAACTCATTTGCAC CACTGCTGTGCCTTGGAATGCTAGTTGGAGTAATAAATCTCTGGAACAGATTTGGAATC ACACGACCTGGATGGAGTGGGACAGAGAAATTAACAATTACACAAGCTTAATACACTCC TTAATTGAAGAATCGCAAAACCAGCAAGAAAAGAATGAACAAGAATTATTGGAATTAGAT AAATGGGCAAGTTTGTGGAATTGGTTTAACATAACAAATTGGCTGTGGTATATAAAATTA TTCATAATGATAGTAGGAGGCTTGGTAGGTTTAAGAATAGTTTTTGCTGTACTTTCTATA GTGAATAGAGTTAGGCAGGGATATTCACCATTATCGTTTCAGACCCACCTCCCAACCCC GAGGGGACCCGACAGGCCCGAAGGAATAGAAGAAGAAGGTGGAGAGAGAGACAGAGA CAGATCCATTCGATTAGTGAACGGATCGGCACTGCGTGCGCCAATTCTGCAGACAAAT GGCAGTATTCATCCACAATTTTAAAAGAAAAGGGGGGATTGGGGGGTACAGTGCAGGG GAAAGAATAGTAGAAATAATAGCAACAGACATACAAACTAAAGAATTACAAAAACAAATT ACAAAAATTCAAAATTTTCGGGTTTATTACAGGGACAGCAGAGATCCAGTTTGGTTAATC CGCTAGCTCTAGAGGATCTGAATTCCCCAGTGGAAAGACGCGCAGGCAAAACGCACCA CGTGACGGAGCGTGACCGCGCGCCGAGCGCGCGCCAAGGTCGGGCAGGAAGAGGGC CTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATT AGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAA TAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTA CCGTAACTTGAAAGTATTTCGATTTCTTGGGTTTATATATCTTGTGGAAAGGACGCGGG ATCCACTGGACCAGGCAGCAGCGTCAGAAGACTTTTTTGGAACGTCTCGTTTTAGAGCT AGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT CGGTGCTTTTTTTGGTGTACATTTATATTGGCTCATGTCCAATATGACCGCCATGTTGAC ATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATA TATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAAC GACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGAC TTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATC AAGTGTATCATATGCCAAGTCCGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGC CTGGCATTATGCCCAGTACATGACCTTACGGGACTTTCCTACTTGGCAGTACATCTACG TATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACACCAATGGGCGTGGA TAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTT GTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAATAACCCCGCCCCGTTGA CGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGT GAACCGTCAGAATTTTGTAATACGACTCACTATAGGGCGGCCGGGAATTCGTCGACTG GAACCGGTACCGAGGAGATCTGCCGCCGCGATCGCCATGGGCAGCAACAAGAGCAAG CCCAAGGATAAGAAATACTCAATAGGACTGGATATTGGCACAAATAGCGTCGGATGGG CTGTGATCACTGATGAATATAAGGTTCCTTCTAAAAAGTTCAAGGTTCTGGGAAATACAG ACCGCCACAGTATCAAAAAAAATCTTATAGGGGCTCTTCTGTTTGACAGTGGAGAGACA GCCGAAGCTACTAGACTCAAACGGACAGCTAGGAGAAGGTATACAAGACGGAAGAATA GGATTTGTTATCTCCAGGAGATTTTTTCAAATGAGATGGCCAAAGTGGATGATAGTTTCT TTCATAGACTTGAAGAGTCTTTTTTGGTGGAAGAAGACAAGAAGCATGAAAGACATCCT ATTTTTGGAAATATAGTGGATGAAGTTGCTTATCACGAGAAATATCCAACTATCTATCAT CTGAGAAAAAAATTGGTGGATTCTACTGATAAAGCCGATTTGCGCCTGATCTATTTGGC CCTGGCCCACATGATTAAGTTTAGAGGTCATTTTTTGATTGAGGGCGATCTGAATCCTG ATAATAGTGATGTGGACAAACTGTTTATCCAGTTGGTGCAAACCTACAATCAACTGTTTG AAGAAAACCCTATTAACGCAAGTGGAGTGGATGCTAAAGCCATTCTTTCTGCAAGATTG AGTAAATCAAGAAGACTGGAAAATCTCATTGCTCAGCTCCCCGGTGAGAAGAAAAATGG CCTGTTTGGGAATCTCATTGCTTTGTCATTGGGTTTGACCCCTAATTTTAAATCAAATTTT GATTTGGCAGAAGATGCTAAACTCCAGCTTTCAAAAGATACTTACGATGATGATCTGGA TAATCTGTTGGCTCAAATTGGAGATCAATATGCTGATTTGTTTTTGGCAGCTAAGAATCT GTCAGATGCTATTCTGCTTTCAGACATCCTGAGAGTGAATACTGAAATAACTAAGGCTC CCCTGTCAGCTTCAATGATTAAACGCTACGATGAACATCATCAAGACTTGACTCTTCTGA AAGCCCTGGTTAGACAACAACTTCCAGAAAAGTATAAAGAAATCTTTTTTGATCAATCAA AAAACGGATATGCAGGTTATATTGATGGCGGCGCAAGCCAAGAAGAATTTTATAAATTT ATCAAACCAATTCTGGAAAAAATGGATGGTACTGAGGAACTGTTGGTGAAACTGAATAG AGAAGATTTGCTGCGCAAGCAACGGACCTTTGACAACGGCTCTATTCCCCATCAAATTC ACTTGGGTGAGCTGCATGCTATTTTGAGAAGACAAGAAGACTTTTATCCATTTCTGAAAG ACAATAGAGAGAAGATTGAAAAAATCTTGACTTTTAGGATTCCTTATTATGTTGGTCCAT TGGCCAGAGGCAATAGTAGGTTTGCATGGATGACTCGGAAGTCTGAAGAAACAATTAC CCCATGGAATTTTGAAGAAGTTGTCGATAAAGGTGCTTCAGCTCAATCATTTATTGAACG CATGACAAACTTTGATAAAAATCTTCCAAATGAAAAAGTGCTGCCAAAACATAGTTTGCT TTATGAGTATTTTACCGTTTATAACGAATTGACAAAGGTCAAATATGTTACTGAAGGAAT GAGAAAACCAGCATTTCTTTCAGGTGAACAGAAGAAAGCCATTGTTGATCTGCTCTTCA AAACAAATAGGAAAGTGACCGTTAAGCAACTGAAAGAAGATTATTTCAAAAAAATAGAAT GTTTTGATAGTGTTGAAATTTCAGGAGTTGAAGATAGATTTAATGCTTCACTGGGTACAT ACCATGATTTGCTGAAAATTATTAAAGATAAAGATTTTTTGGATAATGAAGAAAATGAAGA CATCCTGGAGGATATTGTTCTGACATTGACCCTGTTTGAAGATAGGGAGATGATTGAGG AAAGACTTAAAACATACGCTCACCTCTTTGATGATAAGGTGATGAAACAGCTTAAAAGAC GCAGATATACTGGTTGGGGAAGGTTGTCCAGAAAATTGATTAATGGTATTAGGGATAAG CAATCTGGCAAAACAATACTGGATTTTTTGAAATCAGATGGTTTTGCCAATCGCAATTTT ATGCAGCTCATCCATGATGATAGTTTGACATTTAAAGAAGACATCCAAAAAGCACAAGT GTCTGGACAAGGCGATAGTCTGCATGAACATATTGCAAATCTGGCTGGTAGCCCTGCTA TTAAAAAAGGTATTCTCCAGACTGTGAAAGTTGTTGATGAATTGGTCAAAGTGATGGGG CGGCATAAGCCAGAAAATATCGTTATTGAAATGGCAAGAGAAAATCAGACAACTCAAAA GGGCCAGAAAAATTCCAGAGAGAGGATGAAAAGAATCGAAGAAGGTATCAAAGAACTG GGAAGTCAGATTCTTAAAGAGCATCCTGTTGAAAATACTCAATTGCAAAATGAAAAGCTC TATCTCTATTATCTCCAAAATGGAAGAGATATGTATGTGGACCAAGAACTGGATATTAAT AGGCTGAGTGATTATGATGTCGATCACATTGTTCCACAAAGTTTCCTTAAAGACGATTCA ATAGACAATAAGGTCCTGACCAGGTCTGATAAAAATAGAGGTAAATCCGATAACGTTCC AAGTGAAGAAGTGGTCAAAAAGATGAAAAACTATTGGAGACAACTTCTGAACGCCAAGC TGATCACTCAAAGGAAGTTTGATAATCTGACCAAAGCTGAAAGAGGAGGTTTGAGTGAA CTTGATAAAGCTGGTTTTATCAAACGCCAATTGGTTGAAACTCGCCAAATCACTAAGCAT GTGGCACAAATTTTGGATAGTCGCATGAATACTAAATACGATGAAAATGATAAACTTATT AGAGAGGTTAAAGTGATTACCCTGAAATCTAAACTGGTTTCTGACTTCAGAAAAGATTTC CAATTCTATAAAGTGAGAGAGATTAACAATTACCATCATGCCCATGATGCCTATCTGAAT GCCGTCGTTGGAACTGCTTTGATTAAGAAATATCCAAAACTTGAAAGCGAGTTTGTCTAT GGTGATTATAAAGTTTATGATGTTAGGAAAATGATTGCTAAGTCTGAGCAAGAAATAGGC AAAGCAACCGCAAAGTATTTCTTTTACTCTAATATCATGAACTTCTTCAAAACAGAAATTA CACTTGCAAATGGAGAGATTCGCAAACGCCCTCTGATCGAAACTAATGGGGAAACTGG AGAAATTGTCTGGGATAAAGGGAGAGATTTTGCCACAGTGCGCAAAGTGTTGTCCATGC CCCAAGTCAATATCGTCAAGAAAACAGAAGTGCAGACAGGCGGATTCTCTAAGGAGTC AATTCTGCCAAAAAGAAATTCCGACAAGCTGATTGCTAGGAAAAAAGACTGGGACCCAA AAAAATATGGTGGTTTTGATAGTCCAACCGTGGCTTATTCAGTCCTGGTGGTTGCTAAG GTGGAAAAAGGGAAATCCAAGAAGCTGAAATCCGTTAAAGAGCTGCTGGGGATCACAA TTATGGAAAGAAGTTCCTTTGAAAAAAATCCCATTGACTTTCTGGAAGCTAAAGGATATA AGGAAGTTAAAAAAGACCTGATCATTAAACTGCCTAAATATAGTCTTTTTGAGCTGGAAA ACGGTAGGAAACGGATGCTGGCTAGTGCCGGAGAACTGCAAAAAGGAAATGAGCTGG CTCTGCCAAGCAAATATGTGAATTTTCTGTATCTGGCTAGTCATTATGAAAAGTTGAAGG GTAGTCCAGAAGATAACGAACAAAAACAATTGTTTGTGGAGCAGCATAAGCATTATCTG GATGAGATTATTGAGCAAATCAGTGAATTTTCTAAGAGAGTTATTCTGGCAGATGCCAAT CTGGATAAAGTTCTTAGTGCATATAACAAACATAGAGACAAACCAATAAGAGAACAAGC AGAAAATATCATTCATCTGTTTACCTTGACCAATCTTGGAGCACCCGCTGCTTTTAAATA CTTTGATACAACAATTGATAGGAAAAGATATACCTCTACAAAAGAAGTTCTGGATGCCAC TCTTATCCATCAATCCATCACTGGTCTTTATGAAACACGCATTGATTTGAGTCAGCTGGG AGGTGACCCCAAGAAAAAACGCAAGGTGGAAGATCCTAAGAAAAAGCGGAAAGTGGAC ACGCGTACGCGGCCGCTCGAGCAGAAACTCATCTCAGAAGAGGATCTGGCAGCAAATG ATATCCTGGATTACAAGGATGACGACGATAAGGTTTAACTTAATTAATTCGATATCAAGC TTATCGATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTA TGTTGCTCCTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATTGC TTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTGCTGTCTCTTTATGA GGAGTTGTGGCCCGTTGTCAGGCAACGTGGCGTGGTGTGCACTGTGTTTGCTGACGCA ACCCCCACTGGTTGGGGCATTGCCACCACCTGTCAGCTCCTTTCCGGGACTTTCGCTT TCCCCCTCCCTATTGCCACGGCGGAACTCATCGCCCGCCTGCCTTGCCCGCTGCTGGA CAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTGGTGTTGTCGGGGAAATCATCGTC CTTTCCTTGGCTGCTCGCCTGTGTTGCCACCTGGATTCTGCGCGGGACGTCCTTCTGC TACGTCCTTCGGCCCTCAATCCAAGCGGACCTTCCTTCCCGCGGCCTGCTGCCGGCTC TGCGGGCCTCTTCCGCGTCTTTCGCCTTCGCCCTCAGACGAGTCGGATCTCCCTTTGG GCGCTCCCCGCATCGATGTCGACCTCGAGACCGGCCGAACTCGAAGACCTAGAAAAAA CATTGGAGCAATCACAAGTAGCAATACAGCAGCTACCAATGCTGATTGTGCCTGGCTAG AAGCACAAGAGGAGGAGGAGGTGGGTTTTCCAGTCACACCTCAGGTACCTTTAAGACC AATGACTTACAAGGCAGCTGTAGATCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGG AAGGGCTAATTCACTCCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACA CAAGGCTACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGATCAGATATCCAC TGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGCAAGAGAAGGTAGAAGAAGC CAATGAAGGAGAGAACACCCGCTTGTTACACCCTGTGAGCCTGCATGGGATGGATGAC CCGGAGAGAGAAGTATTAGAGTGGAGGTTTGACAGCCGCCTAGCATTTCATCACATGG CCCGAGAGCTGCATCCGGACTGTACTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTG GGAGCTCTCTGGCTAACTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAG TGCTTCAAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAGAGATCCCTCAGA CCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGGGCCCGTTTAAACCCGCTGATCAGCCT CGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTG ACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCA TTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGG GGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCTTC TGAGGCGGAAAGAACCAGCTGGGGCTCTAGGGGGTATCCCCACGCGCCCTGTAGCGG CGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAG CGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCT TTCCCCGTCAAGCTCTAAATCGGGGCATCCCTTTAGGGTTCCGATTTAGTGCTTTACGG CACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCT GATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTG TTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATT TTGGGGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAAT TAATTCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGG CAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGGAAAGTCCCCA GGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATAG TCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCC GCCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTG AGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTC CCGGGAGCTTGTATATCCATTTTCGGATCTGATCAGCACGTGTTGACAATTAATCATCG GCATAGTATATCGGCATAGTATAATACGACAAGGTGAGGAACTAAACCATGGCCAAGTT GACCAGTGCCGTTCCGGTGCTCACCGCGCGCGACGTCGCCGGAGCGGTCGAGTTCTG GACCGACCGGCTCGGGTTCTCCCGGGACTTCGTGGAGGACGACTTCGCCGGTGTGGT CCGGGACGACGTGACCCTGTTCATCAGCGCGGTCCAGGACCAGGTGGTGCCGGACAA CACCCTGGCCTGGGTGTGGGTGCGCGGCCTGGACGAGCTGTACGCCGAGTGGTCGG AGGTCGTGTCCACGAACTTCCGGGACGCCTCCGGGCCGGCCATGACCGAGATCGGCG AGCAGCCGTGGGGGCGGGAGTTCGCCCTGCGCGACCCGGCCGGCAACTGCGTGCAC TTCGTGGCCGAGGAGCAGGACTGACACGTGCTACGAGATTTCGATTCCACCGCCGCCT TCTATGAAAGGTTGGGCTTCGGAATCGTTTTCCGGGACGCCGGCTGGATGATCCTCCA GCGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCCAACTTGTTTATTGCAGCTTATA ATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCA TTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGTATACCGTCGAC CTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCC GCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCC TAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGG AAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTT GCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGG CTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAG GGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTA AAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAA AAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCG TTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGAT ACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCTGTAG GTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCC GTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAA GACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTA TGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGG ACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAG CTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGC AGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCT GACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAG GATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATAT GAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGAT CTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATAC GGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCAC CGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTG GTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTA AGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGT GTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGA GTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGT TGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATT CTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAG TCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGG ATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCG GGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCG TGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAA CAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACT CATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGA TACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGA AAAGTGCCACCTGAC.
[0058] Extracellular Vesicles
[0059] Disclosed herein is a gene editing composition that comprises an extracellular vesicle (EV) encapsulating the Cas9 fusion protein disclosed herein and a guide RNA. Exemplary extracellular vesicles may include but are not limited to exosomes. However, the term “extracellular vesicles” should be interpreted to include all nanometer-scale lipid vesicles that are secreted by cells such as secreted vesicles formed from lysosomes.
[0060] EVs are cell-derived vesicles with a closed double-layer membrane structure. According to their size and density, EVs mainly include exosomes (30-150 nm), micro vesicles (MVs) (100-1000 nm), and apoptotic bodies or cancer related oncosomes (1-10 μm). EVs are able to carry various molecules, such as proteins, lipids and RNAs on their surface as well as within their lumen. The EV and exosomal surface proteins can mediate organ-specific homing of circulating EVs.
[0061] EVs are produced by many different types of cells including immune cells such as B lymphocytes, T lymphocytes, dendritic cells (DCs) and most cells. EVs are also produced, for example, by glioma cells, platelets, reticulocytes, neurons, intestinal epithelial cells and tumor cells. EVs for use in the disclosed compositions and methods can be derived from any suitable cells, including the cells identified above. EVs have also been isolated from physiological fluids, such as plasma, urine, amniotic fluid and malignant effusions. Non-limiting examples of suitable EVs producing cells for mass production include dendritic cells (e.g., immature dendritic cell), Human Embryonic Kidney 293 (HEK) cells, 293T cells, Chinese hamster ovary (CHO) cells, and human ESC-derived mesenchymal stem cells.
[0062] EVs can also be obtained from any autologous patient-derived, heterologous haplotype-matched or heterologous stem cells so to reduce or avoid the generation of an immune response in a patient to whom the EVs are delivered. Any EV-producing cell can be used for this purpose.
[0063] EVs produced from cells can be collected from the culture medium by any suitable method. Typically a preparation of EVs can be prepared from cell culture or tissue supernatant by centrifugation, filtration or combinations of these methods. For example, EVs can be prepared by differential centrifugation, that is low speed (<20000 g) centrifugation to pellet larger particles followed by high speed (>100000 g) centrifugation to pellet EVs, size filtration with appropriate filters (for example, 0.22 μiη filter), gradient ultracentrifugation (for example, with sucrose gradient) or a combination of these methods.
[0064] In one embodiment, the EVs comprising the disclosed fusion protein are obtained by culturing a cell expressing the fusion protein and subsequently isolating indirectly modified EVs from the culture medium.
[0065] The disclosed EVs may be administered to a subject by any suitable means. Administration to a human or animal subject may be selected from parenteral, intramuscular, intracerebral, intravascular, subcutaneous, or transdermal administration. Typically the method of delivery is by injection. Preferably the injection is intramuscular or intravascular (e.g. intravenous). A physician will be able to determine the required route of administration for each particular patient.
[0066] The EVs are preferably delivered as a composition. The composition may be formulated for parenteral, intramuscular, intracerebral, intravascular (including intravenous), subcutaneous, or transdermal administration. Compositions for parenteral administration may include sterile aqueous solutions which may also contain buffers, diluents and other suitable additives. The EVs may be formulated in a pharmaceutical composition, which may include pharmaceutically acceptable carriers, thickeners, diluents, buffers, preservatives, and other pharmaceutically acceptable carriers or excipients and the like in addition to the EVs.
[0067] EVs may be administered within a pharmaceutically-acceptable diluent, carrier, or excipient, in unit dosage form. Conventional pharmaceutical practice may be employed to provide suitable formulations or compositions to administer the compounds to patients suffering from a disease (e.g., cancer). Administration may begin before the patient is symptomatic. Any appropriate route of administration may be employed, for example, administration may be parenteral, intravenous, intraarterial, subcutaneous, intratumoral, intramuscular, intracranial, intraorbital, ophthalmic, intraventricular, intrahepatic, intracapsular, intrathecal, intracisternal, intraperitoneal, intranasal, aerosol, suppository, or oral administration. For example, therapeutic formulations may be in the form of liquid solutions or suspensions; for oral administration, formulations may be in the form of tablets or capsules; and for intranasal formulations, in the form of powders, nasal drops, or aerosols.
[0068] The disclosed extracellular vesicles further may comprise an agent, such as a therapeutic agent, where the extracellular vesicles deliver the agent to a target cell. Agents comprised by the extracellular vesicles may include but are not limited to therapeutic drugs (e.g., small molecule drugs), therapeutic proteins, and therapeutic nucleic acids (e.g., therapeutic RNA). In some embodiments, the disclosed extracellular vesicles comprise a therapeutic RNA as a so-called “cargo RNA.” For example, in some embodiments the fusion protein further may comprise an RNA-domain (e.g., at a cytosolic C-terminus of the fusion protein) that binds to one or more RNA-motifs present in the cargo RNA in order to package the cargo RNA into the extracellular vesicle, prior to the extracellular vesicles being secreted from a cell. As such, the fusion protein may function as both of a “targeting protein” and a “packaging protein.” In some embodiments, the packaging protein may be referred to as extracellular vesicle-loading protein or “EV-loading protein.” (See Hung and Leonard, “A platform for actively loading cargo RNA to elucidate limiting steps in EV-mediated delivery,” J. Extracellular Vesicles, 2016, 5: 31027, published 13 May 2016, the content of which is incorporated herein by reference in its entirety.)
Methods for DNA Editing
[0069] Disclosed herein are methods for editing DNA in a cell with a gene editing composition disclosed herein. In some embodiments, any of the methods provided herein can be performed on DNA in a cell, for example a bacterium, a yeast cell, or a mammalian cell. In some embodiments, the DNA contacted by any Cas9 protein provided herein is in a eukaryotic cell. In some embodiments, the methods can be performed on a cell or tissue in vitro or ex vivo. In some embodiments, the eukaryotic cell is in an individual, such as a patient or research animal. In some embodiments, the individual is a human.
Polynucleotides, Vectors, Cells, Kits
[0070] Also disclosed herein are polynucleotides encoding one or more of the proteins and/or gRNAs described herein. For example, polynucleotides encoding any of the proteins described herein are provided, e.g., for recombinant expression and purification. In some embodiments, an isolated polynucleotides comprises one or more sequences encoding a gRNA, alone or in combination with a sequence encoding any of the proteins described herein.
[0071] In some embodiments, vectors encoding any of the proteins described herein are provided, e.g., for recombinant expression and purification of Cas9 proteins, and/or fusions comprising Cas9 fusion proteins. In some embodiments, the vector comprises or is engineered to include an isolated polynucleotide, e.g., those described herein. In some embodiments, the vector comprises one or more sequences encoding a Cas9 fusion protein (as described herein), a gRNA, or combinations thereof, as described herein. Typically, the vector comprises a sequence encoding the fusion protein operably linked to a promoter, such that the fusion protein is expressed in a host cell.
[0072] In some embodiments, cells are provided, e.g., for recombinant expression and encapsulation of the disclosed Cas9 fusion proteins and gRNA into extracellular vesicles (EVs). The cells include any cell suitable for recombinant protein expression, for example, cells comprising a genetic construct expressing or capable of expressing a fusion protein disclosed herein (e.g., cells that have been transformed with one or more vectors described herein, or cells having genomic modifications, for example, those that express a protein provided herein from an allele that has been incorporated in the cell's genome). Methods for transforming cells, genetically modifying cells, and expressing genes and proteins in such cells are well known in the art, and include those provided by, for example, Green and Sambrook, Molecular Cloning: A Laboratory Manual (4th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2012)) and Friedman and Rossi, Gene Transfer: Delivery and Expression of DNA and RNA, A Laboratory Manual (1st ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2006)).
[0073] Some aspects of this disclosure provide kits comprising a polynucleotide encoding a Cas9 fusion protein provided herein. In some embodiments, the kit comprises a vector for recombinant protein expression, wherein the vector comprises a polynucleotide encoding any of the proteins provided herein. In some embodiments, the kit comprises a cell (e.g., any cell suitable for expressing Cas9 fusions proteins, such as bacterial, yeast, or mammalian cells) that comprises a genetic construct for expressing any of the proteins provided herein. In some embodiments, any of the kits provided herein further comprise one or more gRNAs and/or vectors for expressing one or more gRNAs. In some embodiments, the kit comprises an excipient and instructions for contacting the nuclease and/or recombinase with the excipient to generate a composition suitable for contacting a nucleic acid with the nuclease and/or recombinase such that hybridization to and cleavage and/or recombination of a target nucleic acid occurs. In some embodiments, the composition is suitable for delivering a Cas9 protein to a cell. In some embodiments, the composition is suitable for delivering a Cas9 protein to a subject. In some embodiments, the excipient is a pharmaceutically acceptable excipient.
[0074] A number of embodiments of the invention have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.
EXAMPLES
Example 1: Fatty Acylation Regulates the Encapsulation of Src Family Kinases into Extracellular Vesicles
[0075] Protein N-myristoylation is a co/post-translational modification that results in covalent attachment of the myristoyl group (14-carbon saturated fatty acyl) to the N-terminus of a target protein (Wright M H, et al. J Chem Biol. 2010 3:19-35). A consensus sequence of Met-Gly-x-x-x-Ser/Thr (SEQ ID NO:3) at the N-terminus is essential for the N-myristoylation process. Myristoylation modification occurs after the first methionine is removed by methionine aminopeptidase during protein translation, and Gly2 is the site of the attachment of the myristoyl group (Udenwobele D I, et al. 2017 8:751). A panel of proteins have been reported to be myristoylated in mammalian cells (Resh M D. Biochimica et biophysica acta. 1999 1451:1-16). Myristoylation allows these proteins to participate in a variety of molecular functions such as cellular localization, cell signaling, and cell-cell communication (Kim S, et al. J Biol Chem. 2017; Casey P J. Science. 1995 268:221). These activities can subsequently regulate the proliferation of cancer cells, tumor progression, immune response, and other biological functions (Udenwobele D I, et al. 2017 8:751; Kim S, et al. Cancer Res. 2017 77:6950-62). Targeting protein myristoylation is a potential therapeutic approach for the treatment of cancer progression (Kim S, et al. Cancer Res. 2017 77:6950-62; Li Q, et al. J Biol Chem. 2018 293:6434-48; Sulejmani E, et al. Oncoscience. 2018 5:3-5).
[0076] Src family kinases (SFKs), a group of non-receptor tyrosine kinases, are among the identified myristoylated proteins (Martin G S. Nat Rev Mol Cell Biol. 2001 2:467-75). All SFK members are composed of an N-terminal Src Homology (SH) 4 domain controlling membrane association via myristoylation and, depending on the SFK, palmitoylation. For example, both Src and Fyn kinase are N-myristoylated, but Fyn kinase is also palmitoylated at cysteine residues at sites 3 and 6 in the N-terminus (Resh M D. Biochimica et biophysica acta. 1999 1451:1-16; Cai H, et al. Proc Natl Acad Sci USA. 2011 108:6579-84; Resh M D. Cell. 1994 76:411-3). SFKs also contain SH3, SH2, tyrosine kinase SH1 domains, and a short C-terminal tail containing an autoinhibitory phosphorylation site, such as Tyr529 in human Src kinase (Xu W, et al. Nature. 1997 385:595; Sicheri F, et al. Curr Opin Cell Biol. 1997 7:777-85). The expression and activity of Src kinase is highly up-regulated in various cancers including aggressive prostate cancer (Guo Z, et al. Cancer Cell. 2006 10:309-19; Drake J M, et al. Proc Natl Acad Sci USA. 2013 110:E4762-9), which is associated with short life expectancy and a high probability of distant metastasis (Fizazi K. Ann Oncol. 2007 18:1765-73; Erpel T, et al. Curr Opin Cell Biol. 1995 7:176-82; Parsons J T, et al. Curr Opin Cell Biol. 1997 9:187-92; Tatarov O, et al. Clin Cancer Res. 2009 15:3540-9; Irby R B, et al. Oncogene. 2000 19:5636). Differential patterns of myristoylation and/or palmitoylation of SFKs determines their cellular localization (Kim S, et al. J Biol Chem. 2017; Patwardhan P, et al. Mol Cell Biol. 2010 30:4094-107), the interaction of Src kinase with androgen receptor (Kim S, et al. Cancer Res. 2017 77:6950-62), intracellular trafficking (Sato I, et al. J Cell Sci. 2009 122:965-75), and subsequently their kinase activity and transformation potential (Kim S, et al. J Biol Chem. 2017; Cai H, et al. Proc Natl Acad Sci USA. 2011 108:6579-84; Patwardhan P, et al. Mol Cell Biol. 2010 30:4094-107; Oneyama C, et al. 2008 30:426-36; Oneyama C, et al. Mol Cell Biol. 2009 29:6462-72). Exogenous myristate in a high-fat diet can regulate Src kinase levels at the cell membrane via myristoylation, and accelerate Src-mediated oncogenic potential and tumorigenesis (Kim S, et al. J Biol Chem. 2017; Kim S, et al. Cancer Res. 2017 77:6950-62).
[0077] Extracellular vesicles (EVs) are nanovesicles with a diameter of 30-150 nm secreted from almost all cell types (Kowal J, et al. Curr Opin Cell Biol. 2014 29:116-25). EVs mediate cell-to-cell communication through the transfer of lipids, proteins, mRNAs, microRNAs, and other exosomal contents (Villarroya-Beltri C, et al. Sem Cell Biol. 2014 28:3-13; Simons M, et al. Curr Opin Cell Biol. 2009 21:575-81). The EVs-mediated cellular interaction can facilitate the dissemination of diseases, promote tumor progression and metastasis, and escape the immune system (Hoshino A, et al. Nature. 2015 527:329-35; Kahlert C, et al. J Mol Med. 2013 91:431-7; Skog J, et al. Nat Cell Biol. 2008 10:1470-6; Abusamra A J, et al. Blood Cells Mol Dis. 2005 35:169-73). EVs are generated through cell exocytosis originated from the fusion of multi-vesicular bodies with the plasma membrane (Thery C, et al. Nat Rev Immunol. 2002 2:569-79; Colombo M, et al. Annu Rev Cell Dev Biol. 2014 30:255-89; Keller S, et al. Immunol Lett. 2006 107:102-8). Here, we study how fatty acylation modulates the encapsulation of proteins into EVs. As disclosed herein, the encapsulation of SFK members into EVs is regulated by myristoylation, palmitoylation, and Src kinase activity, and the encapsulation process involves the syntenin-ESCRT mediated biogenesis pathway.
[0078] Materials and Methods
[0079] Plasmids
[0080] Lentiviral vectors expressing Src(WT), Src(G2A), Src(Y529F), Src(Y529F/G2A), Src(S3C/S6C), Fyn(WT), Fyn (G2A), or Fyn (C3S/C6S) were cloned into the FUCRW parental lentiviral vector as previously reported (Kim S, et al. J Biol Chem. 2017; Cai H, et al. Proc Natl Acad Sci USA. 2011 108:6579-84). Knockdown of Src kinase by shRNA was created in a previous study (Kim S, et al. Cancer Res. 2017 77:6950-62). Two lentiviral vectors expressing shRNA-TSG101 were obtained from Sigma Aldrich. The sequence of shRNA-TSG101-1 was 5′-CCGGACTGGACACATACCCATATAACTCGAGTTATATGGGTATGTGTCCAGTTTTTTG-3′ (SEQ ID NO:7) and the sequence of shRNA-TSG101-2 was 5′-CCGGGCCTTATAGAGGTAATACATACTCGAGTATGTATTACCTCTATAAGGCTTTTG-3′ (SEQ ID NO:8). The lentivirus were generated from these lentiviral vectors to create stable cell lines. The lentiviral production followed the guidelines of the University of Georgia.
[0081] Cell Lines
[0082] SYF1 (Src.sup.−/−Fyn.sup.−/−Yes.sup.−/−), 3T3, and human prostate cancer cell lines including DU145, PC3, 22Rv1, and LNCaP were purchased from American Type Culture Collection (ATCC). The cells were grown in the medium recommended by ATCC. Mycoplasma contamination was examined periodically. The cells were used up to 20 passages.
[0083] Isolation of EVs and Characterization
[0084] To isolate EVs from the cell culture medium, the cell lines were grown in ATCC recommended medium in a 150-mm petri-dish. After reaching 90% confluence, the medium was replaced with fresh medium containing 5% exosome-free FBS (Life Technology Inc.), and grown in 5% CO.sub.2 37° C. incubator for another 24 h. The conditioned medium was collected for the EVs isolation. Specifically, the conditioned medium was repeatedly centrifuged at 4° C. at 300×g for 10 min, 2,000×g for 10 min, and 10,000×g for 30 min to remove live cells, dead cells, and cell debris, respectively. The supernatant was further ultra-centrifugated with 100,000×g at 4° C. for 90 min. The EVs pellet was re-suspended in 1×PBS to wash out the residual medium, and re-centrifugated at 100,000×g at 4° C. for 90 min. The pelleted EVs were re-suspended either in RIPA buffer for protein analysis or 1×PBS for Dynamic Light Scattering (DLS) analysis. The size, zeta potential, and concentration of EVs were measured by nanoparticle tracking analysis (NTA, Particle Metrix, Germany) with ZetaView software for data record and analysis.
[0085] Protein Concentration Determination
[0086] The protein concentration of EVs and cell lysates was determined by detergent compatible (DC) protein assay (Bio-Rad Laboratories). The total cell lysates (TCL) and EVs were dissolved in RIPA buffer [50 mM Tris-base (pH 7.4), 1% NP-40, 0.50% sodium deoxycholate, 0.1% SDS, 150 mM NaCl, 2 mM EDTA and protease inhibitor (1×)] and the manufacturer's protocol was followed.
[0087] Antibodies and Western Blotting Analysis
[0088] The total cell lysate and EVs dissolved in RIPA buffer were subjected to the standard immunoblotting analysis. The following antibodies were used: rabbit anti-Src (Cat #: 2109), rabbit anti-calnexin (Cat #: 2679), rabbit anti-CD-9 (Cat #: 13403 for human species, Cat #: 2118 for mouse species), rabbit anti-GAPDH (Cat #: 13403), rabbit anti-Fyn (Cat #: 4023), and rabbit anti-FAK (Cat #: 13009), rabbit CD81 (Cat #: 10037) were purchased from Cell Signaling Technology; rabbit anti-RFP (Cat #: 600-401-379, Rockland Inc), rabbit anti-AR (Cat #: sc-816, Santa Cruz Biotechnology), and secondary Antibody anti-rabbit IgG HRP (Cat #: 7074, Cell Signaling Technology) were used according to manufactory's recommended dilution. The band intensity was quantified by Image J software.
[0089] Determination of Myristoylated Src Kinase by Click Chemistry
[0090] Cells expressing Src kinase were grown until 90% confluence in EMEM medium with 5% FBS. The medium was replaced with EMEM medium containing exosome-free FBS and 50 μM of myristic acid-azide (an analog of myristic acid) and the cells were grown for another 24 h. The conditioned medium was collected and used for EVs isolation as described above. The cells or EVs were lysed in M-PER buffer (Thermo Scientific) containing protease inhibitors and phosphatase inhibitors. The cell lysates or EVs lysate (10 μg protein) were added to a working solution containing biotin-alkyne (0.1 mM), CuSO.sub.4 (1 mM), TCEP (1 mM) and TBTA (0.1 mM) and incubated at room temperature for 1 h. After the Click reaction, the samples were mixed with loading dye and boiled at 95° C. for 5 min. The lysates were subjected to SDS-PAGE and transferred to a nitrocellulose membrane. After blocking with 5% milk overnight, the membrane was incubated with High Sensitivity Streptavidin-HRP (catalog No. 21130, ThermoFisher Scientific) at room temperature for 1 h. Myristoylated proteins (e.g., myristoylated Src kinase) were detected by ECL.
[0091] Lipid Raft Disruption
[0092] PC3 and DU145 cells were grown overnight. The medium was replaced with the same growth medium but containing EVs/exosome-free FBS with DMSO (control) or Filipin III (0-1 μM) for 24 h to disrupt lipid rafts. The EVs were isolated from the conditioned medium by sequential centrifugation as described above. The isolated EVs and cells were lysed with RIPA buffer for immunoblotting analysis.
[0093] Xenograft Tumors and EVs Isolation and Characterization from the Plasma
[0094] All animal studies were approved by the Institutional Animal Care and Use Committee (IACUC) of the University of Georgia. To establish the xenograft tumors, DU145 cells were transduced with control, Src(Y529F), or Src(Y529F/G2A) by lentiviral infection. Male SCID mice at the age of 8-10 weeks were randomly divided into 4 groups. The transduced cells were implanted to the sub-renal capsule of SCID mice. The mice were routinely examined and euthanized after 5-weeks incubation. The xenograft tumors and the blood from the host were collected for further analysis.
[0095] After centrifugation at 2,000×g for 10 min, the supernatant from the collected blood samples was collected. The plasma EVs were isolated by the Exoquick kit according to manufacturer's instructions (Cat #: EXOQ5A-1, System Biosciences). The isolated EVs were re-suspended in PBS buffer for characterization of size and zeta potential by DLS with zetasizer (Malvern, USA). The isolated EVs were lysed in RIPA buffer for Western blot analysis.
[0096] Identification of Myristoylated Proteins by Bioinformatics
[0097] To identify potential myristoylated proteins in the mammalian genome, the Uniprot database was accessed and searched using the keyword “myristate” and the filters “Reviewed” and “Homo sapiens”. 194 results were recovered and downloaded for further analysis. The sequences of proteins were analyzed and any protein sequences lacking a glycine at the second position were removed from the list. The remaining 182 proteins were checked together with the EVs data provided from the NCI-60 cell lines, and grouped by the number of times each protein appeared in EVs, with 60 being the highest and 0 being the lowest (Hurwitz S N, et al. Oncotarget. 2016 7:86999; Khoury G A, et al. Sci Rep. 2011 1:90; Consortium U. Nucleic Acids Res. 2016 45:D158-D69).
[0098] A literature review focusing on the proteomic analysis of EVs uncovered three published studies on thymic, breast milk, and urine EVs: “Characterization of human thymic exosomes”, “Comprehensive Proteomic Analysis of Human Milk-derived Extracellular Vesicles Unveils a Novel Functional Proteome Distinct from Other Milk Components”, and “Proteomic analysis of urine exosomes by multidimensional protein identification technology (MudPIT)” (Wang Z, et al. Proteomics. 2012 12:329-38; van Herwijnen M J, et al. Mol Cell Proteomics. 2016 15:3412-23; Skogberg G, et al. PloS one. 2013 8:e67554). The 182 proteins taken from the Uniprot database were checked against the EVs data from each of the three studies, and their appearances in each of the three studies were recorded.
[0099] Statistical Analysis
[0100] The data are presented as mean±SEM (standard error of the mean). All the data with more than two groups were analyzed by one-way ANOVA with a post hoc Tukey test in GraphPad Prism software, and two values were compared by an unpaired student t-test. * p<0.05; ** p<0.01; *** p<0.001; NS: not significant.
[0101] Haemotoxylin and Eosin (H&E) Staining
[0102] The tissue samples were fixed with PBS buffered 10% formaldehyde. The samples were paraffin-embedded and sectioned in Leica RM2235 Rotary Microtomy to 4 μm thickness and mounted on microscope slides (catalog No. 12-550-15, Fisher Scientific). Paraffin embedded sections were treated as follows: 100% xylene to de-paraffin for 5 min (3×), 100% ethanol to rehydrate for 2 min (2×), 95% ethanol for 2 min (2×), 75% ethanol for 2 min (2×), and then rinsed thoroughly by distilled water (3×). The sections were stained in Ehrlich's Hematoxylin for 5 min and washed with distilled water (3×), followed by 5-6 quick dips in acid alcohol (0.3%) to differentiate and wash thoroughly with distilled water (3×). The tissue sections were dipped into Scott's Tap Solution for 2 min and rinsed thoroughly with distilled water (3×) followed by counterstain in Eosin solution for 2 min and washed with distilled water (3×), followed by dehydration in 95% alcohol for 5 dips (2×) and 100% alcohol for 5 dips (2×). After xylene clearing for 1 min (3×), tissue sections were mounted with a coverslip in the mounting medium.
[0103] Immunohistochemistry (IHC) Staining
[0104] 4 μm thickness of tissue section on a microscope slide was baked for 60 min at 65° C., and de-paraffined in 100% xylene for 5 min (2×), dehydrated in 100% ethanol for 5 min (2×), 95% ethanol for 5 min (2×), 70% ethanol for 5 min. After washing with PBS for 10 min (3×), the tissue slides were cooked in 0.01 M citrate buffer (pH 6.0) in a steamer cooker at a microwave with 60% power for 15 min and 10% power. After cooling, tissue slides were washed with PBS for 10 min (2×). The tissues were circled with a PAP Pen liquid blocker (Part #6505, Newcomer Supply). 300 μL of 0.3% H.sub.2O.sub.2 in distilled water was added into each tissue spot for 5-10 min and then washed with PBS for 10 min (3×). The tissues were blocked in 2.5% goat serum in PBS for 1 h at room temperature, and then incubated with primary Src antibody (1:250) in PBST overnight at 4° C. The tissue slides were washed with PBST for 10 min (3×), and then incubated with secondary antibody (Cat: M7401) in PBST at room temperature for 1 h. After washing with PBS for 10 min (×3), the tissues slides were incubated with DAB solution (catalog No. SK-4100) for development. As soon as brown color appeared under a microscope, the reaction was stopped by dipping the slide into distilled water. The time to develop for control and treatment was kept the same. The tissue slides were stained in Hematoxylin for 1 min and washed with distilled water (×3), then immersed in NaHCO.sub.3 solution for 3 min and washed with distilled water (×3). The tissue slides were again dehydrated by treating samples in a series of alcohol solutions (75%, 95%, 100% ethanol for 5 min×2), and then air dried for 10 min. After treating with xylene for 5 min (×2), the tissue sections were air dried for 10 min, and mounted with the mounting medium and coverslip.
[0105] Detection of Palmitoylation by Click Chemistry
[0106] Cells expressing Src kinase were grown until 90% confluence in the EMEM medium with 5% PBS. The medium was replaced with the EMEM medium containing exosome-free FBS and 50 μM of myristic acid-azide (an analog of myristic acid) and the cells were grown for another 24 h. The conditioned medium was collected and used for extracellular vesicles (EVs) isolation by the ultracentrifuge method. The cells or EVs were lysed in M-PER buffer (Thermo Scientific) containing protease inhibitors and phosphatase inhibitors. The cell lysates or EVs lysate (10 μg protein) were added into a working solution containing biotin-alkyne (0.1 mM), CuSO.sub.4 (1 mM), TCEP (1 mM) and TBTA (0.1 mM) and incubated at room temperature for 1 h. After the Click reaction, the samples were mixed with loading dye and boiled at 95° C. for 5 min. The lysates were subjected to SDS-PAGE and transferred to a nitrocellulose membrane. After blocking with 5% milk overnight, the membrane was incubated with High Sensitivity Streptavidin-HRP (catalog No. 21130, ThermoFisher Scientific) at room temperature for 1 h. Myristoylated proteins (e.g., myristoylated Src kinase) were detected by ECL.
[0107] Results
[0108] The appearance frequency of myristoylated proteins is elevated in extracellular vesicles.
[0109] The N-terminal glycine (Gly2) is required for protein myristoylation after removal of methionine by methionine aminopeptidase. By searching the mammalian genome for proteins that fit the essential myristoylation requirement, 182 potentially myristoylated proteins were identified (Hurwitz S N, et al. Oncotarget. 2016 7:86999; Khoury G A, et al. Sci Rep. 2011 1:90; Consortium U. Nucleic Acids Res. 2016 45:D158-D69). Given a total of about 20,000 proteins in a mammalian cell, the percentage of myristoylated proteins accounts for about 0.9% of the mammalian genome (
TABLE-US-00008 TABLE 1 182 potential myristoylated proteins in mammalian cells and their appearance frequency in extracellular vesicles of 60 cancer cell lines Appearance frequency in Protein 60 cancer ID Gene Name N-terminus sequence cell lines P84077 ARF1 MGNIFANLFKGLFGKKEMRILMVGLDAAGK (SEQ ID NO: 9) 60 P18085 ARF4 ARF2 MGLTISSLFSRLFGKKQMRILMVGLDAAGK (SEQ ID NO: 10) 60 P62330 ARF6 MGKVLSKIFGNKEMWILMLGLDAAGKTTIL (SEQ ID NO: 11) 60 P04899 GNAI2 GNAI2B MGCTVSAEDKAAAERSKMIDKNLREDGEKA (SEQ ID NO: 12) 60 P08754 GNAI3 MGCTLSAEDKAAVERSKMIDRNLREDGEKA (SEQ ID NO: 13) 60 P62241 RPS8 OK/SW-cl.83 MGISRDNWHKRRKTGGKRKPYHKKRKYELG (SEQ ID NO: 14) 60 Q96TA1 FAM129B C9orf88 MGDVLSTHLDDARRQHIAEKTGKILTEFLQ (SEQ ID NO: 15) 58 Q6IAA8 LAMTOR1 C11orf59 PDRO MGCCYSSENEDSDQDREERKLLLDPSSPPT (SEQ ID NO: 16) 57 PP7157 Q14254 FLOT2 ESA1 M17S1 MGNCHTVGPNEALVVSGGCCGSDYKQYVFG (SEQ ID NO: 17) 56 P84085 ARF5 MGLTVSALFSRIFGKKQMRILMVGLDAAGK (SEQ ID NO: 18) 54 P61313 RPL15 EC45 TCBAP0781 MGAYKYIQELWRKKQSDVMRFLLRVRCWQY (SEQ ID NO: 19) 54 P07947 YES1 YES MGCIKSKENKSPAIKYRPENTPEPVSTSVS (SEQ ID NO: 20) 54 Q9NUQ9 FAM49B BM-009 MGNLLKVLTCTDLEQGPNFFLDFENAQPTE (SEQ ID NO: 21) 52 Q9H4G4 GLIPR2 C9orf19 GAPR1 MGKSASKQFHNEVLKAHNEYRQKHGVPPLK (SEQ ID NO: 22) 52 P63096 GNAI1 MGCTLSAEDKAAVERSKMIDRNLREDGEKA (SEQ ID NO: 23) 52 P36405 ARL3 ARFL3 MGLLSILRKLKSAPDQEVRILLLGLDNAGK (SEQ ID NO: 24) 51 P36404 ARL2 MGLLTILKKMKQKERELRLLMLGLDNAGKT (SEQ ID NO: 25) 50 Q96FZ7 CHMP6 VPS20 MGNLFGRKKQSRVTEQDKAILQLKQQRDKL (SEQ ID NO: 26) 50 Q99653 CHP1 CHP MGSRASTLLRDEELEEIKKETGFSHSQITR (SEQ ID NO: 27) 50 Q8WWI5 SLC44A1 CD92 CDW92 CTL1 MGCCSSASSAAQSSKREWKPLEDRSCTDIP (SEQ ID NO: 28) 49 P07948 LYN JTK8 MGCIKSKGKDSLSDDGVDLKTQPVRNTERT (SEQ ID NO: 29) 47 P49006 MARCKSL1 MLP MRP MGSQSSKAPRGDVTAEEAAGASPAKANGQE (SEQ ID NO: 30) 47 O75695 RP2 MGCFFSKRRKADKESRPENEEERPKQYSWD (SEQ ID NO: 31) 47 P29966 MARCKS MACS PRKCSL MGAQFSKTAAKGEAAAERPGEAAVASSPSK (SEQ ID NO: 32) 46 Q8N9N7 LRRC57 MGNSALRAHVETAQKTGVFQLKDRGLTEFP (SEQ ID NO: 33) 45 P37235 HPCAL1 BDR1 MGKQNSKLRPEVLQDLRENTEFTDHELQEW (SEQ ID NO: 34) 44 P00387 CYB5R3 DIA1 MGAQLSTLGHMVLFPVWFLYSLLMKLFQRS (SEQ ID NO: 35) 43 Q9NRX5 SERINC1 KIAA1253 TDE1L MGSVLGLCSMASWIPCLCGSAPCLLCRCCP (SEQ ID NO: 36) 42 TDE2 UNQ396/PRO732 P12931 SRC SRC1 MGSNKSKPKDASQRRRSLEPAENVHGAGGG (SEQ ID NO: 37) 42 P40616 ARL1 MGGFFSSIFSSLFGTREMRILILGLDGAGK (SEQ ID NO: 38) 40 P80723 BASP1 NAP22 MGGKLSKKKKGYNVNDEKAKEKDKKAEGAA (SEQ ID NO: 39) 40 Q9NX63 CHCHD3 MIC19 MINOS3 MGGTTSTRRVTFEADENENITVVKGIRLSE (SEQ ID NO: 40) 39 Q96PY5 FMNL2 FHOD2 KIAA1902 MGNAGSMDSQQTDFRAHNVPLKLPMPEPGE (SEQ ID NO: 41) 38 P62166 NCS1 FLUP FREQ MGKSNSKLKPEVVEELTRKTYFTEKEVQQW (SEQ ID NO: 42) 38 Q9BZQ8 FAM 129A C1orf24 NIBAN MGGSASSQLDEGKCAYIRGKTEAAIKNFSP (SEQ ID NO: 43) 37 GIG39 Q8NHG7 SVIP MGLCFPCPGESAPPTPDLEEKRAKLAEAAE (SEQ ID NO: 44) 37 Q9Y3E7 CHMP3 CGI149 NEDF VP524 MGLFGKTQEKPPKELVNEWSLKIRKEMRVV (SEQ ID NO: 45) 35 CGI-149 Q99828 CIB1 CIB KIP PRKDCIP MGGSGSRLSKELLAEYQDLTFLTKQEILLA (SEQ ID NO: 46) 32 P17612 PRKACA PKACA MGNAAAAKKGSEQESVKEFLAKAKEDFLKK (SEQ ID NO: 47) 31 Q8ND76 CCNY C10orf9 CBCP1 CFP1 MGNTTSCCVSSSPKLRRNAHSRLESYRPDT (SEQ ID NO: 48) 30 Q9H8Y8 GORASP2 GOLPH6 MGSSQSVEIPGGGTEGYHVLRVQENSPGHR (SEQ ID NO: 49) 29 Q99570 PIK3R4 VPS15 MGNQLAGIAPSQILSVESYFSDIHDFEYDK (SEQ ID NO: 50) 28 Q14699 RFTN1 KIAA0084 MIG2 MGCGLNKLEKRDEKRPGNIYSTLKRPQVET (SEQ ID NO: 51) 25 Q7L014 DDX46 KIAA0801 MGRESRHYRKRSASRGRSGSRSRSRSPSDK (SEQ ID NO: 52) 24 O60936 NOL3 ARC NOP MGNAQERPSETIDRERKRLVETLQADSGLL (SEQ ID NO: 53) 24 P08473 MME EPN MGKSESQMDITDINTPKPKKKQRWTPLEIS (SEQ ID NO: 54) 22 P22694 PRKACB MGNAATAKKGSEVESVKEFLAKAKEDFLKK (SEQ ID NO: 55) 22 Q8IV36 HID1 C17orf28 DMC1 MGSTDSKLNFRKAVIQLTTKTQPVEATDDA (SEQ ID NO: 56) 21 Q8IVF7 FMNL3 FHOD3 FRL2 MGNLESAEGVPGEPPSVPLLLPPGKMPMPE (SEQ ID NO: 57) 19 KIAA2014 WBP3 O15355 PPM1G PPM1C MGAYLSQPNTVKCSGDGVGAPRLPLPYGFS (SEQ ID NO: 58) 19 Q9NUM4 TMEM106B MGKSLSHLPLHSSKEDAYDGVTSENMRNGL (SEQ ID NO: 59) 19 P09471 GNAO1 MGCTLSAEERAALERSKAIEKNLKEDGISA (SEQ ID NO: 60) 17 O75896 TUSC2 C3orf11 FUS1 LGCC MGASGSKARGLWPFASAAGGGGSEAAGAEQ (SEQ ID NO: 61) 16 PDAP2 Q9NS886 LANCL2 GPR69B TASP MGETMSKRLKLHLGGEAEMEERAFVNPFPD (SEQ ID NO: 62) 15 Q02952 AKAP12 AKAP250 MGAGSSTEQRSPEQPPEGSSTPAEPEPSGG (SEQ ID NO: 63) 13 P06239 LCK MGCGCSSHPEDDWMENIDVCENCHYPIVPL (SEQ ID NO: 64) 11 P27216 ANXA13 ANX13 MGNRHAKASSPQGFDVDRDAKKLNKACKGM (SEQ ID NO: 65) 10 P06241 FYN MGCVQCKDKEATKLTEERDGSLNQSSGYRY (SEQ ID NO: 66) 10 O00461 GOLIM4 GIMPC GOLPH4 MGNGMCSRKQKRIFQTLLLLTVVFGFLYGA (SEQ ID NO: 67) 9 GPP130 P63098 PPP3R1 CNA2 CNB MGNEASYPLEMCSHFDADEIKRLGKRFKKL (SEQ ID NO: 68) 9 P62760 VSNL1 VISL1 MGKQNSKLAPEVMEDLVKSTEFNEHELKQW (SEQ ID NO: 69) 9 Q8IWE4 DCUN1D3 SCCRO3 MGQCVTKCKNPSSTLGSKNGDREPSNKSHS (SEQ ID NO: 70) 8 P29728 OAS2 MGNGESQLSSVPAQKLGWFIQEYLKPYEEC (SEQ ID NO: 71) 8 O75688 PPM1B PP2CB MGAFLDKPKTEKHNAHGAGNGLRYGLSSMQ (SEQ ID NO: 72) 7 P56559 ARL4C ARL7 MGNISSNISAFQSLHIVMLGLDSAGKTTVL (SEQ ID NO: 73) 6 Q86UY6 NAA40 NAT11 PATT1 MGRKSSKAKEKKQKRLEERAAMDAVCAKVD (SEQ ID NO: 74) 6 Q9ULE6 PALD1 KIAA1274 PALD MGTTASTAQQTVSAGTPFEGLQGSGTMDSR (SEQ ID NO: 75) 6 O43149 ZZEF1 KIAA0399 MGNAPSHSSEDEAAAAGGEGWGPHQDWAAV (SEQ ID NO: 76) 6 Q9BRQ8 AIFM2 AMID PRG3 MGSQVSVESGALHVVIVGGGFGGIAAASQL (SEQ ID NO: 77) 5 Q9YNA8 ERVK-19 MGQTKSKIKSKYASYLSFIKILLKRGGVKV (SEQ ID NO: 78) 5 Q9C0E8 LNPK KIAA1715 LNP MGGLFSRWRTKPSTVEVLESIDKEIQALEE (SEQ ID NO: 79) 5 Q96BS2 TESC CHP3 MGAAHSASEEVRELEGKTGFSSDQIEQLHR (SEQ ID NO: 80) 5 Q9Y250 LZTS1 FEZ1 MGSVSSLISGHSFHSKHCRASQYKLRKSSH (SEQ ID NO: 81) 4 Q969G9 NKD1 NKD PP7246 MGKLHSKPAAVCKRRESPEGDSFAVSAAWA (SEQ ID NO: 82) 4 Q9Y3C5 RNF11 CGI-123 MGNCLKSPTSDDISLLHESQSDRASFGEGT (SEQ ID NO: 84) 4 Q8NHG8 ZNRF2 RNF202 MGAKQSGPAAANGRTRAYSGSDLPSSSSGG (SEQ ID NO: 85) 4 O15121 DEGS1 DES1 MLD MIG15 MGSRVSREDFEWVYTDQPHADRRREILAKY (SEQ ID NO: 86) 3 Q8WU20 FRS2 MGSCCSCPDKDTVPDNHRNKFKVINVDDDG (SEQ ID NO: 87) 3 P08631 HCK MGGRSSCEDPGCPRDEERAPRMGCMKSKFL (SEQ ID NO: 88) 3 Q9P032 NDUFAF4 C6orf66 HRPAP20 MGALVIRGIRNFNLENRAEREISKMKPSVA (SEQ ID NO: 89) 3 HSPC125 My013 P17568 NDUFB7 MGAHLVRRYLGDASVEPDPLQMPTFPPDYG (SEQ ID NO: 90) 3 P40617 ARL4A ARL4 MGNGLSDQTSILSNLPSFQSFHIVILGLDC (SEQ ID NO: 91) 2 Q9H0F7 ARL6 BBS3 MGLLDRLSVLLGLKKKEVHVLCLGLDNSGK (SEQ ID NO: 92) 2 Q9BSF0 C2orf88 MGCMKSKQTFPFPTIYEGEKQHESEEPFMP (SEQ ID NO: 93) 2 Q9BRQ6 CHCHD6 CHCM1 MIC25 MGSTESSEGRRVSFGVDEEERVRVLQGVRL (SEQ ID NO: 94) 2 Q7L9B9 EEPD1 KIAA1706 MGSTLGCHRSIPRDPSDLSHSRKFSAACNF (SEQ ID NO: 95) 2 P63130 ERVK-7 MGQTKSKIKSKYASYLSFIKILLKRGGVKV (SEQ ID NO: 96) 2 P19086 GNAZ MGCRQSSEEKEAARRSRRIDRHLRSESQRQ (SEQ ID NO: 97) 2 Q9Y6M0 PSMC1 MGARGALLLALLLARAGLRKPESQEAAPLS (SEQ ID NO: 98) 2 P19087 GNAT2 GNATC MGSGASAEDKELAKRSKELEKKLQEDADKE (SEQ ID NO: 99) 1 A8MTJ3 GNAT3 MGSGISSESKESAKRSKELEKKLQEDAERD (SEQ ID NO: 100) 1 O60291 MGRN1 KIAA0544 RNF156 MGSILSRRIAGVEDIDIQANSAYRYPPKSG (SEQ ID NO: 101) 1 Q6BDI9 REP15 MGQKASQQLALKDSKEVPVVCEVVSEAIVH (SEQ ID NO: 102) 1 Q52LD8 RFTN2 C2orf11 MGCGLRKLEDPDDSSPGKIFSTLKRPQVET (SEQ ID NO: 103) 1 Q8IZE3 SCYL3 PACE1 MGSENSALKSYTLREPPFTLPSGLAVYPAV (SEQ ID NO: 104) 1 Q9H6Q3 SLA2 C20orf156 SLAP2 MGSLPSRRKSLPSPSLSSSVQGQGPVTMEA (SEQ ID NO: 105) 1 O75716 STK16 MPSK1 PKL12 TSF1 MGHALCVCSRGTVIIDNKRYLFIQKLGEGG (SEQ ID NO: 106) 1 Q99487 PAFAH2 MGVNQSVGFPPVTGPHLVGCGDVMEGQNLQ (SEQ ID NO: 107) 0 P42684 ABL2 ABLL ARG MGQQVGRVGEAPGLQQPQPRGIRGSSAARP (SEQ ID NO: 108) 0 O43687 AKAP7 AKAP15 AKAP18 MGQLCCFPFSRDEGKISELESSSSAVLQRY (SEQ ID NO: 109) 0 Q9P2G1 ANKIB1 KIAA1386 MGNTTTKFRKALINGDENLACQIYENNPQL (SEQ ID NO: 110) 0 P61204 ARF3 MGNIFGNLLKSLIGKKEMRILMVGLDAAGK (SEQ ID NO: 111) 0 Q969Q4 ARL11 ARLTS1 MGSVNSRGHKAEAQVVMMGLDSAGKTTLLY (SEQ ID NO: 112) 0 Q8N4G2 ARL14 ARF7 MGSLGSKNPQTKQAQVLLLGLDSAGKSTLL (SEQ ID NO: 113) 0 Q8IVW1 ARL17A ARL17P1; ARL17B MGNIFEKLFKSLLGKKKMRILILSLDTAG (SEQ ID NO: 114) 0 ARF1P2 ARL17A PRO2667 P49703 ARL4D ARF4L MGNHLTEMAPTASSFLPHFQALHVVVIGLD (SEQ ID NO: 115) 0 Q9Y689 ARL5A ARFLP5 ARL5 MGILFTRIWRLFNHQEHKVIIVGLDNAGKT (SEQ ID NO: 116) 0 Q96KC2 ARL5B ARL8 MGLIFAKLWSLFCNQEHKVIIVGLDNAGKT (SEQ ID NO: 117) 0 A6NH57 ARL5C ARL12 MGQLIAKLMSIFGNQEHTVIIVGLDNEGKT (SEQ ID NO: 118) 0 Q8WXS3 BAALC MGCGGSRADAIEPRYYESWTRETESTWLTY (SEQ ID NO: 119) 0 P51451 BLK MGLVSSKKPDKEKPIKEKDKGQWSPLKVSA (SEQ ID NO: 120) 0 Q969J3 BORCS5 LOH12CR1 MGSEQSSEAESRPNDLNSSVTPSPAKHRAK (SEQ ID NO: 121) 0 Q9UPA5 BSN KIAA0434 ZNF231 MGNEVSLEGGAGDGPLPPGGAGPGPGPGPG (SEQ ID NO: 122) 0 Q9P203 BTBD7 KIAA1525 MGANASNYPHSCSPRVGGNSQAQQTFIGTS (SEQ ID NO: 123) 0 A6NGG8 C2orf71 MGCTPSHSDLVNSVAKSGIQFLKKPKAIRP (SEQ ID NO: 124) 0 Q9NZU7 CABP1 MGGGDGAAFKRPGDGARLQRVLGLGSRREP (SEQ ID NO: 125) 0 Q9NPB3 CABP2 MGNCAKRPWRRGPKDPLQWLGSPPRGSCPS (SEQ ID NO: 126) 0 A6NI79 CCDC69 MGCRHSRLSSCKPPKKKRQEPEPEQPPRPE (SEQ ID NO: 127) 0 Q15078 CDK5R1 CDK5R NCK5A MGTVLSLSPSYRKATLFEDGAATVGHYTAV (SEQ ID NO: 128) 0 Q13319 CDK5R2 NCK5A1 MGTVLSLSPASSAKGRRPGGLPEEKKKAPP (SEQ ID NO: 129) 0 O43745 CHP2 HCA520 MGSRSSHAAVIPDGDSIRRETGFSQASLLR (SEQ ID NO: 130) 0 Q717R9 CYS1 MGSGSSRSSRTLRRRRSPESLPAGPGAAAL (SEQ ID NO: 131) 0 Q6QHC5 DEGS2 C14orf66 MGNSASRSDFEWVYTDQPHTQRRKEILAKY (SEQ ID NO: 132) 0 Q9NRW4 DUSP22 JSP1 LMWDSP2 MGNGMNKILPGLYIGNFKDARDAEQLSKNK (SEQ ID NO: 133) 0 MKPX Q7RTS9 DYM MGSNSSRIGDLPKNEYLKKLSGTESISEND (SEQ ID NO: 134) 0 P16452 EPB42 E42P MGQALGIKSCDFQAARNNEEHHTKALSSRR (SEQ ID NO: 135) 0 P87889 ERVK-10 MGQTKSKIKSKYASYLSFIKILLKRGGVKV (SEQ ID NO: 136) 0 P62683 ERVK-21 MGQTKSKIKSKYASYLSFIKILLKRGGVKV (SEQ ID NO: 137) 0 P63145 ERVK-24 MGQTKSKIKSKYASYLSFIKILLKRGGVKV (SEQ ID NO: 138) 0 Q9HDB9 ERVK-5 ERVK5 MGQTKSKTKSKYASYLSFIKILLKRGGVRV (SEQ ID NO: 139) 0 Q7LDI9 ERVK-6 ERVK6 MGQTKSKIKSKYASYLSFIKILLKRGGVKV (SEQ ID NO: 140) 0 P62685 ERVK-8 MGQTKSKIKSKYASYLSFIKILLKRGGVKV (SEQ ID NO: 141) 0 P63126 ERVK-9 MGQTKSKIKSKYASYLSFIKILLKRGGVKV (SEQ ID NO: 142) 0 P63128 ERVK-9 MGQTKSKIKSKYASYLSFIKILLKRGGVKV (SEQ ID NO: 143) 0 P09769 FGR SRC2 MGCVFCKKLEPVATAKEDAGLEGDFRSYGA (SEQ ID NO: 144) 0 O95466 FMNL1 C17orf1 C17orf1B MGNAAGSAEQPAGPAAPPPKQPAPPKQPMP (SEQ ID NO: 145) 0 FMNL FRL1 O43559 FRS3 MGSCCSCLNRDSVPDNHPTKFKVTNVDDEG (SEQ ID NO: 146) 0 P11488 GNAT1 GNATR MGCTLSAEDKAAVERSKMIDRNLREDGEKA (SEQ ID NO: 147) 0 Q9BQQ3 GORASP1 GOLPH5 GRASP65 MGLGVSAEQPAGGAEGFHLHGVQENSPAQQ (SEQ ID NO: 148) 0 P43080 GUCA1A C6orf131 GCAP MGNVMEGKSVEELSSTECHQWYKKFMTECP (SEQ ID NO: 149) 0 GCAP1 GUCA1 Q9UMX6 GUCA1B GCAP2 MGQEFSWEEAEAAGEIDVAELQEWYKKFVM (SEQ ID NO: 150) 0 O95843 GUCA1C GCAP3 MGNGKSIAGDQKAVPTQETHVWYRTFMMEY (SEQ ID NO: 151) 0 P53701 HCCS CCHL MGLSPSAPAVAVQASNASASPPSGCPMHEG (SEQ ID NO: 152) 0 P62684 HERVK_113 MGQTKSKIKSKYASYLSFIKILLKRGGVKV (SEQ ID NO: 153) 0 Q8TB92 HMGCLL1 MGNVPSAVKHCLSYQQLLREHLWIGDSVAG (SEQ ID NO: 154) 0 P84074 HPCA BDR2 MGKQNSKLRPEMLQDLRENTEFSELELQEW (SEQ ID NO: 155) 0 Q9UM19 HPCAL4 MGKTNSKLAPEVLEDLVQNTEFSEQELKQW (SEQ ID NO: 156) 0 P63252 KCNJ2 IRK1 MGSVRTNRYSIVSSEEDGMKLATMAVANGF (SEQ ID NO: 157) 0 Q6VT66 MARC1 MOSC1 MGAAGSSALARFVLLAQSRPGWLGVAALGL (SEQ ID NO: 158) 0 P61601 NCALD MGKQNSKLRPEVMQDLLESTDFTEHEIQEW (SEQ ID NO: 159) 0 O76050 NEURL1 NEURL NEURL1A MGNNFSSIPSLPRGNPSRAPRGHPQNLKDS (SEQ ID NO: 160) 0 RNF67 Q969F2 NKD2 MGKLQSKHAAAARKRRESPEGDSFVASAYA (SEQ ID NO: 161) 0 P29474 NOS3 MGNLKSVAQEPGPPCGLGLGLGLGLCGKQG (SEQ ID NO: 162) 0 Q7Z494 NPHP3 KIAA2000 MGTASSLVSPAGGEVIEDTYGAGGGEACEI (SEQ ID NO: 163) 0 Q6X4W1 NSMF NELF LRSEAMSSVAAKVRAARAFG (SEQ ID NO: 164) 0 Q96MG8 PCMTD1 MGGAVSAGEDNDDLIDNLKEAQYIRTERVE (SEQ ID NO: 165) 0 Q9NV79 PCMTD2 C20orf36 MGGAVSAGEDNDELIDNLKEAQYIRTELVE (SEQ ID NO: 166) 0 O00408 PDE2A MGQACGHSILCRSQQYPAARPAEPRGQQVF (SEQ ID NO: 167) 0 Q9UPV7 PHF24 KIAA1045 MGVLMSKRQTVEQVQKVSLAVSAFKDGLRD (SEQ ID NO: 168) 0 Q494U1 PLEKHN1 MGNSHCVPQAPRRLRASFSRKPSLKGNRED (SEQ ID NO: 169) 0 P35813 PPM1A PPPM1A MGAFLDKPKMEKHNAQGQGNGLRYGLSSMQ (SEQ ID NO: 170) 0 Q96LZ3 PPP3R2 CBLP PPP3RL MGNEASYPAEMCSHFDNDEIKRLGRRFKKL (SEQ ID NO: 171) 0 Q9Y478 PRKAB1 AMPK MGNTSSERAALERHGGHKTPRRDSSGGTKD (SEQ ID NO: 172) 0 P22612 PRKACG MGNAPAKKDTEQEESVNEFLAKARGDFLYR (SEQ ID NO: 173) 0 Q13237 PRKG2 PRKGR2 MGNGSVKPKHSKHPDGHSGNLTTDALRNKV (SEQ ID NO: 174) 0 Q9NR22 PRMT8 HRMT1L3 HRMT1L4 MGMKHSSRCLLLRRKMAENAAESTEVNSPP (SEQ ID NO: 175) 0 P11801 PSKH1 MGCGTSKVLPEPPKDVQLDLVKKVEPFSGT (SEQ ID NO: 176) 0 Q13702 RAPSN RNF205 MGQDQTKQQIEKGLQLYQSNQTEKALQVWT (SEQ ID NO: 177) 0 P35243 RCVRN RCV1 MGNSKSGALSKEILEELQLNTKFSEEELCS (SEQ ID NO: 178) 0 Q96EQ8 RNF125 MGSVLSTDSGKSAPASATARALERRRDPEL (SEQ ID NO: 179) 0 Q8WVD5 RNF141 ZNF230 MGQQISDQTQLVINKLPEKVAKHVTLVRES (SEQ ID NO: 180) 0 Q96PX1 RNF157 KIAA1917 MGALTSRQHAGVEEVDIPSNSVYRYPPKSG (SEQ ID NO: 181) 0 Q13239 SLA SLAP SLAP1 MGNSMKSTPAPAERPLPNPEGLDSDFLAVL (SEQ ID NO: 182) 0 Q8WU08 STK32A YANK1 MGANTSRKPPVFDENEDVNFDHFEILRAIG (SEQ ID NO: 183) 0 H3BQB6 STMND1 MGCGPSQPAEDRRRVRAPKKGWKEEFKADV (SEQ ID NO: 184) 0 Q13009 TIAM1 MGNAESQHVEHEFYGEKHASLGRKHTSRSL (SEQ ID NO: 185) 0 Q81VF5 TIAM2 KIAA2016 STEF MGNSDSQYTLQGSKNHSNTITGAKQIPCSL (SEQ ID NO: 186) 0 Q86XR7 TICAM2 TIRAP3 TIRP TRAM MGIGKSKINSCPLSLSWGKRHSVDTSPGYH (SEQ ID NO: 187) 0 Q6P9B6 TLDC1 KIAA1609 MGNSRSRVGRSFCSQFLPEEQAEIDQLFDA (SEQ ID NO: 188) 0 Q9BVX2 TMEM106C EMOC MGSQHSAAARPSSCRRKQEDDRDGLLAERE (SEQ ID NO: 189) 0 P98073 TMPRSS15 ENTK PRSS7 MGSKRGISSRHHSLSSYEIMFAALFAILVV (SEQ ID NO: 190) 0 Q8ND25 ZNRF1 NIN283 MGGKQSTAARSRGPFPGVSTDDSAVPPPGG (SEQ ID NO: 191) 0
TABLE-US-00009 TABLE 2 The number of the detected proteins and potentially myristoylated proteins in Extracellular vesicles in 60 cancer cell lines Number of Number of detected Appearance frequency detected proteins potentially myristoylated of myristoylated Organs Cell Lines in exosomes proteins in exosomes protein in exosomes Leukemia SR 1772 28 1.58 Kidney TK-10 1880 31 1.65 Leukemia RPMI-8226 1694 29 1.71 Lung HOP-62 1740 30 1.72 Lung NCI-H322M 1208 21 1.74 Leukemia K562 2155 38 1.76 Kidney A498 2536 45 1.77 Melanoma LOX IMVI 2382 43 1.81 Kidney ACHN 1486 27 1.82 Kidney UO-31 1427 26 1.82 Breast MCF7 2299 42 1.83 Lung HOP-92 1525 28 1.84 Colon HT29 2059 38 1.85 Ovary OVCAR-3 2245 42 1.87 Ovary OVCAR-4 2717 51 1.88 Leukemia MOLT-4 2020 38 1.88 Lung EKVX 1136 22 1.94 Ovary IGROV1 1699 33 1.94 Breast T-47D 2092 41 1.96 Leukemia HL-60 1678 33 1.97 Breast BT549 2269 45 1.98 Lung NCI-H522 1608 32 1.99 Melanoma SK-MEL-5 2225 45 2.02 Melanoma UACC-62 1728 35 2.03 Breast MDA-MB-468 2377 49 2.06 Colon KM12 2423 50 2.06 Colon Colo205 2545 53 2.08 Leukemia CCRF-CEM 2331 49 2.10 Kidney RXF 393 1830 39 2.13 Lung A549 1868 40 2.14 Melanoma SK-MEL-2 2262 49 2.17 Ovary SK-OV-3 1569 34 2.17 Colon HCT-15 2476 54 2.18 Kidney 786-O 1442 32 2.22 Lung NCI-H23 1663 37 2.22 Colon HCT-116 2510 56 2.23 Colon SW620 2691 61 2.27 Melanoma M14 1409 32 2.27 Lung NCL-H226 1755 40 2.28 Ovary OVCAR-5 2000 46 2.30 Melanoma MALME-3M 2074 48 2.31 Lung NCI-H460 1336 31 2.32 Kidney CAKI 1401 33 2.36 Breast MDA-MB-231 2237 53 2.37 CNS SF295 2041 49 2.40 Melanoma SK-MEL-28 1817 44 2.42 Colon HCC 2998 1841 45 2.44 CNS U251 1862 46 2.47 Melanoma UACC-257 1940 48 2.47 CNS SNB-19 1857 46 2.48 Ovary NCI-ADR-RES 2341 58 2.48 CNS SF539 1761 44 2.50 Prostate PC-3 1558 39 2.50 Prostate DU145 1274 32 2.51 CNS SNB-75 1909 48 2.51 CNS SF268 1819 46 2.53 Kidney SN12C 1716 44 2.56 Ovary OVCAR-8 2005 53 2.64 Melanoma MDA-MB-435 1680 45 2.68 Breast HS 578T 1228 34 2.77
TABLE-US-00010 TABLE 3 The potential myristoylated proteins detected in extracellular vesicles of breast milk. Protein The peptide sequence in the N-terminus of in ID Gene Name potential myristoylated proteins P18085 ARF4 ARF2 MGLTISSLFSRLFGKKQMRILMVGLDAAGK (SEQ ID NO: 192) P62330 ARF6 MGKVLSKIFGNKEMWILMLGLDAAGKTTIL (SEQ ID NO: 193) P04899 GNAI2 GNAI2B MGCTVSAEDKAAAERSKMIDKNLREDGEKA (SEQ ID NO: 194) P08754 GNAI3 MGCTLSAEDKAAVERSKMIDRNLREDGEKA (SEQ ID NO: 195) Q96TA1 FAM129B C9orf88 MGDVLSTHLDDARRQHIAEKTGKILTEFLQ (SEQ ID NO: 196) Q6IAA8 LAMTOR1 C11orf59 PDRO PP7157 MGCCYSSENEDSDQDREERKLLLDPSSPPT (SEQ ID NO: 197) Q14254 FLOT2 ESA1 M17S1 MGNCHTVGPNEALVVSGGCCGSDYKQYVFG (SEQ ID NO: 198) P84085 ARF5 MGLTVSALFSRIFGKKQMRILMVGLDAAGK (SEQ ID NO: 199) P61313 RPL15 EC45 TCBAP0781 MGAYKYIQELWRKKQSDVMRFLLRVRCWQY (SEQ ID NO: 200) P07947 YES1 YES MGCIKSKENKSPAIKYRPENTPEPVSTSVS (SEQ ID NO: 201) Q9NUQ9 FAM49B BM-009 MGNLLKVLTCTDLEQGPNFFLDFENAQPTE (SEQ ID NO: 202) Q9H4G4 GLIPR2 C9orf19 GAPR1 MGKSASKQFHNEVLKAHNEYRQKHGVPPLK (SEQ ID NO: 203) P63096 GNAI1 MGCTLSAEDKAAVERSKMIDRNLREDGEKA (SEQ ID NO: 204) P36405 ARL3 ARFL3 MGLLSILRKLKSAPDQEVRILLLGLDNAGK (SEQ ID NO: 205) Q96FZ7 CHMP6 VPS20 MGNLFGRKKQSRVTEQDKAILQLKQQRDKL (SEQ ID NO: 206) Q99653 CHP1 CHP MGSRASTLLRDEELEEIKKETGFSHSQITR (SEQ ID NO: 207) Q8WWI5 SLC44A1 CD92 CDW92 CTL1 MGCCSSASSAAQSSKREWKPLEDRSCTDIP (SEQ ID NO: 208) P07948 LYN JTK8 MGCIKSKGKDSLSDDGVDLKTQPVRNTERT (SEQ ID NO: 209) P49006 MARCKSL1 MLP MRP MGSQSSKAPRGDVTAEEAAGASPAKANGQE (SEQ ID NO: 210) O75695 RP2 MGCFFSKRRKADKESRPENEEERPKQYSWD (SEQ ID NO: 211) P29966 MARCKS MACS PRKCSL MGAQFSKTAAKGEAAAERPGEAAVASSPSK (SEQ ID NO: 212) Q8N9N7 LRRC57 MGNSALRAHVETAQKTGVFQLKDRGLTEFP (SEQ ID NO: 213) P37235 HPCAL1 BDR1 MGKQNSKLRPEVLQDLRENTEFTDHELQEW (SEQ ID NO: 214) Q9NRX5 SERINC1 KIAA1253 TDE1L TDE2 MGSVLGLCSMASWIPCLCGSAPCLLCRCCP (SEQ ID NO: 215) UNQ396/PRO732 P40616 ARL1 MGGFFSSIFSSLFGTREMRILILGLDGAGK (SEQ ID NO: 216) P80723 BASP1 NAP22 MGGKLSKKKKGYNVNDEKAKEKDKKAEGAA (SEQ ID NO: 217) Q96PY5 FMNL2 FHOD2 KIAA1902 MGNAGSMDSQQTDFRAHNVPLKLPMPEPGE (SEQ ID NO: 218) Q9BZQ8 FAM129A C1orf24 NIBAN GIG39 MGGSASSQLDEGKCAYIRGKTEAAIKNFSP (SEQ ID NO: 219) Q8NHG7 SVIP MGLCFPCPGESAPPTPDLEEKRAKLAEAAE (SEQ ID NO: 220) Q9Y3E7 CHMP3 CGI149 NEDF VPS24 CGI-149 MGLFGKTQEKPPKELVNEWSLKIRKEMRVV (SEQ ID NO: 221) Q99828 CIB1 CIB KIP PRKDCIP MGGSGSRLSKELLAEYQDLTFLTKQEILLA (SEQ ID NO: 222) P17612 PRKACA PKACA MGNAAAAKKGSEQESVKEFLAKAKEDFLKK (SEQ ID NO: 223) Q8ND76 CCNY C10orf9 CBCP1 CFP1 MGNTTSCCVSSSPKLRRNAHSRLESYRPDT (SEQ ID NO: 224) O00461 GOLIM4 GIMPC GOLPH4 GPP130 MGNGMCSRKQKRIFQTLLLLTVVFGFLYGA (SEQ ID NO: 225) Q8NHG8 ZNRF2 RNF202 MGAKQSGPAAANGRTRAYSGSDLPSSSSGG (SEQ ID NO: 226) P40617 ARL4A ARL4 MGNGLSDQTSILSNLPSFQSFHIVILGLDC (SEQ ID NO: 227) O60291 MGRN1 KIAA0544 RNF156 MGSILSRRIAGVEDIDIQANSAYRYPPKSG (SEQ ID NO: 228) Q9P2G1 ANKIB1 KIAA1386 MGNTTTKFRKALINGDENLACQIYENNPQL (SEQ ID NO: 229) P61204 ARF3 MGNIFGNLLKSLIGKKEMRILMVGLDAAGK (SEQ ID NO: 230) P35813 PPM1A PPPM1A MGAFLDKPKMEKHNAQGQGNGLRYGLSSMQ (SEQ ID NO: 231) Q9Y478 PRKAB1 AMPK MGNTSSERAALERHGGHKTPRRDSSGGTKD (SEQ ID NO: 232)
TABLE-US-00011 TABLE 4 The potential myristoylated proteins detected in exosomes of human thymus Protein The peptide sequence in the N-terminus of in ID Gene Name potential myristoylated proteins Q02952 AKAP12 AKAP250 MGAGSSTEQRSPEQPPEGSSTPAEPEPSGG (SEQ ID NO: 233) P84077 ARF1 MGNIFANLFKGLFGKKEMRILMVGLDAAGK (SEQ ID NO: 234) P18085 ARF4 ARF2 MGLTISSLFSRLFGKKQMRILMVGLDAAGK (SEQ ID NO: 235) P84085 ARF5 MGLTVSALFSRIFGKKQMRILMVGLDAAGK (SEQ ID NO: 236) P62330 ARF6 MGKVLSKIFGNKEMWILMLGLDAAGKTTIL (SEQ ID NO: 237) P40616 ARL1 MGGFFSSIFSSLFGTREMRILILGLDGAGK (SEQ ID NO: 238) P36405 ARL3 ARFL3 MGLLSILRKLKSAPDQEVRILLLGLDNAGK (SEQ ID NO: 239) P80723 BASP1 NAP22 MGGKLSKKKKGYNVNDEKAKEKDKKAEGAA (SEQ ID NO: 240) Q96FZ7 CHMP6 VPS20 MGNLFGRKKQSRVTEQDKAILQLKQQRDKL (SEQ ID NO: 241) P00387 CYB5R3 DIA1 MGAQLSTLGHMVLFPVWFLYSLLMKLFQRS (SEQ ID NO: 242) Q7L014 DDX46 KIAA0801 MGRESRHYRKRSASRGRSGSRSRSRSPSDK (SEQ ID NO: 243) Q9BZQ8 FAM129A C1orf24 NIBAN GIG39 MGGSASSQLDEGKCAYIRGKTEAAIKNFSP (SEQ ID NO: 244) Q96TA1 FAM129B C9orf88 MGDVLSTHLDDARRQHIAEKTGKILTEFLQ (SEQ ID NO: 245) Q9NUQ9 FAM49B BM-009 MGNLLKVLTCTDLEQGPNFFLDFENAQPTE (SEQ ID NO: 246) Q14254 FLOT2 ESA1 M17S1 MGNCHTVGPNEALVVSGGCCGSDYKQYVFG (SEQ ID NO: 247) Q96PY5 FMNL2 FHOD2 KIAA1902 MGNAGSMDSQQTDFRAHNVPLKLPMPEPGE (SEQ ID NO: 248) P06241 FYN MGCVQCKDKEATKLTEERDGSLNQSSGYRY (SEQ ID NO: 249) Q9H4G4 GLIPR2 C9orf19 GAPR1 MGKSASKQFHNEVLKAHNEYRQKHGVPPLK (SEQ ID NO: 250) P63096 GNAI1 MGCTLSAEDKAAVERSKMIDRNLREDGEKA (SEQ ID NO: 251) P04899 GNAI2 GNAI2B MGCTVSAEDKAAAERSKMIDKNLREDGEKA (SEQ ID NO: 252) P08754 GNAI3 MGCTLSAEDKAAVERSKMIDRNLREDGEKA (SEQ ID NO: 253) Q9H8Y8 GORASP2 GOLPH6 MGSSQSVEIPGGGTEGYHVLRVQENSPGHR (SEQ ID NO: 254) P08631 HCK MGGRSSCEDPGCPRDEERAPRMGCMKSKFL (SEQ ID NO: 255) P37235 HPCAL1 BDR1 MGKQNSKLRPEVLQDLRENTEFTDHELQEW (SEQ ID NO: 256) P06239 LCK MGCGCSSHPEDDWMENIDVCENCHYPIVPL (SEQ ID NO: 257) Q8N9N7 LRRC57 MGNSALRAHVETAQKTGVFQLKDRGLTEFP (SEQ ID NO: 258) P07948 LYN JTK8 MGCIKSKGKDSLSDDGVDLKTQPVRNTERT (SEQ ID NO: 259) P29966 MARCKS MACS PRKCSL MGAQFSKTAAKGEAAAERPGEAAVASSPSK (SEQ ID NO: 260) P49006 MARCKSL1 MLP MRP MGSQSSKAPRGDVTAEEAAGASPAKANGQE (SEQ ID NO: 261) P08473 MME EPN MGKSESQMDITDINTPKPKKKQRWTPLEIS (SEQ ID NO: 262) P29728 OAS2 MGNGESQLSSVPAQKLGWFIQEYLKPYEEC (SEQ ID NO: 263) Q99570 PIK3R4 VPS15 MGNQLAGIAPSQILSVESYFSDIHDFEYDK (SEQ ID NO: 264) P17612 PRKACA PKACA MGNAAAAKKGSEQESVKEFLAKAKEDFLKK (SEQ ID NO: 265) P22694 PRKACB MGNAATAKKGSEVESVKEFLAKAKEDFLKK (SEQ ID NO: 266) Q14699 RFTN1 KIAA0084 MIG2 MGCGLNKLEKRDEKRPGNIYSTLKRPQVET (SEQ ID NO: 267) O75695 RP2 MGCFFSKRRKADKESRPENEEERPKQYSWD (SEQ ID NO: 268) P61313 RPL15 EC45 TCBAP0781 MGAYKYIQELWRKKQSDVMRFLLRVRCWQY (SEQ ID NO: 269) P62241 RPS8 OK/SW-cl.83 MGISRDNWHKRRKTGGKRKPYHKKRKYELG (SEQ ID NO: 270) Q8WWI5 SLC44A1 CD92 CDW92 CTL1 MGCCSSASSAAQSSKREWKPLEDRSCTDIP (SEQ ID NO: 271) P12931 SRC SRC1 MGSNKSKPKDASQRRRSLEPAENVHGAGGG (SEQ ID NO: 272) P07947 YES1 YES MGCIKSKENKSPAIKYRPENTPEPVSTSVS (SEQ ID NO: 273) O43149 ZZEF1 KIAA0399 MGNAPSHSSEDEAAAAGGEGWGPHQDWAAV (SEQ ID NO: 274) P61204 ARF3 MGNIFGNLLKSLIGKKEMRILMVGLDAAGK (SEQ ID NO: 275) O95466 FMNL1 C17orf1 C17orf1B FMNL FRL1 MGNAAGSAEQPAGPAAPPPKQPAPPKQPMP (SEQ ID NO: 276) P11488 GNAT1 GNATR MGCTLSAEDKAAVERSKMIDRNLREDGEKA (SEQ ID NO: 277) P61601 NCALD MGKQNSKLRPEVMQDLLESTDFTEHEIQEW (SEQ ID NO: 278) O00408 PDE2A MGQACGHSILCRSQQYPAARPAEPRGQQVF (SEQ ID NO: 279) Q9NR22 PRMT8 HRMT1L3 HRMT1L4 MGMKHSSRCLLLRRKMAENAAESTEVNSPP (SEQ ID NO: 280)
TABLE-US-00012 TABLE 5 The potential myristoylated proteins detected in extracellular vesicles of human urine. Protein The peptide sequence in the N-terminus of in ID Gene Name potential myristoylated proteins Q9BRQ8 AIFM2 AMID PRG3 MGSQVSVESGALHVVIVGGGFGGIAAASQL (SEQ ID NO: 281) Q02952 AKAP12 AKAP250 MGAGSSTEQRSPEQPPEGSSTPAEPEPSGG (SEQ ID NO: 282) P27216 ANXA13 ANX13 MGNRHAKASSPQGFDVDRDAKKLNKACKGM (SEQ ID NO: 283) P84077 ARF1 MGNIFANLFKGLFGKKEMRILMVGLDAAGK (SEQ ID NO: 284) P18085 ARF4 ARF2 MGLTISSLFSRLFGKKQMRILMVGLDAAGK (SEQ ID NO: 285) P84085 ARF5 MGLTVSALFSRIFGKKQMRILMVGLDAAGK (SEQ ID NO: 286) P62330 ARF6 MGKVLSKIFGNKEMWILMLGLDAAGKTTIL (SEQ ID NO: 287) P36405 ARL3 ARFL3 MGLLSILRKLKSAPDQEVRILLLGLDNAGK (SEQ ID NO: 288) Q9H0F7 ARL6 BBS3 MGLLDRLSVLLGLKKKEVHVLCLGLDNSGK (SEQ ID NO: 289) P80723 BASP1 NAP22 MGGKLSKKKKGYNVNDEKAKEKDKKAEGAA (SEQ ID NO: 290) Q8ND76 CCNY C10orf9 CBCP1 CFP1 MGNTTSCCVSSSPKLRRNAHSRLESYRPDT (SEQ ID NO: 291) Q9Y3E7 CHMP3 CGI149 NEDF VPS24 CGI-149 MGLFGKTQEKPPKELVNEWSLKIRKEMRVV (SEQ ID NO: 292) Q96FZ7 CHMP6 VPS20 MGNLFGRKKQSRVTEQDKAILQLKQQRDKL (SEQ ID NO: 293) Q99653 CHP1 CHP MGSRASTLLRDEELEEIKKETGFSHSQITR (SEQ ID NO: 294) Q99828 CIB1 CIB KIP PRKDCIP MGGSGSRLSKELLAEYQDLTFLTKQEILLA (SEQ ID NO: 295) P00387 CYB5R3 DIA1 MGAQLSTLGHMVLFPVWFLYSLLMKLFQRS (SEQ ID NO: 296) Q9BZQ8 FAM129A C1orf24 NIBAN GIG39 MGGSASSQLDEGKCAYIRGKTEAAIKNFSP (SEQ ID NO: 297) Q96TA1 FAM129B C9orf88 MGDVLSTHLDDARRQHIAEKTGKILTEFLQ (SEQ ID NO: 298) Q9NUQ9 FAM49B BM-009 MGNLLKVLTCTDLEQGPNFFLDFENAQPTE (SEQ ID NO: 299) Q14254 FLOT2 ESA1 M17S1 MGNCHTVGPNEALVVSGGCCGSDYKQYVFG (SEQ ID NO: 300) P06241 FYN MGCVQCKDKEATKLTEERDGSLNQSSGYRY (SEQ ID NO: 301) Q9H4G4 GLIPR2 C9orf19 GAPR1 MGKSASKQFHNEVLKAHNEYRQKHGVPPLK (SEQ ID NO: 302) P63096 GNAI1 MGCTLSAEDKAAVERSKMIDRNLREDGEKA (SEQ ID NO: 303) P04899 GNAI2 GNAI2B MGCTVSAEDKAAAERSKMIDKNLREDGEKA (SEQ ID NO: 304) P08754 GNAI3 MGCTLSAEDKAAVERSKMIDRNLREDGEKA (SEQ ID NO: 305) P09471 GNAO1 MGCTLSAEERAALERSKAIEKNLKEDGISA (SEQ ID NO: 306) P19086 GNAZ MGCRQSSEEKEAARRSRRIDRHLRSESQRQ (SEQ ID NO: 307) O00461 GOLIM4 GIMPC GOLPH4 GPP130 MGNGMCSRKQKRIFQTLLLLTVVFGFLYGA (SEQ ID NO: 308) P08631 HCK MGGRSSCEDPGCPRDEERAPRMGCMKSKFL (SEQ ID NO: 309) Q8IV36 HID1 C17orf28 DM01 MGSTDSKLNFRKAVIQLTTKTQPVEATDDA (SEQ ID NO: 310) P37235 HPCAL1 BDR1 MGKQNSKLRPEVLQDLRENTEFTDHELQEW (SEQ ID NO: 311) Q6IAA8 LAMTOR1 C11orf59 PDRO PP7157 MGCCYSSENEDSDQDREERKLLLDPSSPPT (SEQ ID NO: 312) P06239 LCK MGCGCSSHPEDDWMENIDVCENCHYPIVPL (SEQ ID NO: 313) Q8N9N7 LRRC57 MGNSALRAHVETAQKTGVFQLKDRGLTEFP (SEQ ID NO: 314) P29966 MARCKS MACS PRKCSL MGAQFSKTAAKGEAAAERPGEAAVASSPSK (SEQ ID NO: 315) P49006 MARCKSL1 MLP MRP MGSQSSKAPRGDVTAEEAAGASPAKANGQE (SEQ ID NO: 316) O60291 MGRN1 KIAA0544 RNF156 MGSILSRRIAGVEDIDIQANSAYRYPPKSG (SEQ ID NO: 317) P08473 MME EPN MGKSESQMDITDINTPKPKKKQRWTPLEIS (SEQ ID NO: 318) O75688 PPM1B PP2CB MGAFLDKPKTEKHNAHGAGNGLRYGLSSMQ (SEQ ID NO: 319) P17612 PRKACA PKACA MGNAAAAKKGSEQESVKEFLAKAKEDFLKK (SEQ ID NO: 320) P22694 PRKACB MGNAATAKKGSEVESVKEFLAKAKEDFLKK (SEQ ID NO: 321) Q14699 RFTN1 KIAA0084 MIG2 MGCGLNKLEKRDEKRPGNIYSTLKRPQVET (SEQ ID NO: 322) O75695 RP2 MGCFFSKRRKADKESRPENEEERPKQYSWD (SEQ ID NO: 323) P62241 RPS8 OK/SW-cl.83 MGISRDNWHKRRKTGGKRKPYHKKRKYELG (SEQ ID NO: 324) Q9NRX5 SERINC1 KIAA1253 TDE1L TDE2 MGSVLGLCSMASWIPCLCGSAPCLLCRCCP (SEQ ID NO: 325) UNQ396/PRO732 Q8WWI5 SLC44A1 CD92 CDW92 CTL1 MGCCSSASSAAQSSKREWKPLEDRSCTDIP (SEQ ID NO: 326) P12931 SRC SRC1 MGSNKSKPKDASQRRRSLEPAENVHGAGGG (SEQ ID NO: 327) Q8NHG7 SVIP MGLCFPCPGESAPPTPDLEEKRAKLAEAAE (SEQ ID NO: 328) P07947 YES1 YES MGCIKSKENKSPAIKYRPENTPEPVSTSVS (SEQ ID NO: 329) Q9P2G1 ANKIB1 KIAA1386 MGNTTTKFRKALINGDENLACQIYENNPQL (SEQ ID NO: 330) P61204 ARF3 MGNIFGNLLKSLIGKKEMRILMVGLDAAGK (SEQ ID NO: 331) Q9P203 BTBD7 KIAA1525 MGANASNYPHSCSPRVGGNSQAQQTFIGTS (SEQ ID NO: 332) Q717R9 CYS1 MGSGSSRSSRTLRRRRSPESLPAGPGAAAL (SEQ ID NO: 333) Q7Z494 NPHP3 KIAA2000 MGTASSLVSPAGGEVIEDTYGAGGGEACEI (SEQ ID NO: 334) P35813 PPM1A PPPM1A MGAFLDKPKMEKHNAQGQGNGLRYGLSSMQ (SEQ ID NO: 335) Q9Y478 PRKAB1 AMPK MGNTSSERAALERHGGHKTPRRDSSGGTKD (SEQ ID NO: 336) Q13237 PRKG2 PRKGR2 MGNGSVKPKHSKHPDGHSGNLTTDALRNKV (SEQ ID NO: 337) P11801 PSKH1 MGCGTSKVLPEPPKDVQLDLVKKVEPFSGT (SEQ ID NO: 338) Q6P9B6 TLDC1 KIAA1609 MGNSRSRVGRSFCSQFLPEEQAEIDQLFDA (SEQ ID NO: 339)
[0110] Src Kinase is Detected and/or Enriched in EVs of Prostate Cancer Cells.
[0111] Src kinase has been well known to be myristoylated (Kim S, et al. Cancer Res. 2017 77:6950-62; Patwardhan P, et al. Mol Cell Biol. 2010 30:4094-107). To examine how myristoylation contributes to the encapsulation of a protein into EVs, we focused on Src kinase in EVs of four prostate cancer cell lines including PC3, DU145, LNCaP, and 22Rv1 cells. The average size of EVs derived from these cell lines was about 140 nm, and the size distribution showed no significant difference (
[0112] Myristoylation Mediates the Encapsulation of Src Kinase into EVs.
[0113] To examine the role of myristoylation in the encapsulation of Src kinase, four cell lines including DU145, NIH 3T3, SYF1, and 22Rv1 were transduced with wild type Src [Src(WT)] or Src(G2A), a mutant with loss of myristoylation by lentiviral infection (
[0114] To further analyze if Src protein in EVs was myristoylated, DU145 cells expressing vector control, Src(WT), or Src(G2A) cells were cultured in medium containing myristic acid-azide (MA-azide, an analog of myristic acid). As expected, the endogenous Src levels in EVs were increased in comparison with that in total cell lysate (
[0115] An Increase of Src Kinase Activity Enhances its Encapsulation into EVs.
[0116] Src(Y529F) is a constitutively active Src kinase mutant (
[0117] Palmitoylation Inhibits the Encapsulation of Proteins into EVs.
[0118] Some SFK members such as Fyn kinase are both myristoylated and palmitoylated at the N-terminus (Resh M D. Cell. 1994 76:411-3; Aicart-Ramos C, et al. 2011 1808:2981-94). A goal was set to study the role of palmitoylation in the regulation of protein encapsulation into EVs. Gain of palmitoylation sites in the Src(S3C/S6C) mutant, or loss of palmitoylation sites in the Fyn(C3S/C6S) mutant were previously created (
[0119] Myristoylation Mediates the Encapsulation of Src Kinase into Plasma EVs.
[0120] To further investigate if myristoylation mediates Src encapsulation into plasma EVs in vivo, DU145 cells or DU145 cells expressing vector control, Src(Y529F), or Src(Y529F/G2A) were implanted sub-renally into SCID mice. The isolated plasma EVs were characterized as mono-dispersed particles with the average size of −100 nm and zeta potential of −25 mV. This size and zeta potential were not significantly different among those isolated from xenograft-free mice, or mice carrying DU145 xenografts expressing control vector, Src(Y529F/G2A), or Src(Y529F) (
[0121] To exclude the possibility that higher Src levels in the plasma EVs were due to larger tumor size of Src(Y529F) induced xenograft tumors, ten times more DU145 cells or DU145 cells expressing Src(Y529F/G2A) were implanted relative to those expressing Src(Y529F). Similar to the previous experiment, the size and zeta potential were not significantly different among the plasma EVs in the different groups (
[0122] The encapsulation of Src kinase into EVs is mediated through the ESCRT pathway, not the lipid rafts pathway.
[0123] Lipid rafts are membrane-associated microdomains enriched with cholesterol and saturated phospholipids like sphingolipids. Lipid rafts are one of the essential pathways to mediate the encapsulation of proteins into EVs (Tan S S, et al. J Extracell Vesicles. 2013 2:22614; Trajkovic K, et al. Science. 2008 319:1244-7). To examine if lipid rafts mediate the encapsulation of Src kinase into EVs, cells were treated with Filipin III, a lipid raft disruption agent and cholesterol levels significantly decreased (
[0124] Syntenin is an important protein to mediate the EVs biogenesis, and is also enriched in EVs. Over-expression of Src(Y529F) in DU145 cells significantly increased levels of syntenin in EVs (
[0125] Syntenin is involved in multi-vesicular bodies (MVB) formation and the ESCRT-mediated biogenesis (Thery C, et al. Nat Rev Immunol. 2002 2:569-79). To further study if Src encapsulation into EVs is regulated by the ESCRT pathway, TSG101, an essential protein in the ESCRT pathway was knocked down in PC3 or 22Rv1 cells. Down-regulation of TSG101 did not change cellular levels of Src protein, but significantly decreased its levels in EVs (
[0126] Discussion
[0127] The disclosed studies have demonstrated that myristoylation mediates the encapsulation of Src kinase into EVs. Myristoylation is one of the important lipid modifications for a panel of proteins (Resh M D. Biochimica et biophysica acta. 1999 1451:1-16). At least 182 proteins, which accounts for about 0.9% of the mammalian genome, possess an N-terminal glycine that is required for myristoylation. As shown herein, these potentially myristoylated proteins occur more frequently in EVs according to proteomic studies. Among the identified proteins, Src kinase is experimentally confirmed to be myristoylated (Kim S, et al. J Biol Chem. 2017). Src kinase is detected and/or enriched in EVs from all four tested prostate cancer cell lines, which is consistent with a report about expression levels of Src kinase in EVs (DeRita R M, et al. J Cell Biochem. 2017 118:66-73). Loss of myristoylation significantly inhibits Src or Fyn levels in EVs. Myristoylation allows for the association of Src kinase with the cell membrane (Kim S, et al. J Biol Chem. 2017), which is important for its biogenesis in EVs. In an analysis of proteins containing a myristoylation epitope that is fused to the N-terminus of GFP, loss of myristoylation in Acyl(G2A)TyA-GFP and Gag(G2A)TyA-GFP suppresses their encapsulation into the secreted vesicles or HIV virus (Shen B, et al. J Biol Chem. 2011 286:14383-95). Therefore, taking advantage of the fact that myristoylated proteins could preferentially be encapsulated into EVs, this fatty acyl modification might be considered as a strategy for delivery of proteins using EVs.
[0128] Myristoylation facilitating the encapsulation of Src kinase into EVs relies on two intertwined factors. First, myristoylation confers the association of Src kinase with the cell membrane to mediate the protein-protein interactions with other membrane-bound proteins (
[0129] As disclosed herein, encapsulation of Src kinase members into EVs is suppressed by palmitoylation at the N-terminus. Gain of palmitoylation sites in Src(S3C/S6C) mutant significantly reduced its levels in EVs. In contrast, removal of palmitoylated sites in Fyn(C3S/C6S) mutant significantly increased Fyn encapsulation into EVs. Loss or gain of palmitoylation in Src family kinase members can potentially change their kinase activity and oncogenic potential (Cai H, et al. Proc Natl Acad Sci USA. 2011 108:6579-84). Therefore, on one hand, palmitoylation suppressing the encapsulation of Src into EVs might be due to a reduction of Src kinase activity, thereby inhibiting the activation of syndecan-syntenin-ESCRT pathway as described in the above. On the other hand, the differential lipidation in myristoylation with/without palmitoylation could considerably change the localization of SFKs members in the cell membrane and the intracellular trafficking pathways (Sato I, et al. J Cell Sci. 2009 122:965-75; Sandilands E, et al. J Cell Sci. 2007 120:2555-64). For example, palmitoylation promotes SFK members localized at the lipid raft and caveolae region of the cell membrane (Shenoy-Scaria A M, et al. J Cell Biol. 1994 126:353-64). Deviation of palmitoylated SFKs members such as Fyn kinase toward the caveolae concentrated domain in the cell membrane could likely regulate their encapsulation into EVs.
[0130] Given the fact that expression levels or activity of Src kinase is usually dys-regulated in numerous cancers including prostate cancer (Irby R B, et al. Oncogene. 2000 19:5636) and metastatic castration resistant prostate cancer (Drake J M, et al. Proc Natl Acad Sci USA. 2013 110:E4762-9), the detection of myristoylated Src in the plasma EVs may potentially serve as an early biomarker for aggressive tumors. The number of EVs in urine or plasma are usually higher in cancer patients and correlated with a high Gleason score and metastatic prostate cancer patients (Vlaeminck-Guillem V. Front Oncol. 2018 8:222). Besides the number of EVs, the components of EVs including lipid, proteins, mRNA, microRNA, long non-coding RNAs and others have also been considered as potential biomarkers (Skog J, et al. Nat Cell Biol. 2008 10:1470-6). This study demonstrates that myristolated proteins, in particular myristoylated Src kinase, could potentially reflect Src-driven xenograft tumors by the detection of Src levels in the plasma EVs. This is supported by the evidence that Src is detected in the plasma EVs of TRAMP mice, a Src driven prostate tumor progression model (DeRita R M, et al. J Cell Biochem. 2017 118:66-73). Additionally, there is a report that an increase of c-Src levels is observed in EVs from multiple myeloma and immunoglobulin light chain (AL) amyloidosis (Di Noto G, et al. PLoS One. 2013 8:e70811). Future studies should explore whether Src or myristoylated Src levels in the plasma EVs from prostate cancer patients reflect tumor progression, which could potentially provide a biomarker of non-invasively monitoring aggressive prostate cancer.
Example 2: Genetical Engineering Cas9 to Encapsulate CRISPR System into Extracellular Vesicles by Protein Myristoylation
[0131] Material and Methods
[0132] Plasmid constructs: To create non-lentiviral vector expressing myristoylated Cas9 (mCas9), Cas9-Guide or Cas9-Scramble CRISPR vectors (OriGene, Rockville, Md., USA) were used as the PCR template. The Src(WT; 8 a.a) (Forward primer) and mCas9 primer (reverse primer) (Table 6) were used to obtain a PCR product, which fused the DNA sequence of the first eight amino acid sequence in the N-terminus of Src kinase with the N-terminus of Cas9 gene. The obtained PCR product, and Cas9/sgRNA-Guide or Cas9/sgRNA-Scramble vectors, and were digested with BglII and BstZ171. After the ligation of PCR product and digested parental vector, non-viral vector, mCas9/sgRNA-Guide and mCas9/sgRNA-Scramble were created. To generate mCas9(G2A) vectors, a PCR product was generated using the created mCas9 vector as the DNA template, and Src(G2A;8a.a) (forward primer) and mCas9 primer (reverse primer). The obtained PCR product were cloned into at the BglII and BstZ171 sites. To generate Cas9/sgRNAs in the bicistronic vector to target GFP gene, three set of sgRNA primers were designed and commercially synthesized (Table 6). The annealed products were cloned into the above vectors between the BamHI and BsmBI sites. As a result, Cas9/sgRNA-GFP, mCas9/sgRNA-GFP, and mCas9(G2A)/sgRNA-GFP were created. All DNA constructs were verified by sequencing.
TABLE-US-00013 TABLE 6 Primer sequences used for cloning Src mutants, sgRNA-GFP on Cas9 vectors Gene Direction Sequence (5′-3′) Src Forward CATAGATCTGCCGCCGCGATCGCCATGGGCAGCAACAAGAG (WT; 8 a.a) CAAGCCCAAGGATAAGAAATACTCAATAGGACTGGATATTGG (SEQ ID NO: 384) Src Forward CATAGATCTGCCGCCGCGATCGCCATGGCCAGCAACAAGAG (G2A; 8 a.a) CAAGCCCAAGG (SEQ ID NO: 385) Src Forward CATAGATCTGCCGCCGCGATCGCCATGGGCTGCAACAAGAG (S3C; 8 a.a) CAAGCCCAAGG (SEQ ID NO: 386) Src Forward CATAGATCTGCCGCCGCGATCGCCATGGGCAGCAACAAGTG (S6C; 8 a.a) CAAGCCCAAGG (SEQ ID NO: 387) Src Forward CATAGATCTGCCGCCGCGATCGCCATGGGCTGCAACAAGTG (S3C/56C) CAAGCCCAAGG (SEQ ID NO: 388) mCas9 Reverse CATGTATACCTTCTCCTAGCTGTCCG (SEQ ID NO: 389) sgRNA-GFP1 Forward GATCGGGGCGAGGAGCTGTTCACCGG (SEQ ID NO: 390) Reverse AAAACCGGTGAACAGCTCCTCGCCCC (SEQ ID NO: 391) sgRNA-GFP2 Forward GATCGGAGCTGGACGGCGACGTAAAG (SEQ ID NO: 392) Reverse AAAACTTTACGTCGCCGTCCAGCTCC (SEQ ID NO: 393) sgRNA-GFP3 Forward GATCGGGCCACAAGTTCAGCGTGTCG (SEQ ID NO: 394) Reverse AAAACGACACGCTGAACTTGTGGCCC (SEQ ID NO: 395) sgRNA- Forward GATCGACAACTTTACCGACCGCGCCG (SEQ ID NO: 396) Luciferase Reverse AAAACGGCGCGGTCGGTAAAGTTGTC (SEQ ID NO: 397) Luciferase-T7 Forward AAATTGCTTCTGGTGGCGC (SEQ ID NO: 398) Reverse CGTCTTCGTCCCAGTAAGCT (SEQ ID NO: 399) U6-Cas9 Forward GGACTATCATATGCTTACCGTAAC (SEQ ID NO: 400) primers Reverse CATGTATACCTTCTCCTAGCTGTCCG (SEQ ID NO: 401)
[0133] To generate lentivirus-based Cas9/sgRNA vectors, FlinkW lentiviral vector was used as a parental vector. First, FlinkW was digested by EcoRI and HpalI enzymes. The above non-lentiviral mCas9 or Cas9/sgRNA vectors were digested with EcoRI and PmeI sites, which generated two DNA fragments, one fragment with 1 kb (both ends are EcoR1) and the other fragment 4 kb (ECoR1 in 5′-end and Pme1 in 3′-end). The 4 kb fragment DNA was then inserted into the digested FlinkW lentiviral vector. After confirmed by sequencing, 1 kb fragment was further inserted into the above vector. Therefore, the 5 Kb of DNA fragment containing mCas9/sgRNA derived from non-viral vector was cloned into Flink W lentiviral vector.
[0134] Additionally, lentiviral vectors expressing Src(WT), Src(G2A), Src(Y529F), and Src(Y529F/G2A) were cloned into the FUCRW parental lentiviral vector. The lentivirus were generated from these lentiviral vectors to create stable cell lines.
[0135] Cell lines: SYF1 (Src.sup.−/−Fyn.sup.−/−Yes.sup.−/−), 3T3, and human prostate cancer cell lines including DU145, PC3, 22Rv1, and LNCaP were purchased from American Type Culture Collection (ATCC). The cells were grown in the medium recommended by ATCC. Mycoplasma contamination was examined periodically. The cells were used up to 20 passages.
[0136] Isolation of EVs and characterization: To isolate EVs from the cell culture medium, the cell lines were grown in ATCC recommended medium in a 150-mm petri-dish. After reaching 90% confluence, the medium was replaced with fresh medium containing 5% exosome-free FBS (Life Technology Inc.), and grown in 5% CO.sub.2 37° C. incubator for another 24 h. The conditioned medium was collected for the EVs isolation. Specifically, the conditioned medium was repeatedly centrifuged at 4° C. at 300×g for 10 min, 2,000×g for 10 min, and 10,000×g for 30 min to remove live cells, dead cells, and cell debris, respectively. The supernatant was further ultra-centrifugated with 100,000×g at 4° C. for 90 min. The EVs pellet was re-suspended in 1×PBS to wash out the residual medium, and re-centrifugated at 100,000×g at 4° C. for 90 min. The pelleted EVs were re-suspended either in RIPA buffer for protein analysis or 1×PBS for Dynamic Light Scattering (DLS) analysis. The size, zeta potential, and concentration of EVs were measured by nanoparticle tracking analysis (NTA, Particle Metrix, Germany) with ZetaView software for data record and analysis.
[0137] Protein concentration determination: The protein concentration of EVs and cell lysates was determined by detergent compatible (DC) protein assay (Bio-Rad Laboratories). The total cell lysates (TCL) and EVs were dissolved in RIPA buffer [50 mM Tris-base (pH 7.4), 1% NP-40, 0.50% sodium deoxycholate, 0.1% SDS, 150 mM NaCl, 2 mM EDTA and protease inhibitor (1×)] and the manufacturer's protocol was followed.
[0138] Antibodies and Western blotting analysis: The total cell lysate and EVs dissolved in RIPA buffer were subjected to the standard immunoblotting analysis. The following antibodies were used: rabbit anti-Src (Cat #: 2109), rabbit anti-calnexin (Cat #: 2679), rabbit anti-CD-9 (Cat #: 13403 for human species, Cat #: 2118 for mouse species), rabbit anti-GAPDH (Cat #: 13403), rabbit anti-Fyn (Cat #: 4023), and rabbit anti-FAK (Cat #: 13009), rabbit CD81 (Cat #: 10037) were purchased from Cell Signaling Technology; rabbit anti-RFP (Cat #: 600-401-379, Rockland Inc), rabbit anti-AR (Cat #: sc-816, Santa Cruz Biotechnology), and secondary Antibody anti-rabbit IgG HRP (Cat #: 7074, Cell Signaling Technology) were used according to manufactory's recommended dilution. The band intensity was quantified by Image J software.
[0139] Computational docking analysis: The docking analysis of NMT1 with the first amino acid, and a leading peptide containing the first 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids from c-Src, indicates that a peptide with 7-8 amino acids has favorable docking with NMT1 enzyme (lower score).
[0140] NMT1 activity assay: NMT1 catalyzes the incorporation of the myristoyl group into the N-terminus of the glycine in an octapeptide, such as Gly-Ser-Asn-Lys-Ser-Lys-Pro-Lys derived from the leading sequence of Src kinase, designated as Src8(WT), and releases CoA. The amount of the released CoA were reacted with 7-diethylamino-3-(4′-maleimidylphenyl)-4-methylcoumarin. The assay was performed in 96-well black microplates. The produced fluorescence intensity was measured by Flex Station 3, and detected by microplate reader (excitation at 390 nm; emission at 479 nm). To measure the Km and Vmax of NMT1 which catalyzed various octapeptides substrates derived from various proteins, twenty-five octapeptides were synthesized by GenScript. These peptide included Src8(G2A), a mutant octapeptide [Ala-Ser-Asn-Lys-Ser-Lys-Pro-Lys, SEQ ID NO: 383], which is not a substrate of NMT1 enzyme. Each data point has three repeats.
[0141] Determination of myristoylated Src kinase by Click chemistry: Cells expressing Src kinase were grown until 90% confluence in EMEM medium with 5% FBS. The medium was replaced with EMEM medium containing exosome-free FBS and 50 μM of myristic acid-azide (an analog of myristic acid) and the cells were grown for another 24 h. The conditioned medium was collected and used for EVs isolation as described above. The cells or EVs were lysed in M-PER buffer (Thermo Scientific) containing protease inhibitors and phosphatase inhibitors. The cell lysates or EVs lysate (10 μg protein) were added to a working solution containing biotin-alkyne (0.1 mM), CuSO.sub.4 (1 mM), TCEP (1 mM) and TBTA (0.1 mM) and incubated at room temperature for 1 h. After the Click reaction, the samples were mixed with loading dye and boiled at 95° C. for 5 min. The lysates were subjected to SDS-PAGE and transferred to a nitrocellulose membrane. After blocking with 5% milk overnight, the membrane was incubated with High Sensitivity Streptavidin-HRP (catalog No. 21130, ThermoFisher Scientific) at room temperature for 1 h. Myristoylated proteins (e.g., myristoylated Src kinase) were detected by ECL.
[0142] Alternatively, myristoylated Src or Cas9 were detected by antibody against myristoylated octapeptide derived from Src kinase. To Develop an antibody to detect myristoylated protein, particularly the proteins containing an octapeptide Gly-Ser-Asn-Lys-Ser-Lys-Pro-Lys (SEQ ID NO: 367) in the N-terminus, such as Src kinase or the octapeptide fused Cas9, Myristoyl-Gly-Ser-Asn-Lys-Ser-Lys-Pro-Lys (SEQ ID NO: 367) was synthesized as an antigen by GenScript, and injected into two rabbits (4857 and 4858) to generate antibodies. After 3.sup.rd immunization, the antibody was purified using myristoylated octapeptide antigen. The reactivity was measured by ELISA assay using myristoylated octapeptide and non-myristoylated octapeptide.
[0143] Statistical analysis: The data are presented as mean±SEM (standard error of the mean). All the data with more than two groups were analyzed by one-way ANOVA with a post hoc Tukey test in GraphPad Prism software, and two values were compared by an unpaired student t-test. * p<0.05; ** p<0.01; *** p<0.001; NS: not significant.
[0144] Results
[0145] The Octapeptide Derived from Src Kinase was a Favorable Substrate of N-Myristoyltransferase 1.
[0146] Protein myristoylation is catalyzed by N-myristoyltransferase (NMT) (41). Two mammalian isozymes of NMTs, NMT1 and NMT2 (77% identity), catalyze this myristoylation process. NMT1/2 binds myristoyl-CoA and transfers the myristoyl group to an N-terminal glycine with release of CoA (43) (
[0147] The feasibility of twenty-six octapeptides served as a substrate of N-myristoyltransferase 1 (Table 7). Octapeptides derived from the leading sequence of 25 myristoylated proteins with glycine at the N-terminus together with a mutation of octapeptide from Src kinase, called Src(G2A), were examined for their feasibility as an NMT1 substrate using the NMT1 activity assay (described in Material and Methods). Km and Vmax catalyzed by full length NMT1 protein were calculated. The docking score was analyzed based on the re-constructed full length NMT1 protein structure. Count means that a particular protein was detected in EVs from cancer cells among 60 cell lines by Mass spectrometry.
TABLE-US-00014 TABLE 7 Octapeptide substrates of N-myristoyltransferase 1 Protein Peptide Docking Km Vmax Name sequence (8 Residues) Count Score [uM] (uM/min) YES1 GCIKSKEN (SEQ ID NO: 358) 54 -12.6 14.4 61.0 FYN GCVQCKDK (SEQ ID NO: 359) 10 -12.3 5.2 54.9 MARCKS GAQFSKTA (SEQ ID NO: 360) 46 -11.7 38.4 6.4 MARCKSL1 GSQSSKAP (SEQ ID NO: 361) 47 -11.2 11.7 6.6 NOL3 GNAQERPS (SEQ ID NO: 362) 24 -11.2 1.4 2.0 NAA40 GRKSSKAK (SEQ ID NO: 363) 6 -11.0 1.2 1.8 PSMC1 GQSQSGGH (SEQ ID NO: 364) 60 -11.0 40 9.6 ZNRF2 GAKQSGPA (SEQ ID NO: 365) 4 -10.9 2.0 1.6 RNF11 GNCLKSPT (SEQ ID NO: 366) 4 -10.6 16.7 61.1 SRC GSNKSKPK (SEQ ID NO: 367) 42 -10.5 14.3 25.8 LYN GCIKSKGK (SEQ ID NO: 368) 47 -9.6 22.5 64.7 SCYL3 GSENSALK (SEQ ID NO: 369) 1 -9.2 0.8 1.7 FRS2 GSCCSCPD (SEQ ID NO: 370) 3 -8.2 28.2 54.7 RP2 GCFFSKRR (SEQ ID NO: 371) 47 -6.0 13.6 60.8 LNP GGLFSRWR (SEQ ID NO: 372) 5 -6.0 10.3 21.9 NDUFAF4 GALVIRGI (SEQ ID NO: 373) 3 -5.8 0.5 1.2 REP15 GQKASQQL (SEQ ID NO: 374) 1 -5.4 15.7 3.4 GNAZ GCRQSSEE (SEQ ID NO: 375) 2 -5.3 15.7 64.4 LANCL2 GETMSKRL (SEQ ID NO: 376) 15 -5.1 13.0 5.3 DEGS1 GSRVSRED (SEQ ID NO: 377) 3 -5.0 79.2 12.9 ARL6 GLLDRLSV (SEQ ID NO: 378) 2 -4.9 <0.1 1.8 ARF6 GKVLSKIF (SEQ ID NO: 379) 60 -3.5 4.4 13.6 ARL2 GLLTILKK (SEQ ID NO: 380) 50 -3.4 0.4 1.2 NDUFB7 GAHLVRRY (SEQ ID NO: 381) 3 No Score 16.4 2.8 DDX46 GRESRHYR (SEQ ID NO: 382) 24 No Score <0.1 2.0 SRC(G2A) ASNKSKPK (SEQ ID NO: 383) N/A N/A <0.1 1.0
[0148] Fusion of Octapeptide to the N-Terminus of Cas9 Maintained its Genome Editing Function, and Promoted Cas9 Protein to be Encapsulated into EVs.
[0149] To this end, a favorable octapeptide derived from the leading sequence of Src kinase was identified as a NMT1 substrate. To fuse the octapeptide to the N-terminus of Cas9, a bi-cistronic lentiviral vector expressing Cas9 and sgRNA (no target), or myristoylated Cas9 or non-myristoylated Cas9, designated as mCas9 or mCas9(G2A) and sgRNA targeting GFP gene was generated, respectively (
[0150] Isolation of EVs-Producing Cells Expressing mCas9/sgRNA-Luciferase, and Encapsulation of mCas9/sgRNA-Luciferase into EVs.
[0151] Using the similar approach, lentiviral vector expressing Cas9/sgRNA-luciferase (luc), mCas9/sgRNA-Luc, or mCas9(G2A)/sgRNA-Luc was generated. To create EVs-producing 3T3 cells, 3T3 cells expressing luciferase gene were transduced with Cas9, mCas9, or mCas9(G2A)/sgRNA-Luc by lentiviral infection. Single cell clones transduced with Cas9, mCas9, or mCas9(G2A)/sgRNA-Luc was isolated through dilution in the 96-well plate (
[0152] Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of skill in the art to which the disclosed invention belongs. Publications cited herein and the materials for which they are cited are specifically incorporated by reference.
[0153] Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims.