PRODUCTION AND USES OF ARTIFICAL HISTONE H1 FOR ANALYZING, DIAGNOSING, TREATING, AND/OR PREVENTING SENESCENCE
20220162274 · 2022-05-26
Inventors
Cpc classification
International classification
Abstract
The present invention provides a method for producing artificial protein sequences and artificial nucleic acid sequences for the linker histone variants H1.0 (also known as histone H1°; H1(0); H5; H1δ; RI H1; or H1 histone family, member 0) and H1x (also known as histone H1.10 or H1 histone family, member X). In particular, the artificial protein sequences produced by the method feature engineered α-helical motifs—three structural motifs in the histone H1 that bind to nucleosomal and/or linker DNA in chromatin. These artificial-sequence histone H1 proteins, when they replace or supplement their wild-type counterparts in vivo, confer multicellular individuals significant resistance to senescence and/or age-related health conditions such as age-related cancer.
Claims
1. A method for producing an artificial protein sequence for histone H1 variants to induce resistance and/or protection against senescence, and/or age-related health conditions wherein the method comprises the steps of: a. selecting a wild-type histone H1.0 or H1x protein sequence, or the wild-type sequence of a respective protein ortholog in the species of interest; b. within the sequence selected in step a, recognizing the subsequences determined by regions or individual sites in the globular domain of the protein that conform the DNA-binding site of the histone H1.0 or H1x proteins, particularly the amino acid residues directly or indirectly interacting with the DNA; c. applying a set of at least one amino acid substitutions, insertions, and/or deletions to one or more of the amino acid subsequences corresponding to the regions or sites recognized in step b, where the modifications do not alter the structure of the α-helical motifs and where the respective net electric charge (z) associated to each resulting modified amino acid subsequence is greater than before the modifications; and d. obtaining the artificial protein sequence by applying the set of at least one amino acid substitutions, insertions, and/or deletions determined by step c into the wild-type histone H1.0, histone H1x, or respective orthologous protein sequence selected in step a, thereby producing the complete artificial protein sequence.
2. The method according to claim 1, wherein the increase of net electric charge (z) in step c is estimated particularly at physiological pH.
3. The method according to claim 2, wherein an artificial nucleic acid sequence that encodes the artificial protein sequence obtained in step d is produced.
4. The method according to claim 1, wherein depending on the variant of the wild-type histone H1.0 or H1x protein sequence selected in step a, it is recognized: i. the first α-helical motif α.sub.1 by using as a sequence homology guide the amino acid sequence SEQ. ID No. 1 if the wild-type histone variant is H1.0 or the amino acid sequence SEQ. ID No. 4 if the wild-type histone variant is H1x; ii. the second α-helical motif α.sub.2 by using as a sequence homology guide the amino acid sequence SEQ. ID No. 2 if the wild-type histone variant is H1.0 or the amino acid sequence SEQ. ID No. 5 if the wild-type histone variant is H1x; iii. the third α-helical motif α.sub.3 by using as a sequence homology guide the amino acid sequence SEQ. ID No. 3 if the wild-type histone variant is H1.0 or the amino acid sequence SEQ. ID No. 6 if the wild-type histone variant is H1x.
5. The method according to claim 4, wherein within each α-helical motif identified in steps i, ii, and iii, a set of at least one amino acid substitution sites is defined as follows: (S1,α.sub.3,12), (S2,α.sub.3,13), (S3,α.sub.2,1), (S4,α.sub.1,1), (S5,α.sub.3,1), (S6,α.sub.2,3), (S7,α.sub.3,3), (S8,α.sub.3,5), (S9,α.sub.3,9), (S10,α.sub.2,2) and (S11,α.sub.3,11); where each triplet shows the substitution site, the α-helical motif, and its relative position (counting from N- to C-terminus) within the α-helical motif.
6. The method according to claim 5, wherein the amino acid substitution sites S1, S2, S3, S4, S5, S6, S7, S8, S9, S10, and S11 are mapped into the wild-type protein sequence selected in step a with respect to its three α-helix subsequences α.sub.1, α.sub.2, and α.sub.3 using SEQ. ID No. 1-6.
7. The method according to claim 6, wherein the amino acid substitutions are optimized by using alternative substitute residues with the same rationale of increased net electric charge (z), particularly at physiological pH, in the artificial-sequence histone H1.0/H1x while preserving the secondary structure and overall function of the wild-type histone H1.0/H1x.
8. The method according to claim 7, wherein once mapping the substitution sites, a set of at least one to up to eleven amino acid substitutions is applied into the wild type protein sequence according to the following criteria: S1((K,S,T);R), S2((S,T,M,L);R), S3((K,L);R); S3((S,T);P); S4((S,T);P); S5(K;M); S5((S,T);N); S6((S,T);A); S7((D,E);N); S8((S,T,Y);R); S9((S,T);A) S10(Y;R); and S11(¬R;R), where for each substitution site, in the first part of the duplex it is shown the possible amino acid residues that can be found in the wild type sequences, and the second part it is shown the preferred substitute amino acid, and where ¬R denotes an amino acid residue other than R.
9. The method according to claim 8, wherein it is verified that the set of amino acid substitutions applied satisfies the condition of increased net electric charge (z), particularly at physiological pH, by estimating z for each modified α-helical motif at physiological pH and comparing it to the z estimate at physiological pH for its wild-type counterpart when the artificial-sequence α-helical motif and the wild-type α-helical motif are each in their respective post-translationally unmodified forms or when each is subjected to plausible PTMs.
10. The method according to claim 8, wherein the amino acid substitutions, insertions, and/or deletions are intended to redesign of the histone H1 α-helical motifs α.sub.3 (most preferred, which binds to both nucleosomal and linker DNA), α.sub.2 (second most preferred, which binds to nucleosomal DNA), and α.sub.3 (third most preferred, which binds to linker DNA), and in particular to stabilize or enhance the electrostatic binding affinity of the α-helical motifs to nucleosomal and/or linker DNA.
11. An artificial histone H1.0 or H1x protein sequence for inducing resistance and/or protection against senescence, and/or age-related health conditions wherein the artificial protein sequence contains a set of at least one amino acid substitutions, insertions, and/or deletions to the DNA-binding site of the histone H1.0 or H1x proteins in the α-helical regions, where the substitutions, insertions, and/or deletions do not alter the structure of the α-helices and entail an increase in the net electric charge (z), particularly at physiological pH, of the resulting artificial-sequence protein.
12. An artificial protein sequence according to claim 11 wherein the increase in net electric charge (z) is estimated particularly at physiological pH.
13. An artificial protein sequence according to claim 11 wherein the DNA binding sites are located in the first, second, and/or third (counting from N- to C-terminus) α-helices of the histone H1.0 and histone H1x proteins.
14. An artificial protein sequence according to claim 13 wherein: the amino acid sequence that corresponds to the first α-helix, denoted by α.sub.1, of the wild-type histone H1 protein counterpart is identical or homologous to SEQ. ID. No. 1 if the wild-type histone variant is H1.0 or to the SEQ. ID. No. 4 if the wild-type histone variant is H1x; the amino acid sequence that corresponds to the second α-helix, denoted by α.sub.2, of the wild-type histone H1 protein counterpart is identical or homologous to SEQ. ID. No. 2 if the wild-type histone variant is H1.0 or to SEQ. ID. No. 5 if the wild-type histone variant is H1x; and the amino acid sequence that corresponds to the third α-helix, denoted by α.sub.3, of the wild-type histone H1 protein counterpart is identical or homologous to SEQ. ID. No. 3 if the wild-type histone variant is H1.0 or to SEQ. ID. No. 6 if the wild-type histone variant is H1x.
15. An artificial protein sequence according to claim 14, wherein the set of amino acid modification corresponds to at least one to up to eleven amino acid substitutions within the binding site in the α-helical motif.
16. An artificial protein sequence according to claim 15, wherein the eleven amino acid substitution sites S1 to S11 comprise at least one substitution for each of the first, second, and third α-helical motifs selected from: (S1,α.sub.3,12), (S2,α.sub.3,13), (S3,α.sub.2,1), (S4,α.sub.1,1), (S5,α.sub.3,1), (S6,α.sub.2,3), (S7,α.sub.3,3), (S8,α.sub.3,5), (S9,α.sub.3,9), (S10,α.sub.2,2) and (S11,α.sub.3,11); where each triplet shows the substitution site, the α-helical motif and their relative position (counting from N- to C-terminus).
17. An artificial protein sequence according to claim 16, wherein the substitute amino acid residue are selected from alanine, methionine, leucine and arginine, for any substitution site or proline for substitution sites S3 and/or S4.
18. A synthetic or recombinant nucleic acid sequence including the cDNA and RNA codifying such sequences which encodes an artificial protein or an artificial peptide sequence according to any of the claims 11 to 17.
19. Use of the artificial protein sequence according to any of the claims 11 to 17 for analyzing and/or diagnosing senescence, and/or age-related health conditions in multicellular species such as the human species, other animal species, or plant species.
20. Use of the artificial protein sequence according to any of the claims 11 to 17 for inducing resistance and/or protection against senescence, and/or age-related health conditions in multicellular species such as the human species, other animal species, or plant species.
21. Use according to claim 20 wherein the resistance and/or protection includes but is not limited to the arrest, slowdown, and/or prevention of senescence, and/or age-related health conditions in multicellular species such as the human species, other animal species, or plant species.
22. Use of the artificial protein sequence according to claim 21, wherein the age-related health conditions are selected from age-related cancer, atherosclerosis and cardiovascular disease, arthritis, cataracts, osteoporosis, type-2 diabetes, hypertension, Alzheimer's disease, benign prostate hyperplasia, hearing disability, age-related macular degeneration, neurodegenerative diseases, degenerative diseases, immune senescence diseases, skin aging, and skin wrinkles.
23. Use of the artificial protein sequence according to any of the claims 11 to 17 for biomedical, cosmetic, industrial, and/or agricultural applications.
Description
BRIEF DESCRIPTION OF THE DRAWING
[0030] For a fuller understanding of this invention, reference is made to the following description and accompanying drawing, in which:
[0031]
DETAILED DESCRIPTION OF THE INVENTION
[0032] In particular, the present invention provides a method for producing artificial protein and artificial nucleic acid sequences (such as RNA or DNA) for two histone H1 variants or any of their respective orthologs, wherein the method entails the application of a set of amino acid (aa) substitutions, insertions, and/or deletions to the respective wild-type (wt) protein sequence at clearly defined sites, and wherein the method entails the satisfaction of a very specific electrochemical condition for each artificial protein sequence produced by the method with respect to its wild-type counterpart.
[0033] In a preferred embodiment of the invention, the method entails the application of a set of amino acid substitutions, insertions, and/or deletions which in turn entails an increase in the net electric charge (z) (at physiological pH) of the globular domain in the artificial-sequence histone H1 with respect to its wild-type counterpart when both globular domains are each in their respective post-translationally unmodified forms or when each globular domain is subjected to plausible post-translational modifications (PTMs), in order to stabilize the electrostatic binding affinity of the artificial-sequence histone H1 (in particular when subjected to PTMs in vivo) to the negatively charged DNA. This effect is possible by virtue of the net electric charge (z) as a function of (i) the side chain and (ii) the ability to undergo PTMs for each amino acid residue placed by the method in the artificial protein sequence and/or those for the amino acid residues displaced by the method from the wild-type protein sequence.
[0034] In a more preferred embodiment of the invention, the method entails the application of a set of amino acid substitutions, insertions, and/or deletions to the α.sub.3, α.sub.2, and/or α.sub.1-helix subsequences within the histone H1.0 (also known as histone H1°; H1(0); H5; H1δ; RI H1; or H1 histone family, member 0) protein sequence, to the α.sub.3, α.sub.2, and/or α.sub.1-helix subsequences within the histone H1x (also known as histone H1.10 or H1 histone family, member X) protein sequence, and to the α.sub.3, α.sub.2, and/or α.sub.1-helix subsequences within any respective histone H1.10/H1x ortholog sequence—because said α.sub.3, α.sub.2, and α.sub.1 subsequences correspond to the structural motifs that bind to nucleosomal and/or linker DNA in chromatin.
[0035] In an even more preferred embodiment of the invention, the method entails the application of a set of at least one and up to eleven amino acid substitutions at eleven respective specific sites spanning the α.sub.3 amino acid subsequence, the N-terminal region of the α.sub.2 subsequence, and the N-terminal region of the α.sub.1 subsequence, where all three subsequences are within the full sequence of the histone H1.0, histone H1x, and of any respective orthologous protein. This embodiment is even more preferred because amino acid substitutions (as opposed to insertions or deletions) are less likely to disrupt the α-helical geometry and the eleven amino acid substitution sites are those most proximal (within the α-helical motifs) to nucleosomal or linker DNA. The selection of the set of at least one and up to eleven amino acid substitutions includes at least one amino acid substitution in the α.sub.3-helix subsequence. In a further embodiment of the invention, the method comprises at least two more amino acid substitutions, where at least one of the remaining substitutions is made to a α.sub.2, and/or a α.sub.1-helix subsequence within the histone H1.
[0036] Therefore, the invention relates in particular to the redesign of the histone H1 α-helical motifs α.sub.3 (most preferred, which binds to both nucleosomal and linker DNA), α.sub.2 (second most preferred, which binds to nucleosomal DNA), and α.sub.1 (third most preferred, which binds to linker DNA), through amino acid sequence modification (by amino acid substitutions, insertions, and/or deletions).
[0037] The method for producing artificial histone H1 protein sequences, according to the present invention, is useful for treating and/or preventing senescence, cancer, and/or age-related health conditions.
[0038] In this invention, reference is made to the following naturally occurring α-amino acids (aa): alanine (IUPAC one-letter symbol: A, three-letter symbol: Ala), cysteine (C, Cys), aspartic acid (D, Asp), glutamic acid (E, Glu), phenylalanine (F, Phe), glycine (G, Gly), histidine (H, His), isoleucine (I, Ile), lysine (K, Lys), leucine (L, Leu), methionine (M, Met), asparagine (N, Asn), proline (P, Pro), glutamine (Q, Gln), arginine (R, Arg), serine (S, Ser), threonine (T, Thr), valine (V, Val), tryptophan (W, Trp), and tyrosine (Y, Tyr).
[0039] Reference is made to amino acid substitutions with the nomenclature X.sub.W#X.sub.A, where # is the site (identified by counting from the translation initiator Met residue numbered as +1) occupied by a wild-type (wt) amino acid residue X.sub.W to be substituted by an amino acid residue X.sub.A according to the present invention.
[0040] Reference is made to the net electric charge (IUPAC symbol: z) defined as the algebraic sum of the charges present at the surface of a molecule divided by the elementary charge of the proton. In this context, reference is also made in this application to z.sub.P, hereby defined as the net electric charge of a molecule at physiological pH (i.e., when pH-7.4). Consequently, a higher z.sub.P implies a more positive (or, equivalently, less negative) net electric charge in a molecule at physiological pH. It has to be noted that the application of the set set of amino acid substitutions, insertions and/or deletions according to the present invention entails an increase in the net electric charge of the molecule at physiological pH (z.sub.P), it may also modify its net electric charge (z) at other other pH values or ranges.
[0041] Reference is made to nucleosomal DNA (also known as core nucleosomal DNA), understood as the DNA that is left-hand wrapped around the histone octamer forming a complex known as the nucleosome core particle (NCP), which is the building block of chromatin.
[0042] Reference is made to linker DNA, understood as the DNA that extends in between nucleosome core particles. Importantly for this invention, the phosphate group repeated across the backbone of nucleic acids in particular makes both nucleosomal and linker DNA negatively charged at physiological pH.
[0043] Reference is made to the histone H1 (also known as linker histone) protein, which constitutes one of the five major histone protein families necessary for the formation of chromatin in the eukaryotic cell. Specific regions within the histone H1 protein bind to nucleosomal and/or linker DNA, which in turn stabilizes the higher-order constraints on chromatin dynamics.
[0044] The histone H1 family comprises a number of variants. These histone H1 variants are encoded by paralog gene families and classified under a phylogeny-based nomenclature. However, they are also grouped according to protein biosynthesis in terms of its relationship with the cell cycle and its tissue specificity.
[0045] Reference is made to a major structural motif known as “winged” helix-turn-helix (wHTH) in the form of α.sub.1-β.sub.1-α.sub.2-α.sub.3-β.sub.2-β.sub.3 where the α.sub.i motifs are alpha helices and the β.sub.j motifs are beta sheets, which characterizes the globular domain of the histone H1 protein. Importantly for this invention, the histone-H1 α.sub.1, α.sub.2, and α.sub.3 helices are motifs with amino acid residues known to be proximal (and thus more likely to bind) to nucleosomal and/or linker DNA.
[0046] Reference is also made to post-translational modifications (PTMs), which are covalent and typically (but not necessarily) enzymatic modifications undergone by amino acid residues in proteins following protein biosynthesis.
[0047] Histone H1 proteins are subjected to PTMs. Some of the most common PTMs undergone by histone H1 proteins are acetylations and phosphorylations. Lys acetylation is known to lower the otherwise positive electric charge of the Lys residue at physiological pH, in turn decreasing the Lys-mediated electrostatic binding affinity of histone H1 to the negatively charged nucleosomal and linker DNA. The post-translational phosphorylation of amino acid residues such as Ser, Thr, Tyr, and His is also known to decrease the electrostatic DNA binding affinity of the histone H1.
[0048] When proximal to DNA-binding regions, the negatively charged Asp and Glu residues may reduce, by electrostatic repulsion, the binding affinity of those regions to nucleosomal and linker DNA.
[0049] In general, any wild-type amino acid residue (even a residue of null z, sometimes used itself as a substitute residue in other instances) located in a DNA-binding region of the histone H1 protein may be substituted in the method according to this invention with an amino acid residue such that z.sub.P is increased in that region, thereby stabilizing the binding affinity to the negatively charged nucleosomal and linker DNA when that region and/or others in the histone H1 protein are subjected to PTMs in vivo.
[0050] When the present invention applies a set of at least one and up to eleven amino acid substitutions to a wild-type histone H1.0 or histone H1x protein sequence, adequate substitute residues include but are not limited to Arg, Ala, Met, Asn, and Pro (the latter being adequate only for substituting an amino acid residue located at the N-terminal site of an α-helical motif).
[0051] Since the present invention aims to elicit a significant phenotypic change in the entire multicellular individual in its adult form, the method must target a histone H1 variant that (i) accumulates in terminally differentiated cells and (ii) is synthesized in the whole body of the individual. In other words, the targeted histone H1 variant must be both replication-independent (i.e., synthesized throughout the cell cycle) and somatic.
[0052] Importantly, only two histone H1 variants embody both characteristics: the H1.0 (also known as histone H1°; H1(0); H5; H1δ; RI H1; or H1 histone family, member 0) variant, which is common to all multicellular species and the H1x (also known as histone H1.10 or H1 histone family, member X) variant, which is unique to vertebrate species.
[0053] Within the histone H1.0/H1x structure, the α.sub.1 helix is 13 amino acid residues long, the α.sub.2 helix is 12 amino acid residues long, and the α.sub.3 helix is 16 amino acid residues long.
[0054] The histone H1.0/H1x α.sub.1 motif (in particular its N-terminal region) binds mainly to linker DNA, the α.sub.2 motif (in particular its N-terminal region) binds mainly to nucleosomal DNA and, importantly, the α.sub.3 binds to both nucleosomal and linker DNA (see
[0055] Amino acid residues in the histone H1.0/H1x α.sub.1, α.sub.2, and/or α.sub.3 motifs can be post-translationally modified. In particular, specific residues can be acetylated or phosphorylated, which entails a decrease in z.sub.P for the α-helical motifs, which in turn decreases the electrostatic binding affinity of the histone H1.0/H1x protein to both nucleosomal and/or linker DNA.
[0056] The invention presented here corrects for the PTM-dependent decrease in electrostatic DNA binding affinity and/or for an intrinsically low DNA binding affinity—the latter caused mainly by negatively charged amino acid residues—in the wild-type protein without impairing the function of the α-helical motifs within the histone H1.0/H1x protein nor the function of the protein as a whole.
[0057] Specifically, this invention claims a method for producing artificial protein sequences and artificial nucleic acid sequences for the histone H1.0 and histone H1x variants and their orthologs, which are useful for the analysis, diagnosis, treatment, and/or prevention of senescence (also known as biological aging) and/or age-related health conditions, such as some types of cancer. This invention also encompasses the artificial protein sequences and artificial nucleic acid sequences produced by the method of the invention.
[0058] In the method the application of a set of amino acid substitutions, insertions, and/or deletions to produce artificial histone H1.0 and histone H1x protein sequences (from the three respective wild-type, α-helical subsequences) entails an increase of z.sub.P in the artificial-sequence α.sub.3, α.sub.2, and/or α.sub.1 motifs with respect to their wild-type counterparts when the artificial and wild-type motifs are each in their respective post-translationally unmodified forms or when each is subjected to plausible PTMs.
[0059] The application of a set of amino acid substitutions only—as opposed to insertions or deletions, because substitutions are more likely to preserve the secondary structure and overall function of the wild-type protein in the artificial-sequence protein derived from it while creating in the artificial-sequence protein a new function for the multicellular individual—applied to a wild-type histone H1.0 or histone H1x informs the most preferred embodiment of this invention.
[0060] When the artificial-sequence α.sub.3, α.sub.2, and/or α.sub.1 motifs undergo post-translational modifications in vivo, the set of amino acid substitutions, insertions, and/or deletions applied by the method to produce the artificial-sequence α.sub.3, α.sub.2, and/or α.sub.1 motifs effectively creates a “reservoir” of positive electric charge that stabilizes or enhances the electrostatic binding affinity of the artificial-sequence histone H1.0/H1x protein to the negatively charged nucleosomal and/or linker DNA.
[0061] This stabilization or enhancement of the electrostatic binding affinity of the artificial-sequence histone H1.0/H1x protein to nucleosomal and/or linker DNA in turn stabilizes the higher-order constraints on chromatin dynamics in the terminally differentiated cells of the multicellular individual, which in turn translates into significant resistance to senescence and/or to age-related health conditions for the multicellular individual.
[0062] For a person skilled in the art, it would be evident that, although the intracellular activity is given by the artificial histone H1 protein, these proteins can be obtained through (i) artificial DNA sequences prepared using any technique available in the state of the art such as genome editing or plasmid systems and (ii) artificial RNA sequences, where all these artificial nucleic acid sequences encoding the artificial protein sequences produced by the claimed method, as well as their complementary reverse in the case of the DNA sequences, are considered within the scope of the present invention.
[0063] The non-triviality and high specificity of the set of amino acid substitutions, insertions, and/or deletions—in terms of location (DNA-binding protein regions) and required effect (increase of z in DNA-binding protein regions)—that must in turn be applied to the wild-type sequence of highly specific, functionally well-defined proteins (histone H1.0 and/or histone H1x) constitutes a clear inventive step in this patent application.
[0064] The method according to this invention is intended to induce resistance and/or protection against senescence (also known as biological aging) and/or age-related health conditions in multicellular species by producing artificial protein sequences and nucleic acid sequences for the histone H1 variants H1.0 (also known as histone H1°; H1(0); H5; H1δ; RI H1; or H1 histone family, member 0) and H1x (also known as histone H1.10 or H1 histone family, member X), and comprises the steps of: [0065] a. selecting, according to the preferred order of histone H1 variants specified in this invention, a wild-type histone H1.0 protein sequence, or a wild-type histone H1x protein sequence, or the sequence of a respective protein ortholog in the species of interest; [0066] b. within the sequence selected in step a, recognizing the regions or individual sites in the globular domain or in its α-helical motifs (for the latter, see steps b.2, b.3, and b.4) of the protein that are most proximal to DNA—using, if necessary, the X. laevis histone H1 protein 3D structure contained in the publicly available PDB data file 5NL0 as a structural homology guide—and/or recognizing the regions or individual sites in the globular domain of the protein known to bind to DNA in the species of interest. In a more preferred realization of this step: [0067] b.2. depending on the variant of the wild-type histone H1 protein sequence selected (i.e., H1.0 or H1x) in step a, recognizing the first (counting from N- to C-terminus) α-helical motif α.sub.1 by using as a phylogenetic homology guide the amino acid sequence of SEQ. ID NO. 1 if the wild-type histone variant is H1.0 or the amino acid sequence of SEQ. ID NO. 4 if the wild-type histone variant is H1x; [0068] b.3. depending on the variant of the wild-type histone H1 protein sequence selected (i.e., H1.0 or H1x) in step a, recognizing the second (counting from N- to C-terminus) α-helical motif α.sub.2 by using as a phylogenetic homology guide the amino acid sequence of SEQ. ID NO. 2 if the wild-type histone variant is H1.0 or the amino acid sequence of SEQ. ID NO. 5 if the wild-type histone variant is H1x; [0069] b.4. depending on the variant of the wild-type histone H1 protein sequence selected (i.e., H1.0 or H1x) in step a, recognizing the third (counting from N- to C-terminus) α-helical motif α.sub.3 by using as a phylogenetic homology guide the amino acid sequence of SEQ. ID NO. 3 if the wild-type histone variant is H1.0 or the amino acid sequence of SEQ. ID NO. 6 if the wild-type histone variant is H1x, and also, in an even more preferred realization of this step: [0070] b.5. identifying the following eleven amino acid substitution sites (S1, . . . , S11) defined according to their relative position (counting from N- to C-terminus) within each α-helix motif as follows:
TABLE-US-00001 substitution site α-helix motif position # (N- to C- ) S1 α.sub.3 12 S2 α.sub.3 13 S3 α.sub.2 1 S4 α.sub.1 1 S5 α.sub.3 1 S6 α.sub.2 3 S7 α.sub.3 3 S8 α.sub.3 5 S9 α.sub.3 9 S10 α.sub.2 2 S11 α.sub.3 11; [0071] b.6. mapping the amino acid substitution sites (S1, . . . , S11) identified in step b.5 into the wild-type protein sequence selected in step a with respect to its three α-helix subsequences recognized in steps b.2, b.3, and b.4; [0072] c. applying a set of amino acid substitutions, insertions, and/or deletions to one or more of the amino acid subsequences corresponding to the regions or sites (for individual sites the subsequence length is equal to one amino acid residue) recognized only in step b or in steps b.2 to b.4 such that modifications do not alter the α-helical structures and such that the respective net electric charge (z) associated to each resulting modified amino acid subsequence is greater, particularly at physiological pH, than the net electric charge (z) associated to its wild-type amino acid subsequence counterpart when the modified amino acid subsequence and the wild-type amino acid subsequence are each in their respective post-translationally unmodified forms or when each is subjected to a respective combination of post-translational modifications (PTMs). Or, alternatively, if steps b.5 and b.6 were also made: [0073] c.2. applying a set of amino acid substitutions in a sequential yet not necessarily comprehensive manner—where the set of amino acid substitutions is not only applied to the wild-type protein sequence in a sequential manner but also tested experimentally/clinically in a sequential manner, i.e., experimentally/clinically testing the amino acid substitutions specified by the present invention only one or two at a time (preferred in the context of artificial-sequence proteins for humans, because the application of a minimal number k<11 of amino acid substitutions, provided the k amino acid substitutions elicit the desired phenotypic change, turn renders the remaining k+1, . . . , 11 amino acid substitutions superfluous), and where any substitute amino acid residue identical to the wild-type amino acid residue it is supposed to replace at any site among (S1, . . . , S11) simply implies no action taken and the set of amino acid substitutions to be reduced in one element for each such case (preferred for simplicity and for keeping the amino acid substitutions applied as few as possible provided they elicit the desired phenotypic change as explained previously)—at the sites (S1, . . . , S11), which are now mapped into the wild-type protein sequence, according to the following criteria:
TABLE-US-00002 substitution site if wt residue is substitute with S1 (K ∨ S ∨ T) R S2 (S ∨ T ∨ M ∨ L) R S3 (K ∨ L) R ¬: logical NOT; ∨: logical OR;
TABLE-US-00003 substitution site if wt residue is substitute with S3 (S ∨ T) P S4 (S ∨ T) P S5 K M S5 (S ∨ T) N S6 (S ∨ T) A S7 (D ∨ E) N S8 (S ∨ T ∨ Y) R S9 (S ∨ T) A S10 Y R S11 ¬R R ¬: logical NOT; ∨: logical OR; [0074] c.3. verifying that the set of amino acid substitutions applied in step c.2 satisfies the condition of increased z.sub.P by estimating z at physiological pH for each modified α-helical motif and comparing it to the z estimate at physiological pH for its wild-type counterpart when the artificial-sequence (i.e., modified) α-helical motif and the wild-type α-helical motif are each in their respective post-translationally unmodified forms or when each is subjected to plausible PTMs; [0075] d. optimizing (if necessary for technical and/or biological reasons) the set of amino acid substitutions applied in step c or in step c.2 by using alternative substitute residues (i.e., other than those suggested in this method) with the same rationale of increased z.sub.P in the artificial-sequence histone H1.0/H1x while preserving the secondary structure and overall function of the wild-type histone H1.0/H1x, thereby allowing in the artificial-sequence histone H1.0/H1x the creation of a novel function for the multicellular individual on top of the regular function inherited from its wild-type protein counterpart; [0076] e. consolidating the set of amino acid substitutions, insertions, and/or deletions determined by steps c, c.2, c.3, and d into the wild-type histone H1.0, histone H1x, or respective orthologous protein sequence selected in step a, thereby producing the complete artificial protein sequence—where the applied set of amino acid substitutions, insertions, and/or deletions effectively creates a “reservoir” of positive electric charge in the artificial-sequence histone H1.0/H1x protein produced, thereby stabilizing or enhancing its electrostatic binding affinity (with respect to its wild-type counterpart) to DNA; and [0077] f. optionally, producing—by virtue of the degeneracy of the genetic code and, if necessary, under the constraints imposed by the species of interest (e.g., codon usage bias) and experimental technique (e.g., CRISPR/Cas sgRNA design)—an artificial nucleic acid sequence that encodes the artificial protein sequence produced in step e. [0078] Steps b.2, b.3, and b.4 are preferred because because (i) the histone H1.0/H1x α-helical motifs are specifically known to bind to nucleosomal and/or linker DNA and (ii) the condition of increased net electric charge (z) in the artificial α-helical motifs with respect to their wild-type counterparts creates a “reservoir” of positive electric charge in the artificial α-helical motifs. Steps b.5, b.6, c.2, and c.3 are even more preferred because (i) in the histone H1.0/H1x the α.sub.3 motif is known to bind to both nucleosomal and linker DNA (most preferred), the α.sub.2 motif is known to bind to nucleosomal DNA (second most preferred), and the α.sub.1 motif is known to bind to linker DNA (third most preferred), (ii) the eleven amino acid substitution sites (S1, . . . , S11) are highly specific and most proximal (within the α-helical motifs) to nucleosomal or linker DNA, and (iii) the substitute amino acid residues Arg, Ala, Met, Asn, and Pro are able to create, when replacing specific amino acid residues at specific sites, a “reservoir” of positive electric charge in the artificial α-helical motifs as detailed previously.
[0079] The present invention encompasses artificial histone H1.0 or H1x protein sequences for inducing resistance and/or protection against senescence, and/or age-related health conditions—which include but not are not limited to age-related cancer, atherosclerosis and cardiovascular disease, arthritis, cataracts, osteoporosis, type-2 diabetes, hypertension, Alzheimer's disease, benign prostate hyperplasia, hearing disability, age-related macular degeneration, neurodegenerative diseases, degenerative diseases, immune senescence diseases, skin aging, and skin wrinkles—where the artificial protein sequence contains a set of at least one amino acid substitutions, insertions, and/or deletions to the DNA-binding site of the histone H1.0 or H1x proteins in the α-helical regions, where the substitutions, insertions, and/or deletions do not alter the structure of the α-helical motifs and also entail an increase in the net electric charge (z) of the resulting artificial-sequence protein. The net electric charge (z) is estimated particularly at physiological pH.
[0080] The DNA binding sites are located in the first, second, and/or third (counting from N- to C-terminus) α-helices of the histone H1.0 and histone H1x proteins.
[0081] The amino acid sequence that corresponds to the first α-helix, denoted by α.sub.1, of the wild-type histone H1 protein counterpart is identical or homologous to SEQ. ID No. 1 if the wild-type histone variant is H1.0 or to the SEQ. ID No. 4 if the wild-type histone variant is H1x.
[0082] The amino acid sequence that corresponds to the second α-helix, denoted by α.sub.2, of the wild-type histone H1 protein counterpart is identical or homologous to SEQ. ID No. 2 if the wild-type histone variant is H1.0 or to SEQ. ID No. 5 if the wild-type histone variant is H1x.
[0083] The amino acid sequence that corresponds to the third α-helix, denoted by α.sub.2, of the wild-type histone H1 protein counterpart is identical or homologous to SEQ. ID No. 3 if the wild-type histone variant is H1.0 or to SEQ. ID No. 6 if the wild-type histone variant is H1x.
[0084] In an embodiment of the invention, the set of amino acid modifications corresponds to at least one to up to eleven amino acid substitutions within the binding site in the α-helical motif.
[0085] In a further embodiment of the invention, the eleven amino acid substitution sites α.sub.1 to α.sub.11 comprise at least one substitution for each of the first, second, and third α-helical motifs selected from: (S1,α.sub.3,12), (S2,α.sub.3,13), (S3,α.sub.2,1), (S4,α.sub.1,1), (S5,α.sub.3,1), (S6,α.sub.2,3), (S7,α.sub.3,3), (S8,α.sub.3,5), (S9,α.sub.3,9), (S10,α.sub.2,2) and (S11,α.sub.3,11); where each triplet shows the substitution site, the α-helical motif and the relative position (counting from N- to C-terminus); where the substitute amino acid residue are selected from alanine, methionine, leucine and arginine, for any substitution site or proline for substitution sites α.sub.3 and/or α.sub.4.
[0086] For a proper z.sub.P comparison between artificial and wild-type protein sequences, the same dissociation-constant data—i.e., the same set of pK.sub.a values for the α-carboxylic acid group, α-ammonium group, and side chain group (if applicable) of each amino acid residue—must be used as calculation base.
[0087] The artificial-sequence histone H1.0 and histone H1x proteins according to the present invention, when synthesized by engineered cells (e.g., via genome editing or synthetic mRNA delivery) or administered extrinsically to cells (if extracellular histone H1 cytotoxicity can be countered) so that in treated multicellular individuals the artificial-sequence proteins reach abundance levels comparable to those of their wild-type protein counterparts in untreated individuals, confer the treated individuals significant resistance to senescence and/or age-related health conditions.
[0088] The artificial histone H1.0 and histone H1x protein sequences, the synthetic or recombinant nucleic acid sequences encoding said artificial protein sequences, and the methods for producing such sequences according to this invention are useful for analyzing and/or diagnosing senescence and/or age-related health conditions in multicellular species.
[0089] The artificial histone H1.0 and histone H1x protein sequences according to this invention are useful for inducing resistance and/or protection against senescence and age-related health conditions in multicellular species. Said resistance and/or protection includes, but is not limited to, the analysis, diagnosis, treatment and/or prevention of senescence and/or other age-related health conditions, such as certain types of cancer.
[0090] The artificial histone H1.0 and histone H1x protein sequences—and the synthetic or recombinant nucleic acid sequences encoding them—according to this invention can be used on biomedical, cosmetic, industrial, and agricultural applications.
[0091] The method of the invention was tested in vivo on a simple organism, such as C. elegans, in order to verify its effectiveness, and as can be seen in detail in examples 1 and 11, a C. elegans organism was obtained featuring only three amino acid substitutions in the sequence of its histone H1.X protein (ortholog of the human histone H1.0) and displating great resistance to senescence, which translates to a very significant increase in the survival rate. So much so that by day 14 only 50% of the wild type survived and 100% of the mutants (hil-1 gene), and by day 24 there were no wild-type left alive, while the worms which synthesized the mutant histone H1.X protein developed in accordance with the method of the present invention, showing a survival rate above 98%. Based on the results, where 98% of the mutants (subjected to only 3 amino acid substitutions in their histone H1.X protein) are still alive, whereas 100% of the wild-type individuals are dead.
[0092] For a person skilled in the art, it would be evident that, although the intracellular activity is given by the artificial histone H1 protein, these proteins can be obtained through (i) artificial DNA sequences prepared using any technique available in the state of the art such as genome editing or plasmid systems and (ii) artificial RNA sequences, where all these artificial nucleic acid sequences encoding the artificial protein sequences produced by the claimed method, as well as their complementary reverse in the case of the DNA sequences, are considered within the scope of the present invention.
Examples
[0093] The following examples are intended to illustrate the present invention and they cannot be used for limiting its scope.
[0094] Example 1: Application of the most preferred embodiment of the method for producing an artificial protein sequence for the somatic, replication-independent histone H1 variant in the model organism Caenorhabditis elegans. [0095] The only replication-independent and somatic histone H1 variant (ortholog of the human histone H1.0) in C. elegans is the histone H1.X protein (NCBI ID: NP_506680.1), SEQ.ID No. 7:
TABLE-US-00004 >NP_506680.1 Histone H1.X [Caenorhabditis elegans] MTTSLIHMANHLDASTEEISLNYVLLGHPHHERAQHHPSYMDMIKGAIQA IDNGTGSSKAAILKYIAQNYHVGENLPKVNNHLRSVLKKAVDSGDIEQTR GHGATGSFRMGKECEKNLQVGIPVQTKPMLMLKEVRQKLENISKAEKTKP STSSMSTNKKGKPISTMKKRGVMSKKRSSKNKMAPKAKSHGLKKKGPATK SSGLVHKAAGAKNEAAPTTKMELRTGTRKSYC [0096] Using the sequences of the human histone H1.0 α-helical motifs provided in the method as phylogenetic homology guides (SEQ. ID Nos. 1, 2, and 3), the respective subsequences corresponding to the α.sub.1, α.sub.2, and α.sub.3 motifs were recognized in the C. elegans wild-type histone H1.X protein sequence:
TABLE-US-00005 C. elegans H1.X 39 SYMDMIKGAIQAI DNGT GSS KAAILKYIAQNY HVGENLP KVNNHLRSVLKKAVDS 93 H. sapiens H1.0 27 KYSDMIVAAIQAE KNRA GSS RQSIQKYIKSHY KVGE NADSQIKLSIKRLVTT 78 wHTH motif α.sub.1 α.sub.2 α.sub.3 [0097] Next, the predefined amino acid substitution sites (S1, . . . , S11) were mapped into the C. elegans wild-type histone H1.X sequence:
TABLE-US-00006 substitution site S 4 3 106 5 7 8 9 11 1 2 ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ 39 596061 788082 86 888990 C. elegans H1.X 39 SYMDMIKGAIQAI DNGT GSS KAAILKYIAQNY HVGENLP KVNNHLRSVLKKAVDS 93 wHTH motif α.sub.1 α.sub.2 α.sub.3 [0098] In this example only one of the possible embodiments of this invention were produced by applying the method to the wild-type C. elegans histone H1.X reference sequence up to the substitution site S4:
TABLE-US-00007 site position [#] wt residue substitute residue aa substitution S1 ✓ 89 K R K89R S2 90 A n/a n/a S3 ✓ 59 K R K59R S4 ✓ 39 S P S39P n/a: not applicable in the method [0099] Since the amino acid substitutions K89R, K59R, and S39P are encompassed by the helical motifs α.sub.3, α.sub.2, and α.sub.1 respectively, it was next verified that the estimated z.sub.P of each artificial-sequence α-helical motif (substitute amino acid residues underlined) is greater than that of its wild-type C. elegans counterpart—when both are post-translationally unmodified or when both are subjected to plausible PTMs (online PTM prediction programs can be useful for this step):
TABLE-US-00008 α-helix sequence seq. type z.sub.p (est.) α.sub.3 (no PTMs) (H.sub.3N.sup.+)-KVNNHLRSVLKKAVDS-(coo.sup.−) wild-type +2.193 (H.sub.3N.sup.+)-KVNNHLRSVLKRAVDS-(coo.sup.−) artificial +2.197 > +2.193 ✓ α.sub.3 (with plausible PTMs) (H.sub.3N.sup.+)-KVNNHLR(pS)VL(K-ac)KAVDS-(coo.sup.−) wild-type −0.563 (H.sub.3N.sup.+)-KVNNHLR(pS)VL(K-ac)RAVDS-(coo.sup.−) artificial −0.559 > −0.563 ✓ α.sub.2 (no PTMs) (H.sub.3N.sup.+)-KAAILKYIAQNY-(coo.sup.−) wild-type +1.146 (H.sub.3N.sup.+)-RAAILKYIAQNY-(coo.sup.−) artificial +1.178 > +1.146 ✓ α.sub.2 (with plausible PTMs ) (H.sub.3N.sup.+)-KAAIL(K-ac)YIAQNY-(coo.sup.−) wild-type +0.150 (H.sub.3N.sup.+)-RAAIL(K-ac)YIAQNY-(coo.sup.−) artificial +0.182 > +0.150 ✓ α.sub.1 (no PTMs) (H.sub.3N.sup.+)-SYMDMIKGAIQAI-(coo.sup.−) wild-type −0.783 (H.sub.3N.sup.+)-PYMDMIKGAIQAI-(coo.sup.−) artificial −0.106 > −0.783 ✓ α.sub.1 (with plausible PTMs ) (H.sub.3N.sup.+)-(pS)YMD(oM)I(K-ac)GAIQAI-(coo.sup.−) wild-type −3.539 (H.sub.3N.sup.+)-PYMD(oM)I(K-ac)GAIQAT-(coo.sup.−) artificial −1.102 > −3.539 ✓ (K-ac): acetylated Lys; (oM): oxidized Met; (pS): phosphorylated Ser [0100] One of the possible artificial protein sequences was finally produced (SEQ. ID No. 8, as claimed in this invention), which is defined by the set of amino acid substitutions {K89R, K59R, S39P} when applied to the wild-type C. elegans histone H1.X reference sequence (substitute amino acid residues underlined):
TABLE-US-00009 >example-01 artificial-sequence histone H1.X for C. elegans MTTSLIHMANHLDASTEEISLNYVLLGHPHHERAQHHPPYMDMIKGAIQA IDNGTGSSRAAILKYIAQNYHVGENLPKVNNHLRSVLKRAVDSGDIEQTR GHGATGSFRMGKECEKNLQVGIPVQTKPMLMLKEVRQKLENISKAEKTKP STSSMSTNKKGKPISTMKKRGVMSKKRSSKNKMAPKAKSHGLKKKGPATK SSGLVHKAAGAKNEAAPTTKMELRTGTRKSYC
[0101] Examples 2-3: Application of the most preferred embodiment of the method for producing two artificial sequences for the mouse (Mus musculus) histone H1.0 protein. [0102] The reference sequence for the mouse histone H1.0 protein (NCBI ID: NP 032223.2), SEQ. ID No. 9, is the following:
TABLE-US-00010 >NP_032223.2 histone H1.0 [Mus musculus] MTENSTSAPAAKPKRAKASKKSTDHPKYSDMIVAAIQAEKNRAGSSRQS IQKYIKSHYKVGENADSQIKLSIKRLVTTGVLKQTKGVGASGSFRLAKG DEPKRSVAFKKTKKEVKKVATPKKAAKPKKAASKAPSKKPKATPVKKAK KKPAATPKKAKKPKVVKVKPVKASKPKKAKTVKPKAKSSAKRASKKK [0103] Using the sequences of the human histone H1.0 α-helical motifs provided in the method as phylogenetic homology guides (SEQ. ID Nos. 1, 2, and 3), the respective subsequences corresponding to the α.sub.1, α.sub.2, and α.sub.3 motifs were recognized in the M. musculus wild-type histone H1.0 protein sequence:
TABLE-US-00011 M.musculus H1.0 27 KYSDMIVAAIQAE KNRA GSS RQSIQKYIKSHY KVGE NADSQIKLSIKRLVTT 78 H. sapiens H1.0 27 KYSDMIVAAIQAE KNRA GSS RQSIQKYIKSHY KVGE NADSQIKLSIKRLVTT 78 wHTH motif α.sub.1 α.sub.2 α.sub.3 [0104] Next, the predefined amino acid substitution sites (S1, . . . , S11) were mapped into the M. musculus wild-type histone H1.0 sequence:
TABLE-US-00012 substitution site S 4 3 10 6 5 7 8 9 11 1 2 ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ 27 47 48 49 63 65 67 71 73 74 75 M. musculus H1.0 27 KYSDMIVAAIQAE KNRA GSS RQSIQKYIKSHY KVGE NADSQIKLSIKRLVTT 78 wHTH motif α.sub.1 α.sub.2 α.sub.3 [0105] In example 2 the method was to produce a “conservative” artificial protein sequence (in terms of departure from its wild-type counterpart) and thus it was produced by applying the method to the wild-type mouse histone H1.0 reference sequence only up to the substitution site S2:
TABLE-US-00013 site position [#] wt residue substitute residue aa substitution S1 74 R n/a n/a S2 ✓ 75 L R L75R n/a: not applicable in the method [0106] Since the amino acid substitution L75R is encompassed by the helical motif α.sub.3, it was next verified that the estimated z.sub.P of the artificial-sequence α.sub.3 helix (substitute amino acid residue underlined) is greater than that of its wild-type mouse counterpart—when both are post-translationally unmodified or when both are subjected to plausible PTMs (online PTM prediction programs can be useful for this step):
TABLE-US-00014 α-helix sequence seq. type zp (est.) α.sub.3 (no PTMs) (H.sub.3N.sup.+)-NADSQIKLSIKRLVTT-(coo.sup.-) wild-type +1.391 (H.sub.3N.sup.+)-NADSQIKLSIKRRVTT-(coo.sup.-) artificial +2.391 > +1.391 ✓ α.sub.3 (with plausible PTMs) (H.sub.3N.sup.+)-NAD(pS)QI(K-ac)LSIKRLVTT-(coo.sup.-) wild-type -1.365 (H.sub.3N.sup.+)-NAD(pS)QI(K-ac)LSIKRRVTT-(coo.sup.-) artificial -0.365 > -1.365 ✓ (PS): phosphorylated Ser; (K-ac): acetylated Lys [0107] An artificial protein sequence was finally produced (SEQ. ID No. 10, as claimed in this invention), which is defined by the set of amino acid substitutions {L75R} when applied to the wild-type mouse histone H1.0 reference sequence (substitute amino acid residue underlined): [0108] example-02 artificial-sequence histone H1.0 for M. musculus
TABLE-US-00015 MTENSTSAPAAKPKRAKASKKSTDHPKYSDMIVAAIQAEKNRAGSSRQS IQKYIKSHYKVGENADSQIKLSIKRRVTTGVLKQTKGVGASGSFRLAKG DEPKRSVAFKKTKKEVKKVATPKKAAKPKKAASKAPSKKPKATPVKKAK KKPAATPKKAKKPKVVKVKPVKASKPKKAKTVKPKAKSSAKRASKKK [0109] In example 3 the method was to produce a less “conservative” artificial protein sequence (in terms of departure from its wild-type counterpart) and thus it was produced by applying the method to the wild-type mouse histone H1.0 reference sequence up to the substitution site S7:
TABLE-US-00016 site position [#] wt residue substitute residue aa substitution S1 74 R n/a n/a S2 ✓ 75 L R L75R S3
47 R n/a n/a S4
27 K n/a n/a S5
63 N n/a n/a S6 ✓ 49 S A S49A S7 ✓ 65 D N D65N n/a: not applicable in the method [0110] Since the amino acid substitutions L75R and D65N are encompassed by the helical motif α.sub.3 and the amino acid substitution S49A is encompassed by the helical motif α.sub.2, it was next verified that the estimated z.sub.P of each of the artificial-sequence α.sub.3 and α.sub.2 motifs (substitute amino acid residues underlined) is greater than that of its wild-type mouse counterpart—when both are post-translationally unmodified or when both are subjected to plausible PTMs (online PTM prediction programs can be useful for this step):
TABLE-US-00017 α-helix sequence seq. type z.sub.p (est.) α.sub.3 (no PTMs) (H.sub.3N.sup.+)-NADSQIKLSIKRLVTT-(coo.sup.-) wild-type +1.391 (H.sub.3N.sup.+)-NANSQIKLSIKRRVTT-(coo.sup.-) artificial +3.390 > +1.391 ✓ α.sub.3 (with plausible PTMs) (H.sub.3N.sup.+)-NAD(pS)QI(K-ac)LSIKRLVTT-(coo.sup.-) wild-type -1.365 (H.sub.3N.sup.+)-NAN(pS)QI(K-ac)LSIKRRVTT-(coo.sup.-) artificial +0.634 > -1.365 ✓ α.sub.2 (no PTMs) (H.sub.3N.sup.+)-RQSIQKYIKSHY-(coo.sup.-) wild-type +2.219 (H.sub.3N.sup.+)-RQAIQKYIKSHY-(coo.sup.-) artificial +2.219 = +2.219 α.sub.2 (with plausible PTMs) (H.sub.3N.sup.+)-RQ(pS)IQ(K-ac)YIKSHY-(coo.sup.-) wild-type -0.536 (H.sub.3N.sup.+)-RQAIQ(K-ac)YIKSHY-(coo.sup.-) artificial +1.223 > -0.536 ✓ (K-ac): acetylated Lys; (PS): phosphorylated Ser [0111] An artificial protein sequence was finally produced (SEQ. ID No. 11, as claimed in this invention), which is defined by the set of amino acid substitutions {L75R, S49A, D6511} when applied to the wild-type mouse histone H1.0 reference sequence (substitute amino acid residues underlined): [0112] >example-03 artificial-sequence histone H1.0 for M. musculus
TABLE-US-00018 MTENSTSAPAAKPKRAKASKKSTDHPKYSDMIVAAIQAEKNRAGSSRQA IQKYIKSHYKVGENANSQIKLSIKRRVTTGVLKQTKGVGASGSFRLAKG DEPKRSVAFKKTKKEVKKVATPKKAAKPKKAASKAPSKKPKATPVKKAK KKPAATPKKAKKPKVVKVKPVKASKPKKAKTVKPKAKSSAKRASKKK
[0113] Examples 4-5: Application of the most preferred embodiment of the method for producing one artificial sequence for the human histone H1.0 protein and one artificial sequence for the human histone H1x protein. [0114] The reference sequence for the human histone H1.0 protein (NCBI ID: NP 005309.1), SEQ. ID No. 12, is the following:
TABLE-US-00019 >NP_005309.1 histone H1.0 [Homo sapiens] MTENSTSAPAAKPKRAKASKKSTDHPKYSDMIVAAIQAEKNRAGSSRQS IQKYIKSHYKVGENADSQIKLSIKRLVTTGVLKQTKGVGASGSFRLAKS DEPKKSVAFKKTKKEIKKVATPKKASKPKKAASKAPTKKPKATPVKKAK KKLAATPKKAKKPKTVKAKPVKASKPKKAKPVKPKAKSSAKRAGKKK [0115] The phylogenetic homology guides provided in the method correspond to the respective α-helical motifs from the human histone H1.0 and H1x variants, thus recognizing the respective subsequences corresponding to the three α-helical motifs in the wild-type histone H1.0, using the sequences SEQ. ID Nos. 1, 2, and 3, and mapping the predefined amino acid substitution sites into its sequence were trivial steps:
TABLE-US-00020 substitution site S 4 3 10 6 5 7 8 9 11 1 2 ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ 27 47 48 49 63 65 67 71 73 74 75 H.sapiens H1.0 27 KYSDMIVAAIQAE KNRA GSS RQSIQKYIKSHY KVGE NADSQIKLSIKRLVTT 78 wHTH motif α.sub.1 α.sub.2 α.sub.3 [0116] In example 4 the method was to produce a “conservative” artificial protein sequence (in terms of departure from its wild-type counterpart) because it is for use in humans. Thus, the artificial protein sequence was produced by applying the method to the wild-type human histone H1.0 reference sequence only up to the substitution site S2:
TABLE-US-00021 site position [#] wt residue substitute residue aa substitution S1 74 R n/a n/a S2 ✓ 75 L R L75R n/a: not applicable in the method [0117] Since the amino acid substitution L75R is encompassed by the helical motif α.sub.3, it was next verified that the estimated z.sub.P of the artificial-sequence α.sub.3 helix (substitute amino acid residue underlined) is greater than that of its wild-type human counterpart—when both are post-translationally unmodified or when both are subjected to plausible PTMs (online PTM prediction programs can be useful for this step):
TABLE-US-00022 α-helix sequence seq. type z.sub.p (est.) α.sub.3 (no PTMs) (H.sub.3N.sup.+)-NADSQIKLSIKRLVT T-(coo.sup.−) wild-type +1.391 (H.sub.3N.sup.+)-NADSQIKLSIKRRVT T-(coo.sup.−) artificial +2.391 > +1.391 ✓ α.sub.3 (with plausible PTMs) (H.sub.3N.sup.+)-NAD(pS)QI(K-ac)LSIKRLVTT-(coo.sup.−) wild-type −1.365 (H.sub.3N.sup.+)-NAD(pS)QI(K-ac)LSIKRRVTT-(coo.sup.−) artificial −0.365 > −1.365 ✓ (PS): phosphorylated Ser; (K-ac): acetylated Lys [0118] An artificial protein sequence was finally produced (SEQ. ID No. 13, as claimed in this invention), which is defined by the set of amino acid substitutions {L75R} when applied to the wild-type human histone H1.0 reference sequence (substitute amino acid residue underlined): [0119] >example-04 artificial-sequence histone H1.0 for H. sapiens
TABLE-US-00023 MTENSTSAPAAKPKRAKASKKSTDHPKYSDMIVAAIQAEKNRAGSSRQS IQKYIKSHYKVGENADSQIKLSIKRRVTTGVLKQTKGVGASGSFRLAKS DEPKKSVAFKKTKKEIKKVATPKKASKPKKAASKAPTKKPKATPVKKAK KKLAATPKKAKKPKTVKAKPVKASKPKKAKPVKPKAKSSAKRAGKKK [0120] The reference sequence for the human histone H1x protein (NCBI ID: NP 006017.1), SEQ. ID No. 14, is the following: [0121] >NP_006017.1 histone H1x [Homo sapiens]
TABLE-US-00024 MSVELEEALPVTTAEGMAKKVTKAGGSAALSPSKKRKNSKKKNQPGKYS QLVVETIRRLGERNGSSLAKIYTEAKKVPWFDQQNGRTYLKYSIKALVQ NDTLLQVKGTGANGSFKLNRKKLEGGGERRGAPAAATAPAPTAHKAKKA APGAAGSRRADKKPARGQKPEQRSHKKGAGAKKDKGGKAKKTAAAGGKK VKKAAKPSVPKVPKGRK [0122] The phylogenetic homology guides provided in the method correspond to the respective α-helical motifs from the human histone H1.0 and H1x variants, thus recognizing the respective subsequences corresponding to the three α-helical motifs in the wild-type histone H1x, using the sequences SEQ. ID Nos. 4, 5 and 6, and mapping the predefined amino acid substitution sites into its sequence were trivial steps:
TABLE-US-00025 substitution site s 4 3 10 6 5 7 8 9 11 1 2 ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ 47 67 68 69 84 86 88 92 94 95 96 H. sapiens Hlx 47 KYSQLVVETIRRL GERN GSS LAKIYTEAKKVP WFDQQ NGRTYLKYSIKALVQN 99 wHTH motif α.sub.1 α.sub.2 α.sub.3 [0123] In example 5 the method was to produce a “conservative” artificial protein sequence (in terms of departure from its wild-type counterpart) because it is for use in humans. Thus, the artificial protein sequence was produced by applying the method to the wild-type human histone H1x reference sequence only up to the substitution site S2:
TABLE-US-00026 site position [#] wt residue substitute residue aa substitution S1 ✓ 95 A R A95R S2 ✓ 96 L R L96R [0124] Since the amino acid substitution A95R and L96R are encompassed by the helical motif α.sub.3, it was next verified that the estimated z.sub.P of the artificial-sequence α.sub.3 helix (substitute amino acid residues underlined) is greater than that of its wild-type human counterpart—when both are post-translationally unmodified or when both are subjected to plausible PTMs (online PTM prediction programs can be useful for this step):
TABLE-US-00027 α.sub.3-helix subsequence subseq. type z.sub.p (est.) α.sub.3 (no PTMs) (H.sub.3N.sup.+)-NGRTYLKYSIKALVQN-(coo.sup.−) wild-type +2.383 (H.sub.3N.sup.+)-NGRTYLKYSIKRRVQN-(coo.sup.−) artificial +4.383 > +2.383 ✓ α.sub.3 (with plausible PTM) (H.sub.3N.sup.+)-NGR(pT)YL(K-ac)YSIKALVQN-(coo.sup.−) wild-type −0.373 (H.sub.3N.sup.+)-NGR(pT)YL(K-ac)YSIKRRVQN-(coo.sup.−) artificial +1.627 > −0.373 ✓ (PT): phosphorylated Thr; (K-ac): acetylated Lys [0125] An artificial protein sequence was finally produced (SEQ. ID No. 15, as claimed in this invention), which is defined by the set of amino acid substitutions {A95R, L96R} when applied to the wild-type human histone H1x reference sequence (substitute amino residues underlined): [0126] >example-05 artificial-sequence histone H1x for H. sapiens
TABLE-US-00028 MSVELEEALPVTTAEGMAKKVTKAGGSAALSPSKKRKNSKKKNQPGKYS QLVVETIRRLGERNGSSLAKIYTEAKKVPWFDQQNGRTYLKYSIKRRVQ NDTLLQVKGTGANGSFKLNRKKLEGGGERRGAPAAATAPAPTAHKAKKA APGAAGSRRADKKPARGQKPEQRSHKKGAGAKKDKGGKAKKTAAAGGKK VKKAAKPSVPKVPKGRK
[0127] Examples 6-10: Since a protein ortholog of the human histone H1.0 can be found in all multicellular species and a protein ortholog of the human histone H1x can be found in all vertebrate species, the method informing the present invention can produce artificial histone H1.0 sequences for any multicellular species and artificial histone H1x sequences for any vertebrate species. Five examples of artificial histone H1.0/H1x protein sequences produced with the method claimed in this invention, using its most preferred steps, for different species are shown in TABLE 1.
TABLE-US-00029 TABLE 1 wt histone H1.0/H1x-orthologous sequence use of the method by applying [SEQ. ID.] thus producing artificial example species the set of aa substitutions to the sequence {SEQ. ID} for (H1 variant) 6 Rattus norvegicus {L75R} [16] {17} (H1.0) 7 Rattus norvegicus {A94R, L95R} [18] {19} (H1x) 8 Nothobranchius furzeri {L73R, S47A, D63N} [20] {21} (H1.0) 9 Drosophila melanogaster {S96R, L68R, K85M} [22] {23} (H1.0) 10 Arabidopsis thaliana {S76R, S64N, T68R, Y47R} [24] {25} (H1.0)
[0128] Example 11: In vivo testing of the method for conferring C. elegans resistance to senescence. [0129] a. The artificial sequence for the C. elegans histone H1.X protein (amino acid substitutions underlined) produced in example 1 (SEQ. ID No. 8) is:
TABLE-US-00030 >example-01 artificial-sequence histone H1.X for C. elegans MTTSLIHMANHLDASTEEISLNYVLLGHPHHERAQHHPPYMDMIKGAIQ AIDNGTGSSRAAILKYIAQNYHVGENLPKVNNHLRSVLKRAVDSGDIEQ TRGHGATGSFRMGKECEKNLQVGIPVQTKPMLMLKEVRQKLENISKAEK TKPSTSSMSTNKKGKPISTMKKRGVMSKKRSSKNKMAPKAKSHGLKKKG PATKSSGLVHKAAGAKNEAAPTTKMELRTGTRKSYC [0130] b. The wild-type C. elegans histone H1.X protein is encoded by the hil-1 gene. Thus, it was necessary to edit the hil-1 gene in the wild-type C. elegans genome (with the CRISPR/Cas genome-editing technique) so that the resulting mutant hil-1 gene encodes the artificial protein sequence shown in step a. [0131] c. The CRISPR/Cas genome editing in the wild-type C. elegans (strain N2) was carried out successfully. The mutant hil-1 allele obtained was fluorescently tagged, then outcrossed to N2 ten times and found to be viable and fertile at least in the heterozygous form (we did not confirm homozygosity due to budget constraints). [0132] d. A survival assay (C. elegans individuals kept at 20° C. and fed with E. coli OP50) was conducted to assess resistance to senescence (if any) in the hil-1 mutant strain with respect to the wild-type C. elegans (strain N2) used as a negative control for the CRISPR/Cas genome editing. The results obtained showed a significant increase in lifespan for the C. elegans mutant strain (χ.sup.2=4.58; corrected P-value=0.032) when compared to the C. elegans N2 strain (see TABLE 2).
TABLE-US-00031 TABLE 2 days after hatching % alive (wt N2) % alive (mutant hil-1) 0.0 100 100 1.0 100 100 1.9 100 100 3.1 100 100 4.2 100 100 6.0 96.7 100 8.1 88.3 100 10.2 81.7 100 12.0 71.7 100 14.0 50.0 100 16.1 13.3 100 18.2 3.3 100 20.0 1.7 98.3 22.1 1.7 98.3 24.2 0.0 98.3
[0133] For a person skilled in the art it would be obvious that, given the well-known signs of senescence in C. elegans observable shortly after the individual reaches its adult form, the increased lifespan in the hil-1 mutant strain with respect to the wild-type C. elegans (strain N2) also implies the hil-1 mutant strain is significantly resistant to senescence, thereby demonstrating the industrial applicability of the present invention in vivo.