BIOMARKER FOR PREDICTING AGE IN DAYS OF PIGS, AND PREDICTION METHOD
20230080372 · 2023-03-16
Inventors
- Zhonglin TANG (Shenzhen, CN)
- Yalan YANG (Shenzhen, CN)
- Xinhao FAN (Shenzhen, CN)
- Muya CHEN (Shenzhen, CN)
Cpc classification
G16B40/00
PHYSICS
C12Q1/6876
CHEMISTRY; METALLURGY
G16B5/00
PHYSICS
C12Q1/6883
CHEMISTRY; METALLURGY
International classification
Abstract
Provided are biomarkers and a prediction method for predicting age in days in pigs. The biomarkers for predicting age in days of pigs include one or more CpG sites with different methylation levels, and the different methylation levels of the CpG sites correspond to different ages in days of pigs. An Elastic Net linear regression model is constructed by using the methylation levels of the CpG sites and the weights corresponding to each CpG site, thereby predicting age in days of pigs to be tested. The above prediction method has high accuracy, and is accurate and reliable in detecting age in days of pigs, which fills the gap in the age prediction model of pigs based on DNA methylation, and provides an ideal model for investigating important scientific issues such as development and aging of human and animals.
Claims
1. Biomarkers for predicting age in days of a pig, comprising multiple CpG sites with different methylation levels, and the different methylation levels of the CpG sites correspond to different ages of pigs; wherein, position informations of the CpG sites as following: chr1:265469121, chr1:6993958, chr1:77278255, chr1:77278255, chr1:90279146, chr1:10222822, chr1:200765194, chr1:252703561, chr1:127811329, chr1:218682018, chr1:272166208, chr2:112726051, chr2:131821312, chr3:79519033, chr3:71354421, chr3:96708114, chr3:4786944, chr4:110707399, chr4:51236025, chr4:61693637, chr4:35277986, chr4:71941843, chr4:38392750, chr5:46167692, chr5:3442060, chr5:83823568, chr5:86678792, chr6:63915584, chr6:98241827, chr6:7667231, chr6:59654560, chr6:148902979, chr6:131779338, chr6:131779339, chr6:63915581, chr6:151183086, chr6:107410789, chr6:134649996, chr7:15916877, chr7:1722548, chr7:89164845, chr7:14846023, chr7:70113867, chr7:89164756, chr7:86102364, chr7:89164755, chr8:46226086, chr8:71696260, chr8:138571452, chr8:78759323, chr8:116621205, chr8:41380820, chr9:116669694, chr9:68467395, chr9:96069192, chr9:36094595, chr9:73739560, chr9:114311129, chr10:14130890, chr10:14130912, chr10:27158773, chr11:43923343, chr11:13802486, chr12:52792396, chr13:158289588, chr13:32034512, chr13:77838609, chr13:30455076, chr13:85584193, chr13:1535436, chr13:111038503, chr14:31839031, chr14:71122259, chr16:57712066, chr17:43961681, chr18:17893916; and, weight informations of the CpG sites as following: TABLE-US-00003 Number (i) CpG position information (β) Weight (w) 1 chr1: 265469121 −0.19791914 2 chr1: 6993958 −3.224485644 3 chr1: 77278255 −13.28624592 4 chr1: 90279146 −9.413975275 5 chr1: 10222822 −2.319516222 6 chr1: 200765194 6.224564956 7 chr1: 252703561 −10.29425473 8 chr1: 127811329 −0.288286911 9 chr1: 218682018 −8.74861671 10 chr1: 272166208 −0.958636654 11 chr2: 112726051 −0.00030695 12 chr2: 131821312 −1.487907119 13 chr3: 79519033 −1.427572944 14 chr3: 71354421 −14.56809668 15 chr3: 96708114 −5.697719601 16 chr3: 4786944 −6.781267851 17 chr4: 110707399 −0.007481015 18 chr4: 51236025 −1.595911641 19 chr4: 61693637 −1.027410147 20 chr4: 35277986 −0.049404384 21 chr4: 71941843 −13.62773853 22 chr4: 38392750 −0.043794313 23 chr5: 46167692 −2.61890723 24 chr5: 3442060 −14.13370338 25 chr5: 83823568 −1.940844913 26 chr5: 86678792 −8.038210429 27 chr6: 63915584 −6.430323147 28 chr6: 98241827 −19.83015838 29 chr6: 7667231 −0.115183771 30 chr6: 59654560 −0.010556261 31 chr6: 148902979 −13.09889713 32 chr6: 131779338 −0.016545453 33 chr6: 131779339 −2.563888441 34 chr6: 63915581 −7.790688318 35 chr6: 151183086 −2.317710899 36 chr6: 107410789 −7.746859508 37 chr6: 134649996 −42.41052359 38 chr7: 15916877 −5.765286814 39 chr7: 1722548 −1.232989258 40 chr7: 89164845 −1.78588923 41 chr7: 14846023 −1.915909405 42 chr7: 70113867 −5.225256985 43 chr7: 89164756 −0.102078131 44 chr7: 86102364 −1.624811107 45 chr7: 89164755 −4.012719139 46 chr8: 46226086 −3.368393933 47 chr8: 71696260 −17.09415973 48 chr8: 138571452 −19.74938423 49 chr8: 78759323 −5.382316805 50 chr8: 116621205 −4.395514047 51 chr8: 41380820 −0.033290161 52 chr9: 116669694 −0.979621002 53 chr9: 68467395 −1.528021515 54 chr9: 96069192 −9.073121614 55 chr9: 36094595 −15.79167462 56 chr9: 73739560 −1.061762087 57 chr9: 114311129 −0.276923385 58 chr10: 14130890 −0.047930706 59 chr10: 14130912 −0.872727299 60 chr10: 27158773 −8.310078727 61 chr11: 43923343 −5.381489916 62 chr11: 13802486 −2.727387937 63 chr12: 52792396 −6.930884723 64 chr13: 158289588 −2.631225249 65 chr13: 32034512 −0.311623607 66 chr13: 77838609 1.844834596 67 chr13: 30455076 −3.508163558 68 chr13: 85584193 −0.540711444 69 chr13: 1535436 −4.226227735 70 chr13: 111038503 −4.872094667 71 chr14: 31839031 −3.157679713 72 chr14: 71122259 −0.311791447 73 chr16: 57712066 −0.895052703 74 chr17: 43961681 −3.8209032 75 chr18: 17893916 −3.998631584 wherein, a method for predicting age in days of a pig, comprising measuring methylation levels of the biomarkers CpG sites in genomic DNA of a pig, and utilizing a statistical prediction algorithm to determine age in days of a pig; the statistical prediction algorithm comprises: (a) obtaining a linear combination of methylation levels of the biomarkers CpG sites, the method for obtaining the linear combination comprises: the methylation levels of the CpG sites and the corresponding weights of each CpG site are used to construct an Elastic Net linear regression model; and (b) applying a transformation to the linear combination to determine age in days of a pig; and, the model is: age in days=w.sub.1.Math.β.sub.1+w.sub.2.Math.β.sub.2+ . . . w.sub.i.Math.β.sub.i+w.sub.75.Math.β.sub.75+383.90, wherein w.sub.i is the weight of CpG site i, β.sub.i is the methylation level of site i.
2. A reagent or a kit for predicting age in days of a pig, comprising a reagent capable of detecting the biomarkers according to claim 1, and optional instructions.
3. A method for predicting age in days of a pig, comprising measuring the methylation levels of the biomarker CpG sites of claim 1 in genomic DNA of a pig, and utilizing a statistical prediction algorithm to determine age in days of a pig; the statistical prediction algorithm comprises: (a) obtaining a linear combination of methylation levels of the biomarkers CpG sites, the method for obtaining the linear combination comprises: the methylation levels of the CpG sites and the corresponding weights of each CpG site are used to construct an Elastic Net linear regression model; and (b) applying a transformation to the linear combination to determine age in days of a pig.
4. The method according to claim 3, wherein the version of the pig reference genome used in the model is Sscrofa11.1 version.
5. The method according to claim 3, wherein the methylation levels of the biomarker CpG sites are measured by measuring the methylation levels of CpG sites in the genome of the biological sample, wherein the biological sample is a muscle, blood, saliva, epidermis, brain, kidney or liver sample of a pig.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0037]
DETAILED DESCRIPTION OF THE INVENTION
[0038] Unless otherwise defined, all scientific and technical terms used in this application have the same meaning as commonly understood by an ordinary skilled person in the art of this application.
[0039] The technical solutions of the examples according to the present application will be clearly and completely described below with reference to the accompanying drawings in the examples of the present application. Obviously, the described examples are only a part of the examples of the present application, but not all of the examples. Based on the examples in the present application, all other examples obtained by the ordinary skilled person in the art without creative efforts shall fall within the protection scope of the present application.
[0040] Unless otherwise specified, the materials, reagents, etc. used in the following examples are commercially available.
[0041] The present application will be described in detail below with reference to specific examples, which are used to understand rather than limit the present application.
[0042] As used herein, the term “biomarker” refers to a CpG site that may be methylated. Methylation typically occurs in a CpG-containing nucleic acid. A CpG-containing nucleic acid may be present, for example, in a CpG island, a CpG dinucleotide, a promoter, an intron, or an exon of a gene.
[0043] As used herein, the term “DNA methylation” refers to the addition of a methyl group to the 5′-carbon of a cytosine residue between CpG dinucleotides (i.e., 5-methylcytosine). DNA methylation can occur at cytosines in other contexts, such as CHG and CHH, wherein H is adenine, cytosine, or thymine. Cytosine methylation can also be in the form of 5-hydroxymethylcytosine. DNA methylation can include non-cytosine methylation, such as N6-methyladenine.
[0044] As used herein, the term “genome” or “genomic” refers to all genetic material in the chromosomes of an organism. DNA derived from the genetic material in the chromosomes of a particular organism is genomic DNA.
[0045] As used herein, the term “gene” refers to a region of genomic DNA associated with a specified gene. For example, such a region can be defined by a specific gene (such as an exon, an intron, and a control sequence for associated expression) and its flanking sequences. However, it has been recognized in the art that methylation in a specific region is often indicative of methylation status at a proximal genomic locus.
Example 1
[0046] A method for constructing a model for predicting age in days of pigs, including the following steps:
[0047] 1. Extraction of Pig Genomic DNA
[0048] The muscle tissues of the experimental pigs are sampled and lysed with 0.5 mL of lysis buffer (0.5 mol/L EDTA, 1 mol/L NaCl, 10% SDS, RNase stock), digesting with 10 μl of proteinase K (5 mg/ml), and extracting DNA by phenol imitation method. The specific steps are as follows:
[0049] (1) cutting the tissues into pieces to add to a 1.5 mL centrifuge tube, adding lysis buffer and proteinase K to the tube, then placing on a shaker (56° C., 5 h);
[0050] (2) adding an equal volume of Tris-saturated phenol (500 μL) and shaking (10 min);
[0051] (3) centrifuging at 12000 rpm for 5 min, and transferring the upper layer liquid to a new centrifuge tube;
[0052] (4) preparing a mixed solution of Tris-saturated phenol:chloroform:isoamyl alcohol=25:24:1;
[0053] (5) adding 0.45 mL of the mixed solution of above step (4) to the new centrifuge tube containing the supernatant;
[0054] (6) centrifuging at 12,000 rpm for 5 min, transferring the supernatant to a new centrifuge tube, and adding an equal volume (0.4 mL) of a mixture of chloroform and isoamylol (chloroform:isoamylol=24:1);
[0055] (7) centrifuging at 12000 rpm for 5 min, transferring the supernatant to a new centrifuge tube, adding 2.5 times of absolute ethanol pre-cooled at −20° C., and staying at −20° C. overnight;
[0056] (8) centrifuging at 12,000 rpm for 5 min, discarding the supernatant to retain the white precipitate, adding 0.4 mL of 75% ethanol, pipetting repeatedly, and centrifuging to remove the liquid;
[0057] (9) repeating step (8);
[0058] (10) adding ddH.sub.2O to complete the extraction.
[0059] 2. Whole-Genome Methylation Sequencing and Calculate the Methylation Levels of CpG Sites
[0060] The whole-genome methylation sequencing results are compared to calculate the methylation levels of CpG sites. The specific methods are as follows:
[0061] (1) The genomic DNA extracted in the previous step is randomly broken into 200-300 bp by using Covaris 5220; the broken DNA fragments are subjected to end repair, A tail addition, and connected with sequencing linker in which all cytosines are modified by methylation.
[0062] (2) Then DNA was treated with bisulfate using EZ DNA Methylation Gold Kit, Zymo Research; after the treatment, unmethylated Cytosine (C) is converted to Uracil (U) (after PCR amplification, U becomes Thymine (T)), while methylated C remains unchanged, and then PCR amplification is performed to obtain the final DNA library.
[0063] (3) Illumina sequencing is performed on the DNA library, and the sequencing platform is HiSeq X Ten. The methylation sites are detected by Bismark, and the methylation levels of the identified methylation sites are calculated.
[0064] 3. Construction of a Linear Model for Predicting Age in Days of Pigs. The Model is as Follows:
[0065] Age in days=w.sub.1.Math.β.sub.1+w.sub.2.Math.β.sub.2+ . . . w.sub.i.Math.β.sub.i+w.sub.75.Math.β.sub.75+383.90, wherein w.sub.i is the weight of CpG site i, β.sub.i is the methylation level at site i.
[0066] See Table 1 for the CpG sites and weight information.
TABLE-US-00002 TABLE 1 Number (i) CpG position information (β) Weight (w) 1 chr1: 265469121 −0.19791914 2 chr1: 6993958 −3.224485644 3 chr1: 77278255 −13.28624592 4 chr1: 90279146 −9.413975275 5 chr1: 10222822 −2.319516222 6 chr1: 200765194 6.224564956 7 chr1: 252703561 −10.29425473 8 chr1: 127811329 −0.288286911 9 chr1: 218682018 −8.74861671 10 chr1: 272166208 −0.958636654 11 chr2: 112726051 −0.00030695 12 chr2: 131821312 −1.487907119 13 chr3: 79519033 −1.427572944 14 chr3: 71354421 −14.56809668 15 chr3: 96708114 −5.697719601 16 chr3: 4786944 −6.781267851 17 chr4: 110707399 −0.007481015 18 chr4: 51236025 −1.595911641 19 chr4: 61693637 −1.027410147 20 chr4: 35277986 −0.049404384 21 chr4: 71941843 −13.62773853 22 chr4: 38392750 −0.043794313 23 chr5: 46167692 −2.61890723 24 chr5: 3442060 −14.13370338 25 chr5: 83823568 −1.940844913 26 chr5: 86678792 −8.038210429 27 chr6: 63915584 −6.430323147 28 chr6: 98241827 −19.83015838 29 chr6: 7667231 −0.115183771 30 chr6: 59654560 −0.010556261 31 chr6: 148902979 −13.09889713 32 chr6: 131779338 −0.016545453 33 chr6: 131779339 −2.563888441 34 chr6: 63915581 −7.790688318 35 chr6: 151183086 −2.317710899 36 chr6: 107410789 −7.746859508 37 chr6: 134649996 −42.41052359 38 chr7: 15916877 −5.765286814 39 chr7: 1722548 −1.232989258 40 chr7: 89164845 −1.78588923 41 chr7: 14846023 −1.915909405 42 chr7: 70113867 −5.225256985 43 chr7: 89164756 −0.102078131 44 chr7: 86102364 −1.624811107 45 chr7: 89164755 −4.012719139 46 chr8: 46226086 −3.368393933 47 chr8: 71696260 −17.09415973 48 chr8: 138571452 −19.74938423 49 chr8: 78759323 −5.382316805 50 chr8: 116621205 −4.395514047 51 chr8: 41380820 −0.033290161 52 chr9: 116669694 −0.979621002 53 chr9: 68467395 −1.528021515 54 chr9: 96069192 −9.073121614 55 chr9: 36094595 −15.79167462 56 chr9: 73739560 −1.061762087 57 chr9: 114311129 −0.276923385 58 chr10: 14130890 −0.047930706 59 chr10: 14130912 −0.872727299 60 chr10: 27158773 −8.310078727 61 chr11: 43923343 −5.381489916 62 chr11: 13802486 −2.727387937 63 chr12: 52792396 −6.930884723 64 chr13: 158289588 −2.631225249 65 chr13: 32034512 −0.311623607 66 chr13: 77838609 1.844834596 67 chr13: 30455076 −3.508163558 68 chr13: 85584193 −0.540711444 69 chr13: 1535436 −4.226227735 70 chr13: 111038503 −4.872094667 71 chr14: 31839031 −3.157679713 72 chr14: 71122259 −0.311791447 73 chr16: 57712066 −0.895052703 74 chr17: 43961681 −3.8209032 75 chr18: 17893916 −3.998631584
Example 2 Verification of the Accuracy of the CpG Sites and the Model in Example 1
[0067] 1. Extraction of Pig Genomic DNA, Whole-Genome Methylation Sequencing
[0068] The skeletal muscle tissues of the experimental pigs at 27 time points are sampled, with 3 replicates for each time point, for a total of 81 samples, wherein 80% of the samples (n=64) are randomly selected as training samples, and the remaining 20% of the samples (n=17) as test verification samples. 0.5 mL lysis buffer (0.5 mol/L EDTA, 1 mol/L NaCl, 10% SDS, RNase stock) is used for lysis, digesting with 10 μL of proteinase K (5 mg/mL), and extracting DNA by phenol imitation method. The specific steps are as follows:
[0069] (1) cutting the tissue into pieces to add to a 1.5 mL centrifuge tube, adding lysis buffer and proteinase K to the tube, then placing on a shaker (56° C., 5 h);
[0070] (2) adding an equal volume of Tris-saturated phenol (500 μL) and shaking (10 min);
[0071] (3) centrifuging at 12000 rpm for 5 min, and transferring the upper layer liquid to a new centrifuge tube;
[0072] (4) preparing a mixed solution of Tris-saturated phenol:chloroform:isoamyl alcohol=25:24:1;
[0073] (5) adding 0.45 mL of the mixed solution of above step (4) to the new centrifuge tube containing the supernatant;
[0074] (6) centrifuging at 12,000 rpm for 5 min, transferring the supernatant to a new centrifuge tube, and adding an equal volume (0.4 mL) of a mixture of chloroform and isoamylol (chloroform:isoamylol=24:1);
[0075] (7) centrifuging at 12000 rpm for 5 min, transferring the supernatant to a new centrifuge tube, adding 2.5 times of absolute ethanol pre-cooled at −20° C., and staying at −20° C. overnight;
[0076] (8) centrifuging at 12,000 rpm for 5 min, discarding the supernatant to retain the white precipitate, adding 0.4 mL of 75% ethanol, pipetting repeatedly, and centrifuging to remove the liquid;
[0077] (9) repeating step (8);
[0078] (10) adding ddH.sub.2O to complete the genomic DNA extraction.
[0079] 2. Whole-Genome Methylation Sequencing and Calculate the Methylation Levels of CpG Sites
[0080] (1) The genomic DNA is randomly broken into 200-300 bp by using Covaris 5220; the broken DNA fragments are subjected to end repair, A tail addition, and connected with sequencing linker in which all cytosines are modified by methylation.
[0081] (2) Then DNA was treated with bisulfate using EZ DNA Methylation Gold Kit, Zymo Research; after the treatment, unmethylated Cytosine (C) is converted to Uracil (U) (after PCR amplification, U becomes Thymine (T)), while methylated C remains unchanged, and then PCR amplification is performed to obtain the final DNA library.
[0082] (3) Illumina sequencing is performed on the DNA library, and the sequencing platform is HiSeq X Ten. The methylation sites are detected by Bismark, and the methylation levels of the identified methylation sites are calculated.
[0083] (4) The methylation level data of randomly selected 64 samples with different ages in days is used as test data to construct a model, and the data of the remaining 17 samples with different ages in days is used as verification data; the speculated ages in days are calculated according to the constructed model, comparing them with the actual ages in days (the comparison results are shown in
[0084] The above descriptions are only preferred examples of the present application, and are not intended to limit the present application. Any modifications, equivalent replacements, etc. made within the spirit and principles of the present application shall be encompassed in the protection scope of the present application.