A CHICKEN METHYLATION CLOCK

Abstract

The invention provides a method of establishing a chicken methylation clock comprising: (a) determining the methylation ratio and the read coverage of the genomic CpG sites of an age-correlated training sample of a specific chicken tissue; (b) defining a set of CpG sites having reliable methylation ratios in all training samples of step (a) using a cutoff value; and (c) performing a penalized regression using the methylation ratios of step (b) as input and the age correlated to the training sample as dependent variable, by applying a penalized regression model; thereby obtaining a set of CpG sites with corresponding weighting factors and intercept of the linear model equation as parameters defining the chicken methylation clock.

Claims

1. A method of establishing a chicken methylation clock, the method comprising: (a) determining a methylation ratio and a read coverage of genomic CpG sites of an age-correlated training sample of a specific chicken tissue; (b) defining a set of CpG sites having reliable methylation ratios in all training samples of the determining (a) using a cutoff value; and (c) performing a penalized regression using the methylation ratios of the defining (b) as input and the age correlated to the training sample as dependent variable, by applying a penalized regression model; thereby obtaining a set of CpG sites with corresponding weighting factors and intercept of a linear model equation as parameters defining the chicken methylation clock.

2. The method according to claim 1, wherein the methylation ratio and the read coverage of the genomic CpG sites are determined in the determining (a) using bisulfite sequencing.

3. The method according to claim 1, wherein the age-correlated training sample of the specific chicken tissue used in the determining (a) is selected from the group consisting of gut tissue, muscle tissue, organ tissue and skin tissue.

4. The method according to claim 1 , wherein the coverage cutoff value defined in the defining (b) is 3.

5. The method according to claim 1, the method further comprising: (d) optimizing a fit of the regression model such that the number of CpG sites is reduced to below 100.

6. The method according to claim 5, wherein for optimizing the fit of the regression model in the optimizing (d), applying ridge regression in combination with lasso regression.

7. The method according to claim 6, wherein the methods of ridge regression and lasso regression are balanced using an alpha value of between 0 and 1.

8. An in vitro method for predicting the chronological age of a chicken, the method comprising: a) obtaining genomic chicken DNA from biological sample material deriving from a chicken subject or from a chicken population to be tested; b) determining a methylation level for the CpG sites indicated in Table 2 or, alternatively, for the CpG sites indicated in Table 3, in the genomic chicken DNA; c) comparing the methylation levels of the CpG sites in the genomic chicken DNA from the sample material to be tested with the methylation levels of the same CpG sites from an age-correlated reference sample material, and deducing therefrom the chronological age of the subject or the population to be tested.

9. A method for predicting the chronological age of a chicken tissue sample material, the method comprising: (a) obtaining genomic DNA from the tissue sample material deriving from a chicken subject or from a chicken population to be tested, (b) determining the methylation ratios for the CpG sites indicated in Table 2 or, alternatively, for the CpG sites indicated in Table 3, and multiplying same with their respective weighing factors to obtain weighted methylation ratios of those CpG sites, (c) computing a sum over the weighted methylation ratios obtained in the determining (b) and adding a respective intercept of a linear model equation, thus predicting the chronological age for the chicken tissue sample.

10. The method according to claim 9, wherein the tissue sample material deriving from the chicken subject or from the chicken population to be tested in the obtaining (a) is selected from the group consisting of gut tissue, muscle tissue, organ tissue and skin tissue.

11. The method according to claim 9 , wherein the tissue sample material deriving from the chicken subject or from the chicken population to be tested used in the obtaining (a) is gut tissue, preferably isolated from fecal sample material.

12. The method according to claim, wherein the methylation ratios for the CpG sites in the determining (b) were determined using bisulfite sequencing.

13. A method for detecting accelerated aging in a chicken tissue sample material, the method comprising: (a) obtaining genomic DNA from the tissue sample material deriving from a chicken subject or from a chicken population to be tested, (b) determining the methylation ratios for the CpG sites indicated in Table 2 or, alternatively, for the CpG sites indicated in Table 3, and multiplying same with their respective weighing factors to obtain the weighted methylation ratios of those CpG sites, (c) computing a sum over weighted methylation ratios obtained in the determining (b) and adding a respective intercept of a linear model equation, thus predicting the age for the chicken tissue sample (epigenetic age) and (d) comparing the predicted age of the computing (c) with the actual chronological age of the tissue sample, wherein a predicted age higher than the chronological age is indicative of accelerated aging in the chicken tissue sample.

Description

BRIEF DESCRIPTION OF THE FIGURES

[0114] FIG. 1. Mean squared error of a trained clock for given alpha at value of lambda leading to the minimal error.

[0115] FIG. 2. Number of CpGs for given alpha at value of lambda leading to the minimal error.

EXAMPLES

Methods

[0116] A broiler study was conducted with Ross 308 male broilers fed industry standard, three phase, corn-soybean meal diets formulated to meet all nutrient requirements from day 1-35 (Table 1).

[0116] TABLE-US-00001 Ingredients, % Starter (day 1-14) Grower (day 15 - 28) Finisher (day 29 -35) Corn 54.38 62.93 63.24 Soybean Meal (48% CP) 35.00 26.83 25.48 Corn Gluten Meal (60% CP) 4.00 4.00 4.00 Soybean Oil 2.66 2.45 3.80 Dicalcium phosphate 22 1.74 1.61 1.42 Limestone (CaCO.sub.3) 0.75 0.75 0.69 Salt (NaCl) 0.36 0.37 0.34 Choline Chloride 60% 0.10 0.10 0.10 Vitamin Mineral Premix 0.50 0.50 0.50 DL-Methionine 0.25 0.20 0.21 L-Lysine-HCl 0.22 0.23 0.19 L-Threonine 0.04 0.04 0.03 Total 100 100 100 Nutrient composition, as is ME, kcal/kg 3008 3086 3186 CP, % 23.90 20.45 19.69 Ca 0.90 0.84 0.76 Available Phosphorous 0.45 0.42 0.38 Lysine 1.36 1.15 1.09 Methionine 0.62 0.53 0.52 Methionine + Cysteine 1.00 0.86 0.84 Threonine 0.9, 0.80 0.76 Tryptophan 0.27 0.23 0.21 Arginine 1.50 1.30 1.18 Isoleucine 1.00 0.85 0.80 Leucine 2.19 1.98 1.91 Valine 1.10 0.95 0.90

[0117] Three physiologically healthy birds were euthanized each at days 3, 15 and 35 to excise spleen, intestinal (ileum) and muscle (pectoralis major) samples for DNA extraction (an Invitrogen PureLink genomic DNA isolation kit) and bisulfite sequencing.

Samples

[0118] Animals were stratified into three tissue (breast, ileum and spleen) and three age (3d, 15d, 34d) groups. From each of these 9 groups, DNA was prepared from three independent animals, resulting in 27 genomic DNA samples.

Whole Genome Bisulfite Sequencing

[0119] Whole-genome bisulfite sequencing services were conducted. Libraries were prepared using the Accel-NGS Methyl-Seq DNA Library Kit from Swift Biosciences. Two sequencing libraries were barcoded onto one sequencing lane. Sequencing was performed on an Illumina HiSeq X platform using a standard paired-end sequencing protocol with 105 nucleotides read length.

Read Mapping

[0120] Reads were trimmed and mapped with BSMAP 2.5 (Xi Y, Li W. 2009. BSMAP: whole genome bisulfite sequence MAPping program. BMC Bioinformatics 10:232. doi:10.1186/1471-2105-10-232.) using the Gallus gallus genome assembly version 5.0 (https://www.ebi.ac.uk/ena/data/view/GCA_000002315.3) as a reference sequence. Duplicates were removed using the Picard tool (http://broadinstitute.github.io/picard). Methylation ratios were determined using a Python script (methratio.py) distributed together with the BSMAP package by dividing the number of reads having a methylated CpG at a certain genomic position by the number of all reads covering this position.

Establishment of a Chicken DNA Methylation Clock

[0121] A penalized regression model (implemented in the R package glmnet [https://cran.r-project.org/web/packages/glmnet/]) was used to regress the chronological age on the CpG probes in the training set. For the genome-wide clock, we restricted the analysis to CpGs that showed a strand specific coverage of greater than 3 in every of the sequenced samples, resulting in a set of 12,876,934 CpGs. For the LMR clock, we restricted the analysis to CpGs within low-methylated regions that showed a strand specific coverage of greater than 3 in every of the sequenced samples, resulting in a set of 765,266 CpGs.

Results

[0122] The alpha parameter of glmnet was varied in a range between 0 and 1 and chosen as 0.54 (elastic net regression), because this value led to a fit that was close to the best fit and a manageable amount of CpGs. The lambda value was chosen using cross-validation on the training data as 0.668. This identified a set of 63 CpGs together with corresponding beta values, which define the weights for these CpGs used in the chicken methylation clock. The mean squared error of 9-fold crossvalidation using the values of 0.54 for alpha and 0.668 for lambda was 2.912493 days. This indicates that a new sample can be predicted with an error of about 1.7 days. In order to apply the clock to a new sample the methylation ratios of this sample at the 63 clock CpGs have to be provided and the command predict.cv of the package glmnet with the trained clock has to be performed.

[0122] TABLE-US-00002 Clock CpGs (genome-wide methylation, alpha = 0.54, lambda = 0.6688101, #CpG’s: 63). ID chromosome position of C (Gallus gallus genome assembly version 5.0) Weighting factor of C in linear model equation, found by glmnet 1 chr1 25008830 0.1201470442 2 chr1 25111167 1.9877222333 3 chr1 25111203 0.8410870744 4 chr1 25215851 1.2141924707 5 chr1 25224869 0.0017297228 6 chr1 38151254 -1.1629356009 7 chr1 98225254 -0.6425407389 8 chr1 98232924 -0.0209456299 9 chr1 98449317 2.4306655726 10 chr1 120302870 -0.5266391056 11 chr1 122381098 0.2068133074 12 chr1 136726442 0.1064898370 13 chr1 136947007 0.7150939026 14 chr1 136947024 -0.1005845211 15 chr1 155719172 -0.3553818254 16 chr1 165952547 -1.0686961681 17 chr2 37156274 -0.7234326082 18 chr2 50274933 -1.9509594895 19 chr2 65836919 0.0012483399 20 chr2 84075642 0.0753794983 21 chr2 97679432 -0.8587247218 22 chr2 124393300 0.0008574341 23 chr2 130294650 -0.2675000430 24 chr3 37670532 0.2441000022 25 chr4 17114627 -0.0144030257 26 chr4 17988283 -0.8468332860 27 chr4 30020074 -0.8090270874 28 chr4 56349569 -0.2213015568 29 chr4 83974212 1.3688948067 30 chr5 12624758 -0.5512961131 31 chr5 41390309 -0.0005570298 32 chr5 41470627 0.4950833229 33 chr5 43219792 0.0102589180 34 chr5 43267010 -0.5840711563 35 chr5 43287871 -0.9018947773 36 chr5 52981453 -1.6852256551 37 chr5 52988321 -0.0034053300 38 chr5 54867941 0.9552754916 39 chr6 10832725 0.7998053520 40 chr8 21190795 -0.3469761054 41 chr8 24062272 -0.0340524553 42 chr8 27668980 -0.0117602447 43 chr10 9910081 0.8644423076 44 chr10 9910314 0.3696403999 45 chr10 9910365 0.5854942871 46 chr10 10336823 0.3635723481 47 chr10 10841972 0.3427303092 48 chr12 11080808 0.8281174497 49 chr12 13217820 -0.6668570734 50 chr13 10242508 -0.0068685759 51 chr14 5256102 0.4507668235 52 chr14 7431894 -1.2268127639 53 chr15 1894201 0.0047490047 54 chr15 7687079 -2.2098760578 55 chr15 9810477 0.3282647376 56 chr17 1680784 0.5952569372 57 chr17 4917324 0.0043603462 58 chr17 4920172 0.1375606500 59 chr18 1275141 -0.3368529858 60 chr21 5371614 -0.2070140738 61 chr23 1181275 -0.4956393650 62 chr24 3272445 -0.6063586075 63 chr24 3311935 -1.6906353118 Intercept of linear model equation found by glmnet: 20.6479265656

[0122] TABLE-US-00003 Clock CpGs (LMR methylation, alpha = 0.54, lambda = 0.610, #CpG’s: 54). ID chromosome position of C (Gallus gallus genome assembly version 5.0) Weighting factor of C in linear model equation, found by glmnet 1 chr1 25248650 0.193774594770605 2 chr1 40591436 -0.568643213730932 3 chr1 40591679 -0.638478225996738 4 chr1 63795100 0.318371926245549 5 chr1 117783796 0.461005791871399 6 chr1 123160777 -0.577270474407336 7 chr1 179633224 -0.628500576957 8 chr2 4113495 -0.826267487270121 9 chr2 7824662 1.64648733634692 10 chr2 16989601 -1.22334588133107 11 chr2 44987393 -3.44735070874419 12 chr2 54405445 -0.0214814175623492 13 chr4 12748089 1.70526212493669 14 chr4 20601950 -2.0957909954246 15 chr4 52871995 1.3114692051496 16 chr4 73741292 -1.12309056504113 17 chr4 81578474 1.26147815810531 18 chr4 84535022 4.3231052389549 19 chr5 12620145 -0.458731514322129 20 chr5 30814076 -0.698405329529931 21 chr5 56027840 -0.482584639740165 22 chr6 10260287 0.880413359385883 23 chr6 12120484 -2.10440924485703 24 chr6 26156764 -0.270864342181805 25 chr6 32055247 -2.69265299854456 26 chr6 33836727 1.15966877632569 27 chr7 31520228 0 28 chr8 22843445 -1.14886729390649 29 chr8 24062272 -2.78258737459703 30 chr8 25204975 -0.664256989487943 31 chr8 27279773 -2.4140540423127 32 chr9 23007215 1.52778880867768 33 chr10 17028547 -0.740108869993362 34 chr10 19606919 -2.61801343344312 35 chr10 19898184 0 36 chr11 8253242 0.920254116414776 37 chr11 16879868 -1.81033374964972 38 chr11 18940226 0 39 chr12 13161054 0 40 chr12 14733821 2.90096891915326 41 chr13 17346033 1.39473025721805 42 chr13 17381809 0 43 chr14 13938035 -1.58848264991683 44 chr15 3464122 0.722266869090579 45 chr15 6016063 -0.251682283409453 46 chr15 6041329 -1.30521623912729 47 chr17 3046991 -0.533407893149151 48 chr18 4397723 -1.46131922207642 49 chr18 4397729 -3.19274009960717 50 chr18 6084926 -2.47686681997711 51 chr24 2956007 -0.010441921766146 52 chr28 3202682 -2.22932163085087 53 chrZ 52701266 0.269214728598885 54 chrZ 56931170 0.22912910718348 Intercept of linear model equation found by glmnet: 27.1994178510791