Genomic mating method for Huaxi cattle based on whole genome single nucleotide polymorphism information and application thereof
12426578 ยท 2025-09-30
Assignee
Inventors
- Xue Gao (Beijing, CN)
- Junya LI (Beijing, CN)
- Yuanqing WANG (Beijing, CN)
- Bo ZHU (Beijing, CN)
- Zezhao WANG (Beijing, CN)
- YAN CHEN (BEIJING, CN)
- Lupei ZHANG (Beijing, CN)
- Lingyang XU (Beijing, CN)
Cpc classification
G16B20/20
PHYSICS
International classification
G01N33/50
PHYSICS
Abstract
Disclosed are a genomic mating method for Huaxi cattle based on whole genome single nucleotide polymorphism (SNP) information and an application thereof. The method includes the following specific steps: step 1, extracting deoxyribonucleic acid (DNA) from to-be-hybridized Huaxi cattle individuals for genotyping; step 2, performing genotype data imputation to obtain high-density chip data; step 3, calculating an additive genetic relationship matrix, utilizing genomic best linear unbiased prediction (GBLUP) to obtain genomic estimated breeding values of five important economic traits of a to-be-hybridized Huaxi cattle population, and calculating a comprehensive selection index of the individuals; and step 4, using a genetic algorithm to construct a population optimal mating combination list. In the present invention, the breeding cost is greatly saved and an inbreeding level of offspring populations is reduced.
Claims
1. A mating method for Huaxi cattle based on whole genome single nucleotide polymorphism (SNP) information, comprising the following specific steps: step 1, extracting deoxyribonucleic acid (DNA) from to-be-hybridized Huaxi cattle individuals, utilizing a Cattle110K gene chip for genotyping, and performing data processing and quality control; step 2, performing genotype data imputation to obtain 770K high-density chip data, performing numerical processing on the genotype data after imputation, and utilizing a whole genome data analysis toolset to convert genotypes AA, Aa and aa into 0, 1 and 2, respectively; step 3, calculating an additive genetic relationship matrix according to a VanRaden algorithm, utilizing genomic best linear unbiased prediction (GBLUP) to obtain genomic estimated breeding values of five important economic traits of a to-be-hybridized Huaxi cattle population, and calculating a comprehensive selection index of the individuals according to the genomic estimated breeding value of each of the traits; and step 4, constructing a population optimal mating combination list by using a genetic algorithm, calculating, according to genotype data of bulls and cows of each mating pair and the additive genetic relationship matrix, an expected comprehensive selection index value and an inbreeding coefficient of an offspring population of each mating pair under the condition of considering mutations by using the genetic algorithm, optimizing a mating combination between to-be-hybridized cows and candidate bulls according to the expected comprehensive selection index value and the inbreeding coefficient, and finally providing a mating list of the to-be-hybridized cows with optimal candidate bulls; wherein the five important economic traits in step 3 comprise carcass weight, calving ease, weaning weight, average daily gain and a dressing percentage, the additive genetic relationship matrix is calculated according to the VanRaden algorithm, and the genomic estimated breeding value is calculated using a GBLUP model, the model being as follows:
y=Xb+Za+e where y represents a phenotypic observation value vector; X is an nf dimensional incidence matrix; b is an f dimensional fixed effect vector; f is the number of fixed effects; Z is a structural matrix associated with a; a represents an additive effect vector and obeys the normal distribution of N (0, G.sub.g.sup.2), G being an additive genome relationship matrix, and .sub.g.sup.2 being an additive genetic variance; and e is a residual vector and obeys the normal distribution of N (0, I.sub.e.sup.2); wherein in step 4, the calculating an expected comprehensive selection index value and an inbreeding coefficient of an offspring population of each mating pair under the condition of considering mutations by using an optimized genetic algorithm model, optimizing a mating combination between to-be-hybridized cows and candidate bulls according to the expected comprehensive selection index value and the inbreeding coefficient, and finally providing a mating list of the to-be-hybridized cows with the optimal candidate bulls comprises the following optimized genetic algorithm model:
Inbreeding(P)=1.sub.N.sub.
Gain(P)=1.sub.N.sub.
2. The mating method for Huaxi cattle based on whole genome SNP information according to claim 1, wherein the quality control in step 1 is that only autosomal sites are retained, sites with a success rate of genotyping less than 90%, a minimum allele frequency (MAF) of less than 0.05 and a Hardy-Weinberg (HW) equilibrium test of less than 0.000001 are eliminated, and the genotyping is performed using the Cattle110K chip.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) For ease of explanation, the present disclosure is described in detail in the following detailed description and accompanying drawings.
(2)
(3)
DETAILED DESCRIPTION
(4) The specific embodiments of the present disclosure will be described below, and the technical solutions of the present disclosure will be further described with reference to the accompanying drawings; but the present disclosure is not limited to these embodiments. In the following description, specific details, such as specific configurations, are provided only to help fully understand the embodiments of the present disclosure. Accordingly, it is clear to those skilled in the art that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the present disclosure.
(5) Unless otherwise specified, the technical means used in the embodiments are conventional and well known to those skilled in the art. Chemical reagents used in the embodiments are commercially available.
(6) A genomic mating method for Huaxi cattle populations based on whole genome SNP information by the present disclosure includes the following steps.
(7)
(8) I. Blood samples are collected from each cow of to-be-hybridized Huaxi cattle populations and frozen for preservation, and DNA is extracted, which is subjected to genotyping by using a Cattle110K gene chip. PLINK 1.0 software is adopted to perform processing and quality control of the data after genotyping, and the quality control standard is as follows: (1) located on an autosome; (2) the MAF is greater than 0.05; (3) a call rate of each SNP marker is greater than 0.9; and (4) a HW equilibrium test P>110.sup.6.
(9) II. According to the 770K chip data of 3928 cattle in a previously established Huaxi cattle reference population, the 110K chip data of the to-be-hybridized population is imputed to 770K high-density chip data (774,660SNPs) using Beagle software. The imputed genotype data is numerically processed using PLINK 1.0 to re-encode genotypes AA, Aa and aa as 0, 1, and 2, respectively.
(10) III. The data obtained in step 2 is utilized, and an additive genetic relationship matrix is constructed using a VanRaden model and an A. mat function in an rrBLUP software package. According to genotype and phenotype data of a Huaxi cattle reference population, an Asreml-R software package is employed to perform GBLUP for genomic estimated breeding values of five important economic traits (carcass weight, calving ease, weaning weight, average daily gain and dressing percentage) of the to-be-hybridized population, and a comprehensive selection index is calculated.
(11)
(12) IV. The first 10% of bulls and the first 90% of cows with GCBI value are selected and retained, the corresponding genotype data are extracted by using PLINK 1.0, and the expected GCBI value and inbreeding coefficient of offspring are calculated for all possible pairs based on the optimized genetic algorithm model by using a TrainSel software package. The optimal mating is solved to maximize the expected GCBI value of offspring while minimizing the inbreeding coefficient.
(13) V. A mating list is provided for each to-be-hybridized cow with the optimal candidate bull.
Embodiment 1 Genomic Mating of Huaxi Cattle Based on Whole Genome SNP Information
(14) Experimental materials: a total of 137 Huaxi breeding bulls from Tongliao Jingyuan Cattle Breeding Co. Ltd., Henan Dingyuan Cattle Breeding Co. Ltd., and other bull stations, as well as a total of 213 Huaxi cows from Jilin Allgenes Agriculture and Animal Husbandry Technology Development Co. Ltd., were selected.
(15) The specific steps are as follows:
(16) I. Blood samples were collected from all Huaxi cattle via vein, and stored in 5 ml EDTA vacuum anticoagulant blood collection tubes (special blood collection tube from MolBreeding) for frozen preservation. The samples were mailed to Shijiazhuang MolBreeding Biotechnology Co. Ltd., and were registered through a MolBreeding sample delivery management system. Genotype information was obtained for the 110K genome SNP (112,180SNPs).
(17) II. Before the analysis, it was necessary to make quality control on genotype data, and PLINK 1.0 was employed to eliminate the unqualified SNP. The quality control criteria and codes for this study were: plink--cow--file filename--geno 0.1--maf 0.05--hwe 0.000001--recode--out filename. After quality control, 350 Huaxi cattle and 106,658 SNPs remained. After quality control, utilizing a reference panel established by Illumina BovineHD 770K high-density chip data from 5099 Huaxi cattle to perform imputation on the 110K chip data of Huaxi cattle populations, the Beagle software was utilized to perform imputation on the missing SNPs after quality control, following the running commands: java-Xmx1000m-jar unphased-file.bgl out-output niterations=100. The Beagle software was utilized to impute the 110K chip data to 770K high-density chip data (774,660SNPs), following the running commands: java-Xmx1000m-jar gt-filename.vcf ref-imputation_ref.vcf.gz out=filename. The obtained genotype files were re-encoded into data of three typing formats: 0, 1 and 2 using PLINK 1.0 software, following the running commands: plink--cow--vcf filename--recode A--out filename. The transformed genotype data was used for subsequent genomic mating of Huaxi cattle.
(18) III. The VanRaden model was used to construct an additive genetic relationship matrix (G matrix) by utilizing an A. mat function in an rrBLUP software package. The Asreml-R software package was used for genetic evaluation, and the GBLUP model was adopted to calculate the genomic estimated breeding values for five traits (carcass weight, calving ease, weaning weight, average daily gain and dressing percentage) of to-be-hybridized Huaxi cattle. A reference population was selected from a reference population of a total of 3928 Huaxi cattle previously established by Institute of Animal Sciences, Chinese Academy of Agricultural Sciences. According to the calculated genomic estimated breeding values of five traits of each of the individuals of the to-be-hybridized population, the comprehensive selection index GCBI was calculated, and ranked by bulls and cows. The genetic parameters of various traits were shown in Table 1.
(19) TABLE-US-00001 TABLE 1 Genetic parameters of various traits Accuracy of genomic Genetic Environmental Phenotypic Heritability estimated Traits variance variance variance (h.sup.2) breeding value Calving 1.69 5.99 7.68 0.22 0.51 ease Weaning 594.84 780 1374.84 0.43 0.56 weight Daily gain 0.0121 0.0131 0.252 0.48 0.61 in Fattening period Carcass 268.69 328.4 597.09 0.45 0.64 weight Dressing 0.0169 0.0394 0.0563 0.3 0.52 percentage
(20) IV. The Huaxi bulls and cows were selected and retained according to the size of GCBI. The selection criteria were: the bull GCBI was greater than 150 and the cow GCBI was greater than 80. Individual IDs and GCBI values selected and retained are shown in Table 2.
(21) TABLE-US-00002 TABLE 2 Number and GCBI values of selected and retained bulls and cows Bull Cow ID Number GCBI value ID Number GCBI value 15217181 302.06 1429 200.77 15419623 254.81 1779 197.75 15421638 245.75 1492 169.50 15420645 214.85 2078 165.65 15420613 210.20 1230 164.22 15420616 197.65 1497 158.12 15420635 196.29 20120701 157.94 15419619 192.61 1236 155.92 15219124 190.54 1071 153.15 15420637 169.48 1753 152.54 15219174 155.06 A051 150.73 15420611 154.98 2843 150.28 15217191 152.68 . . . . . .
(22) V. A total of 13 breeding bulls and 200 cows were obtained for genomic mating. Genotype data of selected and retained bulls and cows was extracted using PLINK 1.0, following the running commands: plink --cow --file filename --keep dam.txt --recode A --out hx_dam, plink --cow --filename --keep sire.txt --recode A --out hx_sire, in which dam.txt and sire.txt were ID lists of the cows and bulls selected and retained, respectively.
(23) VI. The genotypes and corresponding GCBI values of bulls and cows selected and retained in step (4) were input into a TrainSel software package, and the optimized genetic algorithm was utilized for mating. The parameter settings of the genetic algorithm were: npop=200, nelite=10, mutprob=0.01, niterations=800, niterSANN=200, stepSANN-0.01, minitbefstop=200, tolconv=1e-7, nislands=1, mc.cores=1.
(24) VII. The GCBI value scheme maximizing the expectation of offspring was selected from the output results of genetic algorithm, and the mating list of the to-be-hybridized cows with the optimized candidate bulls was obtained (seen in Table 3).
(25) TABLE-US-00003 TABLE 3 Mating list of to-be-hybridized cows with the optimized candidate bulls Candidate cow Candidate cow GCBI Candidate bull Candidate bull GCBI number value number value 1429 200.77 15217191 152.68 1779 197.75 15419623 254.81 1492 169.50 15219124 190.54 2078 165.65 15217181 302.06 1230 164.22 15219174 155.06 1497 158.12 15419619 192.61 20120701 157.94 15420645 214.85 1236 155.92 15420611 154.98 1071 153.15 15219124 190.54 1753 152.54 15420611 154.98 A051 150.73 15420613 210.20 2843 150.28 15217181 302.06 A120 150.11 15217181 302.06 A047 148.69 15420611 154.98 21010902 147.66 15219124 190.54 1469 147.47 15420616 197.65 1965 145.91 15420635 196.29 1374 145.65 15421638 245.75
(26) VIII. Genomic mating effect: to further illustrate the superiority of the genomic mating method, in this data set, the GCBI and inbreeding level of the offspring of genomic mating, homogeneous mating and random mating were evaluated. The results show that the genomic mating scheme has obtained the maximum expected GCBI value of offspring, with the greatest genetic gain (
(27) TABLE-US-00004 TABLE 4 Comparison of average GCBI and inbreeding levels of offspring populations from different mating schemes Mating Average GCBI Average inbreeding coefficient scheme (standard deviation) (standard deviation) Genomic mating 156.663 0.100 0.030 0.000 Homogeneous 146.038 0.100 0.032 0.000 mating Random mating 145.786 0.090 0.031 0.000
(28) The present disclosure can fill the blank of genomic mating of beef cattle in China, solve the problem of how to mating and combine after genomic selection in beef cattle breeding process, provide technical means for beef cattle breeding with high efficiency and quality, accelerate the beef cattle breeding process, promote the rapid development of beef cattle industry, and have great application value and promotion prospects.
(29) Those skilled in the technical field to which the present application belongs can make various modifications or supplements to the described specific embodiments or substitute them in a similar way, without departing from the inventive concept of the present application or exceeding the scope defined by the appended claims.