Diagnostic test for skeletal atavism in horses

Abstract

The present invention relates to methods for detecting a genetic deletion at the SHOX locus of a horse, where the presence of such a genetic deletion indicates that the horse is a carrier of disease-causing mutation that can lead to skeletal atavism. The invention further provides nucleic acid primers and probes for use in methods for detecting the presence or absence of disease-causing genetic deletion at the SHOX locus of a horse.

Claims

1. A method for breeding a horse, said method comprising the steps, (i) extracting DNA from a biological sample obtained from said horse, (ii) determining in said DNA two copies SEQ ID NO: 18, and (iii) breeding said horse.

2. The method according to claim 1, comprising the amplification of a nucleic acid segment by means of the polymerase chain reaction (PCR).

3. The method according to claim 1, comprising hybridizing a primer or primer pair under stringent conditions to the sequence SEQ ID NO: 18.

Description

LEGEND TO FIGURES

(1) FIG. 1: Depth of sequence read coverage observed in whole genome resequencing. Shown is the depth at the SHOX locus for six Skeletal Atavism cases (CG1-6) and for the pool of healthy control stallions. Boxes have been inserted to visualize approximate locations of the two deletions in the EquCab2 assembly context.

EXAMPLES

(2) Methods

(3) Illumina Sequencing and Sequence Analysis

(4) DNA samples from six Shetland ponies diagnosed by veterinarians with Skeletal Atavism (SA), individuals CG1, CG2, CG3, CG4, CG5 and CG6, as well as a control pool consisting of equimolar amounts of DNA from 22 stallions (who had never fathered Atavistic foals despite having fathered many foals) were prepared for sequencing. Illumina paired-end libraries were generated from these DNA samples with mean insert sizes of approximately 220 bp. The two libraries were sequenced using an Illumina HiSeq instrument as paired-end reads (2×100 bp). The reads were mapped to the horse reference genome assembly [5] using the software BWA [6], and PCR-duplicates were removed using the software Picard (http://picard.sourceforge.net). The average read depth obtained was approximately 7× for each SA individual and approximately 55× average depth for the control pool. SNPs and small insertions/deletions were called from the mapping data after subjecting the alignments to realignment around indels and then variant calling using the Genome Analysis Toolkit (GATK) [8]. The variant calls were subjected to recommended VariantFiltrationWalker filters for SNPs listed in the GATK wiki page (http://www.broadinstitute.org/gsa/wiki/index.php/The_Genome_Analysis_Toolkit).

(5) The software SAMtools [7] was used to determine sequence read depths observed in windows of one kilobase over the whole genome and candidate deletions and duplications were called using these depths. Furthermore the paired read mapping distances as well as strands were used to detect structural variations in relation to the reference assembly and in relation to the control pool.

(6) Digital Droplet PCR (ddPCR)

(7) The ddPCR reaction mixtures consisted of 11 μl 2×ddPCR Supermix for probes (Bio-Rad), 1.1 μl of the primer/probe mix for one of the deletion assays and 1.1 ul of the RNAsc P reference gene primer/probe mix (900 nM final concentration of each primer, 250 nM of probe) and 1 μl of sample DNA (concentration 20 ng/μl) in a final volume of 22 μl (see Table 1 for primer and probe sequences). 20 μl of reaction mixture was loaded into a disposable plastic cartridge (DG8, Bio-Rad) together with 70 μl of droplet generation oil (DG oil, Bio-Rad) and placed in the QX100 droplet generator (Bio-Rad). The droplets generated from each sample were then transferred to a 96-well Twin Tec semi-skirted PCR plate (Eppendorf, Germany) which was heat-sealed with Easy Pierce Foil (Thermo).

(8) TABLE-US-00001 TABLE 1 Sequences of primers and probes used in digital droplet PCR to genotype deletions at the Equine SHOX locus SEQ 5′-modi- 3′-modi- Name Target Type Sequence* ID NO fication fication EqD1_F Deletion 1 primer TCCCCGRGTGTGGAAAGTTA 1 None None EqD1_R Deletion 1 primer CCACAAAGCACATCCGTTTA 2 None None EqD1_probe Deletion 1 probe ACGGGAAGGAGGGGGCCC 3 FAM MGB EqD2_F Deletion 2 primer CCMGCTTTTGTCCCTTAAAC 4 None None EqD2_R Deletion 2 primer TCCAGGCGATTTCCAACTAA 5 None None EqD2_probe Deletion 2 probe CCAGCTCTGGGCTCGGCTCC 6 FAM MGB Eq_RNAseP_F RNAse P primer GTTCCAAGCTCCGGCTAAG 7 None None Eq_RNAseP_R RNAse P primer GGAGGTGGGTTCCCAGAG 8 None None Eq_RNAseP_probe RNAse P probe TCTGCCCTCGCGCGGAGC 9 VIC MGB *Non-standard bases correspond to nucleic acid ambiguity codes and indicate positions where mixed bases have been incorporated in primers

(9) PCR amplification was carried out on a T1000 Touch thermal cycler (Bio-Rad) using a thermal profile beginning at 95° C. for 10 min, followed by 40 cycles of 94° C. for 30 s and 60° C. for 60 s, 1 cycle of 98° C. for 10 min, and ending at 4° C. After amplification, the plate was loaded on the droplet reader (Bio-Rad) and the droplets from each well of the plate were read automatically. ddPCR data were analyzed using the QuantaSoft analysis software (Bio-Rad).

(10) Identification of the Mutations Causing Skeletal Atavism

(11) To identify the mutation(s) causing skeletal atavism in Shetland ponnies whole genome resequencing of affected horses and controls were performed. DNA samples from six individual Swedish Shetland ponies diagnosed with skeletal atavism, and a pool of 22 unaffected control stallions were sequenced using Illumina Hi-seq technology with resulting average sequence depths of 7× (each affected individual) and 55× (control pool). After aligning the reads to the reference genome assembly EquCab2 [5] (http://www.ncbi.nlm.nih.gov/assembly/GCA_000002305.1/) using the software BWA [6]. SAMtools [7] was used to determine genome wide depth of coverage for each sequenced sample and GATK [8] to call polymorphisms and to determine genotypes in the samples and to estimate the allele frequencies in the pool.

(12) Next single nucleotide polymorphisms (SNPs) were screened for where each affected horse was homozygous for the variant allele and where the reference allele frequency was 100% in the control pool. In total, 25 SNPs fulfilled this criterion, and 17 of these were located to a very fragmented unplaced scaffold that contains the short stature homeobox (SHOX) gene.

(13) An analysis of the depth of sequence read coverage revealed two separate partially overlapping deletions estimated to be at least 116 Kb and 34 Kb, respectively (FIG. 1) and involved the genome assembly contigs listed in Table 2.

(14) TABLE-US-00002 TABLE 2 Horse sequence contigs predicted to be part of the two identified deletions based on depth of sequence coverage from the six cases and the control pool * Predicted to be SEQ ID {circumflex over ( )} Scaffold accession {circumflex over ( )} Contig accession part of deletion(s) NO NW_001867655.1 AAWR02042945.1 D1 AND D2 10 NW_001867655.1 AAWR02042946.1 D1 AND D2 11 NW_001867655.1 AAWR02042947.1 D1 AND D2 12 NW_001867655.1 AAWR02042948.1 D1 AND D2 13 NW_001867655.1 AAWR02042949.1 D1 AND D2 14 NW_001867655.1 AAWR02042950.1 D1 AND D2 15 NW_001867655.1 AAWR02042951.1 D1 AND D2 16 NW_001867655.1 AAWR02042952.1 D1 AND D2 17 NW_001867655.1 AAWR02042953.1 D1 AND D2 18 NW_001867655.1 AAWR02042954.1 D1 AND D2 19 NW_001867655.1 AAWR02042955.1 D1 AND D2 20 NW_001867655.1 AAWR02042956.1 D1 AND D2 21 NW_001867655.1 AAWR02042957.1 D1 AND D2 22 NW_001867655.1 AAWR02042958.1 D1 AND D2 23 NW_001867655.1 AAWR02042959.1 D1 AND D2 24 NW_001867655.1 AAWR02042960.1 D1 AND D2 25 NW_001867655.1 AAWR02042961.1 D1 AND D2 26 NW_001867655.1 AAWR02042962.1 D1 AND D2 27 NW_001867655.1 AAWR02042963.1 D1 AND D2 28 NW_001867655.1 AAWR02042964.1 D1 AND D2 29 NW_001867655.1 AAWR02042965.1 D1 AND D2 30 NW_001867655.1 AAWR02042966.1 D1 AND D2 31 NW_001867655.1 AAWR02042967.1 D1 AND D2 32 NW_001867655.1 AAWR02042968.1 D1 AND D2 33 NW_001867655.1 AAWR02042969.1 D1 AND D2 34 NW_001867655.1 AAWR02042970.1 D1 AND D2 35 NW_001867655.1 AAWR02042971.1 D1 AND D2 36 NW_001867655.1 AAWR02042972.1 D1 AND D2 37 NW_001867655.1 AAWR02042973.1 D1 AND D2 38 NW_001867655.1 AAWR02042974.1 D1 AND D2 39 NW_001867655.1 AAWR02042975.1 D1 AND D2 40 NW_001867655.1 AAWR02042976.1 D1 AND D2 41 NW_001867655.1 AAWR02042977.1 D1 AND D2 42 NW_001867655.1 AAWR02042978.1 D1 AND D2 43 NW_001867655.1 AAWR02042979.1 D1 AND D2 44 NW_001867655.1 AAWR02042980.1 D1 AND D2 45 NW_001867655.1 AAWR02042981.1 D1 AND D2 46 NW_001867655.1 AAWR02042982.1 D1 AND D2 47 NW_001867655.1 AAWR02042983.1 D1 AND D2 48 NW_001867655.1 AAWR02042984.1 D1 AND D2 49 NW_001867655.1 AAWR02042985.1 D1 AND D2 50 NW_001867655.1 AAWR02042986.1 D1 AND D2 51 NW_001867655.1 AAWR02042987.1 D1 AND D2 52 NW_001867655.1 AAWR02042988.1 D1 AND D2 53 NW_001867655.1 AAWR02042989.1 D1 AND D2 54 NW_001867655.1 AAWR02042990.1 D1 AND D2 55 NW_001867655.1 AAWR02042991.1 D1 AND D2 56 NW_001867655.1 AAWR02042992.1 D1 AND D2 57 NW_001867655.1 AAWR02042993.1 D1 AND D2 58 NW_001867809.1 AAWR02043090.1 D1 59 NW_001867809.1 AAWR02043091.1 D1 60 NW_001867809.1 AAWR02043092.1 D1 61 NW_001867809.1 AAWR02043093.1 D1 62 NW_001867809.1 AAWR02043094.1 D1 63 NW_001867809.1 AAWR02043095.1 D1 64 NW_001867809.1 AAWR02043096.1 D1 65 NW_001867809.1 AAWR02043097.1 D1 66 NW_001867809.1 AAWR02043098.1 D1 67 NW_001867809.1 AAWR02043099.1 D1 68 NW_001867809.1 AAWR02043100.1 D1 69 NW_001867809.1 AAWR02043101.1 D1 70 NW_001867809.1 AAWR02043102.1 D1 71 NW_001867809.1 AAWR02043103.1 D1 72 NW_001867809.1 AAWR02043104.1 D1 73 NW_001869437.1 AAWR02043982.1 D1 74 NW_001869437.1 AAWR02043983.1 D1 75 NW_001869437.1 AAWR02043984.1 D1 76 NW_001869437.1 AAWR02043985.1 D1 77 NW_001869437.1 AAWR02043986.1 D1 78 NW_001869437.1 AAWR02043987.1 D1 79 NW_001870009.1 AAWR02044192.1 D1 80 NW_001870009.1 AAWR02044193.1 D1 81 NW_001870009.1 AAWR02044194.1 D1 82 NW_001870009.1 AAWR02044195.1 D1 83 NW_001870009.1 AAWR02044196.1 D1 84 NW_001870009.1 AAWR02044197.1 D1 85 NW_001873507.1 AAWR02044981.1 D1 86 NW_001873507.1 AAWR02044982.1 D1 87 NW_001873507.1 AAWR02044983.1 D1 88 NW_001875146.1 AAWR02045249.1 D1 89 NW_001875146.1 AAWR02045250.1 D1 90 NW_001876884.1 AAWR02045517.1 D1 AND D2 91 NW_001876884.1 AAWR02045518.1 D1 AND D2 92 NW_001876884.1 AAWR02045519.1 D1 AND D2 93 NW_001876884.1 AAWR02045520.1 D1 AND D2 94 NW_001876884.1 AAWR02045521.1 D1 AND D2 95 NW_001871185.1 AAWR02049699.1 D1 96 NW_001869338.1 AAWR02043946.1 D1 AND D2 97 NW_001869338.1 AAWR02043947.1 D1 AND D2 98 NW_001869338.1 AAWR02043948.1 D1 AND D2 99 NW_001869338.1 AAWR02043949.1 D1 AND D2 100 NW_001869338.1 AAWR02043950.1 D1 AND D2 101 NW_001869338.1 AAWR02043951.1 D1 AND D2 102 NW_001869338.1 AAWR02043952.1 D1 AND D2 103 NW_001869338.1 AAWR02043953.1 D1 AND D2 104 NW_001867532.1 AAWR02045716.1 D1 AND D2 105 NW_001867532.1 AAWR02045717.1 D1 AND D2 106 NW_001867532.1 AAWR02045718.1 D1 AND D2 107 NW_001867532.1 AAWR02045719.1 D1 AND D2 108 NW_001873348.1 AAWR02051849.1 D1 AND D2 109 NW_001873348.1 AAWR02051850.1 D1 AND D2 110 {circumflex over ( )} Scaffold and Contig accessions: Genbank accession numbers of the reference genome assembly contigs and Scaffolds. * Deletion overlap: Presumed deletion(s) involving the contig.

(15) It was not possible to determine the exact size of the two deletions with confidence using this approach due to the poor assembly of this region. The largest deletion (D1) spans over the entire coding sequence of SHOX while the other (D2) involves the region immediately downstream of the SHOX coding sequence (FIG. 1). SHOX has been mapped to the pseudo-autosomal region (PAR) of the X and Y-chromosomes in other mammals and it is very likely that it is located in the PAR region in horses as well. In humans, mutation and haploinsufficiency of SHOX are associated with idiopathic growth retardation [9,10].

(16) Sequencing of Bacterial Artificial Chromosomes (BACs)

(17) In order to obtain additional sequence information BACs whose ends (BAC-ends) had been previously sequenced as a part of the generation of the horse genome assembly (EquCab2) and that were predicted to reside in the Pseudo-autosomal region close to the SHOX gene were identified. 13 such BACs from the CHORI-241 BAC library (http://bacpac.chori.org/library.php?id=41) made from a Thoroughbred male horse (not carrying the deletions Del1 or Del2), available from the BACPAC resource at the Childrens Hospital Oakland Research Institute (http://bacpac.chori.or&equine241.htm) were identified.

(18) TABLE-US-00003 TABLE 4 Sequenced BACs from SHOX region BAC Size(bp) #scaffolds Min (bp) Max (bp) CH241-087.2_E10 154 201 1 154 201 154 201  CH241-121_P22 218 546 1 218 546 218 546  CH241-231_N3 191 296 1 191 296 191 296  CH241-219B18  67 454 2    899 66 555 CH241-52P20  66 939 2    852 66 087 CH241-288L23 186 195 7    886 57 130 CH241-159K1  47 668 4  1 660 27 250 CH241-050_P17 147 467 1 147 467 147 467  CH241-194_E12 155 628 1 155 628 155 628  CH241-291B18 107 104 3  23 331 45 533 CH241-419P11  73 186 1  73 186 73 186 CH241-442L16  58 892 1  58 892 58 892 CH241-712C2 140 175 1 140 175 140 175  CH241-503B2  11 519 1  11 519 11 519

(19) DNA was prepared from the purchased BACs (Table 4) according to standard laboratory procedures and, for each BAC, purified BAC DNA was subjected to sequencing using the Pacific Biosciences DNA sequencing methodology which is capable of generating long sequencing reads. Following sequencing, generated sequencing reads were subjected to de-novo assembly whereby individual reads from each BAC were assembled together into one or more contigs. The resulting assembled contigs were subsequently used as templates for alignment of the short Illumina sequencing reads from Atavism individuals CG1, CG2, CG3, CG4, CG5 and CG6 as well as the DNA pool comprising normal horses. This alignment information was used to determine sequencing read depth in windows to identify BAC-contigs or parts thereof where depth of coverage was consistent with the genotyped of the Atavistic horses, ie. CG1, CG5 and CG6 are of genotype Del1/Del1 and will therefore entirely lack high confidence read alignments for BAC contig parts corresponding to Deletion 1. Individuals CG2, CG3 and CG4 (Genotype=Del1/Del2) will have approximately half the depth of coverage compared to the pool of DNA from normal horses in the BAC contig parts unique to Deletion 1 and entirely lack coverage in the parts shared between Deletion 1 and Deletion 2.

(20) TABLE-US-00004 TABLE 5 BAC sequences identified to contain Del1 and/or Del2 sequences BAC Scaffold Size (bp) Comment breakpoints between Del1 parts Del2 parts 194E12 194E.scf012 155 628 This scaffold comprises a Normal/Del1 breakpoint found bp 133300-end No Del2 part breakpoint and part of Del1 between 133300-133500 bp SEQ ID NO: 111 50P17 50P17.scf06 147 467 This scaffold comprises a Normal/Del1 breakpoint found bp 82200-end No Del2 part breakpoint and part of Del1 between 82200-82300 bp ? SEQ ID NO: 112 291B18 291B18.scf718013  37 613 This scaffold comprises a 1-37613 bp No Del2 part part of Del1 SEQ ID NO: 113 291B18 291B18.scf014  44 786 This scaffold comprises a Del1/Del2 breakpoint found between 1-5100 bp 5100-25700 bp part of Del1, a breakpoint, a 5100-5500 bp, SEQ ID NO: 114 SEQ ID NO: 115 part of Del2, and another (5100-5400 bp rich in repeats*) 5100-25700 bp breakpoint Del2/normal breakpoint found SEQ ID NO: 115 between 25500-25700 bp 52P20 1698_contig  66 087 This scaffold comprises a Normal/Del2 breakpoint found 37800-53700 bp 37800-53700 bp part of Del2 and a between 37800-38000 bp SEQ ID NO: 116 SEQ ID NO: 116 breakpoint. From 53700 bp contaminated with vector. 712C2 712C2.scf702 140 175 This scaffold comprises a Del2/Normal breakpoint somewhere 1-59600 1-59600 bp part of Del2 and a between 59400-59600 bp SEQ ID NO: 117 SEQ ID NO: 117 breakpoint. *The Del1/Del2 breakpoint was found in a region rich in repeats not making it possible to exactly define the position of the breakpoint.
Genotyping Using ddPCR

(21) Among the six affected horses, three were homozygous D1/D1 and three were D1/D2 composite heterozygotes. We used digital droplet PCR (ddPCR) (Biorad) to genotype 39 Swedish Shetland ponies, 18 known carriers, 6 affected horses and 15 unaffected horses, for the two deletions (D1 and D2). The six affected horses were the same as used for sequencing and we confirmed that three of them were homozygous D1/D1 and three were heterozygous D1/D2 (Table 3).

(22) TABLE-US-00005 TABLE 3 Results of digital PCR analysis of the SHOX locus in horses with or without skeletal atavism. Three alleles occur at this locus: WT = wild type, D1 = Deletion 1, D2 = Deletion 2 Genotype Failed geno- WT/ WT/ WT/ D1/ D1/ D2/ Horse.sup.a typing WT D1 D2 D1 D2 D2 Total Affected 0 0 0 0 3 3 0 6 Carrier 4 2 8 3 0 0 1 18 Unaffected 1 12 2 0 0 0 0 15 Affected horses show skeletal atavism, Carriers are heterozygous for a disease causing mutation while Unaffected may be homozygous wild type or heterozygous for a disease causing mutation.

(23) It was possible to trace the inheritance of these alleles from known carriers to affected offspring. All but two known carriers for which genotypes could be determined were heterozygous for one of the deletions (WT/D1 or WT/D2). All unaffected horses carried at least one copy of the WT allele. Thus, it was concluded that skeletal atavism is caused by two different deletion alleles associated with the SHOX locus. Affected horses may be homozygous D1/D1, heterozygous D1/D2 or possibly homozygous D2/D2. We have observed one carrier with genotype D2/D2 and this individual is not reported as affected suggesting that the D2/D2 genotype may not be associated with skeletal atavism at least not in all individuals with this genotype.

CONCLUSION

(24) In conclusion, two deletions in the SHOX gene causing skeletal atavism in horses have been identified. Methods for detecting the presence of these deletions can now be used to identify unaffected carriers of these mutations and use this information to avoid the risk that a mating will produce an affected offspring. In matings between two carriers 25% of the progeny are expected to show skeletal atavism. The deletions can be detected using digital PCR or quantitative PCR.

REFERENCES

(25) 1. J G Speed: A cause of malformation of the limbs of Shetland ponies with a note on its phylogenic significance. The British Veterinary Journal 1958:18-22. 2. W A Hermans: A hereditary anomaly in Shetland ponies. Neth J vet Sci. 1970, 3(1):55-63. 3. Shamis L D, Auer J: Complete ulnas and fibulas in a pony foal. J Am Vet Med Assoc 1985, 186:802-804. 4. Tyson R, Graham J P, Colahan P T, Berry C R: Skeletal atavism in a miniature horse. Vet Radiol Ultrasound 2004, 45:315-317. 5. Wade C M, Giulotto E, Sigurdsson S, Zoli M, Gnerre S, Imsland F, Lear T L, Adelson D L, Bailey E, Bellone R R, et al.: Genome sequence, comparative analysis, and population genetics of the domestic horse. Science 2009, 326:865-867. 6. Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009, 25:1754-1760. 7. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Subgroup GPDP: The Sequence Alignment/Map format and SAMtools. Bioinformatics 2009, 25:2078-2079. 8. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, et al.: The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010, 20:1297-1303. 9. Blaschke R J, Rappold G: The pseudoautosomal regions, SHOX and disease. Curr Opin Genet Dev 2006, 16:233-239. 10. Rosilio M, Huber-Lequesne C, Sapin H, Carel J C, Blum W F, Cormier-Daire V: Genotypes and phenotypes of children with SHOX deficiency in France. J Clin Endocrinol Metab 2012, 97:E1257-1265.