Gene And Use Thereof

20170218461 · 2017-08-03

Assignee

Inventors

Cpc classification

International classification

Abstract

The present invention relates to the field of biotechnology, in particular to genes and use thereof. The present invention employs whole genome sequencing to perform whole genome re-sequencing on a large number of individuals of the honey bee Apis mellifera sinisxinyuan, and obtains genes specific to the A. m. sinisxinyuan. The genes play important roles in the differentiation of A. m. sinisxinyuan from the honey bees in other regions and in the adaptive evolution of A. m. sinisxinyuan to the local environment. The Foxo gene or the Ebony gene provided in the present invention can be used to identify A. m. sinisxinyuan from other subspecies; can also be used for studying the genetic diversity of species resources of bees; and can further be used for studying cold resistance genes. This will fill the gap in the research field of A. m. sinisxinyuan by Chinese researchers.

Claims

1. A polynucleotide having: (I) the nucleotide sequence set forth in SEQ ID No. 1 or SEQ ID No. 2; or (II) a sequence complementary to the nucleotide sequence set forth in SEQ ID No. 1 or SEQ ID No. 2; or (III) a sequence which encodes the same protein as that the nucleotide sequence of (I) or (II) does but differs from the nucleotide sequence of (I) or (II) due to genetic codon degeneracy; or (IV) a nucleotide sequence having the nucleotide sequence obtained from the nucleotide sequence set forth in SEQ ID NO: 1 or SEQ ID No. 2 by substitution, deletion or addition of a sequence of one or more nucleotides, and having the same or similar function as that of the nucleotide sequence set forth in SEQ ID NO: 1 or SEQ ID No. 2.

2. The polynucleotide according to claim 1, wherein the sequence of more nucleotides has 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 nucleotides.

3. A recombinant DNA comprising the polynucleotide according to claim 1.

4. A method for use in molecular marker-assisted breeding of Apis mellifera by using of the polynucleotide according to claim 1.

5. A method for use in cold resistance by using the polynucleotide according to claim 1.

6. A method for identifying A. m. sinisxinyuan, comprising: step 1: obtaining the DNA of a species to be tested; step 2: by means of gene alignment, if the polynucleotide according to claim 1 is present, the species to be tested is A. m. sinisxinyuan; while if the polynucleotide according to claim 1 is absent, the species to be tested is not A. m. sinisxinyuan.

Description

DESCRIPTION OF DRAWINGS

[0026] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the office upon request and payment of the necessary fee.

[0027] In order to illustrate the examples of the present invention or the technical solutions in the prior art more clearly, the drawings which are required for use in the examples or the prior art descriptions will be briefly described below.

[0028] FIG. 1 shows the gene trees of the 2 genes in A. m. sinisxinyuan and other representative populations (European dark bee and African honey bee).

[0029] FIG. 2 shows the graph of genomic DNA extraction results; large fragments of DNA with high-quality were obtained from all the samples, and no significant degradation was shown in any of the samples; the “Standard” in the graph is the standard sample loaded with 5 ul (10 ng/ul); M-1 is the Trans 2k plus DNA molecular weight standard, loaded with 2 ul; M-2 is the Trans 15k plus DNA molecular weight standard, loaded with 2 ul; the rest are the sample DNAs;

[0030] FIG. 3(A) shows that several statistics such as F.sub.ST, Tajima's D and θ.sub.π were used to scan the two genes and apparent selected signals were both detected, indicating that these genes were subjected to a specific natural selection in A. m. sinisxinyuan;

[0031] FIG. 3(B) shows that the two genes have special DNA sequences; the SNP sites of the gene region have significant genotype differences as compared to the other subspecies.

DETAILED DESCRIPTION OF EMBODIMENTS

[0032] The present invention discloses a gene of A. m. sinisxinyuan and application thereof, and those skilled in the art can use the content herein for reference to improve the technological parameters appropriately to achieve it. It should be particularly noted that all the similar substitutions and alterations will be apparent to those skilled in the art, and are deemed to be included in the present invention. The method and use of the present invention have been described by way of preferred embodiments, and it will be apparent to the related personnel that the method and use described herein may be altered or appropriately modified and combined to achieve and apply the technology of the present invention without departing from the content, spirit and scope of the present invention.

[0033] The method for obtaining and detecting the gene provided in the present invention is as the follows:

[0034] 1. Sample collection. Collecting living honey bee samples and immediately putting them into 90% ethanol for storage. In further specific embodiments of the present invention, in addition to 90% ethanol for storage, ethanol with higher purities, or liquid nitrogen, dry ice and other low-temperature preservation methods can also be used for sample storage.

[0035] 2. Extracting high-quality DNA (deoxyribonucleic acid) from the samples.

[0036] 3. Subjecting the DNA samples to Illumina high-throughput sequencing to obtain the raw data of DNA sequences. Any high-throughput sequencing platforms can be used for DNA high-throughput sequencing; in addition to the Illumina platform described above, SOLiD platform, 454 sequencing and other second- or third-generation sequencing platforms can also be used.

[0037] 4. Filtering out the low-quality sequences. The rules for filtering include: 1) the number of terminal “N” should be less than or equal to 10% of the sequence length; 2) the number of base with sequencing quality lower than 5 should be no more than 50% of the sequence length.

[0038] 5. Performing sequence alignment using BWA software by taking the western honey bee (Apis mellifera) genome in the NCBI public database as the reference genome (apiMe14.5), wherein the alignment parameter is “-t -k 32-M-R”. The latest version for the reference genome is apiMe14.5, and the reference genome of updated versions can be employed after their publication.

[0039] 6. Obtaining the SNP genotypes of a population using the SAMtools' mpileup program, and filtering the obtained genotypes to obtain the final results. The rules for filtering are: 1) the quality value should be no less than 20; 2) SNPs within 5 bases from a sequence gap should be filtered out; 3) the sequencing depth should be greater than or equal to 4, and less than or equal to 1000; the programs for obtaining SNP genotypes can be any programs that is capable of performing variation test, including the above-described SAMtools, and GATK, CLC and other programs; 4) SNP sites with 3 or more genotypes are removed.

[0040] All of the raw materials and reagents used in gene of A. m. sinisxinyuan and use thereof provided in the present invention are commercially available.

[0041] The present invention is further illustrated in combination with the following examples:

EXAMPLE 1

[0042] (1) Honey bee samples were collected and DNA was extracted; samples with the OD value of the DNA being 1.8-2.0, content over 1.5 μg were considered to be qualified.

[0043] (2) A Library was Constructed with the Qualified DNA Samples:

[0044] The DNA samples tested to be qualified were broken randomly into fragments with a length of 350 bp via a Covaris crusher. TruSeq Library Construction Kit was employed to construct the library and the reagents and consumables recommended in the manual were used strictly. DNA fragments were subjected to end-repair, tail addition, sequencing adaptor addition, purification, PCR amplification and other steps to accomplish the preparation of the whole library. The well-constructed library was sequenced by illumina HiSeq.

[0045] (3) Library Inspection:

[0046] After the library was constructed, Qubit2.0 was used first for preliminary quantification and the library was diluted to 1 ng/μl; subsequently, the insert size of the library was detected with Agilent 2100. After the insert size met the expectation, the effective concentration of the library was accurately qualified by Q-PCR method (effective concentration of the library >2 nM) to ensure the quality of the library.

[0047] (4) Sequencing on Machine:

[0048] With library inspection qualified, illumina HiSeq sequencing was conducted according to the effective concentration of the library and requirements of data output.

[0049] (5) Quality Control:

[0050] Sequenced Reads or raw reads obtained by sequencing contain low-quality reads with adaptors. In order to ensure the quality of information analysis, raw reads must be filtered to obtain clean reads, and all of the following analyses were based on the clean reads. Data processing steps are as follows:

[0051] a. removing paired-end reads with adaptors;

[0052] b. such paired-end reads need to be removed when the content of N contained in the single-end sequencing read exceeds 10% of the read length;

[0053] c. such paired-end reads need to be removed when the number of low-quality (Q<=5) base contained in the single-end sequencing read exceeds 50% of the read length.

[0054] A total of 179 million high-quality double-end sequencing sequences (read length 100 bp) were obtained by re-sequencing 10 A. m. sinisxinyuan individuals, with a total data volume being 17.9 G.

[0055] (6) Sequence Alignment:

[0056] Sequence (clean reads) alignment was conducted with the BWA software, and default values were adopted for all parameters except “-t-k 32-M-R”. With Amel 4.5 (derived from NCBI) taken as the reference genome, the bam files obtained from alignment were sorted with the SAMtools software and the duplicated sequences were removed.

[0057] After sequence alignment, a sequencing depth of 8 × was obtained with a genome coverage rate of about 90%.

[0058] (7) SNP Detection:

[0059] After the bam files were obtained, SNP detection was performed. SNP (single nucleotide polymorphism) mainly refers to DNA sequence polymorphism caused by a single nucleotide variation on genomic level, including transition, transversion, etc. of a single base. SAMTOOLS (mpileup-m2-F 0.002-d 1000) was used for individual SNP detection. In order to reduce the error rate of SNP detection, the following criteria were selected for filtering:

[0060] a. the support number of SNP reads is no less than 4;

[0061] b. the quality value (MQ) of SNPs is no less than 20;

[0062] A total of 1,409,113 SNP sites were detected in A. m. sinisxinyuan as compared with the reference genome.

[0063] (8) SNP Annotation:

[0064] ANNOVA is an efficient software tool that uses the latest information to annotate gene variations detected from multiple genomes. ANNOVAR can perform gene-based annotation, region-based annotations, filter-based annotation, and other functionalities as long as the chromosomes where the variation is located, start sites, stop sites, reference nucleotides and variant nucleotides are given. In view of the powerful annotation capability and international acceptance of ANNOVAR, it was used to annotate SNP detection results.

[0065] The annotation result shows that among the 1,409,113 SNPs, 28,067 are located in the upstream region of the gene (within 1 Kb), 24,778 are located in the downstream region of the gene (within 1 Kb), 62,289 are located in the exon region, 657,772 are located in the intron region, 110 are located at the cleavage sites, and 633,186 are located in the remaining non-gene regions.

[0066] (9) The corresponding gene sequences can be extracted with GATK kit, using the reference genomic sequence and the detected SNP sequences.

[0067] The results are as shown in FIG. 1 to FIG. 3.

[0068] FIG. 1 shows the gene trees of the 2 genes in A. m. sinisxinyuan and other representative populations (European dark bee and African honey bee).

[0069] FIG. 2 shows the graph of genomic DNA extraction results; large fragments of DNA with high-quality were obtained from all the samples, and no significant degradation was shown in any of the samples; the “Standard” in the graph is the standard sample loaded with 5 ul (10 ng/ul); M-1 is the Trans 2k plus DNA molecular weight standard, loaded with 2 ul; M-2 is the Trans 15k plus DNA molecular weight standard, loaded with 2 ul; the rest are the sample DNAs;

[0070] FIG. 3(A) shows that several statistics such as F.sub.ST, Tajima's D and θ.sub.π were used to scan the two genes and apparent selected signals were both detected, indicating that these genes were subjected to a specific natural selection in A. m. sinisxinyuan; FIG. 3(B) shows that the two genes have special DNA sequences; the SNP sites of the gene region have significant genotype differences as compared to the other subspecies.

[0071] The foregoing are only preferred embodiments of the present invention, it should be noted that a number of improvements and modifications may be made thereto by an ordinary skilled in the art without departing from the principles of the present invention, and these improvements and modifications should also be deemed to be within the protection scope of the present invention.