METHOD OF HAPLOTYPING
20220267836 · 2022-08-25
Inventors
Cpc classification
C12Q2539/105
CHEMISTRY; METALLURGY
C12Q1/6809
CHEMISTRY; METALLURGY
C12Q1/6809
CHEMISTRY; METALLURGY
C12Q1/6874
CHEMISTRY; METALLURGY
International classification
C12Q1/6809
CHEMISTRY; METALLURGY
Abstract
The present invention relates to detecting aberrant expression of genes which may be associated with a disease or disorder using haplotype phasing. In particular, the invention relates to a method of obtaining an indication of dysregulation between the expression levels of at least two alleles of a gene in a target eukaryotic cell. The method comprises the steps of for a plurality of genes from one or more target eukaryotic cells, (a) obtaining pre-mRNAs of at least two alleles of the same gene; and (b) determining the ratios (Ri,j) between amounts of the pre-mRNAs of one or more pairs of alleles (i,j) of the same gene.
Claims
1. A method of identifying mutations in alleles of a gene which may be causative of dysregulation of the expression levels of the alleles of the gene in a target eukaryotic cell, the method comprising the steps of: for a plurality of genes from one or more target eukaryotic cells, (a) obtaining pre-mRNAs of at least two alleles of the genes; and (b) determining the ratios (R.sub.i,j) between the amounts of pre-mRNAs of one or more pairs of alleles (i,j) of the genes; wherein when R.sub.i,j≠1 for a pair of alleles (i,j) of a gene, or in response to determining that R.sub.i,j≠1 for a pair of alleles (i,j) of a gene, the method additionally comprises the steps: (c) determining the nucleotide sequences of that pair of alleles; and (d) comparing the nucleotide sequences of that pair of alleles in order to identify differences between the nucleotide sequences of that pair of alleles; wherein one or more of the differences between the nucleotide sequences of the pair of alleles of the gene may be mutations which are causative of the dysregulation of the expression levels of the two alleles of that gene in the target eukaryotic cell, wherein the method is performed in a phased genome sequence, and wherein all sequence differences are then attributed to a specific allele to determine allelic skew across the whole gene and these sequence differences are linked with sequence variation outside the body of the gene.
2. The method as claimed in claim 1, wherein when R.sub.i,j is <0.9 or if R.sub.i,j>1.1 for a pair of alleles (i,j) of a gene, or in response to determining that R.sub.i,j is <0.9 or if R.sub.i,j>1.1 for a pair of alleles (i,j) of a gene, the method comprises the steps: (c) determining the nucleotide sequences of that pair of alleles; and (d) comparing the nucleotide sequences of that pair of alleles in order to identify differences between the nucleotide sequences of that pair of alleles; wherein one or more of the differences between the nucleotide sequences of the pair of alleles of the gene may be mutations which are causative of the dysregulation of the expression levels of the two alleles of that gene in the target eukaryotic cell.
3. The method as claimed in claim 1, wherein Step (c) is carried out using RNA-Seq.
4. The method as claimed in claim 3, wherein sequences deriving from specific alleles are identified and counted by identifying, within the RNA-Seq data, sequence changes in the introns, exons and downstream transcribed regions, known to be specific to that allele.
5. The method as claimed in claim 1, wherein if R.sub.i,j<0.9 or R.sub.i,j>1.1, then this provides an indication that there exists a change in the regulatory elements on one allele that controls the expression of the gene.
6. The method as claimed in claim 3, wherein the method additionally comprises the further step of carrying out a sequence-based assay that measures the activity of regulatory elements to detect skew on the same allele of the genes found to be skewed using RNA-seq.
7-12. (canceled)
13. method as claimed in claim 1, wherein the eukaryotic cells are human primary lymphoid cells or primary neuronal cells.
14. The method as claimed in claim 1, wherein the plurality of genes is 2-10, 10-100, 100-500, 500-1000, 1000-5000, 5000-10000, or 10000 or more genes.
15. The method as claimed in claim 1, wherein there are 2 alleles of the same gene in each target eukaryotic cell.
16. The method as claimed in claim 1, wherein the pre-mRNA is polyA.sup.+ mRNA.
17. The method as claimed in claim 1, wherein R.sub.i,j≠1 means that R.sub.i,j is less than 0.95, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.05 or 0.01; or R.sub.i,j is more than 1.05, 1.1, 1.2, 1.4, 1.6, 2, 2.5, 3, 5, 10, 20 or 100.
18. The method as claimed in claim 5, wherein the regulatory elements are ones which exist within the introns of the gene or outside of the body of the gene.
19. The method as claimed in claim 18, wherein the regulatory elements are ones which exist outside of the coding region of the gene.
20. The method as claimed in claim 6, wherein the sequence-based assay is ATAC-seq, DNase-seq or ChIP-seq.
21. The method as claimed in claim 16, wherein the pre-mRNA is polyA-mRNA obtained from total cellular RNA.
Description
BRIEF DESCRIPTION OF THE FIGURES
[0103]
[0104]
[0105]
EXAMPLES
[0106] The present invention is further illustrated by the following Examples, in which parts and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be understood that these Examples, while indicating preferred embodiments of the invention, are given by way of illustration only. From the above discussion and these Examples, one skilled in the art can ascertain the essential characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions.
[0107] Thus, various modifications of the invention in addition to those shown and described herein will be apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims.
Example 1: Use of pre-mRNA in Phased Genomes to Detect Gene Dysregulation Associated with a Specific Haplotype
[0108]
[0109] In Haplotype A, this regulatory element contains a sequence change which alters its activity (shown as a lighter shade). The regulatory interactions between the regulatory element and genes, mapped by 3C methods such as Capture-C, are shown as arced lines with arrows. The sequence changes which distinguish the source allele of the pre-mRNAs from both genes lie within the transcribed portions of the genes (for example introns, exons and downstream regions). In this example the gene dysregulation is caused by a damaged regulatory element, but the same use of phased sequence changes combined with the sequencing of pre-mRNA can be used for any other mechanism (for example, gain-of-function caused by sequence variation or larger scale structural variation).
[0110]
[0111]
[0112]
[0113]
[0114]
[0115]
Example 2: Use of pre-mRNA in Phased Genomes to Detect Dysregulation of the IKZF1 Gene Associated with a Specific Sequence Variant in a Regulatory Element
[0116]
[0117]
[0118]
[0119]
Example 3: The Use of Allelic Skew in pre-mRNA in Phased Genomes Allows for Regulatory Variation to be Analysed at Unprecedented Scale in Primary Cells
[0120]
REFERENCES
[0121] James C et al., Cell, vol. 155, 2013, “Human SNP Links Differential Outcomes in Inflammatory and Infectious Disease to a FOX03-Regulated Pathway”, pages 57-69
[0122] Kowalczyk, M. S. et al. Intragenic enhancers act as alternative promoters. Mol Cell 45, 447-58 (2012).
[0123] Quinn E M, et al. (2013) Development of Strategies for SNP Detection in RNA-Seq Data: Application to Lymphoblastoid Cell Lines and Evaluation Using 1000 Genomes Data. PLoS ONE 8(3): e58815. https://doi.org/10.1371/journal.pone.0058815
[0124] Rainbow et al., BIOCHEMICAL SOCIETY TRANSACTIONS, vol. 36, 2008, “Commonality in the genetic control of Type 1 diabetes in humans and NOD mice: variants of genes in the IL-2 pathway are associated with autoimmune diabetes in both species”, page 312
[0125] Schwessinger R, et al. (2017) Sasquatch: predicting the impact of regulatory SNPs on transcription factor binding from cell- and tissue-specific DNase footprints. Genome Res. 2017 Oct;27(10):1730-1742. PMCID: PMC5630036.
[0126] Sigurdsson et al., HUMAN MOLECULAR GENETICS, vol. 17, 2008, “A risk haplotype of STAT4 for systemic lupus erythematosus is over-expressed, correlates with anti-dsDNA and shows additive effects with two risk alleles of IRF5”, pages 2868-2876
[0127] Thomas et al., EPIGENETICS & CHROMATIC, vol. 4, 2011, “Allele-specific transcriptional elongation regulates monoallelic expression of the IGF2BPI gene”, page 14