POLYNUCLEOTIDE
20210251203 · 2021-08-19
Assignee
Inventors
Cpc classification
A01K2217/206
HUMAN NECESSITIES
C12N2830/008
CHEMISTRY; METALLURGY
C12N15/8509
CHEMISTRY; METALLURGY
A01K2217/15
HUMAN NECESSITIES
A01K2227/706
HUMAN NECESSITIES
C12N2800/30
CHEMISTRY; METALLURGY
Y02A50/30
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
A01K2217/072
HUMAN NECESSITIES
International classification
Abstract
The invention relates to polynucleotides, and in particular to novel polynucleotides which represent promoter sequences. The invention is especially concerned with novel promoters for use in germline expression, in that they are substantially operative in only germline cells. In particular, the promoters initiate transcription of genes in the germline cells of an arthropod, and can be used in a gene drive. The invention is also concerned with vectors and gene drive constructs comprising the polynucleotides of the invention. The invention is also concerned with methods of producing arthropods comprising vectors containing such promoters.
Claims
1-20. (canceled)
21. A gene drive genetic construct comprising a polynucleotide comprising: a. a nucleic acid sequence substantially as set out in any one of SEQ ID NO: 1, 2 or 3, or a variant or fragment thereof having at least 50% sequence identity with SEQ ID NO: 1, 2 or 3; b. an expression cassette comprising the polynucleotide of (a); or c. a recombinant vector comprising the polynucleotide of (a) or the expression cassette of (b).
22. The gene drive genetic construct according to claim 21, wherein the polynucleotide sequence substantially restricts the activity of the gene drive genetic construct for germline expression of the construct in an arthropod.
23. The gene drive genetic construct according to claim 22, wherein the arthropod is an insect, optionally wherein the insect is a mosquito.
24. The gene drive genetic construct according to claim 22 wherein the arthropod is a mosquito and wherein the mosquito is of the subfamily Anophelinae, optionally wherein the mosquito is selected from a group consisting of: Anopheles gambiae; Anopheles coluzzi; Anopheles menus; Anopheles arabiensis; Anopheles quadriannulatus; Anopheles stephensi; Anopheles funestus; and Anopheles melas.
25. A method of producing a genetically modified host cell or arthropod comprising; (a) introducing, into a host cell, an expression cassette comprising a polynucleotide comprising a nucleic acid sequence substantially as set out in any one of SEQ ID No: 1, 2 or 3, or a variant or fragment thereof having at least 50% sequence identity with SEQ ID No: 1, 2 or 3, operably linked to a transgene; or (b) introducing, into an arthropod gene, a gene drive genetic construct comprising: (i) a polynucleotide comprising a nucleic acid sequence substantially as set out in any one of SEQ ID No: 1, 2 or 3, or a variant or fragment thereof having at least 50% sequence identity with SEQ ID No: 1, 2 or 3; (ii) an expression cassette comprising the polynucleotide of (i); or (iii) a recombinant vector comprising the polynucleotide of (i) or the expression cassette of (ii).
26-33. (canceled)
34. An expression cassette comprising a polynucleotide comprising a nucleic acid sequence substantially as set out in any one of SEQ ID No: 1, 2 or 3, or a variant or fragment thereof having at least 50% sequence identity with SEQ ID No: 1, 2 or 3, operably linked to a transgene.
35. The expression cassette according to claim 34, wherein the transgene is selected from the group consisting of: a CRISPR nuclease, a Zinc finger nuclease, a TALEN-derived nuclease, Cre recombinase, a piggyback transposase; and a φC31 integrase.
36. The expression cassette according to either claim 34, wherein the transgene is a CRISPR nuclease, optionally wherein the transgene is Cpf1 or Cas9.
37. An expression cassette according to claim 34, wherein the polynucleotide comprises or consists of a nucleic acid sequence having at least 80% or 90% sequence identity with SEQ ID No: 1, or a variant or fragment thereof.
38. An expression cassette according to claim 34, wherein the polynucleotide comprises or consists of a nucleic acid sequence having at least 95% or 99% sequence identity with SEQ ID No: 1, or a variant or fragment thereof.
39. An expression cassette according to claim 34, wherein the polynucleotide sequence comprises or consists of a nucleic acid sequence having at least 80% or 90% sequence identity with SEQ ID No: 2, or a variant or fragment thereof.
40. An expression cassette according to claim 34, wherein the polynucleotide sequence comprises or consists of a nucleic acid sequence having at least 95% or 99% sequence identity with SEQ ID No: 2, or a variant or fragment thereof.
41. An expression cassette according to claim 34, wherein the polynucleotide sequence comprises or consists of a nucleic acid sequence having at least 80% or 90% sequence identity with SEQ ID No: 3, or a variant or fragment thereof.
42. An expression cassette according to claim 34, wherein the polynucleotide sequence comprises or consists of a nucleic acid sequence having at least 95% or 99% sequence identity with SEQ ID No: 3, or a variant or fragment thereof.
43. The expression cassette according to claim 34, wherein the polynucleotide initiates gene expression of a coding sequence operatively connected thereto in the germline cells only.
44. The expression cassette according to claim 34, wherein the polynucleotide sequence is a promoter sequence that is substantially operative in only germline cells of an arthropod, optionally wherein the polynucleotide is a promoter sequence which is substantially operative in the male and female mosquito gonad cells at the time of meiosis.
Description
[0104] For a better understanding of the invention, and to show how embodiments of the same may be carried into effect, reference will now be made, by way of example, to the accompanying Figures, in which:
[0105]
[0106]
[0107]
[0108]
[0109]
[0110]
[0111]
[0112]
[0113]
[0114]
[0115]
[0116]
[0117]
[0118]
[0119]
[0120]
[0121]
EXAMPLES
[0122] The invention described herein relies on inserting site-specific nuclease genes into a locus of choice, in formations that both confer some trait of interest on an individual and lead to a biased inheritance of the trait. The approach relies on “homing” leading to suppression. The invention is focused on population suppression, whereby the gene drive construct is designed to insert within a target gene in such a way that the gene product, or a specific isoform thereof, is disrupted. To build the nuclease-based gene drive of the invention, the nuclease gene is inserted within its own recognition sequence in the genome such that a chromosome containing the nuclease gene cannot be cut, but chromosomes lacking it are cut. When an individual contains both a nuclease-carrying chromosome and an unmodified chromosome (i.e. heterozygous for the gene drive), the unmodified chromosome is cut by the nuclease. The broken chromosome is usually repaired using the nuclease-containing chromosome as a template and, by the process of homologous recombination, the nuclease is copied into the targeted chromosome. If this process, called “homing”, is allowed to proceed in the germline, then it results in a biased inheritance of the nuclease gene, and its associated disruption, because sperm or eggs produced in the germline can inherit the gene from either the original nuclease-carrying chromosome, or the newly modified chromosome.
[0123] Due to the negative reproductive load the gene drive imposes, selection can be expected to occur for resistant alleles. The most likely source of such resistance is sequence variation at the target site that prevents the nuclease cutting yet at the same time permits a functional product from the target gene. Such variation can pre-exist in a population or can be created by activity of the nuclease itself—a small proportion of cut chromosomes, rather than using the homologous chromosome as a template, can instead be repaired by end-joining (EJ), which can introduce small insertions or deletions (“indels”) or base substitutions during the repair of the target site. In-frame indels or conservative substitutions might be expected to show selection in the presence of a gene drive. The inventors have previously observed target site resistance in cage experiments (data not shown) and found that end-joining in chromosomes of the early embryo, due to parentally-deposited nuclease, was likely to be the predominant source of the resistant alleles at the target site.
[0124] In mitigating and preventing the emergence of resistant alleles, the strategy being investigated by the inventors involves reducing the embryonic source of end-joining mutations by expressing the nuclease from promoters that show tighter, germline-restricted expression and less maternal and paternal deposition, e.g. nanos (nos), zero population (zpg), and exuperentia (exu).
[0125] Materials and Methods
[0126] Pooled Amplicon Sequencing of Caged Experiments
[0127] Pooled amplicon sequencing was performed as described before in Hammond and Kyrou (2017).sup.6. Up to 600 adults were homogenized from the cage trial experiments at generations 0, 2, 5, and 8, and extracted in pooled groups using the Wizard Genomic DNA purification kit (Promega). A 332 bp locus spanning the target site was amplified from 90 ng of each genomic sample using KAPA HiFi HotStart Ready Mix PCR kit (Kapa Biosystems) in 50 ul reactions. Primers were designed to include the Illumina Nextera Transposase Adapters (underlined),
TABLE-US-00004 7280-Illumina-F (TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGGAGAAGGTAAATGC GCCAC-SEQ ID No: 63) and 7280-Illumina-R (GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGCGCTTCTACACTC GCTTCT-SEQ ID No: 64)
for downstream library preparation and sequencing. The primers were annealed at 68° C. for 20 seconds to minimize off target amplification. In order to maintain an accurate representation of the allele frequencies at the target site, 25 μL of the PCR reaction was removed at 20 cycles, whilst the reaction was non-saturated, and stored at −20° C. The remnant 25 μL was run for an additional 20 cycles to verify the reaction on an agarose gel. The non-saturated samples were purified with AMPure XP beads (Beckman Coulter) and used in a second PCR reaction in which dual indices and Illumina sequencing adapters from the Nextera XT Index Kit were added according to the Illumina 16S Metagenomic Sequencing Library Preparation protocol (Part #15044223). The PCR was purified again with AMPure XP beads and validated with Agilent Bioanalyzer 2100. The normalized libraries were sequenced in a pooled reaction at a concentration of 10 pM on an Illumina Nano flowcell v2 using the Illumina MiSeq instrument with a 2×250 bp paired-end run.
[0128] Use of zpg Promoter to Drive Cas9 Expression in Gene Drive Constructs
[0129] The gene drive construct targeting dsxF is identical in design to that described in Hammond et al. except for the promoter and 3′ UTR surrounding the Cas9 gene—where previously these were from the ortholog of vasa (AGAP008578), in the current construct these are replaced by 1074 bp upstream and 1034 bp downstream of the germline-specific gene AGAP006241, the putative ortholog of zero population growth (zpg). The inventors performed a comparison of the fertility and homing rates in individuals heterozygous vasa- and zpg-driven gene CRISPR.sup.h constructs at the exact same target locus in AGAP007280, previously described in Hammond et al. (
[0130] Probability of Stochastic Loss of the Drive as a Function of Initial Number of Male Drive Heterozygotes
[0131] To calculate the probability of stochastic loss of the drive in the cage experiment setup, for each initial number (ho) of male drive heterozygous individuals, out of 1000 simulations of the stochastic cage model, The inventors recorded the number of times the drive was not present at 40 generations (and consequently population elimination did not occur). Each data point represents woo individual simulations of the stochastic cage model (
[0132] In Vitro Cleavage Assay Against Wild Type and SNP Variant Target Site
[0133] The inventors performed an in vitro cleavage assay to test the ability of the gRNA used in this study to cleave the target site that incorporates the SNP found in wild populations in Africa (
[0134] Amplification of Promoter and Terminator Sequences
[0135] The published Anopheles gambiae genome sequence provided in Vectorbase (Giraldo-Calderon et al, 2015) was used as a reference to design primers in order to amplify the promoters and terminators of the three Anopheles gambiae genes: AGAP006098 (nanos), AGAP006241 (zero population growth) and AGAP007365 (exuperantia).
[0136] Using the primers provided in Table 3 the inventors performed PCRs on 40 ng of genomic material extracted from wild type mosquitoes of the G3 strain using the Wizard Genomic DNA purification kit (Promega). The primers were modified to contain suitable Gibson assembly overhangs (underlined) for subsequent vector assembly. Promoter and terminator fragments were 2092 bp and 601 bp for nos, 1074 bp and 1034 bp for zpg, and 849 and 1173 bp for exu, respectively. The sequences of all regulatory fragments can be found in Table 4.
[0137] Generation of CRISPR.sup.h Drive Constructs
[0138] The inventors modified available template plasmids used previously in Hammond et al. (2016).sup.2 to replace and test alternative promoters and terminators for expressing the Cas9 protein in the germline of the mosquito. p16501, which was used in that study carried a human optimised Cas9 (hCas9) under the control of the vas22 promoter and terminator, an RFP cassette under the control of the neuronal 3×P3 promoter and a U6:sgRNA cassette targeting the AGAP007280 gene in Anopheles gambiae.
[0139] The hCas9 fragment and backbone (sequence containing 3×P3::RFP and a U6::gRNA cassette), were excised from plasmid p16501 using the restriction enzymes XhoI+PacI and AscI+AgeI respectively. Gel electrophoresis fragments were then re-assembled with PCR amplified promoter and terminator sequences of zpg, nos or exu by Gibson assembly to create new CRISPR.sup.h vectors named p17301 (nos), p17401 (zpg) and p17501 (exu).
[0140] Transformation of Drive Constructs into Genome at AGAP007280
[0141] CRISPR.sup.h constructs containing Cas9 under control of the zpg, nos and exu promoters were inserted into an hdrGFP docking site previously generated at the target site in AGAP007280 (Hammond et al. 2016).
[0142] Anopheles gambiae mosquitoes of the hdrGFP-7280 strain were reared under standard conditions of 80% relative humidity and 28° C., and freshly laid embryos used for microinjections as described before (Fuchs et al, 2013). Freshly-laid embryos were microinjected as described before (Fuchs et al, 2013). Recombinase-mediated cassette exchange (RCME) reactions were performed by injecting each of the new CRISPR.sup.h constructs into embryos of the hdrGFP docking line that was previously generated at the target site in AGAP007280 (Hammond et al. 2016). For each construct, embryos were injected with solution containing CRISPR.sup.h (400 ng/μl) and a vas2::integrase helper plasmid (400 ng/μl) (Volohonsky et al, 2015). Surviving G.sub.o larvae were crossed to wild type transformants identified by a change from GFP (present in the hdrGFP docking site) to DsRed linked the CRISPR.sup.h construct that should indicate successful RCME.
[0143] Molecular Confirmation of Gene Targeting and Cassette Integration
[0144] Successful RMCE integration of CRISPR.sup.h constructs into the genome at AGAP007280 were confirmed by PCR using genomic DNA extracted using the Wizard Genomic DNA purification kit (Promega). Primers binding the integrated cassette (hCas9-F7 and RFP2qF) were used with primers that bind the neighbouring genomic integration site in AGAP007280 (Seq-7280-F and Seq-7280-R) to verify the presence but also the orientation of the CRISPR.sup.h cassette. Primer sequences can be found in (Supplementary Table S2).
[0145] Caged Experiments
[0146] The cage trials were performed following the same principle described before in Hammond et al. (2016). Briefly, heterozygous zpg-CRISPR.sup.h that had inherited the drive from a female parent were mixed with age-matched wild type at L1 at 10% or 50% frequency of heterozygotes. At the pupal stage, 600 were selected to initiate replicate cages for each initial release frequency. Adult mosquitoes were left to mate for 5 days before they were blood fed on anesthetized mice. Two days after, the mosquitoes were left to lay in a 300 ml egg bowl filled with water and lined with filter paper. Each generation, all eggs were allowed two days to hatch and 600 randomly selected larvae were screened to determine the transgenic rate by presence of DsRed and then used to seed the next generation. From generation 4 onwards, adults were blood-fed a second time and the entire egg output photographed and counted using JMicroVision V1.27. Larvae were reared in 2 L trays in 500 ml of water, allowing a density of 200 larvae per tray. After recovering progeny, the entire adult population was collected and entire samples from generation 0, 2, 5, and 8 were used for pooled amplicon sequence analysis.
[0147] Phenotypic Assays to Measure Fertility and Rates of Homing
[0148] Heterozygous CRISPR.sup.h/+ mosquitoes from each of the three new lines zpg-CRISPR.sup.h, nos-CRISPR.sup.h, zpg-CRISPR.sup.h, were mated to an equal number of wild type mosquitoes for 5 days in reciprocal male and female crosses. Females were blood fed on anesthetized mice on the sixth day and after 3 days, a minimum of 40 were allowed to lay individually into a 25-ml cup filled with water and lined with filter paper. The entire larval progeny of each individual was counted and a minimum of 50 larvae were screened to determine the frequency of the DsRed that is linked to the CRISPR.sup.h allele by using a Nikon inverted fluorescence microscope (Eclipse TE200). Females that failed to give progeny and had no evidence of sperm in their spermathecae were excluded from the analysis. Statistical differences between genotypes were assessed using the Kruskal-Wallis test.
[0149] Population Genetics Model
[0150] To model the results of the cage experiments, the inventors used discrete-generation recursion equations for the genotype frequencies, treating males and females separately. F_ij (t) and M_ij (t) denote the frequency of females (or males) of genotype i/j in the total female (or male) population. The inventors considered three alleles, W (wildtype), D (driver) and R (non-functional resistant), and therefore six genotypes.
[0151] Homing
[0152] Adults of genotype W/D produce gametes at meiosis in the ratio W:D:R as follows:
[0153] (1−d.sub.f)(1−u.sub.f):d.sub.f:(1−d.sub.f)u.sub.f in females
[0154] (1−d.sub.m)1−u.sub.m):d.sub.m:(1−d.sub.m)u.sub.m in males
[0155] Here, d_f and d_m are the rates of transmission of the driver allele in the two sexes and u_f and u_m are the fractions of non-drive gametes that are non-functional resistant (R alleles) from meiotic end-joining. In all other genotypes, inheritance is Mendelian. Fitness. Let w_ij≤1 represent the fitness of genotype i/j relative to w_WW=1 for the wild-type homozygote. The inventors assume no fitness effects in males. Fitness effects in females are manifested as differences in the relative ability of genotypes to participate in mating and reproduction. The inventors assume the target gene is needed for female fertility, thus D/D, D/R and R/R females are sterile; there is no reduction in fitness in females with only one copy of the target gene (W/D, W/R).
[0156] Parental Effects
[0157] The inventors consider that further cleavage of the W allele and repair can occur in the embryo if nuclease is present, due to one or both contributing gametes derived from a parent with one or two driver alleles. The presence of parental nuclease is assumed to affect somatic cells and therefore female fitness but has no effect in germline cells that would alter gene transmission. Previously, embryonic EJ effects (maternal only) were modelled as acting immediately in the zygote [1, 2]. Here, the inventors consider that experimental measurements of female individuals of different genotypes and origins show a range of fitnesses, suggesting that individuals may be mosaics with intermediate phenotypes. The inventors therefore model genotypes W/X (X=W, D, R) with parental nuclease as individuals with an intermediate reduced fitness w.sub.WX.sup.10, w.sub.WX.sup.01, or w.sub.WX.sup.11 depending on whether nuclease was derived from a transgenic mother, father, or both. The inventors assume that parental effects are the same whether the parent(s) had one or two drive alleles. For simplicity, a baseline reduced fitness of w.sub.10, w.sub.01, w.sub.11 is assigned to all genotypes W/X (X=W, D, R) with maternal, paternal and maternal/paternal effects, with fitness estimated as the product of mean egg production values and hatching rates relative to wild-type in Table 1 in the deterministic model. In the stochastic version of the model, egg production from female individuals with different parentage is sampled with replacement from experimental values.
TABLE-US-00005 TABLE 1 Parameters for stochastic cage model Parameter Estimate Method of estimation Mating probability 0.85 for heterozygotes; 0 for D/D, Estimated from D/R and R/R homozygotes Hammond et al. 2017 Egg production from Mean 137.4. Sampling with From assays of mated wildtype female replacement of observed values females (no parental nuclease) (10, 61, 96, 98, 111, 111, 113, 127, 128, 129, 132, 132, 134, 135, 137, 138, 138, 139, 142, 142, 146, 146, 149, 152, 152, 152, 158, 160, 162, 164, 170, 179, 186, 189, 191) Egg production from Mean 118.96. Sampling with From assays of mated W/D heterozygote female replacement of observed values (12, females (nuclease from ♀) 31, 76, 90, 96, 100, 106, 106, 107, 113, 117, 118, 119, 130, 133, 136, 136, 136, 137, 138, 139, 142, 143, 145, 146, 148, 157, 174) Egg production from Mean 59.67. Sampling with From assays of mated W/D heterozygote female replacement of observed values females (nuclease from ♂) (0, 0, 0, 0, 0, 34, 47, 50, 65, 105, 113, 115, 115, 125, 126) Hatching probability, 0.941 From assays of mated wildtype female females (no parental nuclease) Hatching probability, 0.707 From assays of mated W/D heterozygote female females (nuclease from ♀) Hatching probability, 0.47 From assays of mated W/D heterozygote female females (nuclease from ♂) Probability of emergence 0.8708 Average of observations from pupa (survival from over all generations and larva) both cage experiments Drive in 0.9985 Observed fraction W/D females transgenic from assays Drive in 0.9635 Observed fraction W/D males transgenic from assays Meiotic EJ parameter 0.4685 Estimated from (fraction non-drive alleles Hammond et al. 2016 that are resistant)
[0158] Recursion Equations
[0159] The inventors firstly considered the gamete contributions from each genotype, including parental effects on fitness. In addition to W and R gametes that are derived from parents that have no drive allele and therefore have no deposited nuclease, gametes from W/D females and W/D, D/R and D/D males carry nuclease that is transmitted to the zygote, and these are denoted as W{circumflex over ( )}*, D{circumflex over ( )}*, R{circumflex over ( )}*. The proportion of type i alleles in eggs produced by females participating in reproduction are given in terms of male and female genotype frequencies below. Frequencies of mosaic individuals with parental effects (i.e., reduced fitness) due to nuclease from mothers, fathers or both are denoted by superscripts 10, 01 or 11.
e.sub.W=(F.sub.WW+w.sub.WW.sup.10F.sub.WW.sup.10+w.sub.WW.sup.01F.sub.WW.sup.01+w.sub.WW.sup.11F.sub.WW.sup.11+(F.sub.WR+w.sub.WR.sup.10F.sub.WR.sup.10+w.sub.WR.sup.01F.sub.WR.sup.01+w.sub.WR.sup.11F.sub.WR.sup.11)/2)/
e.sub.R=1/2(F.sub.WR+w.sub.WR.sup.10F.sub.WR.sup.10+w.sub.WR.sup.01F.sub.WR.sup.01+w.sub.WR.sup.11F.sub.WR.sup.11)/
e*.sub.W=(1−d.sub.f)(1−u.sub.f)(w.sub.WD.sup.10F.sub.WD.sup.10+w.sub.WD.sup.01F.sub.WD.sup.01+w.sub.WD.sup.11F.sub.WD.sup.11)/
e*.sub.D=d.sub.f(w.sub.WD.sup.10F.sub.WD.sup.10+w.sub.WD.sup.01F.sub.WD.sup.01+w.sub.WD.sup.11F.sub.WD.sup.11)/
e*.sub.R=(1−d.sub.f)u.sub.f(w.sub.WD.sup.10F.sub.WD.sup.10+w.sub.WD.sup.01F.sub.WD.sup.01+w.sub.WD.sup.11F.sub.WD.sup.11)/
[0160] The proportions s.sub.i of type i alleles in sperm are:
s.sub.W=(M.sub.WW+M.sub.WW.sup.10+M.sub.WW.sup.01+M.sub.WW.sup.11+(M.sub.WR+M.sub.WR.sup.10+M.sub.WR.sup.01+M.sub.WR.sup.11)/2)/
s.sub.R=(M.sub.RR+(M.sub.WR+M.sub.WR.sup.10+M.sub.WR.sup.01+M.sub.WR.sup.11)/2)/
s*.sub.W=(1−d.sub.m)(1−u.sub.m)(M.sub.WD.sup.10+M.sub.WD.sup.01+M.sub.WD.sup.11)/
s*.sub.D=(M.sub.DD+M.sub.DR/2+d.sub.m(M.sub.WD.sup.10+M.sub.WD.sup.01+M.sub.WD.sup.11))/
s*.sub.R=(M.sub.DR/2+(1−d.sub.m)u.sub.m(M.sub.WD.sup.01+M.sub.WD.sup.10+M.sub.WD.sup.11))/
[0161] Above,
[0162] To model cage experiments, the inventors started with an equal number of males and females, with an initial frequency of wildtype females in the female population of F_WW=1, wildtype males in the male population of M.sub.WW=1/2, and M.sub.WD.sup.01=1/2 heterozygote drive males that inherited the drive from their fathers. Assuming a 50:50 ratio of males and females in progeny, after the starting generation, genotype frequencies of type i/j in the next generation (t+1) are the same in males and females, F.sub.ij(t+1)=M.sub.ij(t+1). Both are given by G.sub.ij(t+1) in the following set of equations in terms of the gamete proportions in the previous generation, assuming random mating:
G.sub.WW(t+1)=e.sub.Ws.sub.W
G.sub.WW.sup.10(t+1)=e*.sub.Ws.sub.W
G.sub.WW.sup.01(t+1)=e.sub.Ws*.sub.W
G.sub.WW.sup.11(t+1)=e*.sub.Ws*.sub.W
G.sub.WD.sup.10(t+1)=e*.sub.Ds.sub.W
G.sub.WD.sup.01(t+1)=e.sub.Ws*.sub.D
G.sub.WD.sup.11(t+1)=e*.sub.Ws*.sub.D+e*.sub.Ds*.sub.W
G.sub.WR(t+1)=e.sub.Ws.sub.R+e.sub.Rs.sub.W
G.sub.WR.sup.10(t+1)=e*.sub.Ws.sub.R+e*.sub.Rs.sub.W
G.sub.WR.sup.01(t+1)=e.sub.Ws*.sub.R+e.sub.Rs*.sub.W
G.sub.WR.sup.11(t+1)=e*.sub.Ws*.sub.R+e*.sub.Rs*.sub.W
G.sub.DD(t+1)=e*.sub.Ds*.sub.D
G.sub.DR(t+1)=(e.sub.R+e*.sub.R)s*.sub.D+e*.sub.D(s.sub.R+s*.sub.R)
G.sub.RR=(e.sub.R+e*.sub.R)(s.sub.R+s*.sub.R)
[0163] The frequency of transgenic individuals can be compared with experiment (fraction of RFP+ individuals):
f.sub.RFP+=F.sub.WD.sup.10+F.sub.WD.sup.01+F.sub.WD.sup.11+F.sub.DD+F.sub.DR+M.sub.WD.sup.10+M.sub.WD.sup.01+M.sub.WD.sup.11+M.sub.DD+M.sub.DR
[0164] All calculations were carried out using Wolfram Mathematica.sup.23.
[0165] PCR
[0166] The PCR reactions were performed using Phusion High Fidelity Master Mix. Initial denaturation was performed in 98° C. for 30 seconds. Primer annealing was performed at a temperature range of 60-72° C. for 30 seconds and elongation was performed at a temperature of 72° C. for 30 seconds per kb.
TABLE-US-00006 TABLE 2 Primers used in this study dsxgRNA-F TGCTGTTTAACACAGGTCAAGCGG-SEQ ID No: 4 dsxgRNA-R AAACCCGCTTGACCTGTGTTAAAC-SEQ ID No: 5 dsxΦ31L-F GCTCGAATTAACCATTGTGGACCGGTCTTGTGTTTAGCAG GCAGGGGA-SEQ ID No: 6 dsxΦ31L-R TCCACCTCACCCATGGGACCCACGCGTGGTGCGGGTCACC GAGATGTTC-SEQ ID No: 7 dsxΦ31R-F CACCAAGACAGTTAACGTATCCGTTACCTTGACCTGTGTTA AACATAAAT-SEQ ID No: 8 dsxΦ31R-R GGTGGTAGTGCCACACAGAGAGCTTCGCGGTGGTCAACG AATACTCACG-SEQ ID No: 9 zpgprCRISPR-F GCTCGAATTAACCATTGTGGACCGGTCAGCGCTGGCGGTG GGGA-SEQ ID No: 10 zpgprCRISPR-R TCGTGGTCCTTATAGTCCATCTCGAGCTCGATGCTGTATTT GTTGT-SEQ ID No: 11 zpgteCRISPR-F AGGCAAAAAAGAAAAAGTAATTAATTAAGAGGACGGCGA GAAGTAATCAT-SEQ ID No: 12 zpgteCRISPR-R TTCAAGCGCACGCATACAAAGGCGCGCCTCGCATAATGAA CGAACCAAAGG-SEQ ID No: 13 dsxin3-F GGCCCTTCAACCCGAAGAAT-SEQ ID No: 14 dsxex6-R CTTTTTGTACAGCGGTACAC-SEQ ID No: 15 GFP-F GCCCTGAGCAAAGACCCCAA-SEQ ID No: 16 dsxex4-F GCACACCAGCGGATCGACGAAG-SEQ ID No: 17 dsxex5-R CCCACATACAAAGATACGGACAG-SEQ ID No: 18 dsxex6-R GAATTTGGTGTCAAGGTTCAGG-SEQ ID No: 19 3xP3 TATACTCCGGCGGTCGAGGGTT-SEQ ID No: 20 hCas9-F CCAAGAGAGTGATCCTGGCCGA-SEQ ID No: 21 dsxex5-R1 CTTATCGGCATCAGTTGCGCAC-SEQ ID No: 22 dsxin4-F GGTGTTATGCCACGTTCACTGA-SEQ ID No: 23 RFP-R CAAGTGGGAGCGCGTGATGAAC-SEQ ID No: 24
TABLE-US-00007 TABLE 3 Primers used to amplify the promoters nos-pr-F GTGAACTTCCATGGAATTACGT- SEQ ID No: 67 nos-pr-R CTTGCTTTCTAGAACAAAAGGATC- SEQ ID No: 68 nos-ter-F GACAGAGTCGTTCGTTCATT-SEQ ID No: 69 nos-ter-R GTAATTAGTGTTCATTTTAG-SEQ ID No: 70 zpg-pr-F CAGCGCTGGCGGTGGGGA-SEQ ID No: 71 zpg-pr-R CTCGATGCTGTATTTGTTGT-SEQ ID No: 72 zpg-ter-F GAGGACGGCGAGAAGTAATCAT- SEQ ID No: 73 zpg-ter-R TCGCATAATGAACGAACCAAAGG- SEQ ID No: 74 exu-pr-F GGAAGGTGATTGCGATTCCATGT- SEQ ID No: 75 exu-pr-R TTTGTACAAGCTACACAAGAGAAGG- SEQ ID No: 76 exu-ter-F GCGTGAGCCGGAGAAAGC-SEQ ID No: 77 exu-ter-R ACTGCTACTGTGCAACACATC-SEQ ID No: 78
TABLE-US-00008 TABLE 4 Primers used to assemble the vectors and verify the insertions nos-pr-CRISPR-F GCTCGAATTAACCATTGTGGACCGGTGTGAACTTCCATGGAATTACGT-SEQ ID No: 79 nos-pr-CRISPR-R TCGTGGTCCTTATAGTCCATCTCGAGCTTGCTTTCTAGAACAAAAGGATC-SEQ ID No: 80 nos-ter-CRISPR-F GCCGGCCAGGCAAAAAAGAAAAAGTAATTAATTAAGACAGAGTCGTTCGTTCATT- SEQ ID No: 81 nos-ter-CRISPR-r TCAACCCTTCAAGCGCACGCATACAAAGGCGCGCCGTAATTAGTGTTCATTTTAG- SEQ ID No: 82 zpg-pr-CRISPR-F GCTCGAATTAACCATTGTGGACCGGTCAGCGCTGGCGGTGGGGA-SEQ ID No: 10 zpg-pr-CRISPR-R TCGTGGTCCTTATAGTCCATCTCGAGCTCGATGCTGTATTTGTTGT-SEQ ID No: 11 zpg-ter-CRISPR-F AGGCAAAAAAGAAAAAGTAATTAATTAAGAGGACGGCGAGAAGTAATCAT-SEQ ID No: 12 zpg-ter-CRISPR-R TTCAAGCGCACGCATACAAAGGCGCGCCTCGCATAATGAACGAACCAAAGG-SEQ ID No: 13 exu-pr-CRISPR-F GCTCGAATTAACCATTGTGGACCGGTGGAAGGTGATTGCGATTCCATGT-SEQ ID No: 83 exu-pr-CRISPR-R TCGTGGTCCTTATAGTCCATCTCGAGTTTGTACAAGCTACACAAGAGAAGG-SEQ ID No: 84 exu-ter-CRISPR-F AGGCAAAAAAGAAAAAGTAATTAATTAAGCGTGAGCCGGAGAAAGC-SEQ ID No: 85 exu-ter-CRISPR-r TTCAAGCGCACGCATACAAAGGCGCGCCACTGCTACTGTGCAACACATC-SEQ ID No: 86 hCas9-F7 CGGCGAACTGCAGAAGGGAA-SEQ ID No: 87 RFP2qF GTGCTGAAGGGCGAGATCCACA-SEQ ID No: 88 Seq-7280-F GCACAAATCCGATCGTGACA-SEQ ID No: 89 Seq-7280-R CAGTGGCAGTTCCGTAGAGA-SEQ ID No: 90
[0167] Results
[0168] To investigate whether dsx represented a suitable target for a gene drive approach aimed at suppressing population reproductive capacity, the inventors disrupted the intron 4-exon 5 boundary of dsx with the objective to prevent the formation of functional AgdsxF while leaving the AgdsxM transcript unaffected. The inventors injected A. gambiae embryos with a source of Cas9 and gRNA designed to selectively cleave the intron 4-exon 5 boundary in combination with a template for homology directed repair (HDR) to insert an eGFP transcription unit (
[0169] The knock-in of the eGFP construct resulted in the complete disruption of the exon 5 (dsxF−) coding sequence and was confirmed by PCR and genomic sequencing of the chromosomal integration (
TABLE-US-00009 TABLE 4 Ratio of larvae recovered by intercrossing heterozygous dsx ΦC31-knock-in mosquitoes GFP strong (dsxF.sup.−/−) GFP weak (dsxF.sup.−/+) no GFP (+/+) Total 262 (24.9%) 523 (49.7%) 268 (25.5%) 1053
[0170] Larvae heterozygous for the exon 5 disruption developed into adult male and female mosquitoes with a sex ratio close to 1:1. On the contrary half of dsxF−/− individuals developed into normal males whereas the other half showed the presence of both male and female morphological features as well as a number of developmental anomalies in the internal and external reproductive organs (intersex).
[0171] To establish the sex genotype of these dsxF−/− intersex, the inventors introgressed the mutation into a line containing a Y-linked visible marker (RFP) and used the presence of this marker to unambiguously assign sex genotype among individuals heterozygous and homozygous for the null mutation. This approach revealed that the intersex phenotype was observed only in genotypic females that were homozygous for the null mutation. The inventors saw no effect in heterozygous mutants, suggesting that the female-specific isoform of dsx is haplosufficient.
[0172] Examination of external sexually dimorphic structures in dsxF−/− genotypic females showed several phenotypic abnormalities including: the development of dorsally rotated male claspers (and absent female cerci), longer flagellomeres associated with male-like plumose antennae (
[0173] Males carrying the dsxF− null mutation in heterozygosity or homozygosity showed wild type levels of fertility as measured by clutch size and larval hatching per mated female, as did heterozygous dsxF− female mosquitoes. On the contrary, intersex XX dsxF−/− female mosquitoes, though attracted to anaesthetised mice were unable to take a bloodmeal and failed to produce any eggs (
[0174] The surprisingly drastic phenotype of dsxF−/− in females is proof of key functional role of exon 5 of dsx in the poorly understood sex differentiation pathway of A. gambiae mosquitoes and suggested that its sequence could represent a suitable target for gene drive approaches aimed at population suppression.
[0175] The inventors employed recombinase-mediated cassette exchange (RMCE) to replace the 3×P3::GFP transcription unit with a dsxFCRISPRh gene drive construct that consists of an RFP marker gene, a transcription unit to express the gRNA targeting dsxF, and the Cas9 gene under the control of the germline promoter of zero population growth (zpg) and its terminator sequence (
[0176] The ability of the dsxFCRISPRh construct to home and bypass Mendelian inheritance was analysed by scoring the rates of RFP inheritance in the progeny of heterozygous parents (referred to as dsxFCRISPRh/+ hereafter) crossed to wild type mosquitoes.
[0177] Surprisingly, high dsxFCRISPRh transmission rates of up to 100% were observed in the progeny of both heterozygous dsxFCRISPRh/+ male and female mosquitoes (
[0178] Surprisingly, the inventors noticed a more severe reduction in the fertility of heterozygous females when the drive allele was inherited from their father (mean fecundity 21.7%+/−8.6%) rather than their mother (64.9%+/−6.9%) (
[0179] To test this hypothesis, caged wild type mosquito populations were mixed with individuals carrying the dsxFCRISPRh allele and subsequently monitored at each generation to assess the spread of the drive and quantify its effect on reproductive output. To mimic a hypothetical release scenario, the inventors started the experiment in two replicate cages putting together 300 wild type female mosquitoes with 150 wt male mosquitoes and 150 dsxFCRISPRh/+ male individuals and allowed them to mate. Eggs produced from the whole cage were counted and 650 eggs were randomly selected to seed the next generations. The larvae that hatched from the eggs were screened for the presence of the RFP marker to score the number of the progeny containing the dsxFCRISPRh allele in each generation. During the first three generations, the inventors observed in both caged populations an increase of the drive allele from 25% up to ˜69% and thereafter they diverged. In cage 2 the drive reached 100% frequency by generation 7; in the following generation no eggs were produced and the population collapsed. In cage 1 the drive allele reached 100% frequency at generation 11 after drifting around 65% for two generations. This cage population also failed to produce eggs in the next generation. Though the two cages showed some apparent differences in the dynamics of spreading both curves fall within the prediction of the model (
[0180] The inventors also monitored at different generations the occurrence of mutations at the target site to identify the occurrence of nuclease resistant functional variants. Amplicon sequencing of the target sequence from pooled population samples collected at generation 2, 3, 4 and 5 revealed the presence of several low frequency indels generated at the cleavage site, none of which appeared to encode for a functional AgdsxF transcript (
[0181] Heterozygous and homozygous individuals for the dsxF.sup.− allele were separated based on the intensity of fluorescence afforded by the GFP transcription unit within the knockout allele. Homozygous mutants were distinguishable as recovered in the expected Mendelian ratio of 1:2:1 suggesting that the disruption of the female-specific isoform of Agdsx is not lethal at the L1 larval stage.
TABLE-US-00010 TABLE 5 Genetic females homozygous for the insertion carry male-specific characteristics Genetic Males Genetic Females Characteristic dsxF.sup.+/+ dsxF.sup.+/− dsxF.sup.−/− dsxF.sup.+/+ dsxF.sup.+/− dsxF.sup.−/− Pupal genital male male male female female male lobe Claspers ✓ ✓ ✓ x x ✓ Cercus x x x ✓ ✓ x Spermatheca x x x ✓ ✓ x MAGs ✓ ✓ ✓ x x ✓ Feed on blood x x x ✓ ✓ x Can lay eggs x x x ✓ ✓ x Plumose ✓ ✓ ✓ x x ✓ antennae Pilose antennae x x x ✓ ✓ x
[0182] The inventors assume that parental effects on fitness (egg production and hatching rates) for non-drive (W/W, W/R) females with nuclease from one or both parents are the same as observed values for drive heterozygote (W/D) females with parental effects. For combined maternal and paternal effects (nuclease from both parents), the minimum of the observed values for maternal and paternal effect is assumed.
TABLE-US-00011 TABLE 6 Summary of values obtained from the cage trials Cage Trial 1 Cage Trial 2 Transgenic Hatching Repr. Egg Repr. Rate Rate Egg Output Load Transgenic Hatching Output Load Generation (%) (%) (N) (%) Rate (%) Rate (%) (N) (%) G0 25 — 27462 — 25 — 26895 — (150/600) (150/600) G1 49.65 88.62 17405 36.62 50 86.15 16578 38.36 (268/576) (576/650) (280/560) (560/650) G2 62.01 74.92 14957 45.54 61.79 80.92 15565 42.13 (302/487) (487/650) (325/526) (526/650) G3 68.94 76.77 11249 59.04 68.05 74.15 9376 65.14 (344/499) (499/650) (328/482) (482/650) G4 67.67 71.85 9170 66.61 85.41 71.69 6514 75.78 (316/467) (467/650) (398/466) (466/650) G5 58.67 69.23 11364 58.62 86.5 61.54 4805 81.13 (264/450) (450/650) (346/400) (400/650) G6 63.3 70 7727 71.86 90.09 52.77 4210 84.35 (288/455) (455/650) (309/343) (343/650) G7 69.47 78.62 7785 71.65 100 55.85 1668 93.8 (355/511) (511/650) (363/363) (363/650) G8 70.07 70.92 6293 77.08 100 42.77 0 100 (323/461) (461/650) (278/278) (278/650) G9 75.58 66.15 4107 85.04 — — — — (325/430) (430/650) G10 95.71 57.38 4146 84.90 (357/373) 373/650 G11 100 57.54 2645 90.37 (374/253) (374/650) G12 100 38.92 0 100 (253/253) (253/650)
[0183] Transgenic rate, hatching rate, egg output and reproductive load at each generation during the cage experiment. The reproductive load indicates the suppression of egg production at each generation compared to the first generation.
[0184] Phenotypic assays were performed to measure simultaneously the fertility and transmission rates for each of three drives (
[0185] Maternally or paternally deposited Cas9 can cause resistant mutations in the embryo that may reduce the rate of homing in the next generation (Hammond & Kyrou et al. 2017). To test this effect, the inventors separated male and female drive heterozygotes by whether they had inherited the drive from their mother or father and scored inheritance of the drive in their progeny (
[0186] Fertility assays were performed to measure the larval output in individual crosses of drive heterozygotes to wild-type (
[0187] Large differences between wild-type controls support this hypothesis. As such, the values above are used only as a rough estimate of fertility that serve to demonstrate the dramatic improvement over vas2.
[0188] To test the potential for zpg-CRISPR.sup.h to spread throughout nave populations of malaria mosquitoes, two replicate cages were initiated with either 10% or 50% of drive heterozygotes, and monitored for 16 generations. Remarkably, the drive spread to more than 97% of the population in all four replicates (
[0189] Resistant mutations arise when there is a change to the target site sequence that prevents further recognition or cleavage by the nuclease, but also encodes a gene product that can rescue against the sterile knock-out phenotype. Though these may be pre-existing in a population, they are overwhelmingly produced by the gene drive itself from error-prone non-homologous end-joining (NHEJ) or microhomology-mediated end-joining (MMEJ) in the small fraction of cleaved chromosomes that are not repaired by homing in the germline, or in the embryo following cleavage by maternally- or paternally-deposited nuclease (Hammond & Kyrou et al. 2017).
[0190] To investigate the nature and frequency of resistance in the zpg-CRISPRh release cages, the inventors performed amplicon sequencing across the target locus in samples of pooled individuals collected before, during and after the emergence of resistance at generations 0, 2, 5 and 8 (
CONCLUSIONS
[0191] The regulatory sequences of zpg, nos and exu described herein offer a clear advantage over and above the current best system (i.e. the vasa2 promoter) used for germline nuclease expression in gene drives designed for the malaria mosquito, showing:
[0192] 1) surprisingly high rates of biased transmission into the offspring of both male and female mosquitoes;
[0193] 2) substantially reduced fitness cost;
[0194] 3) reduced end-joining mutations that are the major cause of resistance to gene drive; and
[0195] 4) vastly improved spread in caged experiments in terms of speed, persistence and maximum frequency of the drive.
[0196] Gene drives based upon these promoter sequences are far superior to all previously tested gene drives and could be used for both population replacement and population suppression strategies. The improvements in gene drive efficacy can be attributed to vast improvements in spatio-temporal regulation of Cas9 nuclease expression that is brought about by the use of these novel regulatory sequences, specifically an improvement in restriction to the germline.
[0197] To illustrate the magnitude of improvement, the inventors observed a relative fitness in females of more than 80% compared to only 7% using the vasa2 promoter, as shown in
[0198] The inventors have demonstrated that gene drives built using these promoters require no further improvement to invade entire mosquito populations and meet the requirements for a gene drive system aimed at population replacement. The regulatory sequences described herein may be used for a range of technologies currently under development, including improvements to mosquito transformation, driving endonuclease genes, and other gene drive technologies that rely upon expression in the mosquito germline.
REFERENCES
[0199] 1. Gantz, V. M. et al. Highly efficient Cas9-mediated gene drive for population modification of the malaria vector mosquito Anopheles stephensi. Proc Natl Acad Sci USA 112, E6736-6743 (2015).
[0200] 2. Hammond, A. et al. A CRISPR-Cas9 gene drive system targeting female reproduction in the malaria mosquito vector Anopheles gambiae. Nat Biotechnol 34, 78-83 (2016).
[0201] 3. Burt, A. Site-specific selfish genes as tools for the control and genetic engineering of natural populations. Proc Biol Sci 270, 921-928 (2003).
[0202] 4. Deredec, A., Godfray, H. C. & Burt, A. Requirements for effective malaria control with homing endonuclease genes. Proc Natl Acad Sci USA 108, E874-880 (2011).
[0203] 5. Hamilton, W. D. Extraordinary sex ratios. A sex-ratio theory for sex linkage and inbreeding has new implications in cytogenetics and entomology. Science 156, 477-488 (1967).
[0204] 6. Galizi, R. et al. A synthetic sex ratio distortion system for the control of the human malaria mosquito. Nat Commun 5, 3977 (2014).
[0205] 7. Magnusson, K. et al. Demasculinization of the Anopheles gambiae X chromosome. BMC Evol Biol 12, 69 (2012).
[0206] 8. Champer, J. et al. Novel CRISPR/Cas9 gene drive constructs reveal insights into mechanisms of resistance allele formation and drive efficiency in genetically diverse populations. PLoS Genet 13, e1006796 (2017).
[0207] 9. Hammond, A. M. et al. The creation and selection of mutations resistant to a gene drive over multiple generations in the malaria mosquito. PLoS Genet 13, e1007039 (2017).
[0208] 10. Marshall, J. M., Buchman, A., Sanchez, C. H. & Akbari, O. S. Overcoming evolved resistance to population-suppressing homing-based gene drives. Sci Rep 7, 3776 (2017).
[0209] 11. Unckless, R. L., Clark, A. G. & Messer, P. W. Evolution of Resistance Against CRISPR/Cas9 Gene Drive. Genetics 205, 827-841 (2017).
[0210] 12. Burtis, K. C. & Baker, B. S. Drosophila doublesex gene controls somatic sexual differentiation by producing alternatively spliced mRNAs encoding related sex-specific polypeptides. Cell 56, 997-1010 (1989).
[0211] 13. Graham, P., Penn, J. K. & Schedl, P. Masters change, slaves remain. Bioessays 25, 1-4 (2003).
[0212] 14. Krzywinska, E., Dennison, N. J., Lycett, G. J. & Krzywinski, J. A maleness gene in the malaria mosquito Anopheles gambiae. Science 353, 67-69 (2016).
[0213] 15. Scali, C., Catteruccia, F., Li, Q. & Crisanti, A. Identification of sex-specific transcripts of the Anopheles gambiae doublesex gene. J Exp Biol 208, 3701-3709 (2005).
[0214] 16. Neafsey, D. E. et al. Mosquito genomics. Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes. Science 347, 1258522 (2015).
[0215] 17. Anopheles gambiae Genomes, C. et al. Genetic diversity of the African malaria vector Anopheles gambiae. Nature 552, 96-100 (2017).
[0216] 18. Murray, S. M., Yang, S. Y. & Van Doren, M. Germ cell sex determination: a collaboration between soma and germline. Curr Opin Cell Biol 22, 722-729 (2010).
[0217] 19. Curtis, C. F. Possible use of translocations to fix desirable genes in insect pest populations. Nature 218, 368-369 (1968).
[0218] 20. National Academies of Sciences, E. & Medicine Gene Drives on the Horizon: Advancing Science, Navigating Uncertainty, and Aligning Research with Public Values. (The National Academies Press, Washington, D.C.; 2016).
[0219] 21. Papathanos, P. A., Windbichler, N., Menichelli, M., Burt, A. and Crisanti, A. The vasa regulatory region mediates germline expression and maternal transmission of proteins in the malaria mosquito Anopheles gambiae: a versatile tool for genetic control strategies. BMC Mol Biol 10, 65, (2009).
[0220] 22. Hammond, A. M. et al. The creation and selection of mutations resistant to a gene drive over multiple generations in the malaria mosquito. PLoS Genet 13, e1007039 (2017).
[0221] 23. Wolfram Research, Inc., 2017 Mathematica 11.2, Champaign, Ill.