REGULATORY NUCLEIC ACID MOLECULES FOR ENHANCING GENE EXPRESSION IN PLANTS
20230148071 · 2023-05-11
Inventors
Cpc classification
C12N15/1065
CHEMISTRY; METALLURGY
C12N15/8213
CHEMISTRY; METALLURGY
C12N15/1072
CHEMISTRY; METALLURGY
International classification
Abstract
The present invention is in the field of plant molecular biology and provides methods for production of high expressing promoters and the production of plants with enhanced expression of nucleic acids wherein nucleic acid expression enhancing nucleic acids (NEENAs) are functionally linked to said promoters and/or introduced into plants.
Claims
1. A method for enhancing expression derived from a plant promoter comprising functionally linking to a promoter one or more nucleic acid expression enhancing nucleic acid (NEENA) molecule heterologous to said promoter comprising i) a nucleic acid molecule having the sequence of any one of SEQ ID NOs: 16 to 21, or ii) a nucleic acid molecule having a sequence with an identity of at least 90% to any one of SEQ ID NOs: 16 to 21, which has expression enhancing activity as the corresponding nucleic acid molecule having the sequence of any one of SEQ ID NOs: 16 to 21 or iii) a nucleic acid molecule hybridizing under stringent conditions to a nucleic acid molecule having a sequence of any one of SEQ ID NOs: 16 to 21, or iv) a fragment of 30 or more consecutive bases of a nucleic acid molecule of i) to iii) which has expression enhancing activity as the corresponding nucleic acid molecule having the sequence of any one of SEQ ID NOs: 16 to 21, or v) a nucleic acid molecule which is the complement or reverse complement of any of the previously mentioned nucleic acid molecules under i) to iv).
2. A method for producing a plant or part thereof with, compared to a respective control plant or part thereof, enhanced expression of one or more nucleic acid molecule comprising the steps of a) introducing into the plant or part thereof one or more NEENA molecule comprising a nucleic acid molecule as defined in claim 1 i) to v) and b) functionally linking said one or more NEENA molecule to a promoter and to a nucleic acid molecule being under the control of said promoter, wherein the NEENA molecule is heterologous to said promoter.
3. The method of claim 1 comprising the steps of a) introducing the one or more NEENA molecule comprising a nucleic acid molecule as defined in claim 1 i) to v) into a plant or part thereof and b) integrating said one or more NEENA molecule into the genome of said plant or part thereof whereby said one or more NEENA molecule is functionally linked to an endogenous promoter heterologous to said one or more NEENA molecule and optionally c) regenerating a plant or part thereof comprising said one or more NEENA molecule from said transformed cell.
4. The method of claim 3 wherein the one or more NEENA molecule is integrated into the genome of a plant or part thereof by applying genome editing technologies.
5. The method of claim 4 wherein the genome editing technology comprises the introduction of single or double strand breaks at the position where the one or more NEENA molecule is to be integrated into the genome using nucleic acid guided nucleases, TALEN, homing endonucleases or Zink finger proteins and the introduction of a DNA repair template comprising the NEENA molecule and at its 3′ and 5′ end sequences essentially identical or complementary to the sequences upstream and downstream of the single or double strand break facilitating recombination at the position of the single or double strand break.
6. The method of claim 4 wherein the genome editing technology comprises introduction of point mutations in the genome of the plant or part thereof thereby introducing the sequence of the NEENA in the plant genome.
7. The method of claim 1 comprising the steps of a) providing an expression construct comprising one or more NEENA molecule comprising a nucleic acid molecule as defined in claim 1 i) to v) functionally linked to a promoter heterologous to said one or more NEENA molecule and b) integrating said expression construct comprising said one or more NEENA molecule into the genome of said plant or part thereof and optionally c) regenerating a plant or part thereof comprising said one or more expression construct from said transformed plant or part thereof.
8. The method of claim 1 wherein said one or more NEENA molecule is functionally linked to a promoter upstream or downstream of the translational start site of the nucleic acid molecule the expression of which is under the control of said promoter.
9. The method of claim 1 wherein said one or more NEENA molecule is functionally linked to a promoter within the 5″UTR of the nucleic acid molecule the expression of which is under the control of said promoter.
10. The method of claim 1 wherein said one or more NEENA molecule is functionally linked to a tissue specific, developmental specific or inducible promoter within the 5″UTR of the nucleic acid molecule the expression of which is under the control of said promoter.
11. A recombinant expression construct comprising a NEENA molecule selected from the group of i) the nucleic acid molecule having a sequence of any one of SEQ ID NOs: 16 to 21, and ii) a nucleic acid molecule having a sequence with an identity of at least 90% to any one of SEQ ID NOs: 16 to 21, which has expression enhancing activity as the corresponding nucleic acid molecule having the sequence of any one of SEQ ID NOs: 16 to 21 and iii) a nucleic acid molecule hybridizing under stringent conditions to a nucleic acid molecule having a sequence of any one of SEQ ID NOs: 16 to 21, and iv) a fragment of 30 or more consecutive bases of a nucleic acid molecule of i) to iii) which has expression enhancing activity as the corresponding nucleic acid molecule having the sequence of any one of SEQ ID NOs: 16 to 21, and v) a nucleic acid molecule which is the complement or reverse complement of any of the previously mentioned nucleic acid molecules under i) to iv), functionally linked to one or more promoter and one or more expressed nucleic acid molecule wherein the promoter is heterologous to said one or more NEENA molecule.
12. A recombinant expression vector comprising one or more recombinant expression construct of claim 11.
13. A transgenic cell or transgenic plant or part thereof comprising a recombinant expression construct of claim 11.
14. The transgenic cell, transgenic plant or part thereof of claim 13, selected or derived from the group consisting of bacteria, fungi, yeasts or plants.
15. A transgenic cell culture, transgenic seed, parts or propagation material derived from a transgenic cell or plant or part thereof of claim 14 comprising the recombinant expression construct.
16. (canceled)
17. (canceled)
Description
FIGURES
[0151]
[0152]
[0153]
[0154]
[0155]
EXAMPLES
[0156] Chemicals and Common Methods
[0157] Unless indicated otherwise, cloning procedures carried out for the purposes of the present invention including restriction digest, agarose gel electrophoresis, purification of nucleic acids, Ligation of nucleic acids, transformation, selection and cultivation of bacterial cells were performed as described (Sambrook et al., 1989). Sequence analyses of recombinant DNA were performed with a laser fluorescence DNA sequencer (Applied Biosystems, Foster City, Calif., USA) using the Sanger technology (Sanger et al., 1977). Unless described otherwise, chemicals and reagents were obtained from Sigma Aldrich (Sigma Aldrich, St. Louis, USA), from Promega (Madison, Wis., USA), Duchefa (Haarlem, The Netherlands) or Invitrogen (Carlsbad, Calif., USA). Restriction endonucleases were from New England Biolabs (Ipswich, Mass., USA) or Roche Diagnostics GmbH (Penzberg, Germany). Oligonucleotides were synthesized by Eurofins MWG Operon (Ebersberg, Germany).
Example 1: Discovery of New Candidate Enhancer Sequences from the Wheat Genome
[0158] To identify novel wheat enhancer sequences, 500 genes that either are highly expressed (average CPM above 500) or have a medium expression level (average CPM between 100 and 500) with a low gene expression variability between wheat tissues (low Coefficient of variation), were selected from the wheat genome (IWGSC version 1.0 2017). Gene expression levels were normalized using the TMM method (edgeR).
[0159] From these 500 genes, sequences of the putative promoter, the first intron and the 3′UTR, if available in the genome sequence, were extracted based on the genome annotation and an improved annotation pipeline. For the introns the 5′ 10 nt and the 3′ 20 nt were excluded. Only those sequences that were at least 144 nt long were retained. In total, 1392 sequence features were retained.
[0160] These 1392 sequences were digested in silico at the NheI, XbaI, KpnI, PvuI and SfiI sites. The resulting promoter and intron sequences were split into 144-nt long fragments with a 20-nt overlap, except for the 2 most 3′ fragments for which the overlap is such that the 3′end of the last 144-nt fragment coincides with the 3′end of the original sequence. The 3′ UTR sequences were split in the same way in overlapping 139-nt long fragments and CTAGC was added to the 5′end of each 3′UTR fragment, resulting again in 144-nt long sequence fragments. This whole process resulted in 9919 sequences that are 144 nt in length. 10 sequences (SEQ ID NO 1 to 10) were added to the list. These are sequences from which the impact on expression in wheat protoplasts was known from previous experiments. The P35S and the ALMT1 3′ sequences increase expression of a 35S minimal promoter >10- and 6-fold, respectively, whereas the lambda insulator and ALMT1 5′ sequences do not increase expression of this minimal promoter. The resulting 9929 sequences were cloned into an MPRA library (Melnikov et al. 2014, J. Vis. Exp. (90), e51719) to screen for sequences with promoter enhancing activity. Each sequence was linked to 5 different unique 11-nt long bar codes. The bar codes do not contain AATAAT, AATAAA, ATTTA, TTTTT, restriction sites for NheI, XbaI, KpnI, PvuI and SfiI, differ by at least 2 nt, have no base repeats longer than 2 nt, and do not start with TC. 200-nt long oligos were synthesized containing each of the 49645 query sequence—bar code combinations and the sequences required for amplification and cloning of the library into the expression vectors (
[0161] The resulting plasmid library was transfected into wheat mesophyll protoplasts (4 transfections of 80 μg of plasmid DNA in one million protoplasts). Transfected protoplasts were incubated for 6 hrs and collected for RNA isolation. Total RNA was isolated using a Sigma plant RNA isolation kit. The isolated RNA was eluted in a total volume of 130 μl and had a concentration of 0.54 μg/μl. A DNase treatment was performed on this RNA using the Turbo DNase kit from Ambion applying 2 μl of DNase for 30 min at 37° C.
[0162] Following an RNA denaturation step (5 min at 65° C.), cDNA was synthesized with the Superscript III First Strand Synthesis kit from Thermofisher using oligo dT and 40 μl (=18 μg) of total RNA in a final volume of 100 μl. cDNA synthesis was performed for 50 min at 50° C. followed by 5 min at 80° C. Following cDNA synthesis, RNA was removed using RNase H for 20 min at 37° C.
[0163] In a next step, the bar code containing regions of the cDNA (15 μl RT reaction) and of the plasmid library DNA (1 ng) were amplified by PCR using primers MPRA_SfiI (SEQ ID No 13) and MPRA_R3 (SEQ ID No 14) and InFusion DNA polymerase in HF buffer (final volume of 60 μl).
[0164] PCR conditions:
[0165] 95° C. 2 min
[0166] 25 cycli of 98° C. 30 seconds, 55° C. 30 seconds, 72° C. 30 seconds
[0167] 72° C. 2 min
[0168] The PCR reactions were cleaned up using a 0.8× Agencourt AMPure bead cleanup and eluted in 30 μl water. Appropriate amounts of PCR product were loaded on the MiSeq for 26-bp single read sequencing. For each sample, more than 30 Mio reads were obtained. From these data, the frequency of each bar code within the RNA of the transfected protoplasts as well as within the transfected plasmid DNA library was deduced. The ratio of the bar code abundance in the RNA versus the abundance in the plasmid library DNA is a measure for the expression enhancing activity of the test sequence that is linked to the specific bar code. As each test sequence is linked to 5 different bar codes, each test sequence has 5 RNA/DNA ratios. The median value was used as a measure for the enhancer activity of the tested sequence. A paired t-test was used to test the significance (p<0.05) of the expression increase of specific sequences.
TABLE-US-00005 TABLE 1 Comparison of known enhancer effect of control sequences with observed RNA/DNA ratios in the MPRA expression library. Control sequence Known enhancer activity Observed RNA/DNA ratio P35S −208 to −65 >10-fold 20.05 ALMT1 3′ 6-fold 5.31 ALMT1 5′ none 0.39 Lambda insulator none 0.106
[0169] Results from the control sequences (see Table 1) showed that the 35S enhancer and the ALMT1 3′ sequence had high RNA/DNA ratios, around 20 and 5 respectively, whereas the nonfunctional ALMT1 5′ and Lambda insulator sequences had RNA/DNA ratios that are well below 1. This showed that the RNA/DNA ratios are consistent with the known enhancer activity of these control sequences.
[0170] Table 2 shows 6 query sequences that had an RNA/DNA ratio that is above that of the wheat ALMT1 3′ enhancer and do not contain sequences that are present in high copy number in the wheat genome. These sequences include 2 highly overlapping fragments (EN2161 and EN2162) from the promoter of a low molecular weight glutenin subunit gene, that are shifted by only 20 nt and thus overlap by 124 nt.
Example 2: Validation of Enhancer Sequences in Wheat Protoplasts
[0171] The first 4 sequences listed in Table 2, plus the combined EN2161+EN2162 sequence (164 nt fragment) were cloned upstream of the minimal 35S promoter and the gus coding sequence in plasmid pBay01697 for validation in wheat protoplasts. The resulting plasmids were introduced in wheat mesophyll protoplasts and protein was extracted and GUS activities determined following an overnight incubation of the protoplasts. To correct for differences in introduction efficiency, GUS activities of transfected wheat protoplasts were divided by the luciferase activities from a co-introduced control vector having the firefly luciferase gene under control of the maize ubiquitin promoter (pKA63, SEQ ID NO 15). Wheat protoplast preparation and PEG transfection of wheat protoplasts was performed according to Shang et al. (2014, Nature protocols 9(10), 2395-2410).
[0172] The resulting data show that all the candidate enhancers effectively increased expression from a minimal 35S promoter (
Example 3: Impact of the Wheat Enhancers on Wheat Promoter Activity
[0173] The same set of enhancers were tested with a 1-kb promoter fragment of the 7A trehalose-6-phosphate phosphatase (T6PP) gene (WO/2018/113702, SEQ ID NO. 22). The enhancer fragments were inserted 200 nt upstream of the translation start codon, which is the location in the promoter at which the highest expression increase was obtained when inserting the ALMT1B enhancer (EP 19173869.9). Transient expression of the resulting plasmids in wheat protoplasts showed that all the enhancer fragments increase expression from the wheat T6PP promoter between 6- and 10-fold, which is clearly higher than the ALMT1B enhancer (
Example 4: Impact of the Orientation of the Wheat Enhancers on Wheat Promoter Activity
[0174] Three of the enhancers (EN3233, SEQ ID NO: 19, EN2516, SEQ ID NO: 18, and EN5128, SEQ ID NO: 16) were inserted in the opposite orientation in the promoter of the wheat T6PP gene. Both enhancers remained functional, i.e. increased expression from the wheat promoter (
Example 5: Impact of Duplication of the Wheat Enhancers on Wheat Promoter Activity
[0175] Three of the enhancers (EN3233, SEQ ID NO: 19, EN5128, SEQ ID NO: 16 and EN2516, SEQ ID NO: 18) have been each inserted in a duplicated manner, ie inserting twice said enhancer at one location, in the promoter of the wheat T6PP gene. For both enhancers, duplication increased their activity significantly compared to when only one enhancer is inserted in the same promoter (
Example 6: MPRA Experiment to Map Functional Elements within the Enhancer Sequences
[0176] An MPRA library was synthesized that contains the selected enhancer sequences (EN3233 and EN5128) and each single-nt mutant thereof, together with 2 positive (35S enhancer and ALMT1 3′) and 2 negative (ALMT1 5′ and lambda insulator fragment) control sequences. Each sequence is linked to 19 different barcodes. These sequences were cloned in a plasmid library with the enhancer sequences upstream of the minimal 35S promoter and the bar codes downstream of a gus gene that is under the control of the minimal 35S promoter with linked enhancer. The plasmid library was transfected into wheat protoplasts and the frequency of the bar codes in the expressed RNA was compared to that in the plasmid library as a measure for the activity of the linked enhancer sequence. Based on these results, sequence motifs were selected from EN3233 and EN5128.
[0177] Motifs and mutations therein affecting EN3233 enhancer activity: Two motifs have been identified as comprising the most important positions for enhancer activity: [0178] First motif: CAGGTTCAACGAACGC (SEQ ID NO: 23), nucleotides corresponding to the nucleotides at position 79 to 94 of SEQ ID NO: 19 [0179] Second motif: GTCCACCAGCGCCAGCCGCCT (SEQ ID NO: 24), nucleotides corresponding to the nucleotides at position 106 to 126 of SEQ ID NO: 19.
[0180] Therefore a fragment of SEQ ID NO: 19 is functional when it comprises the first motif of nucleotides corresponding to the nucleotides at position 79 to 94 of SEQ ID NO: 19, the second motif of nucleotides corresponding to the nucleotides at position 106 to 126 of SEQ ID NO: 19 or both motifs.
[0181] Some mutations had a negative effect on the enhancer activity. In the first motif the enhancer activity was reduced compared to the one of SEQ ID NO: 19 by replacing the nucleotide at position 79, 88 or 92 with an A or G nucleotide, replacing the nucleotide at position 80, 85, 87 or 91 with a T or G nucleotide, replacing the nucleotide at position 81, 83 or 84 with a C or A nucleotide, replacing the nucleotide at position 82 with a T or C nucleotide, replacing the nucleotide at position 86 or 90 with a G or C nucleotide, replacing the nucleotide at position 89 with a T or A nucleotide, replacing the nucleotide at position 93 with a T nucleotide, or replacing the nucleotide at position 94 with a G nucleotide. In the second motif the enhancer activity was reduced compared to the one of SEQ ID NO: 19 by replacing the nucleotide at position 106, 113 or 114 with a T or C nucleotide, replacing the nucleotide at position 107, 110 or 119 with a C or G nucleotide, replacing the nucleotide at position 108 with a G nucleotide, replacing the nucleotide at position 109, 115 or 116 with a T or A nucleotide, replacing the nucleotide at position 111, 121 or 125 with an A or G nucleotide, replacing the nucleotide at position 112, 117, 118, 122 or 124 with a G or T nucleotide, replacing the nucleotide at position 120 with any other nucleotide (A, T or C), replacing the nucleotide at position 123 with a T nucleotide, or replacing the nucleotide at position 126 with a C nucleotide.
[0182] Some mutations however demonstrated a positive effect on the enhancer activity. In the first motif the enhancer activity was increased compared to the one of SEQ ID NO: 19 by replacing the nucleotide at position 94 with a T nucleotide. In the second motif the enhancer activity was increased compared to the one of SEQ ID NO: 19 by replacing the nucleotide at position 106 with an A nucleotide, replacing the nucleotide at position 109 with a G nucleotide, or replacing the nucleotide at position 114 with an A nucleotide.
[0183] Motifs and mutations therein affecting EN5128 enhancer activity:
[0184] One weak motif has been identified as comprising the most important positions for enhancer activity: ATTGG, nucleotides corresponding to the nucleotides at position 135 to 139 of SEQ ID NO: 16.
[0185] Therefore a fragment of SEQ ID NO: 16 is functional when it comprises this motif of nucleotides corresponding to the nucleotides at position 135 to 139 of SEQ ID NO: 16.
[0186] Some mutations had a negative effect on the enhancer activity. The enhancer activity was reduced compared to the one of SEQ ID NO: 16 by replacing the nucleotide at position 138 with a T nucleotide.
[0187] Some mutations had a positive effect on the enhancer activity. The enhancer activity was increased compared to the one of SEQ ID NO: 16 by replacing the nucleotide at position 138 with a C nucleotide.
TABLE-US-00006 TABLE 2 List of MPRA query sequences that were validated for enhancer activity by transfection in wheat protoplasts. If elements originate from the same gene this is marked with a grey background. Sequence names include 3 fields, separated by _: field 1, prom: promoter field 2, IWGSC genome annotation field 3, the chromosome and coordinates of the sequence. Gene ID Gene annotation Type ratio P value Number Seq ID Sequence coordinates CS6A01G101800 calnexin homolog promoter 8.80 0.01574 EN5128 16 prom_TraesCS6A01G101800_chr6A:70569513-70569656 CS1B01G131300 dihydrolipoyl promoter 8.41 0.02180 EN3638 17 prom_TraesCS1B01G131300_chr1B:163100404-163100547 dehydrogenase 1, mitochondrial-like CS2B01G386900 remorin-like promoter 8.17 0.00603 EN2516 18 prom_TraesCS2B01G386900_chr2B:550039229-550039372 CS5D01G027600 cysteine synthase promoter 12.05 0.05373 EN3233 19 prom_TraesCS5D01G027600_chr5D:25277779-25277922 CS1B01G011700 low promoter 5.33 0.02681 EN2161 20 prom_TraesCS1B01G011700_chr1B:5688160-5688303 molecular weight glutenin subunit CS1B01G011700 low promoter 9.95 0.10660 EN2162 21 prom_TraesCS1B01G011700_chr1B:5688180-5688323 molecular weight glutenin subunit