Synthetic Promoters

20170002378 ยท 2017-01-05

    Inventors

    Cpc classification

    International classification

    Abstract

    CHO cell-specific synthetic promoter constructs for expressing recombinant proteins, a library of promoter constructs thereof, and a method for producing the promoter constructs. The promoter constructs enable precise control of recombinant gene transcription over three orders of magnitude, with the top expressing promoters capable of double the transcriptional activity of the CMV promoter.

    Claims

    1. A CHO cell, comprising a synthetic promoter suitable for eliciting recombinant protein expression therein, said synthetic promoter comprising a promoter core and upstream thereof two or more transcription factor regulatory elements independently selected from the group consisting of NFB-RE, E-box, AP1, CRE, GC-Box, E41F, C/EBP-RE, OCT and RARE.

    2. The CHO cell according to claim 1, wherein the promoter core is selected from CMV, SV40, UbC, EF1A, PGK and CAGG.

    3. The CHO cell according to claim 1, wherein the synthetic promoter comprises 2 to 50 transcription factor regulatory elements.

    4. The CHO cell according to claim 1, wherein the transcription factor regulatory elements are all the same type.

    5. The CHO cell according to claim 1, wherein the transcription factor regulatory elements are a combination of different types, which are, optionally, independently selected from NFB-RE, E-box, GC-Box, C/EBP-RE, CRE and E41F.

    6. The CHO cell according to claim 1, wherein the transcription factor regulatory elements are arranged in tandem.

    7. (canceled)

    8. The CHO cell according to claim 1, wherein a) synthetic promoter DNA sequence is 0.9 or less of the size of the full length CMV promoter sequence, and/or b) the synthetic promoter has a transcriptional activity per unit DNA sequence thereof which in greater than the transcriptional activity per unit DNA of CMV promoter.

    9. (canceled)

    10. The CHO cell according to claim 1, wherein the CHO cell is selected from CHO-S, CHO-K1 and CHO-DG44.

    11. The CHO cell according to claim 1, wherein the activity of the transcription factor regulatory element YY1 is inhibited, for example by a block decoy specific to YY1.

    12. The CHO cell according to claim 1, wherein the cell further comprises a polynucleotide sequence encoding a recombinant protein under the control of the synthetic promoter, wherein, optionally, the recombinant protein is an antibody or antigen binding fragment thereof.

    13. (canceled)

    14. The CHO cell according to claim 1, wherein the promoter exhibits improved protein expression in comparison to the promoter core or the wild type promoter, wherein, optionally, the improved protein expression is a greater level of recombinant protein expression.

    15. (canceled)

    16. The CHO cell according to claim 1, wherein the synthetic promoter comprises a sequence given in any one of SEQ ID NOs: 30 to 169.

    17. The CHO cell according to claim 16, wherein the synthetic promoter comprises a) a sequence given in any one of SEQ ID NOs: 126 to 169, or b) the nucleotide sequence given in SEQ ID NO:30, SEQ ID NO:31, SEQ ID NO:32, SEQ ID NO:126, SEQ ID NO:128 or SEQ ID NO:144.

    18. (canceled)

    19. The CHO cell according to claim 1, wherein the synthetic promoter a) does not comprise any CpG islands, and/or b) has properties suited to the expression of the specific recombinant protein it is associated with.

    20. (canceled)

    21. A synthetic promoter suitable for promoting recombinant protein expression in a CHO cell said synthetic promoter comprising a promoter core and upstream thereof a mixture of two or more transcription factor regulatory elements independently selected from the group consisting of NFB-RE, E-box, AP1, CRE, GC-Box, E41F, C/EBP-RE, OCT and RARE.

    22. The synthetic promoter according to claim 21 wherein the transcription factor regulatory elements are independently selected from the group consisting of NFB-RE and CRE.

    23. A method of generating a synthetic promoter suitable for promoting recombinant protein expression in a given mammalian recombinant host cell comprising, a) identifying motifs of transcription factor regulatory elements, b) testing each transcription factor regulatory element identified in a) combined with a promoter core for activity in a chosen mammalian recombinant host cell line, c) selecting two or more transcription factor regulatory elements from (b) which are more active in the chosen mammalian recombinant host cell line than the promoter core alone, d) preparing one or more synthetic promoter constructs comprising two or more of those transcription factor regulatory elements independently selected from those selected in c), e) testing the synthetic promoter construct or constructs prepared in d) for activity in the chosen mammalian recombinant host cell, f) identifying the synthetic promoter construct or constructs that exhibit the same or improved protein expression compared to a wild type promoter wherein the method optionally additionally comprises g) selecting two or more of those transcription factor regulatory elements which are associated with the constructs identified in f), and h) preparing one or more synthetic promoter constructs comprising a TFRE construct comprising or consisting of those elements independently selected in g).

    24. (canceled)

    25. The method according to claim 24, wherein the TFRE constructs prepared in step h comprise transcription factor regulatory elements at a stoichiometry which reflects their relative abundance in the constructs identified in f).

    26. The method according to claim 23, wherein part (f) of the method further comprises identifying the transcription factor regulatory element or elements that are most frequently associated with promoter constructs which exhibit reduced protein expression compared to the wild type promoter and excluding these in (g).

    27. A method of identifying a synthetic promoter suitable for promoting recombinant protein expression in a given mammalian recombinant host cell at a desired level comprising the steps of: a) obtaining two or more synthetic promoter constructs defined in claim 21 or claim 22, b) testing the synthetic promoter constructs obtained in a) to determine the level of recombinant protein expression driven by each construct in the chosen mammalian recombinant host cell, c) selecting a synthetic promoter construct tested in (b) if it promotes recombinant protein expression at the desired level.

    28. The method according to claim 27, wherein the two or more synthetic promoters obtained in step (a) each comprises a sequence independently selected from any one of SEQ ID NOs: 30 to 169 or SEQ ID NOs 126-169.

    29. The method according to claim 27, wherein the desired level of protein expression is higher or lower than that achieved using a wild type promoter.

    30. (canceled)

    31. A method of constructing a transcription factor regulatory element construct library comprising the step of randomly ligating the transcription factor regulatory elements a) NFB-RE and E-box at a ratio of 5:3, or b) NFB-RE, E-box, GC-Box and C/EBP-RE at a ratio of 5:3:1:1.

    32. (canceled)

    33. A CHO cell wherein the activity of the transcription factor regulatory element YY1 activity is knocked down or knocked out by a block decoy specific to YY1.

    34. (canceled)

    35. The CHO cell according to claim 33, wherein the cell further comprises a polynucleotide sequence encoding a recombinant protein under the control of the synthetic promoter.

    Description

    DESCRIPTION OF FIGURES

    [0194] FIG. 1 shows a graph depicting the results of a reporter gene assay to identify transcription factor regulatory elements which are active in CHO cells. 28 transcription factor regulatory elements (TFREs) (see Table 1) were derived from informatics analysis of transcription factor binding sites in common viral promoters known to be active in mammalian cells. [0195] The 28 TFREs were assessed for their transcriptional activities in CHO-S cells (sequence list shown in Table 2) using SEAP/GFP reporter constructs. A schematic diagram of the general structure of the reporter construct is also shown in FIG. 1. Seven copies of each TFRE (as described in Table 1) were cloned in series upstream of a minimal CMV core promoter in reporter vectors encoding either GFP or SEAP reporters. [0196] CHO-S cells (210.sup.5) in 24-well plates were transfected with 1 g of SEAP (black bars) or GFP (white bars) TFRE reporter-vector. SEAP activity in cell culture supernatant and intracellular GFP were measured 24 h post-transfection. Data are expressed as a fold-change with respect to the activity of a vector containing only a minimal CMV core promoter (Core). A random 8 bp sequence with no known homology to TFRE sequences (8mer) was also used as a ve control. Bars represent the mean+SD of three independent experiments each performed in triplicate, using three clonally derived plasmids for each TFRE-reporter construct.

    [0197] FIG. 2 shows a graph which indicates some of the results of a reporter gene assay comparing the expression levels of a selection of Generation 1 synthetic promoters of the present invention (sequences shown in Table 3). [0198] First generation synthetic promoters were constructed by random ligation of NFB, CRE, E-box, GC-box, E4F1 and C/EBP TFREs in equal proportion.

    [0199] Synthetic promoters were inserted upstream of a minimal CMV core promoter in SEAP reporter plasmids and transfected into CHO-S cells. FIG. 2 includes a schematic diagram of the general structure of the Generation 1 synthetic promoters. [0200] SEAP expression was quantified 24 h post-transfection. Data are expressed as a percentage of the production exhibited by promoter 1/01 (see Table 3 for sequence). SEAP production from the control CMV-SEAP reporter is shown as the black bar. Each bar represents the mean of two transfections, for each promoter less than 10% variation in SEAP production was observed.

    [0201] FIG. 3A/B shows a series of graphs, each of which shows the results of an analysis of the abundance of a TFRE relative to the expression levels of the Generation 1 promoter constructs in which the TFRE is present. [0202] The number of each TFRE in each synthetic promoter is plotted against the relative activity of that promoter (A-F). In each case the linear regression line is shown, where the slope of the line indicates the extent to which each TFRE occurs in promoters of varying activity. [0203] Over-mean: higher than average expression level (i.e. high expressing constructs). [0204] Under-mean: lower than average expression level (i.e. low expressing constructs).

    [0205] FIG. 4 shows a graph indicating some of the results of a reporter gene assay comparing the expression levels of a selection of Generation 2 synthetic promoter constructs of the present invention (sequences shown in Table 4). [0206] Second generation synthetic promoters were constructed by random ligation of NFB, E-box, GC-box, C/EBP TFREs in the ratio 5:3:1:1. Synthetic promoters were inserted upstream of a minimal CMV core promoter in SEAP reporter plasmids and transfected into CHO-S cells. SEAP expression was quantified 24 h post-transfection. Data are expressed as a percentage of the production exhibited by CMV control promoter (black bar). SEAP production from the most active promoter from the first generation library (1/01; FIG. 2) reporter is shown as a checked bar. Otherwise, each bar represents the mean of two transfections, for each promoter less than 10% variation in SEAP production was observed. [0207] Promoter construct 1/01 (hatched bar): top Generation 1 promoter which produced the highest levels of expression.

    [0208] FIG. 5 shows a series of graphs, each of which shows the results of an analysis of the abundance of a TFRE relative to the expression levels of the Generation 2 promoter constructs in which the TFRE is present. [0209] The number of each TFRE in each synthetic promoter is plotted against the relative activity of that promoter (A-D). In each case the linear regression line is shown, where the slope of the line indicates the extent to which each TFRE occurs in promoters of varying activity. [0210] Over-mean: higher than average expression level (i.e. high expressing constructs). [0211] Under-mean: lower than average expression level (i.e. low expressing constructs).

    [0212] FIG. 6 shows the results of reporter gene assays indicating the relative activity of seven synthetic promoters with differing relative activity was determined in CHO-S cells, CHO-K1 cells and CHO-DG44 cells. Cells (210.sup.5) were transfected with 250 ng SEAP-reporter vector, and SEAP production was quantified 24 h post-transfection. Data are expressed as a percentage of the activity of the CMV promoter in each cell line. Values represent the mean+S.D of three independent experiments performed in triplicate. [0213] Interestingly the relative performance of each promoter was not the same in all cells lines tested.

    [0214] FIG. 7 shows the results of reporter gene assays involving longer term transient expression (i.e. performed over 7 days) in fed-batch culture of CHO-S cells using the same synthetic promoter constructs depicted in FIG. 6. [0215] CHO-S cells (610.sup.6) were transfected with 7.5 g of SEAP reporter-vectors, where SEAP expression was under the control of synthetic promoters with varying activity or the control CMV promoter. SEAP production and viable cell concentration were measured over the course of a 7-day fed-batch process in tube-spin bioreactors. The mean IVCD (integral of viable cell density) at Day 7 (white bars) and SEAP titer (black bars) are shown. SEAP data are expressed as a percentage of the control CMV promoter activity. Two independent transfections were performed in duplicate.

    [0216] FIG. 8 shows various sequences of synthetic promoters according to the present disclosure

    DETAILED DESCRIPTION OF INVENTION

    [0217] In tandem as employed herein refers to sequences in line one after the other.

    [0218] As used herein, the term transcription factor refers to any cellular factor, including proteins that bind to a cis-acting region and regulate either positively or negatively the expression of the gene, for example, a transcription factor may bind upstream of the coding sequence of a gene to either enhance or repress transcription of the gene by assisting or blocking RNA polymerase binding. Transcription factors or repressors or co-activators or co-repressors, and the like are encompassed within this definition.

    [0219] As used herein, the term transcription factor regulatory element, or TFRE, or regulatory element refers to a nucleotide sequence that is recognized and bound by a transcription factor.

    [0220] A TFRE comprises a nucleic acid sequence suitably, a double stranded DNA sequence. A TFRE may comprise a cis-acting region and may also comprise additional nucleic acids. The core six to eight nucleotides of promoter and enhancer elements may be sufficient for the binding of their corresponding transcription factors.

    Thus, a TFRE may consist of 6 to 8 nucleic acid bases. A TFRE of the invention may be 6 or more, 8 or more, 10 or more, 15 or more, 20 or more, 25 or more, or 30 or more bases in length. A TFRE of the invention may be 100 or less, 75 or less, 50 or less, 30 or less, 25 or less, 20 or less or 15 or less bases in length.

    [0221] The person of skill in the art will understand, however, that influence from other transcription factors, e.g. general transcription factors or transcription factors binding promiscuously to polynucleotides, is not always excluded. Thus, a synthetic promoter construct wherein at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 99% of transcription activity or repression activity is mediated by the transcription factors for which transcription factor regulatory elements have been incorporated into said promoter is to be considered to be specifically controlled by the transcription factors.

    [0222] A particularly suitable TFRE is one that is active in the cell or tissue of interest. Such a TFRE may be identified as being associated with a gene that is expressed in the cell or tissue of interest, for example, a TFRE may be associated with a gene that is differentially expressed in that cell or tissue, when compared with another cell or tissue. Differential expression of a gene may be seen by comparing the expression of the gene in two different cells or tissues, or in the same cells or tissues under different conditions.

    [0223] Expression in one cell or tissue type may be compared with that in a different, but related, tissue type, for example, where the cell or tissue of interest is a disease cell or tissue or has been artificially manipulated as described herein, the expression of genes in that cell or tissue may be compared with the expression of the same genes in an equivalent normal or untreated cell or tissue. This may allow the identification of genes that are differentially regulated between the two cell or tissue types.

    [0224] A TFRE that is associated with such a gene is generally located close to the coding sequence of the gene within the genome of the cell, for example, such a TFRE may be located in the region immediately upstream or downstream of that coding sequence. Such a TFRE may be located close to a promoter or other regulatory sequence that regulates expression of the gene. The location of a TFRE may be determined by the skilled person using a variety of known methods, such as those described in the present specification.

    [0225] Some suitable examples of transcription factor regulatory elements of the present invention are shown in Table 1 below.

    TABLE-US-00001 TABLE1 TranscriptionFactorRegulatoryElements:Tenviralpromotersthought toexhibitactivityinCHOcellsweresurveyedforthepresence ofdiscretetranscriptionfactorregulatoryelements(transcription factorbindingsites)usingTranscriptionElementSearchSystem (TESS)andTranscriptionAffinityPrediction(TRAP)algorithms usingstringentsequencesofsingleTFREsthatoccurinsearch parameterstominimizefalsepositives.DNAoftheirrelative abilitytoactivatemorethanoneviralpromoterarelisted. MeasurementcellsisshowninFIG.1. transcriptionofrecombinantreportergenesinCHO-S SEQ TranscriptionFactorRegulatoryElement Sequence IDNO: Activatorprotein1(AP1) TGACTCA 1 CC(A/T).sub.6GGelement(CArG) CCAAATTTGG 2 CCAATdisplacementprotein(CDP) GGCCAATCT 3 CCAAT-enhancerbindingproteinalpha TTGCGCAA 4 (C/EBP) Cellularmyeloblastosis(cMyb) TAACGG 5 cAMPRE(CRE) TGACGTCA 6 Elongationfactor2(E2F) TTTCGCGC 7 E4F1 GTGACGTAAC 8 Earlygrowthresponseprotein1(EGR1) CGCCCCCGC 9 Estrogen-relatedreceptoralphaRE(ERRE) AGGTCATTTTGACCT 10 Enhancerbox(E-box) CACGTG 11 GATA-1(GATA) AGATAG 12 GC-box GGGGCGGGG 13 GlucocorticoidRE(GRE) AGAACATTTTGTTCT 14 Growthfactorindependence1(Gfi1) AAAATCAAC 15 HeliosRE(HRE) AATAGGGACTT 16 Hepatocytenuclearfactor1(HF) GGGCCAAAGGTCT 17 Insulinpromoterfactor1(IPF1) CCCATTAGGGAC 18 Interferon-stimulatedRE(ISRE) GAAAAGTGAAACC 19 Myocyteenhancerfactor2(MEF2) CTAAAAATAG 20 Msxhomeobox(MSX) CGGTAAATG 21 Nervegrowthfactor-inducedgene-BRE AAAGGTCA 22 (NBRE) Nuclearfactor1(NF1) TTGGCTATATGCCAA 23 NuclearfactorofactivatedTcells(NFAT) AGGAAATC 24 NuclearfactorkappaB(NFB) GGGACTTTCC 25 Octamermotif(OCT) ATTAGCAT 26 RetinoicacidRE(RARE) AGGTCATCAAGAGGTCA 27 Yinyang1(YY1) CGCCATTTT 28 Random8mer(8mer)[-vecontrol] TTTCTTTC 29

    [0226] It will be understood that the transcription regulatory elements of the invention are not limited to specific sequences referred to in the specification but also encompass their structural and functional analogs/homologues. Such analogs may contain truncations, deletions, insertions, as well as substitutions of one or more nucleotides introduced either by directed or by random mutagenesis. Truncations may be introduced to delete one or more binding sites for known transcriptional repressors.

    [0227] Additionally, such sequences may be derived from sequences naturally found in nature that exhibit a high degree of identity to the sequences in the invention. A nucleic acid of about 20 nucleotides or more will be considered to have high degree of identity to a transcription factor regulatory element of the invention if it hybridizes to the relevant transcription factor under stringent conditions. Alternatively, a nucleic acid will be considered to have a high degree of identity to a transcription factor regulatory element of the invention if it comprises a contiguous sequence of about 20 or more nucleotides, which has percent identity of at least 70%, 75%, 80%, 85%, 90%, 95%, or more as determined by standard alignment algorithms such as, for example, Basic Local Alignment Tool (BLAST) described in Altshul et al., J. Mol. Biol. 1990, 215: 403-410, the algorithm of Needleman et al., J. Mol. Biol. 1970, 48: 444-453, or the algorithm of Meyers et al., Comput. Appl. Biosci. 1988, 4: 11-17.

    [0228] The transcription factor regulatory elements of the present invention may be chosen from transcription factor regulatory elements already known to be active in a target host cell or may be putative regulatory elements determined by in silico analysis of sequences upstream of core promoters, by using methods known to those of skill in the art.

    [0229] In one embodiment there is provided use of a combination of transcription factor regulatory element combination selected from a group described herein, for example use to improve transcription, translation or expression in host, in particular by incorporation into a synthetic promoter.

    [0230] In one embodiment there is provided use of a synthetic promoter, for example as described herein for driving expression in a mammalian host cell, such as a CHO cell.

    [0231] These in silico analyses typically operate by comparing non-coding regulatory sequences between the genomes of various organisms to enable the identification of conserved regions that are significantly enriched in promoters of candidate genes or from clusters identified by microarray analysis and can potentially function as transcription factor regulatory elements. Examples of these software suites include TRAFAC, CORG, CONSITE, CONFAC, VAMP and CisMols Analyser.

    [0232] As used herein, the term promoter or promoter construct refers to a DNA segment that contains components for an efficient transcription of a gene and includes one or more transcription factor regulatory elements; a core promoter region; and optionally, sequences from 5-untranslated region or introns.

    [0233] As used herein, the term synthetic promoter construct or synthetic promoter refers to an artificial or engineered or assembled promoter sequence, for example comprising two or more transcription factor regulatory elements, such as containing one or more transcription factor regulatory element constructs.

    [0234] As used herein, the term transcription factor regulatory element construct or TFRE construct refers to an assembled double stranded DNA molecule that comprises more than one transcription regulatory element sequence. The construct may be created by a number of different means that would be known to the skilled addressee; including the ligation of various double stranded transcription regulatory elements together in a random or directed fashion. The construct may comprise other nucleic acid sequences as well, e.g. spacers that do not mediate binding of a transcription factor but allow for a correct spatial arrangement of binding sites. The spacer region, for example may be a common overhang which allows different TFREs to be easily ligated to each other.

    [0235] The TFREs are usually in tandem with one another, but may be separated by an overhang sequence.

    [0236] The construct may comprise at least two transcription factor regulatory elements that are the same.

    [0237] The construct may comprise at least two different transcription factor regulatory elements for at least two different transcription factors.

    [0238] The construct may also comprise a plurality, for example, two, three, four, five, six, seven, eight, nine, or ten or more transcription factor regulatory elements. A number of these may be the same or they may all be different.

    [0239] When assembled within a construct, the transcription factor regulatory elements bind multiple transcription factors expressed in the target host cells and efficiently drive expression of a recombinant protein or reporter protein. However, the same combined transcription factor regulatory elements may be inactive in non-target host cells due to the lack of particular transcription factors required for binding to the sequence elements. Thus, the combinatorial nature of gene transcription is most effectively utilized, by knowing the exact profile of transcription factors and co-regulators that are active in the target host cells.

    [0240] As used herein, the term core promoter or promoter core refers to a short DNA segment which is the minimal portion of the promoter required to initiate transcription. Core promoter sequence can be derived from various different sources, including prokaryotic and eukaryotic genes. Examples of this are dopamine beta-hydroxylase gene minimum promoter and cytomegalovirus (CMV) immediate early gene promoter. In one example, the core promoter is derived from a promoter selected from the group consisting of CMV, SV40, UbC, EF1A, PGK and CAGG, such as CMV.

    [0241] A core promoter may be inducible, wherein transcription is initiated in response to an inducing agent or an increased level of transcription of an operatively linked expressible polynucleotide as compared to the level of transcription, if any, in the absence of an inducing agent.

    [0242] Alternatively, a promoter may be constitutive, wherein the transcription activity is not affected by the presence of an inducing agent.

    [0243] The term inducing agent is used to refer to a chemical, biological or physical agent that effects transcription from an inducible promoter. An inducing agent can be, for example, a stress condition to which a cell is exposed, for example, a heat or cold shock, a toxic agent such as a heavy metal ion, or a lack of a nutrient, hormone, growth factor, or the like; or can be exposure to a molecule that affects the growth or differentiation state of a cell such as a hormone or a growth factor.

    [0244] A core promoter may also be regulated in a tissue-specific or tissue-preferred manner, such that it is only active in transcribing the operable linked coding region in a specific tissue type.

    [0245] As used herein, the term wild type promoter refers to a promoter sequence as it occurs in nature.

    [0246] Unless stated otherwise, the term CMV promoter refers to the commonly acknowledged full length hCMV-IE1 promoter (i.e. the core and the proximal elements) GenBank accession number: M60321.1, bases 595-1193). The term CMV core or hCMV-IE1 core as used herein refers to the minimal hCMV-IE1 core promoter provided herein as SEQ ID NO:170.

    [0247] The term operatively linked as used herein refers to elements or structures in a nucleic acid sequence that are linked by operative function (i.e. able to influence the function or respond to the function of the other element) and not physical location. Hence, it is not necessary for elements or structures in a nucleic acid sequence to be in a tandem or adjacent order to be operatively linked.

    [0248] The term vector as used herein refers to any vehicle that delivers a nucleic acid into a cell or organism. An example of a vector is a plasmid, which is a circular double stranded DNA loop into which additional DNA segments may be ligated and can replicate independently of chromosomal DNA. Plasmids occur or are derived from mainly bacteria and sometimes from other microorganisms. However, mitochondrial and chloroplast DNA, yeast killer and other cases are commonly excluded.

    [0249] Another type of vector is a viral vector, wherein additional DNA segments may be ligated into the viral sequence. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) can be integrated into the genome of a host cell, where they are subsequently replicated along with the host genome. In the present specification, the terms plasmid and vector may be used interchangeably as a plasmid is the most commonly used form of vector.

    [0250] General methods by which the vectors may be constructed, transfection methods and culture methods are well known to those skilled in the art. In this respect, reference is made to Current Protocols in Molecular Biology, 1999, F. M. Ausubel (ed), Wiley Interscience, New York and the Maniatis Manual produced by Cold Spring Harbor Publishing.

    The vectors of the present invention may comprise a selectable marker, which is a protein whose expression allows one to identify cells that have been transformed or transfected with a vector containing the marker gene.

    [0251] A wide range of selection markers are known in the art, for example, the selectable marker may be a gene for neomycin phosphotransferase (npt II), which expresses an enzyme conferring resistance to the antibiotic kanamycin, and genes for the related antibiotics neomycin, paromomycin, gentamicin, and G418, or the gene for hygromycin phosphotransferase (hpt), which expresses an enzyme conferring resistance to hygromycin.

    [0252] The term expression vector as used herein, refers to a vector encoding a recombinant protein that is to be expressed in a target host cell. A plurality of different expression vectors as described herein may be provided. These may form a library.

    As used herein, a host cell is a cell comprising one or more synthetic promoters, vectors or expression vectors of the present invention. The cell may be a mammalian cell. The host cell may be a cultured cell or a body cell. Suitable mammalian host cells include CHO, myeloma or hybridoma cells.

    [0253] Transfection of vectors into the target host cells of the present invention may be achieved using any suitable method. A variety of transfection methods are known in the art and the skilled person will be able to select a suitable method depending on the type of vector and type of host cell desired.

    [0254] The term transient expression as used herein refers to the capacity of a host cell to direct the transcription and translation of recombinant genetic sequences. This transcription and translation occurs soon after the genetic sequences are introduced into the host cell. Such expression can occur even when a plasmid which carries a reporter gene sequence operably linked to a functional promoter is introduced into a host cell incapable of replicating or propagating the plasmid.

    [0255] As used herein, a recombinant protein refers to a protein that is constructed or produced using recombinant DNA technology. The protein of interest may be an exogenous sequence identical to the endogenous protein or a mutated version thereof, for example with attenuated biological activity, or fragment thereof, expressed from an exogenous vector. Alternatively, the protein of interest may be a heterologous protein, not normally expressed by the host cell.

    [0256] The recombinant protein may be any suitable protein including therapeutic, prophylactic or diagnostic protein.

    [0257] The recombinant protein expressed under the control of a synthetic promoter according to the invention may, for example be an immunogenic protein, a fusion protein comprising two heterologous proteins or an antibody. Antibodies for use as the recombinant protein include monoclonal, multi-valent, multi-specific, humanized, fully human or chimeric antibodies. The antibody may be a complete antibody molecule having full length heavy and light chains or a fragment thereof, e.g. VH, VL, VHH, Fab, modified Fab, Fab, F(ab).sub.2, Fv or scFv fragment.

    [0258] After expression antibody fragments may be further processed, for example by conjugation to another entity or for example the antibody fragments may be PEGylated to generate a product with the required properties, for example similar to the whole antibodies, if required.

    [0259] Examples of antigens of interest bound by the antibodies or fragments thereof may include, but are not limited to, any medically relevant protein such as those proteins upregulated during disease or infection, for example receptors and/or their corresponding ligands. Particular examples of cell surface proteins include adhesion molecules, for example integrins such as 1 integrins e.g. VLA-4, E-selectin, P selectin or L-selectin, CD2, CD3, CD4, CD5, CD7, CD8, CD11a, CD11b, CD18, CD19, CD20, CD23, CD25, CD33, CD38, CD40, CD45, CDW52, CD69, CD134 (OX40), ICOS, BCMP7, CD137, CD27L, CDCP1, DPCR1, DPCR1, dudulin2, FLJ20584, FLJ40787, HEK2, KIAA0634, KIAA0659, KIAAl246, KIAA1455, LTBP2, LTK, MAL2, MRP2, nectin-like2, NKCC1, PTK7, RAIG1, TCAM1, SC6, BCMP101, BCMP84, BCMP11, DTD, carcinoembryonic antigen (CEA), human milk fat globulin (HMFG1 and 2), MHC Class I and MHC Class II antigens, and VEGF, and where appropriate, receptors thereof.

    [0260] Soluble antigens include interleukins such as IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-8, IL-12, IL-16, IL-23, viral antigens for example respiratory syncytial virus or cytomegalovirus antigens, immunoglobulins, such as IgE, interferons such as interferon , interferon or interferon , tumor necrosis factor-, colony stimulating factors such as G-CSF or GM-CSF, and platelet derived growth factors such as PDGF-, and PDGF-13 and where appropriate receptors thereof. Other antigens include bacterial cell surface antigens, bacterial toxins, viruses such as influenza, EBV, HepA, B and C, bioterrorism agents, radionuclides and heavy metals, and snake and spider venoms and toxins.

    [0261] The term reporter gene as used herein refers to a nucleic acid sequence encoding easily assayed proteins. Among the more commonly used reporter genes are those for the following reporter proteins: chloramphenicol acetyltransferase (CAT), secreted alkaline phosphatase (SEAP), -galactosidase (GAL), -glucuronidase (GUS), luciferase (LUC), and green fluorescent protein (GFP). One of ordinary skill in the art will be aware of other available reporter genes.

    [0262] The present invention also provides reporter gene assays as a method by which the transcriptional activity of a particular synthetic promoter construct within a cell can be analysed. These assays may comprise linking a reporter gene for visualizing the promoter activity, downstream of a promoter of interest to thereby obtain a reporter construct, introducing the reporter construct in a test cell, and quantifying the condition of promoter activation on the basis of the level of expressed reporter protein measured.

    [0263] The reporter gene may, for example encode a secretable reporter protein, which when transcribed and translated, will result in the secretable reporter protein being synthesized and secreted into the external culture medium. The presence of the reporter molecule is monitored by assaying the culture medium without requiring the destruction or rupture of the microorganism host cells. An aliquot of culture medium is evaluated by any means capable of detecting the reporter molecule. Such means may either be, for example, by immunoassay or by other means known to the art. The rate of accumulation of the reporter molecule in the external culture medium is therefore an indication of the transcription activity of any promoter construct which was present on the fragment cloned adjacent and preceding the reporter gene sequences on the plasmid/expression vector.

    [0264] Alternatively, if a visually identifiable reporter gene such as luciferase is used (which results in the emission of a photon in the presence of the substrate luciferin and ATP), the expression levels of the reporter protein can be easily monitored using a luminometer.

    [0265] In one embodiment the recombinant protein is not a reporter protein, i.e. the construct of the present disclosure does not comprise a reporter gene.

    In one embodiment the recombinant protein is not luciferase, i.e. the construct of the present disclosure does not comprise a gene encoding luciferase.

    [0266] The term transcriptional activity as used herein refers to the transcription of the information encoded in DNA into a molecule of RNA, or the translation of the information encoded in the nucleotides of a RNA molecule into a defined sequence of amino acids in a protein.

    [0267] A promoter construct with a high transcriptional activity refers to a construct which is able to express a recombinant protein or reporter protein at a high level of expression, defined as a level of expression that is higher than the mean level of expression obtained across a range of promoter constructs. As a reference point, synthetic promoters with the highest levels of activity exceed the activity of hCMV-IE1, which is widely acknowledged as one of the strongest promoters in CHO cells.

    [0268] Conversely, a promoter construct with a low transcriptional activity refers to a construct which expresses a recombinant protein or reporter protein at a low level of expression, defined as a level of expression that is lower than the mean level of expression obtained across a range of promoter constructs.

    [0269] Accordingly, the terms transcriptional activity and expression level are used interchangeably within the present specification.

    [0270] The terms oligonucleotide, polynucleotide or nucleotide sequence are used broadly herein to mean a sequence of two or more deoxyribonucleotides or ribonucleotides that are linked together by a phosphodiester bond. As such, the terms include RNA and DNA, which can be a gene or a portion thereof, a cDNA, a synthetic polydeoxyribonucleic acid sequence or polyribonucleic acid sequence, or the like, and can be single stranded or double stranded, as well as a DNA/RNA hybrid. Furthermore, the terms oligonucleotide, polynucleotide and nucleotide sequence include naturally occurring nucleic acid molecules, which can be isolated from a cell, as well as synthetic molecules, which can be prepared, for example, by methods of chemical synthesis or by enzymatic methods such as by the polymerase chain reaction (PCR).

    [0271] Synthetic methods for preparing a nucleotide sequence include, for example, the phosphotriester and phosphodiester methods (see Narang et al., Meth. Enzymol. 68:90, (1979); U.S. Pat. No. 4,356,270, U.S. Pat. No. 4,458,066, U.S. Pat. No. 4,416,988, U.S. Pat. No. 4,293,652; and Brown et al, Meth. Enzymol. 68:109, (1979), each of which is incorporated herein by reference). In various embodiments, an oligonucleotide of the invention or a polynucleotide useful in a method of the invention can contain nucleoside or nucleotide analogs, or a backbone bond other than a phosphodiester bond.

    [0272] The nucleotides comprising an oligonucleotide (polynucleotide) generally are naturally occurring deoxyribonucleotides, such as adenine, cytosine, guanine or thymine linked to 2-deoxyribose, or ribonucleotides such as adenine, cytosine, guanine or uracil linked to ribose. However, a polynucleotide also can contain nucleotide analogs, including non-naturally occurring synthetic nucleotides or modified naturally occurring nucleotides. Such nucleotide analogs are well known in the art and commercially available, as are polynucleotides containing such nucleotide analogs (Lin et al., Nucl. Acids Res. 22:5220-5234 (1994); Jellinek et al, Biochemistry 34:11363-11372 (1995); Pagratis et al., Nature Biotechnol. 15:68-73 (1997), each of which is incorporated herein by reference).

    [0273] The covalent bond linking the nucleotides of an oligonucleotide or polynucleotide generally is a phosphodiester bond. However, the covalent bond also can be any of numerous other bonds, including a thiodiester bond, a phosphorothioate bond, a peptide-like bond or any other bond known to those in the art as useful for linking nucleotides to produce synthetic polynucleotides (see, for example, Tarn et al., Nucl. Acids Res. 22:977-986 (1994); Ecker and Crooke, BioTechnology 13:351360 (1995), each of which is incorporated herein by reference). The incorporation of non-naturally occurring nucleotide analogs or bonds linking the nucleotides or analogs can be particularly useful where the nucleotide sequence is to be exposed to an environment that can contain a nucleolytic activity, including, for example, a tissue culture medium or upon administration to a living subject, since the modified nucleotide sequences can be less susceptible to degradation.

    [0274] A polynucleotide comprising naturally occurring nucleotides and phosphodiester bonds can be chemically synthesized or can be produced using recombinant DNA methods, using an appropriate polynucleotide as a template. In comparison, a polynucleotide comprising nucleotide analogs or covalent bonds other than phosphodiester bonds generally are chemically synthesized, although an enzyme such as T7 polymerase can incorporate certain types of nucleotide analogs into a polynucleotide and, therefore, can be used to produce such a polynucleotide recombinantly from an appropriate template (Jellinek et al., supra, 1995).

    [0275] The term library as used herein where the context dictates refers to two or more TFRE constructs, two or more synthetic promoters or two or more expression vectors of the present disclosure or two or more cells of the present disclosure. As described throughout the specification, the term library is used in its broadest sense and may also encompass sub-libraries that may or may not be combined to produce the libraries of the present disclosure. TFREs identified by the methods of the invention as being active in a cell or tissue type of interest may be used to target genes to that cell or tissue type. For example, where the methods of the invention show that a TFRE is active specifically in a particular cell type, but not in a control cell type, then that TFRE may be used to specifically direct expression in the cell type of interest. Thus, a TFRE of the invention may be combined with a gene that it is desired to express in a particular cell type.

    [0276] The term comprising, within in the context of the present specification, is intended to meaning including.

    [0277] The term about or approximately means within 20%, preferably within 10%, and more preferably within 5% of a given value or range.

    [0278] Where technically appropriate, embodiments of the invention may be combined.

    [0279] Embodiments are described herein as comprising certain features/elements. The disclosure also extends to separate embodiments consisting or consisting essentially of said features/elements.

    [0280] Technical references such as patents and applications are incorporated herein by reference.

    [0281] Any embodiments specifically and explicitly recited herein may form the basis of a disclaimer either alone or in combination with one or more further embodiments.

    [0282] The invention will now be described with reference to the following examples, which are merely illustrative and should not in any way be construed as limiting the scope of the present invention.

    REFERENCES

    [0283] Al-Fageeh M B, Marchant R J, Carden M J, Smales C M. 2006. Biotechnology and bioengineering 93(5):829-835. [0284] Blazeck J, Garg R, Reed B, Alper H S. 2012. Biotechnology and bioengineering 109(11):2884-2895. [0285] Brown A J, Mainwaring D O, Sweeney B, James D C. 2013. Analytical biochemistry 443(2): 205-210. Dale L. 2006. BioProcess International 4:14-22. [0286] Daramola O, Stevenson J, Dean G, Hatton D, Pettman G, Holmes W, Field R (2013) Biotechnology Progress Volume 30, Issue 1, pages 132-141, January/February 2014. [0287] Datta P, Linhardt R J, Sharfstein S T. 2013. Biotechnology and Bioengineering Volume 110, Issue 5, pages 1255-1271, May 2013. [0288] Ferreira J P, Overton K W, Wang C L. 2013. Proceedings of the National Academy of Sciences 110(28):11284-11289. [0289] Ferreira J P, Peacock R W, Lawhorn I E, Wang C L. 2011. Systems and synthetic biology 5(3-4):131-138. [0290] Girod P A, Zahn-Zabal M, Mermod N. 2005. Biotechnology and bioengineering 91(1):1-11. [0291] Grabherr M G, Pontiller J, Mauceli E, Ernst W, Baumann M, Biagi T, Swofford R, Russell P, Zody M C, Di Palma F. 2011. Exploiting nucleotide composition to engineer promoters. PloS One 6(5):e20136. [0292] Hai T, Curran T. 1991. Proceedings of the National Academy of Sciences 88(9):3720-3724. [0293] Ho S C, Koh E Y, van Beers M, Mueller M, Wan C, Teo G, Song Z, Tong Y, Bardor M, Yang Y. 2013. Journal of biotechnology 165(3-4):157-166. [0294] Kim M, O'Callaghan P M, Droms K A, James D C. 2011. Biotechnology and bioengineering 108(10):2434-2446. [0295] A. M. Lanza, J. K. Cheng, and H. S. Alper, Current Opinion in Chemical Engineering (2012). [0296] Le H, Vishwanathan N, Kantardjieff A, Doo I, Srienc M, Zheng X, Somia N, Hu W-S. 2013. Metabolic Engineering. 2013 November; 20:212-20. [0297] Mader A, Prewein B, Zboray K, Casanova E, Kunert R. 2012. Applied microbiology and biotechnology:1-6. [0298] Manke T, Roider H G, Vingron M. 2008. Statistical modeling of transcription factor binding affinities predicts regulatory interactions. PLoS Comput Biol 4(3):e1000039. [0299] McLeod J, O'Callaghan P M, Pybus L P, Wilkinson S J, Root T, Racher A J, James D C. 2011. Biotechnology and bioengineering 108(9):2193-2204. [0300] O'Callaghan P M, McLeod J, Pybus L P, Lovelady C S, Wilkinson S J, Racher A J, Porter A, James D C. 2010. Biotechnology and bioengineering 106(6):938-951. [0301] Ogawa R, Kagiya G, Kodaki T, Fukuda S, Yamamoto K. 2007. BioTechniques 42(5):628. [0302] Pasotti L, Politi N, Zucca S, De Angelis M G C, Magni P. 2012. Bottom-up engineering of biological systems through standard bricks: A modularity study on basic parts and devices. PloS One 7(7):e39407. [0303] Prentice H, Tonkin C, Caamano L, Sisk W. 2007. Journal of biotechnology 128(1):50-60. [0304] Schlabach M R, Hu J K, Li M, Elledge S J. 2010. Proceedings of the National Academy of Sciences 107(6):2538. [0305] Schug J. 2008. Curr. Protoc. Bioinform. 21: 2.6.1-2.6.15 [0306] Stadlmayr G, Mecklenbruker A, Rothmller M, Maurer M, Sauer M, Mattanovich D, Gasser B. 2010. Journal of biotechnology 150(4):519-529. [0307] Stinski M F, Isomura H. 2008. Medical microbiology and immunology 197(2):223-231. [0308] Torne J, Kusk P, Johansen T E, Jensen P R. 2002. Gene 297(1):21-32. [0309] Yim S S, An S J, Kang M, Lee J, Jeong K J. 2013. Biotechnology and bioengineering 110(11):2959-2969 [0310] Zhou H, Liu Z-g, Sun Z-w, Huang Y, Yu W-y. 2010. Journal of biotechnology 147(2):122-129.

    Example 1

    Identification of Transcription Factor Regulatory Elements that are Active in CHO-S Cells

    [0311] In Silico Analysis of Transcription Factor Regulatory Elements

    [0312] In order to identify discrete TFREs (transcription factor binding sites) capable of recombinant gene transactivation in CHO-S cells, the inventors surveyed for putative TFREs in ten viral promoters generally known to be active in CHO cells.

    [0313] The following promoter sequences were retrieved from GenBank: hCMV-IE1 (accession number M60321.1), mouse CMV-IE1 (M11788), rat CMV-IE1 (U62396), guinea pig CMV-IE1 (CS419275), mouse CMV-IE2 (L06816.1), simian virus 40 early promoter and enhancer (NC_001669.1), adenovirus major late promoter (KF268310), myeloproliferative sarcoma virus long terminal repeat (LTR) (K01683.1), rous sarcoma virus LTR (J02025.1), and human immunodeficiency virus LTR (K03455.1).

    [0314] Promoters were analysed using the Transcription Element Search System (TESS: http://www.cbil.upenn.edu/cgi-bin/tess/tess) and the Transcription Affinity Prediction tool (TRAP: http://trap.molgen.mpg.de/cgi-bin/trap_form.cgi) according to the methods previously described by Schug (Schug 2008) and Manke et al (Manke et al. 2008). Stringent search parameters were used to minimize false positives.

    [0315] Using online search tools that scan DNA sequences for transcription factor (TF) binding sites, specifically Transcription Element Search System (TESS) and Transcription Affinity Prediction tool (TRAP), stringent search parameters (Manke et al. 2008; Schug 2008) were employed to minimize false positives. Across all viral promoter sequences, 67 discrete TFREs were identified as being present in one or more promoters. To further minimize this pool (design space) TFREs that did not occur in at least two promoters were filtered out. Based on the above in silico analysis, 28 transcription factor regulatory elements (TFREs) (see Table 2 below) were identified.

    TABLE-US-00002 TABLE 2 Transcription factor regulatory elements identified by bioinformatic survey of viral promoters: Ten viral promoters known to exhibit activity in CHO cells were surveyed for the presence of discrete transcription factor regulatory elements (transcription factor binding sites) using Transcription Element Search System (TESS) and Transcription Affinity Prediction (TRAP) algorithms using stringent search parameters to minimize false positives. 28 single TFREs that occur in more than one viral promoter are listed. Transcription Factor Promoter Regulatory Elements Human Cytomegalovirus AP1, CArG, C/EBP, CRE, immediate early 1 E4F1, EGR1, GC-box, Gfi1, (hCMV-IE1) IPF1, NF1, NFB, RARE, YY1 Mouse Cytomegalovirus AP1, CRE, E-box, E4F1, immediate early 1 ERRE, Gfi1, HRE, IPF1, (mCMV-IE1) NF1, NFKB, NFAT, NBRE, RARE Rat Cytomegalovirus AP1, E2F, ERRE, ISRE, immediate early 1 NFB, NFAT, NBRE, RARE (rCMV-IE1) Guinea pig Cytomegalovirus AP1, GATA, GC-box, GRE, immediate early 1 HNF, MSX, NF1, NFB, (gpCMV-IE1) OCT, RARE, YY1 Mouse Cytomegalovirus CArG, GC-box, cMyb, E2F, immediate early 2 EGR1, GATA, HRE, MSX, (mCMV-IE2) RARE Simian virus 40 early AP1, C/EBP, cMyb, promoter and enhancer E-box, GATA, GC-box, (SV40E) MSX, NFB, OCT Adenovirus major late CDP, E-Box, EGR1, GATA, promoter (AdMLP) GC-box, HNF, NF1, YY1 Myeloproliferative CDP, cMyb, ERRE, GATA, sarcoma virus long GC-box, Gfi1, terminal repeat (LTR) GRE, NF1, RARE, YY1 (MPSV LTR) Rous sarcoma virus CArG, CDP, C/EBP, LTR (RSV LTR) ISRE, OCT Human immunodeficiency E-box, GC-box, GATA, virus LTR (HIV LTR) HNF, NFB, NF1

    TFRE-Reporter Vector Construction

    [0316] Previously described (Brown et al. 2013) promoter-less reporter-vectors (subcloned from pSEAP2 control (Clontech, Oxford, UK)) were utilized in this study. These plasmids contain a minimal hCMV-IE1 core promoter (5-AGGTCTATATAAGCAGAGCTCGTTTAGTGA ACCGTCAGATCGCCTAGATACGCCATCCACGCTGTTTTGACCTCCATAGAAGAC-3) (SEQ ID NO:170) upstream of either the secreted alkaline phosphatase (SEAP) or turbo green fluorescent protein (GFP) open reading frame (ORF).

    [0317] To create RE reporter plasmids, synthetic oligonucleotides containing 7 repeat copies of each of the TFRE consensus sequences in Table 1 were synthesized (Sigma), PCR amplified, and inserted into KpnI and XhoI sites upstream of the CMV core promoter. Three clonally derived plasmids for each TFRE reporter were purified using a Qiagen plasmid mini kit (Qiagen, Crawley, UK). The sequence of all the plasmid constructs was confirmed by DNA sequencing.

    Cell Culture and Transfection

    [0318] CHO-S and CHO-K1 cells were cultured in CD-CHO medium (Life Technologies) supplemented with 8 mM and 6 mM L-glutamine (Sigma) respectively. CHO-DG44 cells were cultured in CD-DG44 medium (Life Technologies) supplemented with 8 mM L-glutamine and 18 mL/L pluronic F68 (Life Technologies). All cells were routinely cultured at 37 C. in 5% (v/v) CO.sub.2 in vented Erlenmeyer flasks (Corning, UK), shaking at 140 rpm and subcultured every 3-4 days at a seeding density of 210.sup.5 cells/ml. Cell concentration and viability were determined by an automated Trypan Blue exclusion assay using a Vi-Cell cell viability analyser (Beckman-Coulter, High Wycombe, UK). Two hours prior to transfection, 210.sup.5 cells from a mid-exponential phase culture were seeded into individual wells of a 24 well plate (Nunc, UK). Cells were transfected with DNA-lipid complexes comprising DNA and Lipofectamine (Life Technologies), prepared according to the manufacturer's instructions. Transfected cells were incubated for 24 h prior to protein expression analysis.

    Quantification of Reporter Expression

    [0319] SEAP protein expression was quantified using the Sensolyte pNPP SEAP colorimetric reporter gene assay kit (Cambridge Biosciences, Cambridge, UK) according to the manufacturer's instructions. GFP protein expression was quantified using a Flouroskan Ascent FL Flourometer (Excitation filter: 485 nm, Emission filter: 520 nm). Background fluorescence/absorbance was determined in cells transfected with a promoter-less vector. The results of the reporter assays are shown in FIG. 1. Negligible basal activity was observed with CMV core promoter (34 to +48 relative to TSS, includes TATA box and initiator element).

    [0320] As can be seen, the majority of the TFREs did not result in an increased expression level of SEAP/GFP compared to when the core promoter alone was used. However, 7 TFREs, i.e. NFB-RE, E-box, AP1-RE, CRE, GC-box, E4F1-RE and C/EBP-RE showed higher expression levels of SEAP/GFP compared to the core promoter CMV alone.

    [0321] These 7 TFREs were selected for incorporation into the Generation 1 synthetic promoters.

    Example 2

    Construction and Analysis of Generation 1 Promoters

    Synthetic Promoter Library Construction

    [0322] Synthetic promoter TFREs were constructed from complementary single stranded 5 phosphorylated oligonucleotides (Sigma, Poole, UK), annealed in STE buffer (100 mM NaCl, 50 mM Tris-HCl, 1 mM EDTA, pH 7.8, Sigma) by heating at 95 C. for 5 min, prior to ramp cooling to 25 C. over 2 h. Oligonucleotides were designed such that the resulting double stranded blocks contained the specific TFRE (Table 1) and a 4 by TCGA single stranded overhang at each 5 termini. For example the sequences used for the NFkB-RE block were as follows (RE site underlined): 5-TCGATGGGACTTTCCA-3 SEQ ID NO: 171 and 5-TCGATGGAAAGTCCCA-3 SEQ ID NO: 172.

    [0323] In order to construct a first generation synthetic promoter library, all 7 TFREs identified as transcriptionally active in CHO-S cells were used. Oligonucleotide building blocks containing a single copy of each TFRE sequence were chemically synthesized. NFB, CRE, E-box, GC-box, E4F1, and C/EBP; AP1 was omitted from the library due to previously observed functional redundancy between CRE and AP1 sites (Hai and Curran 1991). Transcription factor regulatory element blocks were synthesised by ligating TFREs at appropriate stoichiometric molar ratios with high concentration T4 DNA ligase (Life Technologies, Paisley, UK). A cloning-block containing KpnI and XhoI sites was included in ligation mixes at a 1:20 molar ratio of the TFREs. The ligated molecules were digested with KpnI and XhoI (Promega, Southampton, UK), gel extracted (Qiaquick gel extraction kit, Qiagen), and inserted upstream of the minimal CMV core promoter in the promoter-less SEAP reporter vector. Clonally derived plasmids were purified and sequenced. A control CMV promoter reporter plasmid was constructed using the hCMV-IE1 promoter (hereafter referred to as CMV) upstream of the SEAP ORF.

    [0324] Transient production of SEAP was employed to determine the relative activity of synthetic promoters as it both maximises throughput and provides a direct readout of synthetic promoter transactivation without potential interference from integration-specific effects or silencing. Whilst SEAP production is not a direct measurement of transcriptional activity, previous experiments in this laboratory have confirmed that SEAP activity in cell culture supernatant is linearly correlated with SEAP mRNA levels post-transfection. Moreover, assay conditions were optimized such that control CMV-SEAP reporter activity was in the centre of the linear assay range with respect to plasmid copy number (DNA load) and measured SEAP output (data not shown).

    [0325] Purified plasmid DNA from 110 transformed E. coli colonies picked at random was utilized for measurement of SEAP reporter production. SEAP production at 24 h post transfection was measured for each synthetic promoter, and each promoter was sequenced to reveal its TFRE-block composition. A small proportion (14) of reporter plasmids were found to be lacking a promoter insert and these were excluded from further analysis.

    [0326] The relative transcriptional activity of the remaining 96 promoters is shown in FIG. 2, and their TFRE-block compositions are listed in Table 3 below. These data show that generation 1 synthetic promoter activities spanned two orders of magnitude, where the most active synthetic promoter exhibited a 1.2-fold increase in SEAP production over that deriving from the CMV control vector.

    TABLE-US-00003 TABLE3 Generation1Promoters:N= NFB-RE,E= E-box,G= GC-box,B= C/EBP- RE,C= CRE,F= E4F1-RE.TFREsinthereverseorientation(i.e.3 to 5 withrespecttotheSEAPreporter)areindicatedbyanapostrophe. AllotherTFREsareinthe5 to3 orientation. Promoter Relative SEQID Name PromoterSequence Activity(%) No: 1/01 NEENGBCE B NNEG N 100 30 1/02 E FF B EG N E BNN NN GN 84.46 31 1/03 EEG CN G GG C ENN B E BE 67.20 32 1/04 B C E GCB NN N GF BN 58.3 33 1/05 GFNGNB N BCE N 56.87 34 1/06 CEE G ENECB G C N G 54.89 35 1/07 B EB G NBBEC NN 54.71 36 1/08 EE N CB E EGBEN 52.64 37 1/09 BG NB C N EE C GE 51.31 38 1/10 B EN EE BC GGN 48.83 39 1/11 C E NB EG F 47.64 40 1/12 B NNEEG CEN B B 47.18 41 1/13 GBE N GENC B NE E CE 44.7 42 1/14 CGGNCBB N NC FE N 38.43 43 1/15 N BF C N BE NC N 35.94 44 1/16 CF NNB C EE F 34.31 45 1/17 G GGB ENBN B CB EG G 34.27 46 1/18 NN EB CG EB C 34.03 47 1/19 EECECNN FCNBN E 33.77 48 1/20 F GEEE N FEBG E E 33.35 49 1/21 NEN E NNN 32.57 50 1/22 FG B EN E B E GGB 31.73 51 1/23 C NE BCBNG E 30.32 52 1/24 CN BBE FNE GN C 28.25 53 1/25 GG B G B N CNNB E CCE B 28.13 54 1/26 NG B NEEE F NB N B 27.33 55 1/27 B CEF B EF NGNN 26.87 56 1/28 E FN NEECNE F NG NGE F 25.72 57 NG NE F 1/29 NNNNNN 25.69 58 1/30 CB NE E GN 24.68 59 1/31 B GBE F N CBN E BG 23.33 60 1/32 E FG NGEENFNCGB GGB 20.09 61 1/33 CN FNG G G N FFC NE N 19.46 62 1/34 GG N EEFCEB E C N FG 15.22 63 1/35 CBC EBG N B E GN GG 15.16 64 1/36 EC NGF FGFG NEBC 14.48 65 1/37 NF EEFF BNBNFGF CC GN 14.44 66 ECN 1/38 N CNCCE GCB NN 13.75 67 1/39 C E C E FBG G N GE E CC 13.59 68 1/40 E FGCEFF G GF GFGEGB 13.57 69 NB ENFBG 1/41 B B EG F G B F EN BC 12.38 70 1/42 F B G NBBF E EBC NBE NG 12.23 71 B CN G 1/43 GE FNCBC EE 12.1 72 1/44 G EEFC C B NE 12.04 73 1/45 F BN C BC GN E GEC 12.01 74 1/46 NGE C G EBE ECFC E EFEG 11.56 75 C 1/47 EEEEEEE 10.78 76 1/48 E G FG E CFG E GB E BBC 9.11 77 1/49 B E FG NFCGG B N B EC 8.94 78 1/50 FG E E C B 8.74 79 1/51 FC G NF CGNGC BEG 8.53 80 1/52 CN C NC G N E N FCCN B C 8.53 81 CNC N FG 1/53 GG F NB N F E B N 8.44 82 1/54 CCCCCCC 8.43 83 1/55 G CC CN EN 8.03 84 1/56 G B C NE GEF F 7.38 85 1/57 N B BG FG FNE 6.89 86 1/58 NGC C 7.74 87 1/59 C G GE E G BNBNF FEF 6.06 88 1/60 GGGGGGG 6.03 89 1/61 G B CCN B EEB 5.94 90 1/62 FFFFFF 4.7 91 1/63 E EGGC N GB CGF F BEEBF 4.62 92 1/64 EBBC F CBE G 4.29 93 1/65 F BG B CEE F B 4.04 94 1/66 E ECGNC F ECG EF NC 4 95 1/67 E G G NFE F NN GFF 3.94 96 1/68 G G N F N CB E C EBFE FN 3.57 97 GC GC NG CBCFN BEFB G 1/69 EGBCF BBGNC 3.43 98 1/70 E N GFNE G CFE C GGCB 3.35 99 1/71 E N FG CFEBC CN 3.24 100 1/72 FF CG C B C N CGBBF CN 3.14 101 FE EB F G GB 1/73 BBBBBBB 3.13 102 1/74 C G C B BEB C BBB 3.1 103 1/75 GGN N FGFC BG 2.57 104 1/76 BFGGGGFBC EF 2.52 105 1/77 EFE NFG NFNG FC FG G 2.46 106 1/78 F GGC GG FN B 2.39 107 1/79 FFF E F C FBC NBEB 2.22 108 1/80 E GCEF F B GE G 2.11 109 1/81 B GCB GFGF 2.02 110 1/82 B E B BG BF C B NC C G 2.01 111 1/83 F C F NBG NG 2 112 1/84 B CB F G GEBCGC 1.94 113 1/85 C EE EBGFGF FF EG 1.87 114 1/86 G E NFEB FC E F CC 1.71 115 1/87 F CB GFGGEF C E 1.57 116 1/88 C N GE C GCN GFF 1.35 117 1/89 C GGB CCB 1.34 118 1/90 FC G B FE F G FGC 1.33 119 1/91 E B CEF F FBBG B CCBC G 1.33 120 1/92 C E FCF F EF EB 1.26 121 1/93 BF GEG GNFCF 1.23 122 1/94 B EGN NFCGFNFC F 0.66 123 1/95 G G N F F E C B CC 0.38 124 1/96 EBE B EEB FGC FFB 0.36 125

    [0327] Analysis of synthetic promoter composition revealed that (i) synthetic promoter length varied between 7 and 31 TFREs (mean=11.94.2 blocks; 18966 bp), although relative transcriptional activity was unrelated to promoter length (ii) across the generation 1 library the relative abundance of the six TFREs was approximately equivalent and (iii) individual TFREs could occur in either forward or reverse orientation [i.e. the consensus TF recognition sequence (see Table 1) could occur on either DNA strand] but this was not apparently related to synthetic promoter activity, either with respect to the general frequency of occurrence or with respect to the relative orientation of specific TFRE blocks.

    [0328] Thus, the inventors inferred that variation in synthetic promoter activity was a consequence of the differing relative abundance of specific TFREs within promoters and/or positional effects (i.e. that specific neighbouring or distal combinations of TFREs may affect promoter strength). Whilst the latter is computationally intractable given the size of the library, the inventors addressed the former by determination of the relative frequency with which individual TFREs occurred within synthetic promoters of varying activity. These data are shown in FIG. 3.

    [0329] Whilst no single TFRE exhibited an obviously dominant influence over synthetic promoter strength, individual TFREs were either relatively abundant in high transcriptional activity promoters (NFB, E-box), equally distributed across promoters (C/EBP, GC-box) or relatively abundant in low activity promoters (E4F1, CRE). This bias was confirmed by multiple linear regression analysis, where either an all factor model (inclusion of all six TFREs, r.sup.2=0.57, p=1.710.sup.14) or a parsimonious model excluding C/EBP and GC-box TFREs (as these do not improve model fit; r.sup.2=0.56, p=8.8410.sup.16) predicted the optimal stoichiometry of TFRE blocks to be NFB 1.58: E-box 1.

    [0330] The other TFREs were either neutral (C/EBP, GC-box) or negative effectors (E4F1, CRE). Analysis of specific promoter sequences throughout the library confirmed site designations as positive, neutral or negative. For example, the strongest promoter (1/01) contains the highest ratio of positive (NFB, E-box): negative (E4F1, CRE) sites (9:1) in the library. Moreover, the three most active promoters (1/01-1/03) are the only promoters in the library containing more than 7 positive sites and less than 3 negative sites. There are also multiple examples where high numbers of positive sites are apparently counteracted by high numbers of negative sites to produce relatively weak promoters, for example promoters 1/37, 1/52 and 1/68 which have positive:negative ratios of 8:8, 8:8, and 9:11 respectively.

    [0331] Accordingly, this Example demonstrates that the synthetic promoters of the present invention have the potential to tailor the expression of a recombinant protein to a specific expression level depending on the promoter construct selected.

    Example 3

    Construction and Analysis of Generation 2 Synthetic Promoters

    [0332] Based on the above analysis of the Generation 1 synthetic promoters, the inventors sought to further improve synthetic promoter activity by creating a Generation 2 synthetic promoter library using random ligation of a mixture of TFREs at an optimal ratio derived from analysis of the composition of Generation 1 promoters.

    [0333] Four of the initial 7 TFREs identified (see FIG. 1) were utilised for construction of a Generation 2 library of synthetic promoter constructs at a stoichiometry quantitatively derived from their relative representation in active synthetic promoters in the Generation 1 synthetic promoter constructs. The stoichiometric ratios used were 5:3:1:1 (NFkB-RE:E-box: C/EBP-RE:GC box).

    [0334] Specifically, of the initial 7 TFREs, the negative TFREs, E4F1 and CRE (see FIG. 3) were omitted (i.e. promoters which contained larger numbers of these TFREs were associated with lower reporter gene expression levels), whilst the neutral TFREs, C/EBP and GC-box were included based on the hypothesis that increased complexity could be advantageous. For example, the three most active synthetic promoters in the first generation library all contained at least two copies of both neutral TFREs (see Table 3) and thus they could contribute to unknown positional effects. The inventors expected that second generation promoters would contain the same average number of TFREs (12) as first generation promoters.

    [0335] The Generation 2 promoter constructs (see Table 4 below for sequences) were generated using the same construction method described above in Example 2.

    TABLE-US-00004 TABLE4 Generation2Promoters:N= NFB-RE,E= E-box,G= GC-box,B= C/EBP-RE. TFREsinthereverseorientation(i.e.3 to5 withrespectto theSEAPreporter)areindicatedbyanapostrophe.AllotherTFREsarein the5 to3 orientation. Relative Activity (asa%of Promoter hCMV-IE1 SEQID Name PromoterSequence activity) NO: 2/01 NGEE NE E NNNE ENE 216.66 126 2/02 EN N N EEE N E B N E NNE 175.94 127 2/03 EEN G EN E N N E N N NB 174.82 128 2/04 NENENN N EN 169.04 129 2/05 N NGNB E E NBNE NN 166.6 130 2/06 N N NNE NG E N EB 157.92 131 2/07 N N E NG N N N BN 155.3 132 2/08 E E N B N N G NNN E 154.47 133 2/09 N N ENBNN E NGE E 151.37 134 2/10 E NB N NE NBNNN G NNN 150.63 135 2/11 EN ENN N N NE N N N N BN E 150.17 136 NE NE 2/12 N E NNB N NNE BE 148.93 137 2/13 EBB EB NNE NN EEG NGNEBN 143.62 138 2/14 EN N NGNN G BN GN 140.87 139 2/15 NG E N N NN E N 140.87 140 2/16 NG ENN B NNG G 140.77 141 2/17 E NG G E ENNNE NN G N B NE 138.95 142 E NNN 2/18 N N NN G N NG E NBE 138.83 143 2/19 E N NB NNE GNNEB NE 137.6 144 2/20 EEN E E EN G E NE NG 137.53 145 2/21 NNE EN NGE N E 132.61 146 2/22 NE G EN BE N EN 117.77 147 2/23 N E N NGN NNGE 117.22 148 2/24 B EN N EN G NE GNB N 113.38 149 2/25 NG NNNN N BN N 108.35 150 2/26 BE G NG N EGE N G 99.36 151 2/27 N NE N N EEB EG E 98.83 152 2/28 EN N NN NN B 97.9 153 2/29 B N G G E N E NEN BN 94.66 154 2/30 ENN NENBN N 94.36 155 2/31 E G G N G EBN EN N 90.93 156 2/32 NNG EE NN G EE 90.49 157 2/33 N EGNN EE B N 87.38 158 2/34 BEG N NNGE 82.11 159 2/35 N N GBG B NN NN B 76.61 160 2/36 N EN GNNB EG 75.11 161 2/37 G NNN NEEG G 70.59 162 2/38 GEBNNE E N 70.24 163 2/39 NNGE BE N 63.19 164 2/40 G N NE GN N BG 62.21 165 2/41 EB E EB N N E B 57.26 166 2/42 N GN NEGB B 41.65 167 2/43 GE B NG NGB N 36.61 168 2/44 BEE EBE B BE N 32.3 169

    [0336] 50 transformed E. coli colonies were picked at random, synthetic promoters in purified plasmid DNA were sequenced and 44 reporter plasmids containing promoter sequences were utilized for measurement of SEAP reporter production. The relative transcriptional activity of second generation promoters is shown in FIG. 4.

    [0337] The Generation 2 synthetic promoters exhibited significantly increased activity compared to the Generation 1 promoters. In particular, the mean expression level (relative to CMV) shifted from 21.2% for first generation promoters to 116% for the second generation library. In fact, the results indicate that a large number of Generation 2 promoters have a higher activity than the top Generation 1 promoter. Twenty five Generation 2 synthetic promoters (57% of the library) achieved a higher SEAP production than the CMV control, with the strongest promoter (2/01) exhibiting a 2.2-fold increase.

    [0338] FIG. 5 shows the results of an analysis of the relative abundance of each TFRE relative to the expression levels of the Generation 2 synthetic promoter constructs in which the TFREs are present.

    [0339] Analysis of the TFRE block composition of the second generation promoters revealed that the relative stoichiometry of TFREs across the library was approximately as designed (NFB 5:E-box 2.81:GC-box 1.32:C/EBP1.15). As shown in FIG. 5, for second generation promoters the influence of GC-box and C/EBP is generally negative, whereas NFB and E-box remain positive effectors.

    [0340] However, considering the composition of second generation promoters (see sequences in Table 4), the results suggest that neither NFB nor E-box TFRE blocks could support high transcriptional activity alonea combination of both is necessary.

    [0341] The most powerful promoters (2/01-2/03) contain relatively high numbers of both TFREs in approximately equal proportion, with a correspondingly low number of negative GC-box and C/EBP blocks. Some lower activity promoters do contain relatively large numbers of NFB or E-box blocks (e.g. 2/11, 2/13, 2/17) but (i) contain a sub-optimal ratio of NFB:E-box (2/11, 2/17) or (ii) also contain relatively large numbers of GC-box and C/EBP blocks (2/13).

    [0342] Therefore, this Example demonstrates that the Generation 2 promoters are able to successfully extend the possible transcriptional activity range that can be achieved in CHO cells using Generation 1 promoter constructs.

    Example 4

    Analysis of Generation 1 and 2 Promoter Constructs in Different CHO Host Cell Types

    [0343] In order to determine if the synthetic promoters of the present invention performed robustly and predictably, the inventors evaluated their relative functional capability in different CHO host lines.

    [0344] A panel of seven promoters from both first and second generation libraries were selected that cover a broad range of promoter activity (i.e. 1/51<1/17<1/04<1/02<2/19<2/03<2/01). These were compared to the activity of CMV.

    [0345] FIG. 6 shows transient SEAP production from all promoters in three commonly utilized host lines; CHO-S, CHO-DG44 and CHO-K1. The relative rank order of promoter activity is maintained in all three cell lines, with the exception that 2/03 outperforms 2/01 in CHO-K1in contrast to the original screen (see Example 3), promoters 2/03 and 2/01 have approximately equivalent expression in CHO-DG44 and CHO-S.

    [0346] In each cell line, the top performing synthetic promoter drives significantly higher SEAP production than CMV3.1-fold 1.9-fold, and 1.7-fold in CHO-DG44, CHO-S, and CHO-K1 cells respectively. In general, CHO-DG44 cells exhibited significantly less reporter production than either CHO-S or CHO-K1 cells, presumably due to their reduced transfectability by lipofection.

    [0347] Nonetheless, this Example indicates that the rank order of the promoter constructs is generally preserved across different CHO cell types and suggests that the promoter constructs will also be effective in other CHO cell types besides the 3 types tested, and also suggests that the synthetic promoters are likely be functional in a range of transformed mammalian cell types and may have use in cancer-targeted gene therapeutic applications.

    Example 5

    Analysis of Generation 1 and 2 Promoter Constructs in Longer Term Transient Transfections

    [0348] Next, to determine synthetic promoter functionality in an industrially relevant production process, the same panel of promoters was evaluated in a fed-batch SEAP production process over a longer term transient transfection (7 days), utilizing CHO-S host cells.

    [0349] A transient system was employed (rather than stable) to ensure production variability was directly linked to differences in promoter activity rather than cell line specific, site-specific integration or promoter silencing (e.g. methylation, deletion) artefacts.

    [0350] Two hours prior to transfection 610.sup.6 cells from a mid-exponential phase CHO-S culture were seeded into 50 mL CultiFlask bioreactors (Sartorius, Surrey, UK) at a working volume of 6 mL. Cells were transfected with DNA:lipofectamine complexes, prepared according to the manufacturer's instructions. Fed-batch cultures were maintained for seven days by nutrient supplementation with 10% v/v CHO CD Efficient Feed A (Life Technologies) on day 2, 4 and 6. SEAP expression and cell growth were measured at 24 h intervals.

    [0351] The results of the SEAP expression assays are shown in FIG. 7. As can be seen, the relative order of activity is mostly preserved compared to the short term transient expression results (i.e. the static microplate experiments from Example 4, see FIG. 6), indicating that robust and reproducible expression levels can be achieved long term using the synthetic promoter constructs of the present invention.

    In addition, the highest SEAP titer, driven by promoter 2/03, was over 1.65-fold that obtained by CMV-mediated expression.

    [0352] The IVCD results further show that cell viability is not significantly affected when the CHO cells are transfected with the promoter constructs of the present invention. This is important because it shows that the promoter constructs have the potential to be used in host cells without adversely affecting normal cellular function.