Site-specific integration
11326185 · 2022-05-10
Assignee
Inventors
- James Rance (Thame, GB)
- Robert Young (London, GB)
- Michael J. Agostino (Andover, MA)
- Mark Moffat (St. Louis, MO, US)
- Lin Zhang (Boxford, MA, US)
- Baohong Zhang (Madison, CT, US)
Cpc classification
C12N2800/22
CHEMISTRY; METALLURGY
C12N2800/30
CHEMISTRY; METALLURGY
C07K2317/14
CHEMISTRY; METALLURGY
C12N15/90
CHEMISTRY; METALLURGY
C07K16/00
CHEMISTRY; METALLURGY
C07K2317/24
CHEMISTRY; METALLURGY
International classification
C12N15/10
CHEMISTRY; METALLURGY
C07K16/00
CHEMISTRY; METALLURGY
Abstract
The present invention relates to stable and high-producing site-specific integration (SSI) host cells, e. g. Chinese hamster ovary (CHO)-derived host cells, methods to produce and to use them.
Claims
1. A site-specific integration (SSI) host cell comprising: an endogenous Fer1L4 gene; and an exogenous nucleotide sequence integrated in said Fer1L4 gene, the exogenous nucleotide sequence comprising at least two recombination target sites, wherein the exogenous nucleotide sequence is integrated in a region spanning and including exon 28 to exon 40 of the endogenous Fer1L4 gene.
2. The SSI host cell of claim 1, wherein the exogenous nucleotide sequence comprises a gene coding sequence of interest.
3. The SSI host cell of claim 2, wherein the at least two recombination target sites flank the gene coding sequence of interest, wherein an integration site of one of the recombination target sites that flank the gene coding sequence of interest is located between exon 39 and 40 of the endogenous Fer1L4 gene and an integration site of the other recombination target site that flanks the gene coding sequence of interest is located between exon 28 and 29 of the endogenous Fer1L4 gene.
4. The SSI host cell of claim 2, wherein the gene coding sequence of interest comprises one or more of a gene encoding a selection marker, a detectable protein, an antibody, a peptide antigen, an enzyme, a hormone, a growth factor, a receptor, a fusion protein or other biologically active protein.
5. The SSI host cell of claim 1, wherein the recombination target site is a FRT site or a lox site.
6. The SSI host cell of claim 4, wherein the selection marker is a glutamine synthase selection marker, a hygromycin selection marker, a puromycin selection marker or a thymidine kinase selection marker.
7. The SSI host cell of claim 1, wherein the host cell is a mouse cell, a human cell or a CHO host cell, a CHOK1 host cell or a CHOK1SV host cell.
8. The SSI host cell of claim 3, wherein the nucleotide sequence of the Fer1L4 gene flanking the integrated exogenous nucleotide sequence is selected from the group consisting of SEQ ID No. 7, 8, 9 and homologous sequences thereof.
9. The SSI host cell of claim 1, wherein the exogenous nucleotide sequence replaces a portion of the Fer1L4 gene.
10. The SSI host cell of claim 5, wherein the host cell comprises a FRT site which is a wild type FRT site or a mutant FRT site.
11. The SSI host cell of claim 1, wherein the recombination target sites comprise at least one wild type FRT site and at least one mutant FRT site.
Description
(1) The figures show:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12) The Fer1L4 gene is depicted as an arrow whilst the genomic fragment is represented as a solid line. The interrupting dashed line represents a very large distance. 5′ and 3′ flanking sequences are represented by white and black pointers, respectively.
(13)
(14)
EXAMPLES
Example 1
(15) A) Materials and Methods
(16) 1. Vector Construction
(17) All vector sequences were synthesized fully sequenced. Puromycin acetyl transferase (PAC), hygromycin phosphotransferase (Hyg) and mAb genes were all gene-optimized and adapted to the codon bias of Cricetulus griseus prior to gene synthesis. The majority of pRY17 (
(18) 2. Batch and Fed-Batch Shake Flask Analysis
(19) For the batch shake flask analysis, cells were seeded at 3×10.sup.5 viable cells/mL in 125 mL shake flasks in 30 mL of CD CHO supplemented with various selective agents (as described later) and incubated at 37° C. in a humidified 5% CO.sub.2 in air (v/v) orbital shaking incubator at 140 rpm. Conditioned medium was harvested at day 7 of the culture and the antibody concentration in the conditioned medium was determined by Protein A HPLC.
(20) For fed-batch shake flask analysis, cells were seeded at 3×10.sup.5 cells/mL in 500 mL shake flasks, each containing 100 mL of proprietary medium and incubated at 37° C. in a humidified 5% CO.sub.2 in air (v/v) orbital shaking incubator at 140 rpm. Cells were fed starting on day 3 of the culture with a proprietary feed consisting of mixture of amino acid and trace elements. Daily viabilities and viable cell concentrations were determined using a Vi-CELL™ automated cell viability analyzer. Antibody concentration in the medium was determined by Protein A HPLC starting on day 6 of the culture through to its harvest on day 14.
(21) 3. Stability Analysis
(22) Cells were sub-cultured alternately every 3 and 4 days in 125 mL shake flask in 30 mL CD CHO supplemented with different selection agents (as described later). At different generation numbers (1 generation is equivalent to 1 population doubling), duplicate fed batch shake flasks were set up as described above. Cell concentration, viability and mAb concentration measurements were collected as described above. If the productivity of a cell line changes by >30% within 70 generations, then it is considered to be unstable.
(23) 4. Flow Cytometry for Single-Cell Cloning
(24) Single-cell cloning was performed on a FACS Aria II cell sorter equipped with FACSDiva v6.0 software with an air-cooled laser emitting at 488 nm. Dead cells were excluded in a FSC vs. SSC dot plot, and the doublets were excluded in a FSC width vs. area dot plot. The sorting gate for the live cells was a combination of the two dot plots.
(25) 5. Generation of SSI Host Cells
(26) Transfection of the parental pRY17-expressing cell line with the null vector, pRY37, comprising the second gene coding sequence of interest, namely the selection marker gene, was conducted using FreeStyle™ MAX CHO system (Invitrogen). To this end, 24 h before the re-transfection for RMCE, a selected pRY17-transfected cell line (11A7) was first seeded in FreeStyle™ CHO Expression Medium at 5×10.sup.5 cells/mL in a 125 mL shake flask. On the day of transfection, approximately 3×10.sup.7 cells at a concentration of 1×10.sup.6 cells/mL were co-transfected with 33.75 μg of pOG44 plasmid (Invitrogen, gb:X52327) and 3.75 μg of pRY37 (9:1) with FreeStyle™ MAX reagent in a 125 mL shake flask according to the manufacturer's instructions. Post-transfection, cells were plated into 48-well plates containing proprietary chemical-defined medium supplemented with 25 μM MSX and 7 μg/mL puromycin. Three weeks post-plating, medium from each well containing viable cells were screened for antibody production on a ForteBio using Protein A biosensor. Medium from cell lines with no detectable antibody were advanced to 125 mL shake flasks containing CD CHO medium supplemented with 25 μM MSX and 1 μg/mL puromycin.
(27) 6. Generation of Cell Lines Derived from the 10E9 Host
(28) RMCE experiments are divided into 3 distinct ‘rounds’, and are referred to both here and later in the results section below (rounds are defined in
(29) In rounds 1 and 2, transfection of the SSI host cell 10E9 with the SSI targeting vector, pRY21 was conducted using FreeStyle™ MAX CHO system (Invitrogen). To this end, 24 h before the re-transfection for RMCE, 10E9 SSI host cells were first seeded at a concentration 5×10.sup.5 cells/mL in a 125 mL shake flask containing FreeStyle™ CHO Expression Medium (Invitrogen) supplemented with 25 μM MSX and 1 μg/mL puromycin. On the day of transfection, approximately 3×10.sup.7 cells at a concentration of 1×10.sup.6 cells/mL were co-transfected with 33.75 μg of pOG44 plasmid (Invitrogen, gb:X52327) and 3.75 μg of pRY21 (9:1) with FreeStyle™ MAX reagent in a 125 mL shake flask according to the manufacturer's instructions. Post-transfection, cells were recovered in a 125 mL shake flask containing 30 mL of FreeStyle™ CHO Expression Medium (Invitrogen) supplemented with 25 μM MSX. 48 h post-RMCE transfectants from round 1 were plated onto 48-well plates containing proprietary chemical-defined medium supplemented with 25 μM MSX, 400 μg/mL hygromycin (positive selection) and 3 μM ganciclovir (negative selection). Four weeks later, the concentration of mAb in medium from wells containing visible foci was determined on a ForteBio Octect using a Protein A biosensor. Cells secreting mAb into medium were expanded and maintained in shake flasks containing CD CHO medium supplemented with 200 μg/mL hygromycin and 25 μM MSX. These cell pools were further evaluated for antibody productivity in batch shake flask analysis (as described earlier). For stability analysis, MSX was removed from the sub-culture for the condition specified in
(30) After recovery for 48 h in shake flasks, round 2 transfectants were seeded at a concentration of 5×10.sup.5 cells/mL in a 125 mL shake flask containing CD CHO medium supplemented with 400 μg/mL hygromycin (positive selection). This was followed by the addition of the 3 μM ganciclovir 5 days later as the negative selection. Cells were passaged continuously every 3-4 day in the same medium for 3 weeks in the same shake flask. Surviving cells were single-cell cloned using a FACS Aria II into 96-well plates containing proprietary chemical-defined medium supplemented with 400 μg/mL hygromycin and 3 μM ganciclovir. Three weeks later, the mAb concentration in medium from wells with visible cell growth was determined on a ForteBio Octect using a Protein A biosensor. Clones secreting mAb into the culture medium were expanded and maintained in shake flasks containing CD CHO supplemented with 200 μg/mL hygromycin and 25 μM MSX. These clones were further evaluated for antibody productivity in batch shake flask analysis (as described earlier). For the stability analysis, MSX was removed from the sub-culture for the conditions as specified in the
(31) For round 3 transfections, 1×10.sup.7 10E9 SSI host cells were cotransfected by electroporation with 45 μg of pOG44 plasmid (Invitrogen, gb:X52327) and 5 μg of pRY21 at 900 μF, 300V. Post-transfection, the cells were seeded into a T-75 flask containing 20 mL proprietary chemical-defined medium. After 48 h, either 200 or 400 μg/mL hygromycin was added to the medium followed by the addition of 3 μM ganciclovir 6 days later. In some cases (as described in the
(32) 7. Data Analysis
(33) The time-integral of the area under the growth curve (the time-integral of the viable cell concentration (IVC); 10.sup.6 cells day/mL) was calculated using the method described by Renard et al. (Renard et al. 1988, Biotechnology Letters 10:91-96)
(34)
where Xv0=viable cell concentration at first sample (10.sup.6/mL), Xv1=viable cell concentration at second sample (10.sup.6/mL), t0=elapsed time at first sample (day), t1=elapsed time at second sample (day).
8. DNA Walking
(35) Seegene DNA Walking SpeedUp™ Kit II was used according to the manufacturer's to provide 3′ genome flanking sequence data. Beta-lactamase (bla) gene-specific primers, specific for bla in the 3′ arm of the schematic of linearly-integrated pRY17 vector (bla R,
(36) 9. Genome Sequencing of the 10E9 Host
(37) 10E9 genomic DNA was fragmented and a paired-end library suitable for HiSeq platform sequencing was prepared using the TrueSeq DNA Sample Preparation kit, following manufacturer's instructions. The library generated was within the expected size range of 300 bp to 500 bp. QC analysis of the generated library using an Agilent 2100 Bioanalyzer (indicated that the library was of acceptable quality, containing the expected fragment size and yield, for continued sample processing. The library generated was used in the cBot System for cluster generation, following manufacturer's instructions. The flow cells containing amplified clusters were sequenced using 2×100 base pair paired-end sequencing on a Hi-Seq 2000. The reads are mapped to CHO-K1 contigs (Xu X et al. 2011, Nature Biotechnol. 29:735-742) using the Burrows-Wheeler Aligner (BWA) (Li H. and Durbin R. 2009 25:1754-1760).
(38) B) Results
(39) Phase I: Generation of the Parental mAb-Expressing Cell Lines
(40) The aim of phase I was to generate a high-producing mAb-expressing GS-CHOK1SV cell line, exhibiting favorable growth characteristics with stable productivity, containing only a single integration locus and the lowest possible number of vector integrants at this locus. A modified Lonza GS ‘double gene vector’, pRY17 (
(41) Importantly, the vector-derived GS gene was placed outside of exchangeable cassette so that it would be retained in the genome of resulting cell lines after RMCE. By doing so, any potential perturbation of glutamine metabolism in any derivative cell line was avoided; the parental GS-CHOK1SV cell lines were selected in glutamine-free medium and 50 μM methionine sulfoximine (MSX), in the presence of both endogenous and vector-derived glutamine synthetase expression. A promoterless and translation initiation methionine-deficient (−ATG) hygromycin B phosphotransferase gene were placed in-between sequence encoding linker (Lnk) and the F recombination sequence (
(42) The pRY17 vector containing FRT recombination sequences was introduced into CHOK1SV cells by a conventional cell line development procedure, followed by an intensive screen conducted at key stages of the process to ensure that we isolated cell lines with the best combination of growth and productivity. Additionally, cB72.3 protein derived from the chosen cell lines has to exhibit similar product quality characteristics as a preparation derived from a previous GS-CHOK1SV cell line (Birch and Racher, 2006, Adv Drug Deliv Rev 58:671-685). The HC and LC gene copy numbers from candidate cell lines are preferred to be close to one for each. To this end, three independent electroporations were conducted and each with 50 μg of linearized pRY17 and 1×10.sup.7 CHOK1SV cells. Transfectants from all three electroporations were selected in medium containing 50 μM MSX and 1500 surviving cell pools were screened for antibody production at 3 weeks post-transfection. Eventually a total of 79 clones were evaluated by the 7 day batch shake flask and cB72.3 mAb concentration of all 79 clones was determined. All RMCE derivative cell lines were maintained in medium containing 25 μM MSX, except where stated. From the 79 clones evaluated, 38 were selected for further analysis in fed-batch shake flask culture. The top 6 best-performing clones based on productivity and growth characteristics were selected (table 1) for genetic characterization (see methods).
(43) TABLE-US-00001 TABLE 1 Growth and antibody production of the top 6 clonal cell lines from Phase 1 in batch and fed-batch shake flask (n = 2). Fed-Batch Batch Specific Peak Cell Product Product Production Density Cell Concentration Estimate of IVCC Concentration Rate (10.sup.6 Line (mg/L)-day 7 (10.sup.6 cells day/mL) (mg/L)-day 14 (pg/cell .Math. day) cells/mL) 1G11 637.73 73.55 3385 47.3 9.00 6B5 273.73 97.89 2029 21.2 13.40 8F10 427.41 125.46 2976 25.0 15.32 11A7 457.88 149.6 3570 25.1 19.13 14D11 378.83 86.32 3251 39.2 10.33 18C11 480.27 90.45 2953 35.0 11.05
(44) In order to investigate the integration site of the pRY17 in the CKOK1SV genome, metaphase chromosomes from the 6 clones were prepared and probed with DIG-labeled pRY17 (data not shown). Clones 1G11, 6B5, 8F10, 14D11 and 11A7 all appear to have only one integration locus at the telomeric region of an individual chromosome. Clone 18C11 on the other hand seems to have two distinct integration sites and therefore was not selected for further study. To determine the gene copy numbers in each of the cell lines, sonicated genomic DNA was prepared from actively growing cells. For the qPCR analysis, GAPDH was included as the endogenous control and pRY17 was ‘spiked’ into host cell DNA as the positive control. The gene copy numbers per cell for both the HC and LC were calculated as the ratio of averaged copies to averaged GAPDH copies. As shown in table 2, out of the 5 clones analyzed, 11A7 has the lowest HC and LC gene copy numbers. Southern blot analysis of the genomic DNA revealed that both HC and LC can be detected in all 5 clones and the intensity of bands reflects the qPCR-determined gene copy numbers (data not shown).
(45) TABLE-US-00002 TABLE 2 Gene copy analysis of the cB72.3 HC and LC in 5 of the 6 clonal cell lines using qPCR HC average LC average Cell Line copies/cell copies/cell 1G11 15.58 13.95 6B5 44.46 47.91 8F10 40.46 38.41 11A7 6.57 3.57 14D1 9.75 9.31
(46) Out of the 5 clones selected that entered the present stability study (table 3), 11A7 maintained similar productivity over 7 months (220 generations), whereas other clones showed gradual productivity loss during the first 3 months of the study (80 generations). Taken together, 11A7 not only has one of the best combinations of good growth and productivity profiles, but also has the lowest gene copy number with a single integration site. Importantly, 11A7 is the most stable clone out of the 6 in terms of productivity. Most importantly product quality was comparable after 220 generations. 11A7 was chosen as the parental clone for the first round of RMCE: Phase II.
(47) TABLE-US-00003 TABLE 3 Stability analyses of the top 5 SSI clonal cell lines. The top 5 SSI clonal cell lines were continuously cultured in shake flasks for various numbers of generations in the presence of MSX. At different generation numbers as indicated, all 5 SSI clonal cell lines were analysed by fed-batch shake flask for antibody production. (n = 2). Antibody production (mg/L) at given generation number Cell line 10 40 80 100 120 140 160 180 200 220 1G11 2809.00 2977.00 2675.86 2553.18 2464.09 2047.00 — — — — 11A7 2946.00 2776.82 2546.47 2464.99 2395.90 2421.00 2425.00 2498.00 2496.00 2676.00 6B5 1548.00 1791.94 1580.33 — — — — — — — 8F10 2433.00 2593.00 2760.20 — — — — — — — 14D11 2629.00 2521.08 2358.44 — — — — — — —
Phase II—Generation of SSI Host Cells
(48) Although it is entirely possible to design a targeting vector that could swap the original mAb transcription units in 11A7 for those of a new mAb, it is preferred that the original was completely excised from the genome. To do this an additional null targeting vector, pRY37 (
(49) Of the surviving cell line pools that were negative for mAb expression, one cell line, 136-A4 was chosen for further characterization by Southern blot analysis (data not shown). It confirmed the presence of TK in the 136-A4 genome. Restriction mapping indicated the presence of only two copies of pRY37 in the “hot-spot” and was confirmed by subsequent genome sequencing of the daughter clone 10E9. The copy number is substantially lower than pRY17 found in 11A7 (table 2). To obtain a homogenous SSI host, we single-cell cloned 136-A4 using FACS Aria II and obtained growth profiles of 26 clonal derivatives. Out of these, two clonal derivatives with the best growth profiles, 10E9 and 8C8, were selected for further characterization by northern blot analysis. Northern blot analysis of RNA from these daughter clones confirmed the absence of cB72.3 HC and LC mRNAs (data not shown). Taken together, these results show that 10E9 is a suitable candidate host cell line for RMCE for testing in phase III.
(50) Phase III—RMCE with Myo mAb Targeting Vector
(51) In order to demonstrate the utility of the new SSI host cell line 10E9, a targeting vector, pRY21 was designed (
(52) The productivity data obtained from different pools in round 1 is similar, suggesting that cell line members of each pool are likely to have similar productivities. The range of productivities from the pools is much narrower than that of either clonal or non-clonal cell lines from a random integration process (
(53) Initially, in phase I a very stable GS-CHOK1V cell line, 11A7, was isolated that is stable up to 220 generations. This stability trait could be inherited in derivative cell lines generated by RMCE in phase III. Accordingly, 3 cell pools from round 1 from phase III were evaluated in an extended stability study under two different conditions (Table 4).
(54) TABLE-US-00004 TABLE 4 Round 1: Three selected cell lines were expanded into shake flasks and cultured continuously at two different conditions; +hygromycin/+MSX (+/+) or −MSX/+hygromycin (−/+). At various time points, duplicate fed-batch shake flask cultures were setup from the continuous cultures of all 3 SSI cell lines in both conditions and the concentration of Myo mAb in medium was determined after 14 days. Antibody concentration (mg/L) in fed-batch cultures of cell lines Generation 70C2 72A3 74B5 Number (+)/(+) (−)/(+) (+)/(+) (−)/(+) (+)/(+) (−)/(+) 10 1240 1140 1510 1430 1310 1170 40 1070 1140 1400 1250 1370 1190 70 1100 1170 1400 1200 1370 1240 100 1030 1270 1700 1490 1700 1400
(55) Regardless of the conditions, all three pools tested met the criteria for a stable cell line. Further, a total of 12 clonal cell lines, 6 each from round 2 and 3, in the same type of stability study (tables 5 and 6, respectively). It was found that all 12 clonal cell lines retained the stability trait under selection. Interesting was that the 6 clonal cell lines from round 3 (table 4) were stable even without the presence of any selective agents. This has profound implications for manufacture of biopharmaceuticals.
(56) TABLE-US-00005 TABLE 5 Round 2: Six single-cell clones were continuously cultured at two different conditions; +hygromycin/+MSX (+/+) or −MSX/−hygromycin (−/−) and were periodically analyzed by fed-batch culture as described for Round 1 (table 4). Antibody concentration (mg/L) in fed-batch cultures of cell clones Generation 11F6 11F11 12C3 13F7 13G8 15E9 Number (+)/(+) (−)/(−) (+)/(+) (−)/(−) (+)/(+) (−)/(−) (+)/(+) (−)/(−) (+)/(+) (−)/(−) (+)/(+) (−)/(−) 10 1470 1390 1920 1860 1480 1360 1960 1870 2020 2020 2150 2100 40 1300 1420 1870 1680 1770 1590 1850 1860 1940 2010 2020 2100 70 1350 1350 1870 1650 1770 1600 1850 1760 1840 1890 2060 2060 100 1200 1580 2180 1930 2160 1790 2094 1950 1970 2020 2500 2360
(57) TABLE-US-00006 TABLE 6 Round 3: Six single-cell clones were continuously cultured at two different conditions; −MSX/+hygromycin (−/+) or − MSX/− hygromycin (−/−) and were periodically analysed by fed-batch culture as described for Round 1 (table 4). Antibody concentration (mg/L) in fed-batch cultures of cell clones Generation 2H4 4H7 A5B8 A7D5 A8A10 A8G1 Number (+)/(+) (−)/(−) (+)/(+) (−)/(−) (+)/(+) (−)/(−) (+)/(+) (−)/(−) (+)/(+) (−)/(−) (+)/(+) (−)/(−) 10 2400.0 2400.0 2200.0 2400.0 1900.0 2100.0 2200.0 2300.0 1300.0 1400.0 1400.0 1500.0 40 2460.0 2440.0 2330.0 2240.0 2100.0 2070.0 2320.0 2320.0 1580.0 1570.0 1490.0 1500.0 70 2080.0 2150.0 1980.0 2070.0 1910.0 1930.0 1990.0 2120.0 1490.0 1610.0 1360.0 1380.0 100 2389.0 2359.0 2655.0 2510.0 2215.5 2276.5 2343.0 2536.0 1760.0 2009.0 1637.5 1656.5 130 2477.5 1937.5 3401.0 2505.0 2392.0 1970.0 2496.0 2311.0 1760.0 1870.0 1552.0 1029.0
(58) Characterization of genomic sequences flanking the “hot-spot”
(59) 3′ Flanking Sequence
(60) 500 bp 3′ flanking sequence (SEQ ID No. 7) sequence derived from Seegene DNA walking from bla R (
(61) The 3′ flanking sequence, identified by Seegene DNA walking (see methods) was used to blast search the contigs of a CHO-K1 genome sequencing project (Xu X et al. 2011, Nature Biotechnol. 29:735-742), publically available in the NCBI databank. Using this data and Illumina HiSeq genome sequence data obtained from the 10E9 SSI host cell line, a unique region located on unplaced genomic scaffold, scaffold1492 (accession number, JH000254.1, identical to NW_003613833.1) was found. This was found to be located within a predicted Fer1L4 (fer-1-like 4) gene (NCBI Gene ID: 100755848) in scaffold1492 on the minus strand (scaffold 1492 nucleotide number 1,746,191 to 1,781,992; 35,802 nucleotides in total). The 5′ flanking sequence appears to be located between exons 39 and 40 whilst the 3′ flanking sequence appears to be located between exons 28 and 29 (see
(62) 5′ Flanking Sequence
(63) Illumina reads from 10E9 genomic DNA were mapped to pRY17 (SEQ ID No. 1) using the Burrows-Wheeler Aligner (BWA). Through inspection of the mapping, it was found that multiple unpaired reads (black arrows in the
Example 2
(64) A) Materials and Methods
(65) 1. Southern Blot
(66) 5-10 μg of genomic DNA, isolated from passages 2 and 4 of each clone and purified using Blood & Cell Culture DNA Maxi Kit from QIAGEN (Qiagen), was digested with restriction endonuclease(s) for 15 h at 37° C. The digested DNA was extracted twice with an equal volume of a phenol:chloroform:isoamyl alcohol mixture, pH8.0 (1:1 v/v) followed by chloroform alone and ethanol-precipitated prior to electrophoresis on 0.7% (w/v) agarose gel run in either 0.5× TBE (50× TBE: Lonza) or 1× TAE (40 mM Tris, pH 7.7, 2.5 mM EDTA) buffer. The gel was transferred onto Hybond-N membrane (Amersham) using a vacuum manifold essentially according to manufacturer's instructions (Appligene, Pharmacia). The Hybond-N membranes were UV-fixed, pre-hybridized in either hybridization buffer containing 5× Denhardt's prepared from 50× stock solution (Sigma), 6× SSC (1× SSC: 0.15 M sodium chloride, 15 mM sodium citrate), and 10% (w/v) SDS or Rapid-hyb Buffer (GE healthcare) alone.
(67) TK probes were generated in PCRs using the following primer sets:
(68) TABLE-US-00007 TK-forward: (SEQ ID No. 10) 5′-AGATCACCATGGGCATGCCTTAC-3′; TK-reverse: (SEQ ID No. 11) 5′ AACACGTTGTACAGGTCGCCGTT-3′;
(69) The vector pRY37 was used as a template for the probe-generating PCR and the cycling conditions were: 15 ng template/50 μl reaction; Taq DNA Polymerase (Roche); 94° C. 2 min, 30 cycles of 94° C. for 30 s, 55° C. for 1 min and 72° C. for 30 s, final extension at 72° C. for 7 min. 25 ng of PCR product was labeled with [γ-32P] dCTP (111 TBq/mmol, Perkin Elmer) using the Megaprime Kit and purified on a nick-translation column (Amersham). Hybridizations were performed in the same pre-hybridization buffer for 2-20 h at 65° C. Post-hybridization, membranes were washed to a final stringency of 0.1× SSC, 0.1% (w/v) SDS at 65° C. Blots were exposed to a storage phosphor screen (Bio-Rad); exposed screens were imaged using a Personal Molecular Imager (PMI) System (Bio-Rad).
(70) 2. Mapping and Alignment
(71) The paired end reads in FASTQ format are the input for mapping to genomics templates. Vector sequences and CHOK1SV assembly are indexed as the templates to be mapped to. The paired end reads are aligned to the templates using Bowtie2 (Langmead B, & Salzberg SL, 2012, Nature methods, 9 (4), 357-9) with the default parameters (-D 5 -R 1 -N 0 -L 25 -i S,1,2.00) for very fast local alignment. Coverage is normalized as the <raw coverage>*500 M/<number of reads> in order to compare across different samples.
(72) 3. Identification of Integration Sites
(73) 2×100 paired end reads of 10E9 SSI host strain were sequenced using Illumina Hi-Seq 2000 at an average coverage of 40×. The sequence reads were mapped to vector pRY17 which is the first vector integrated into the CHOK1SV genome. Reads covering integration sites are termed chimerical reads because they contain sequence that maps to both the CHOK1SV genome and also to integrated vector sequence. Because the mapping is performed by local alignment, the chimerical reads have characteristics of partial match to vector sequences with overhang tails which could map to genomic sequence. In addition to the chimerical reads, there other reads where one end of a paired read maps to vector sequence fully and the other end maps to genomics sequence. These read pairs are called discordant read pairs. The overhang tail sequences and unmapped reads from discordant read pairs are collected and used to search against CHOK1SV genome assembly using blast to identify the flanking sequence of integration sites based on sequence similarity.
(74) 4. Landing Pad
(75) The structure of landing pad (exogenous sequences introduced into the hot-spot that contain recombination sites for the integration of expression cassettes of genes of interest by RMCE) was based on both the Southern blot analysis and whole-genome re-sequencing (WGRS) analysis data of the 10E9 cell line (
(76) 5. Quantification of RNA-Seq Analysis
(77) The template used to map the reads was constructed using a ‘onecopy’ model of the landing pad derived from whole genome resequencing (see above). RNA-seq sequencing reads were mapped to the template by BWA using default parameters. The read counts on LC and HC were normalized to the RPKM measure, reads per kilobase transcriptome per million mapped reads, by the following formula:
(78)
(79) The number of reads overlapping with the exons is obtained for each interval using bedtools (Quinlan AR and Hall IM, 2010, Bioinformatics. 26, 6, pp. 841-842) as the following command, bedtools coverage −abam<bam file>−b<intervals in bed>
(80) B) Results
(81) Structure of the Landing-Pad in 10E9 SSI Host Cell
(82) A model of the structure of the landing pad within the Fer1L4 hot-spot was inferred from the expected RMCE events occurring during the creation of cell line 10E9 from 11A7 using the null targeting vector pRY37 (
(83) Flanking Sequences in the RMCE Derivative Cell Lines
(84) Four 10E9-derived recombinant cell lines were created expressing an anti-Myostatin monoclonal antibody (Myo) by RMCE (using targeting vector pRY21) and the 5′ and 3′ flanking sequences in each were determined, using the methods previously described. During the process of RMCE, a consistent genomic rearrangement occurs generating a new 3′ flanking sequence in the derivative cell lines (
(85) Estimation of the Copy Number of Integrated Cassette in RMCE-Generated Cell Lines
(86) The sequencing reads from each of their genomes were mapped to a model where one copy was integrated into the hot-spot. The number of Myo copies was derived from the mean average coverage on LC and HC region. The mean coverage for these four cell lines are 41, 34, 18, 27 on LC and 32, 27, 14, 19 on HC region, respectively. The coverage data indicate that for high producers, there may be at least one more copy of HC and LC (table 7). The coverage data is shown graphically in
(87) TABLE-US-00008 TABLE 7 Sequence read coverage and specific production rate (qP) of Myo producing RMCE cell lines. Copy number is indicated in brackets next to the coverage value. Cell qP Coverage Line pg/(cell .Math. day) LC HC 1 20 41 (2) 32 (2) 2 19 34 (2) 27 (2) 3 10.2 18 (1) 14 (1) 4 10 27 (1) 19 (1)
(88) Based on this observation, a new model of the post-RMCE locus was generated in which an extra copy of Myo was included. This was achieved by inserting another copy of the fragment spanning from the beginning of the first wFRT site to the beginning of the second wFRT site (both indicated by asterisks in
(89) Further evidence for a ‘two-copy’ model in high qP Myo-producing cell lines was obtained from RNA-Seq data from each cell line. The sequencing reads from RNA-Seq from RNA derived from each Myo cell line were mapped to one the original ‘one-copy’ model (