INTERNAL STANDARD FOR CRISPR GUIDE RNA

20230121309 · 2023-04-20

    Inventors

    Cpc classification

    International classification

    Abstract

    A nucleic acid including a sequence encoding a single guide RNA (sgRNA) of a CRISPR/Cas system is disclosed, wherein the sgRNA sequence is interrupted by a guide disruption sequence flanked by a first pair of recombinase recognition sites, and wherein the sgRNA sequence further includes a second pair of recombinase recognition sites that has a different recombinase recognition sequence than the first pair of recombinase recognition sites, wherein the guide disruption sequence is not flanked by the second pair of recombinase recognition sites and wherein the sequences flanked by the first and second recombinase recognition sites overlap; methods of using such a sgRNA, transgenic cells and kits.

    Claims

    1. A nucleic acid comprising: a sequence encoding a single guide RNA (sgRNA) of a CRISPR/Cas system, wherein the sgRNA sequence is interrupted by a guide disruption sequence flanked by a first pair of recombinase recognition sites; and wherein the sgRNA sequence further comprises a second pair of recombinase recognition sites that has a different recombinase recognition sequence than the first pair of recombinase recognition sites; wherein the guide disruption sequence is not flanked by the second pair of recombinase recognition sites and/or wherein the second pair of recombinase recognition sites flank a part of the sgRNA required to form an active sgRNA; and wherein the sequences flanked by the first and second recombinase recognition sites overlap.

    2. The nucleic acid of claim 1, wherein one recombinase recognition site of the second recombinase recognition site pair is located between the first pair of recombinase recognition sites and preferably downstream of the guide disruption sequence, and another recombinase recognition site of the second recombinase recognition site pair is located downstream of the first pair of recombinase recognition sites.

    3. The nucleic acid of claim 1, wherein one of the first recombinase recognition sites is located in a loop region of the sgRNA sequence, preferably wherein the sgRNA sequence comprises a crRNA part and a tracrRNA part and one of the first recombinase recognition sites is located in a crRNA-tracrRNA linker loop.

    4. The nucleic acid of claim 1, wherein the guide disruption sequence comprises a transcription disruption sequence or has sufficient length to prevent folding into an active sgRNA fold.

    5. The nucleic acid of claim 1, wherein the first and second pairs of recombinase recognition sites are activated by the same recombinase enzyme.

    6. The nucleic acid of claim 1, further comprising a selection marker sequence, which is located between the pairs of recombinase recognition sites.

    7. A method of expressing an sgRNA of the CRISPR/Cas system upon recombinase stimulation, comprising: A) providing a plurality of cells with a plurality of nucleic acids of claim 1; B) introducing or activating one or more recombinases in the cells that are capable of activating the first and second recombinase recognition site pairs; and C) wherein activation of the first recombinase recognition site pair and of the second recombinase recognition site pair are competing reactions, wherein activation of the first recombinase recognition site pair leads to expression of an active sgRNA and wherein activation of the second recombinase recognition site pair inactivates the sgRNA sequence.

    8. The method of claim 7, wherein the cells of the plurality have a single copy of the nucleic acid of claim 1 per cell.

    9. The method of claim 7, wherein the cells are multiplied; after step A) and before step B), wherein the cells are multiplied to a number of at least 250 cells per number of different sgRNA sequences in the plurality of the nucleic acid.

    10. The method of claim 7, wherein cells with the inactive part of the sgRNA sequence are identified to detect the presence of a sgRNA sequence.

    11. The method of claim 7, wherein the cells further express a transgenic oncogene or have a suppressed tumor suppressor gene, the method further comprising: overserving differences in tumorigenesis after activation in step C) as compared to cells without the activation in step C), thereby screening for a role of a gene targeted by the sgRNA during tumorigenesis; or wherein the cells are further treated with a candidate compound, the method further comprising: overserving differences in cell activity or morphology after activation in step C) as compared to cells without the activation in step C), thereby screening for an activity of a gene targeted by the sgRNA under influence of the candidate compound.

    12. The method of claim 7, wherein the nucleic acid of comprises a unique molecular identifier (UMI) sequence, wherein the UMI is used to identify the same sgRNA in different cells.

    13. The method of claim 7 wherein the cells comprise a nucleic acid sequence for expression of a recombinase, wherein said nucleic acid for expression of a recombinase preferably also comprises a selection marker.

    14. A cell comprising the nucleic acid of claim 1.

    15. A kit comprising i) the nucleic acid of claim 1 and ii) one or more nucleic acids for expression of one or more recombinases that is/are capable to activate both recombinase recognition site pairs of the nucleic acid.

    Description

    FIGURES

    [0063] FIG. 1: Distribution of log.sub.2 fold changes between barcodes before and after a pooled CRISPR screen in decreasing numbers of barcodes per guide in library.

    [0064] FIG. 2: A) Schematic illustration of 2D in vitro genetic screens without bottlenecks; with each split of the cell population the representation of cells/sgRNA is maintained above 500-1,000 cells/sgRNAs to keep the complexity of the screen; B) Schematic illustration of complexity bottlenecks in genetic screens; after a bottleneck caused by infection efficiency, limited cells, engraftment efficiency and/or differentiation cells recover differently, leading to reduced representation of cells/sgRNA. Independent of clone size of cellular heterogeneity, single cell derived clones are stochastically split into an experimental and a control population, depicted as the upper green double arrows (active sgRNAs) and lower red double arrows (inactive sgRNAs).

    [0065] FIG. 3: (A, B) Schematic representation of CRISPR-StAR vector encoding sgRNAs, stop cassette, selection cassette, tracrRNA and UMIs. Recombination leads to either an active (A) or an inactive (B) sgRNA.

    [0066] FIG. 4: Schematic illustration of the CRISPR-StAR construct series; StAR1 contains two sets of different lox sites. In comparison to StAR1, StAR3 contains an extra loxP site, a longer distance between the Lox5171 site and the stop cassette and a reduced distance between tracr and the second Lox5171 site. The removal of the extra loxP site resulted in construct StAR4.

    [0067] FIG. 5: Experimental outline to determine frequency of active to inactive recombination in CRISPR StAR constructs.

    [0068] FIG. 6: Schematic outline of proof of concept experiment.

    [0069] FIG. 7: Benchmarking of CRISPR StAR analysis; comparison with traditional day 0 reference.

    [0070] FIG. 8: Correlation of two biological replicates in high complexities using conventional (active vs day 0) and CRISPR-StAR analysis (active vs inactive). Each dot represents one sgRNA. Density plots and stacked histograms show guide distribution in each replicate. Essentials are shown in red, non-essentials in blue.

    [0071] FIG. 9: Correlation of two biological replicates in low complexities using conventional (active vs day 0) and CRISPR-StAR analysis (active vs inactive). Each dot represents one sgRNA. Density plots and stacked histograms show guide distribution in each replicate. Essentials are shown in red, non-essentials in blue. In addition to a dramatically increased spread of neutral (blue) sgRNAs, additional complete dropout is observed at very low representation. This is due to the fact that the sgRNAs were completely lost in the bottleneck. In contrast, CRISPR-StAR only scores sgRNAs that are found in inactive conformation and lost in active.

    [0072] FIG. 10: Area under the curve analysis of essentials (red) compared to non-essentials (blue) of two biological replicates in decreasing numbers of cells per guide in library.

    [0073] FIG. 11: Area under the receiver operating characteristic curve (AUROC) analysis of decreasing complexities in cell numbers compared to library. CRISPR StAR analysis (active vs inactive) in green, conventional analysis (active vs day 0) in black

    [0074] FIG. 12: Pearson correlation, delta area under the curve (dAUC) and area under the receiver operating characteristic curve (AUROC) analysis of decreasing complexities in cell numbers compared to library. Black dots show values of individual replicates, bars show mean of two replicates.

    [0075] FIG. 13: Improved robustness of organoid screening. a) Correlation of two biological replicates determined by UMI. Density plots and stacked histograms show guide distribution in each replicate. b) The average number of guides targeting the same gene (y-axis) for genes correlated with the top sgRNAs (x-axis), sorted by rank. c) Vulcano plots of conventional (active vs day 0) and CRISPR-StAR analysis (active vs inactive) in two biological replicates determined by UMI. Top genes are shown in blue. Genes that scored in the other replicate are shown in green.

    [0076] FIG. 14. Correlation plot of in vitro and in vivo CRISPR-StAR screening results. Each dot represents all sgRNAs for one gene, dot size represents the number of UMIs per gene in the in vivo samples. Stacked histograms show guide distribution in each sample. In vivo samples are two combined replicates. Essential genes are shown in red, non-essential genes in black. The majority of the essential genes show reduced representation both in vitro and in vivo.

    [0077] FIG. 15. Sleeping beauty transposon with an EGFP-P2A-FAH expression cassette under control of the EF1a promoter with the CRISPR-StAR construct. (Left) Liver from an FAH−/− mouse injected with only saline and maintained with NTBC, harvested 14 days post injection. (Right) Liver from FAH−/− mouse injected with transposon and transposase, harvested 25 days post injection. Nuclei were counterstained with DAPI (blue) and expanded cells containing the CRISPR-StAR construct were visualized with EGFP (green).

    [0078] FIG. 16. A sleeping beauty transposon with a KrasG12D-P2A-FAH expression cassette under the control of the EF1a promoter with the CRISPR-StAR construct. (Left) Liver from WT mouse injected with only the transposase, harvested 50 days post injection. (Right) Liver from WT mouse injected with the transposon and transposase, harvested 50 days post injection. Nuclei were counterstained with DAPI (blue) and expanded cells containing the CRISPR-StAR construct were visualized with EGFP (green).

    EXAMPLES

    Example 1: Material and Methods

    1.1 Material

    1.1.1 Cell Lines

    [0079] Tamoxifen-inducible Cre-ERT mouse embryonic stem cells AN3-12 (ESC)
    Platinum-E cells (Cell Biolabs RV-101)
    Vil-CreERT2; Rosa-LSL-Cas9-2A-eGFP mouse small intestinal organoid

    1.1.2 Cell Culture Media

    [0080] Mouse embryonic stem cell medium (ESCM):
    450 ml of DMEM, 75 ml of FCS (Sigma, 025M3347), 5.5 ml of penicillin-streptomycin (Sigma), 5.5 ml of NEAA (Gibco), 5.5 ml of L-glutamine (Gibco), 5.5 ml of sodium pyruvate (Sigma), 0.55 ml of β-mercaptoethanol (Merck), 7.5 μl of LIF (2 mg/ml)

    Organoid Complete Culture Medium:

    [0081] Advanced DMEM/F12, penicillin/streptomycin, 10 mmol/L HEPES, Glutamax, 1× N2, 1× B27 (all from Invitrogen), and 1 mmol/L Nacetylcysteine (Sigma), recombinant human Wnt-3A, murine EGF, murine noggin, human R-spondin-1, nicotinamide

    1.1.3 Buffers

    Laurylsarcosine Lysis Buffer:

    [0082] 10 mM Tris-HCl pH 7.5 (Sigma Aldrich), 10 mM EDTA (Sigma Aldrich), 10 mM NaCl (Sigma Aldrich), 0.5% N-laurylsarcosine (Sigma Aldrich), 1 mg/ml proteinase K (Thermo Fisher Scientific), 0.1 mg/ml RNase A (Qiagen)

    2×SDS Lysis Buffer:

    [0083] 10 mM Tris-HCl pH 8 (Sigma Aldrich), 1% SDS (in-house), 10 mM EDTA (Sigma Aldrich), 100 mM NaCl (Sigma Aldrich), 0.1 mg/ml RNase A (Qiagen)

    TABLE-US-00001 1.1.4 Primers FW_G_CrSc_5:  (SEQ ID NO: 1) AATGATACGGCGACCACCGAGATCTACACAGATAACGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_6:  (SEQ ID NO: 2) AATGATACGGCGACCACCGAGATCTACACAGCTTGCGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_7: (SEQ ID NO: 3) AATGATACGGCGACCACCGAGATCTACACAGGACACGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_10: (SEQ ID NO: 4) AATGATACGGCGACCACCGAGATCTACACATCACTCGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_12:  (SEQ ID NO: 5) AATGATACGGCGACCACCGAGATCTACACCAACACCGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_13: (SEQ ID NO: 6) AATGATACGGCGACCACCGAGATCTACACCACGCCCGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_15: (SEQ ID NO: 7) AATGATACGGCGACCACCGAGATCTACACCATTACCGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_19: (SEQ ID NO: 8) AATGATACGGCGACCACCGAGATCTACACCCCCAACGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_20: (SEQ ID NO: 9) AATGATACGGCGACCACCGAGATCTACACCGTCATCGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_21: (SEQ ID NO: 10) AATGATACGGCGACCACCGAGATCTACACCTATGCCGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_22: (SEQ ID NO: 11) AATGATACGGCGACCACCGAGATCTACACCTCCGCCGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_39: (SEQ ID NO: 12) AATGATACGGCGACCACCGAGATCTACACTGCCGACGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_41: (SEQ ID NO: 13) AATGATACCGCGACCACCGAGATCTACACTGTAGACGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_42: (SEQ ID NO: 14) AATGATACGGCGACCACCGAGATCTACACTTGCCACGAGGGCC- TATTTCCCATGATTCCTTC  RV_G_CrSc: (SEQ ID NO: 15) CAAGCAGAAGACGGCATACGAGATACCGTTGATGAGTAG  NGS_U6: (SEQ ID NO: 16) CGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCG 

    1.2 Methods

    1.2.1 Mouse Embryonic Stem Cell Culture

    [0084] Cells were cultured in ESCM, which was changed daily. When confluent, cells were trypsinized and split 1:10. For 4-hydroxytamoxifen (40H) treatment, medium was supplemented every day with 0.5 μM 40H (Sigma).

    1.2.2 Small Intestinal Organoid Culture

    [0085] Intestinal organoids were established from a Vil-CreERT2; Rosa-LSL-Cas9-2A-eGFP (homozygous) mouse. For organoid establishment, crypts were isolated from the mouse small intestinal epithelium after washing and dissociation. Isolated crypts were resuspended in Matrigel (Corning) at a density of 150-200 crypts per 20 μl droplet. Droplets were seeded in 48-well plates (Corning) and 250 μl of media was used in each well. For the first two passages, cells were cultured in complete organoid medium supplemented with Rho-kinase inhibitor (Y-27632, R&D Systems). Organoids were split every 5-7 days through mechanical pipetting in 1:5 to 1:6 ratios.

    1.2.3 Single Cell Derived Clones

    [0086] ESCs were trypsinized and counted. 500 cells were seeded on a 15 cm dish (Sigma Aldrich). ESCM was exchanged every 2 days. Colonies were allowed to grow for 10 days, then picked into 96-U well plates (Thermo Fisher), trypsinized and split onto 96-F well plates (Thermo Fisher). Cells were cultured until confluent, lysed with 75 μl Laurylsarcosine lysis buffer at 37° C. over-night. For amplification, 1 μl lysate was used in 25 μl PCR reactions (95° C. 3 min, [95° C. 20 sec, 65° C. (−0.3° C. per cycle) 20 sec, 72° C. 30 sec]×23, [95° C. 20 sec, 58° C. 20 sec, 72° C. 30 sec]×30, 72° C. 3 min, 12° C. ∞).

    1.2.4 Retroviral Vectors and ESC Infection

    [0087] The CRISPR-StAR library was packaged into Platinum-E cells according to the manufacturer's recommendations. 300 million ESCs were infected with a 1:10 dilution of virus-containing supernatant in the presence of 2 μg/ml polybrene. 24 hours after infection, selection for infected cells was started with blasticidin and puromycin at 1 μg/ml each. To estimate the multiplicity of infection, 10,000 cells were plated on 15 cm dishes and selected with G418. For comparison, an additional 1,000 cells were plated and were not exposed to G418 selection. On day 10, colonies were counted.

    1.2.5 Cell Culture Screen

    [0088] ESCs were infected with a retroviral CRISPR-StAR vector, selected for blasticidin and puromycin resistance for 3 days. To mimic bottlenecks, cells were thoroughly counted and seeded in densities of 1 cell/sgRNA (5870 cells), 4 cells/sgRNA, 16 cells/sgRNA, 64 cells/sgRNA, 256 cells/sgRNA, 1024 cells/sgRNA in the library. Over the course of 7 days cells were grown to equal densities. To induce recombination, ESCs were treated with 5 μM 40H for 3 days. They were maintained for another 14 days.

    1.2.6 Organoid Screen

    [0089] To prepare for the screen, organoids were expanded in 10 cm dish format (Corning). In each 10 cm dish, 50-55 droplets were seeded and each droplet containing around 100 organoids and in total ten 10 cm dishes were used in the screen. Each dish was supplemented with 10 ml of complete medium and refreshed every two days. To prepare organoids for viral infection, organoids were first mechanically broken down to small pieces. After spin down (500g×5 min) and removing the supernatant (which contains old Matrigel), cells were resuspended in TrypLE (Gibco) and dissociated to 5- to 8-cell clumps at 37° C. Cells were spun down at 300 g for 3 min. After removing the supernatant, cell pellets were resuspended in virus-containing media and dispensed into 48-well plates. The plate was sealed with parafilm and spinoculation was performed for 1 h at 37° C. After spinoculation, parafilm was removed and the plate was incubated at 37° C. for 6 h. Afterwards, cells were transferred to Eppendorf tubes and spun down (300g×3 min). The cell pellet was resuspended in Matrigel and seeded onto 10 cm dishes. After 3 days of recovery, infected organoids were selected for blasticidin resistance for 8 days at 1 μg/ml. Subsequently, organoids were dissociated, and complete medium was substituted with 40H for 6 h. Afterwards, organoids were kept in culture in complete medium for 12 days without splitting. Medium was refreshed every 3 days.

    1.2.7 DNA Harvest and NGS Sample Preparation

    [0090] 60 million cells per sample were collected and lysed in SDS lysis buffer plus 1 mg/ml Proteinase K and 0.1 mg/ml RNAseA. Genomic DNA was extracted with phenol and chlorophorm and precipitated with 1 volume isopropanol. The integrated sgRNA construct is flanked by Pad restriction sites. Samples were digested with Pad for 48 h and co-digested with BbsI for the last 12 h. Each sample was PCR amplified in 96 individual 50 μl reactions with 1 μg DNA per reaction (95° C. 3 min, [95° C. 10 sec, 59° C. 20 sec, 72° C. 30 sec]×36, 72° C. 3 min, 4° C. ∞). Forward primers were unique for each sample and contained a 6 bp experimental index for demultiplexing after NGS (AATGATACGGCGACCACCGAGATCTACAC-NNNNNNCGAGGGCCTATTTCCCATGATTCCTTC (SEQ ID NO: 17), where the 6-bp NNNNNN sequence represents specific experimental indices used for demultiplexing samples after NGS). Reverse primer was the same for each sample. PCR products were purified and size-separated by agarose gel electrophoresis. The two recombination products were excised separately, purified on a mini-elute column and mixed in equal amounts. This sample was sequenced on an Illumina HiSeqV4 SR100 dual-indexing sequencing run. sgRNAs were sequenced with a custom read primer. To distinguish active from inactive guide, the sequence downstream of the first lox site (either TCAGCATAGC for active or TTTTTTT for inactive) was chosen.

    Example 2: Concept Overview

    [0091] In genetic screens, genome editing can have three major effects: it can give a growth benefit, a growth disadvantage or have no effect to cells targeted with a specific sgRNA. A growth benefit will lead to enrichment within the population. A growth disadvantage will lead to depletion.

    [0092] Pooled CRISPR screens are usually kept at a complexity of 300-1,000 individually targeted cells per sgRNA. This allows a sufficient number of unique editing events to call a significant change in the population. However, it is not always possible to maintain this high level of complexity. When a system encounters a bottleneck caused by inefficient infection or limited cell numbers or differentiation or if cells recover at different rates and the library representation decreases. To illustrate this, we calculated log.sub.2 fold changes (LFC) between read numbers of barcodes before and after a CRISPR screen. The numbers of barcodes represent the numbers of differently transformed cells, i.e. the numbers of barcodes per guide represent the numbers of cells/sgRNA.

    [0093] As complexity decreases, the distribution in LFC becomes broader because fewer barcodes are present and changes in the population have larger effects. When complexity further decreases, the distribution becomes bimodal with appearance of a second peak with strong LFC (FIG. 1). This peak is due to missing guides with 0 reads. In analysis, these guides will be mistaken for guides causing a strong depletion phenotype and therefore skew screening results. This means that with insufficient complexity, read numbers of guides before the screen are no longer comparable to read numbers after the screen and conventional analysis fails.

    [0094] The problems caused by insufficient library representation upon bottlenecks in CRISPR screens can be overcome by the invention (illustrated in FIG. 2).

    Example 3: sgRNA Constructs

    [0095] Due to two sets of interlaced lox sites, the CRISPR StAR system can give rise to two different recombination products: an inactive sgRNA or an active sgRNA. The vector contains an sgRNA (library), followed by two pairs of lox sites in the tracr region. Between the lox sites there is a blasticidin selection cassette to prevent premature activation due to e.g. Cre activity or recombination events during viral packaging. Lastly, it contains a stretch of random nucleotides acting as unique molecular identifiers (UMIs). Recombination of the loxP sites results in an active sgRNA (FIG. 3A), whereas recombination of the lox5171 sites results in termination and exclusion of the tracr. As a consequence, the sgRNA is inactive (FIG. 3B). The two recombination events are mutually exclusive.

    [0096] With this system, it is possible to compare active guides to an inactive internal control within the final population of a CRISPR screen. However, it is beneficial to compare of read numbers of the two recombination products, if the ratio of active to inactive recombination is fairly similar. For most cases, the ratio of loxP (active) to lox5171 (inactive) recombination should be between 10:90 and 90:10.

    [0097] Recombination probability between the one and the other loxP pair depends on several factors such as distance and DNA structure (primary, secondary, and tertiary) at the locus. It is therefore difficult to predict. Single cell quantification of recombination probabilities revealed that the original construct (StAR1) resulted in a recombination ratio of 33% active sgRNAs to 66% inactive sgRNAs. Such ratio is ideal if screens desire to monitor relative enrichment of active over inactive sgRNAs, as it offers an ideal dynamic range. However, for the analysis of essential genes, it is preferable to start with equal ratio of active sgRNAs relative to inactive or even a bias towards active sgRNAs. We therefore developed StAR3 and StAR4 by modification or relative distances, primary sequence, and introduction of one additional loxP site (FIG. 4). In doing so, we successfully generated a series of constructs resulting in different recombination ratios:

    TABLE-US-00002 Active Inactive StAR1 (SEQ ID NO: 18): 33% 66% StAR3 (SEQ ID NO: 19): 90% 10% StAR4 (SEQ ID NO: 20): 50% 50%
    Depending on the desired experiment, different setups will be ideal.

    [0098] To determine how efficient either pair of lox sites recombines, sgRNA-infected cells were treated with 40H for 3 days and subsequently seeded in clonal density (FIG. 5). At this point, recombination has happened and these clones either expressed an active or an inactive guide. To identify them, we did PCR with primers flanking the guide construct. Recombination products are 580 bp for active or 542 bp for inactive. We counted frequency of each band size. Most importantly, we found no unrecombined clones, which confirms stable Cre expression in our cell line. The above recombination frequencies were found. For StAR1, out of 288 total clones, recombination resulted in 97 active and 172 inactive sgRNAs. We found 21 double bands which are either due to contaminated mixed clones or double infections. They were counted for both events.

    Example 4: Cell Culture

    4.1 Experimental Design

    [0099] To confirm that CRISPR-StAR overcomes noise in bottleneck screens, we introduced controlled bottlenecks in a cell culture experiment. Therefore, we infected mouse embryonic stem cells with stable integration of a Cas9 expression cassette as well as a CreERT2 expression construct with a retroviral sgRNA StAR1-type library of 5,870 sgRNAs targeting 1,245 genes (Table 1).

    TABLE-US-00003 TABLE 1 Library subpools Genes Guides Drugable genes 885 4,453 Handpicked 360 1,417 Sum 1,245 5,870

    [0100] 15% of cells were infected to ensure single infections. After selection for viral integration, we counted and diluted the cells to introduce controlled bottlenecks. Complexity was reduced to 1 cell/sgRNA (5,870 cells), 4 cells/sgRNA, 16 cells/sgRNA, 64 cells/sgRNA, 256 cells/sgRNA, 1,024 cells/sgRNA. Cells were grown to equal densities of more than 1,000 cells/sgRNA over the course of 7 days. Subsequently, cells were treated with 40H to induce Cre recombination and cells were maintained for another 14 days. The experiment was executed in 2 independent replicates (FIG. 6).

    [0101] After 14 days, genomic DNA was extracted and digested with Pad using cut sites flanking the construct. We then amplified the guide construct via PCR from the fragmented genome with primers containing experimental indices and Illumina adaptors for each sample, which allowed direct sequencing of the PCR product. We gel-extracted both recombination products separately and mixed them in a 1:1 ratio. This pool was then sequenced.

    4.2 Bioinformatic Pipeline

    [0102] After mapping NGS reads, we used the 10 bp stretch directly downstream of the first loxP site to bioinformatically distinguish active from inactive guides (either TCAGCATAGC for active or TTTTTTT for inactive). Although active and inactive recombination products were mixed in a 1:1 ratio before sequencing, we found twice more reads from inactive than from active guides, which indicates that inactive constructs sequence better. Nevertheless, analysis does not suffer from this situation.

    [0103] Each cell was infected with a single guide construct. Thus, every UMI represents one clone and the number of UMIs per guide is equal to the number of cells per guide, which in turn is a direct measure of how many cells per guide were infected. To check whether cell dilutions in the proof of concept experiment were sufficient, we calculated median number of UMIs per inactive guide in lowest complexity samples (1 cell per guide). However, instead of the theoretical 1 UMI per guide, we found much higher numbers. We hypothesized two reasons for this: First, most of these UMIs had only one or two reads, which is most likely due to base substitution errors in sequencing; second, when we calculated distribution of read numbers per UMI, we found a bimodal distribution. When we looked at the sgRNA-UMI combinations from the low read fraction of this distribution, we could find the same sgRNA-UMI combinations with high read numbers in different samples. This suggested index hopping, which is a known problem in Illumina based sequencing, where indices between neighboring clusters are assigned to the wrong sample. In higher complexity samples these issues are negligible because there are high numbers of true UMIs per guide, so overall, these errors have a very small impact. Therefore, this is only relevant in lower complexity samples (1-16 cells per sgRNA). Here, true reads have a distinct distribution with high read counts, while the errors have a distribution with low reads.

    [0104] To separate true reads from errors, we defined the local minimum in this bimodal read distribution for each low complexity sample as a threshold and discarded all reads below. Since the read number of an UMI in an active guide can represent a phenotype, we only set a cutoff in inactive guides and mapped the sgRNA-UMI combination in the active guides, which further cleaned the dataset of non-existing UMIs.

    Finally, to benchmark performance of CRISPR StAR to conventional CRISPR screen analysis, we calculated LFC for both methods: active guides versus day 0 for conventional analysis as well as active versus inactive guides for CRISPR-StAR analysis (FIG. 7).

    4.3 Benchmarking

    [0105] To benchmark performance of CRISPR-StAR compared to conventional screening methodology, we calculated Pearson coefficients between replicates, delta area under the curve (dAUC) and area under receiver operating characteristic curves (AUROC).

    4.3.1 Replicate Correlation

    [0106] To test reproducibility of our results, we calculated correlation coefficients between two biological replicates on essential and non-essential guides. In order to do this, we defined essential genes (red) using data of two independent screens performed in the same cell line, with the same library at a high complexity. We calculated median depletion of each guide and defined guides with a LFC lower than −3 as essential. On the other hand, we defined non-essentials (blue) as the same number as essentials of the least depleting guides from the same dataset. We then correlated LFCs of guides in two independent replicates and determined Pearson coefficients based on essentials and non-essentials. To get a better understanding of data distribution, we calculated densities and ratios of essentials and remaining data for each replicate (FIGS. 8 and 9, side density plots). Lastly, we counted number of sgRNAs present in each replicate as well as overlap between both replicates.

    [0107] At high complexities of 64-1,024 cells per sgRNA, with both conventional and CRISPR StAR analysis, we found good correlation between replicates. Although distribution of data is slightly broader using conventional analysis than using CRISPR StAR, essentials can clearly be separated from non-essentials. Correlation coefficients range from 0.72 to 0.75 using conventional analysis and from 0.80 to 0.84 with CRISPR-StAR (FIG. 8). In this homogeneous system, 64 cells per sgRNA seems to be a sufficient complexity for CRISPR screening using conventional analysis.

    [0108] Using conventional analysis in lower complexities of 1-16 cells per sgRNA, we found an increased spread of both essential and non-essential guides. In 4 and 1 cells per sgRNA samples, the distribution of data becomes bimodal. This is due to sgRNAs with 0 reads in either one or both replicates that cause a strong depletion when compared to day 0. This depletion can either be due to a phenotype caused by a guide, or it can be due to the absence of the guide in the final population. Especially in systems that encounter bottlenecks, it is likely that guides get lost. With conventional analysis, it is not possible to distinguish missing guide from a phenotype. In contrast, when using CRISPR StAR analysis, abundance of active guides is compared to abundance of inactive control guides within the final population. Therefore, guides that got lost due to the effect of a bottleneck will be excluded from analysis. The resulting guide population is smaller and LFCs are due to a phenotype caused by a guide. As a result, in the lowest complexity sample (1 cell per sgRNA), using conventional analysis, correlation decreases to 0.16, while with CRISPR StAR analysis with 0.83 it is as high as in the most complex sample (FIG. 9).

    [0109] In conclusion, using conventional analysis we found poor reproducibility with decreasing complexities. This is due to an increased spread of data caused by missing guides. Using CRISPR StAR, missing guides are removed, and only present guides are considered. Therefore, results are highly reproducible even at low complexity.

    4.3.2 dAUC

    [0110] Calculating dAUC of defined categories within a population gives a measure of how well members of each category can be separated from one another. Using this, we benchmarked performance of CRISPR StAR against conventional analysis in separating essentials from non-essentials. For this, we subset essential and non-essential guides, as defined above, to a new list and ranked them by LFC from most depleting to most enriching. We then calculated the cumulative fraction for occurrence of each guide in a category throughout the ranked list. In other words, if an essential guide scores, the essential curve goes up. The same is true for non-essentials. If the guides have an effect, essentials must be ranked on the top of the list, which results in rapid increase, followed by a plateau, where no essentials are scored. On the other hand, non-essentials are ranked at the end of the list and this is represented by a plateau followed by a rapid increase. Ideally, we would expect both categories to be clearly separated from one another. Therefore, the better method will show a better separation. To get a comparable measure, we calculated dAUC by subtracting AUC of essentials from AUC of non-essentials. An ideal score, if all essentials are separated from non-essentials would be 0.5. A random sample would result in a diagonal line and the dAUC score would be 0.

    [0111] The dAUC for CRISPR-StAR analysis is stably ranging from 0.45 to 0.47. Even in the lowest complexity samples dAUC are 0.46 and 0.45, respectively. In contrast, using conventional analysis, with decreasing complexity, essentials can no longer be cleanly separated from non-essentials. As above, this is caused by a broad spread of both essentials and non-essentials (FIG. 9). dAUC drops to 0.14 and 0.09 in the lowest complexity samples, respectively (FIG. 10). Therefore, CRISPR StAR analysis outperforms conventional analysis by clearly identifying essentials as essential and by separating them from non-essentials.

    4.3.3 AUROC

    [0112] In receiver operating characteristic (ROC) curves, true positive rates are compared to false positive rates. They quantify how well a method can classify data, in this case: guides, into essentials or non-essentials. We defined essentials as above and categorized them as true positives. In the same manner, we categorized non-essentials as false positives. We calculated AUROC scores on a ranked list of guides by LFC for true CRISPR StAR and conventional analysis using the pROC package in R. An ideal score would be 1, a random score would be 0.5.

    [0113] For conventional analysis, with decreasing complexity, AUROC drops from 0.94 to 0.44, which is the same as a random score (FIG. 11). Non-essentials that deplete are absent from analysis. This causes a large LFC, which scores them wrongly as essentials. In contrast, with CRISPR StAR analysis, the AUROC remains between 0.91 and 0.95. Therefore, even at the lowest complexity, true positives can clearly be distinguished from false positives.

    4.4 Summary

    [0114] We calculated Pearson coefficients, dAUC and AUROC to benchmark performance of CRISPR StAR against conventional CRISPR screen analysis. Using all three methods we found that with decreasing complexities CRISPR StAR clearly outperforms conventional analysis especially in the lowest complexity samples (FIG. 12).

    [0115] Taken together, the presented data confirm that CRISPR-StAR indeed overcomes noise in genetic screens that is introduced by the loss of complexity after bottleneck in screening population.

    Example 5: Organoid Screen

    [0116] In homogeneous cell populations, conditions that support high resolution CRISPR screening can be easily controlled. In more heterogeneous systems such as organoids, this is a major difficulty. To specifically test the effect of clonal heterogeneity in a model, we tested CRISPR-StAR in intestinal organoids. First, our retroviral library delivery will only infect the stem cells in the crypt, which is a small subset of the whole cell population. Therefore, infection in organoids is very inefficient and usually represents the first bottleneck that needs to be overcome. Secondly, clonal outgrowth is very heterogeneous.

    [0117] We transduced organoids carrying CreERT2 and Cas9 transgenes with our sgRNA library. They were selected for blasticidin resistance for 8 days, treated with 40H-tamoxifen to induce Cre recombination and kept in culture for another 12 days.

    [0118] To estimate the complexity of infection, we calculated median number of UMIs per guide. Similar to the cell culture screen, we saw a bimodal read distribution caused by index swapping. We handled this in the same way we did in the cell culture screen; i.e. to separate true reads from errors, we defined the local minimum in this bimodal read distribution as a threshold and discarded all reads below. Since the read number of an UMI in an active guide can represent a phenotype, we only set this cutoff in inactive guides and mapped the sgRNA-UMI combination in the active guides, which further cleaned the dataset of non-existing UMIs. After the cutoff, we found that infection occurred at a complexity of 30 cells per sgRNA.

    [0119] UMIs on the guide construct allow for tracking of clonal outgrowth of individually marked cells, thus every UMI within the same guide represents a biological replicate. Thus, we modified our dataset by splitting it into two groups according to first letter of UMI: UMIs starting with A or T in one group and UMIs starting with C and G in another. These two groups were then used as biological replicates.

    5.1 Benchmarking

    [0120] To benchmark performance of CRISPR-StAR in organoids compared to conventional screening methodology, we calculated Pearson coefficients between replicates based on UMIs. Next, we analyzed guide reproducibility within a ranked list of guides by calculating the number of genes compared to the number of guides and scored correlation of two biological replicates determined by UMI. Lastly, we compared hit lists in both types of analysis within the same two replicates.

    5.2 Correlation

    [0121] To compare reproducibility of CRISPR StAR to conventional analysis, we calculated Pearson coefficients between these UMI-based biological replicates. To generate a day 0 sample for conventional analysis, we took both replicates of day 0 samples in the proof of concept screen and calculated mean read numbers of each guide. As we do not know the complete essentialome of organoids, we could not apply the same benchmarking procedure as for the cell culture screen (Example 2.2). Instead, we used core essentials as defined by Hart (Hart et al., Cell 163(6), 2015: 1515-1526) that should be depleting in every cell type.

    [0122] We found a poor reproducibility of screening results using conventional analysis (R=0.27) while CRISPR-StAR analysis of the same dataset generated more reproducible hit lists (R=0.53). Overall, the spread of data is larger when using conventional analysis. In contrast, there is a very sharp signal with CRISPR StAR analysis, after identifying 557 missing guides, which were lost in the bottleneck, and were therefore excluded from CRISPR StAR analysis (FIG. 13a).

    5.3 Guide Reproducibility

    [0123] To test guide reproducibility, we used MAGeCK algorithm to generate a ranked list of guides. From this list, we calculated the average number of sgRNAs present per gene for all genes hit by the respective group of guides sorted by rank. For example, if 15 genes hit within the top 30 sgRNAs, the value was 2; a value of 1 would be expected for a random data set. While conventional analysis leads to a close to random result, CRISPR-StAR shows higher reproducibility of scored genes (FIG. 13b).

    5.4 Gene Reproducibility

    [0124] Lastly, for comparison at gene level, we used MAGeCK for both ways of analysis to combine guides and create a ranked list of genes. Not only could we call top hits with higher p-values compared to conventional analysis, but the scored genes were also more reproducible between replicates. Furthermore, using CRISPR-StAR analysis, out of the top 10 depleting genes we called 4 out of the top 10 depleting genes in both replicates, n contrast to only one commonly depleting gene using conventional analysis. These are hits that we expect to find since they are either core essential or specific to organoid growth (Egfr, Itgb1, Top2a, Rp114). Under the top 5 enriching genes, we found 2 that were common between replicates (Nf2, Cdkn2a), while we did not find any common genes using conventional analysis (FIG. 13c). Furthermore, genes that scored in the respective other replicate are scoring highly in CRISPR-StAR analysis, while they are rather distributed in conventional analysis.

    [0125] We conclude that CRISPR-StAR can identify screen hits robustly and thereby outperforms conventional analysis, allowing reproducible results even in heterogeneous systems such as intestinal organoids.

    Example 6: In Vitro Versus In Vivo Screening

    6.1 Material

    Cell Lines

    [0126] Yumm1.7 450R melanoma cells (received from the Obenauf Lab, IMP, Vienna).

    Lenti-X (Clontech 632180)

    Cell Culture Medium

    [0127] Yumm1.7 450R melanoma cells: DMEM/F12 supplemented with 10% FCS (Gibco), 1% L-Glutamine (Gibco), 1% penicillin-streptomycin (Sigma). Medium for YUMM1.7 450R(Cas9-Cre.sup.ERT2) contained additionally puromycin (1 μg/ml, Invivogen).
    Lenti-X cells: DMEM supplemented with 10% FCS (Gibco), 1% L-Glutamine (Gibco), 1% penicillin-streptomycin (Sigma), 1% non-essential amino acids (NEAA, Gibco), 1% sodium pyruvate (Sigma).

    Buffer

    2×SDS Lysis Buffer:

    [0128] 10 mM Tris-HCl pH 8 (Sigma Aldrich), 1% SDS (in-house), 10 mM EDTA (Sigma Aldrich), 100 mM NaCl (Sigma Aldrich), freshly added 1 mg/ml proteinase K (New England Biolabs).

    TABLE-US-00004 Primers FW_G_CrSc_2: (SEQ ID NO: 21) AATGATACGGCGACCACCGAGATCTACACACCGAACGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_15: (SEQ ID NO: 22) AATGATACGGCGACCACCGAGATCTACACCATTACCGAGGGCC- TATTTCCCATGATTCCTTC  FW_G_CrSc_20: (SEQ ID NO: 23) AATGATACGGCGACCACCGAGATCTACACCGTCATCGAGGGCC- TATTTCCCATGATTCCTTC  RV_G_CrSc: (SEQ ID NO: 24) CAAGCAGAAGACGGCATACGAGATACCGTTGATGAGTAG  NGS_U6: (SEQ ID NO: 25) CGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCG  NGS_customNextSeq_i2_primer: (SEQ ID NO: 26) GAAGGAATCATGGGAAATAGGCCCTCG 

    6.2. Methods

    6.2.1 Generation of Cas9 and CreERT2-Expressing Single-Cell Derived Clones

    [0129] For in vivo and in vitro screening, we generated Yumm1.7 450R cells with Cas9 and CreERT2. First, cells were sequentially transduced with PX459 pSpCas9(BB)-2A-Puro and pMSCV-GFP-mir30-PGK-CreERT2. Bulk cell population was selected for puromycin resistance and single cell clones were derived by single cell fluorescence-activated cell sorting (FACS). Subsequently, clones were tested for Cas9 function and leaky creERT2 expression using CRISPR-Switch with an sgRNA for GFP (Chylinski et al, Nature Communications 10, 2019).

    6.2.2 Pooled Library Cloning

    [0130] To generate the lentiviral library containing the StAR construct with the drugged sgRNA library pool, 15,723 sgRNAs were PCR amplified and cloned into the StAR vector by Golden Gate cloning. Subsequently, the plasmid was electroporated into bacteria (Endura ElectroCompetent cells, Lucigen). After transformation, the bacteria were recovered for 1 h in LB medium at 37° C., plated in LB-agar plates containing ampicillin, and incubated over-night at 37° C. We confirmed a 3,000-fold coverage of each sgRNA in the library. Plasmid DNA was isolated and used to create lentivirus particles.

    6.2.3 In Vitro Screening

    [0131] The StAR construct containing the drugged sgRNA library pool (157,23 sgRNAs) was packaged into Lenti-X cells according to the manufacturer's recommendations. The mono-clonal YUMM1.7 450R(Cas9-Cre.sup.ERT2) were transduced with lentiviral particles, followed by neomycin selection (Geneticin G-418, 500 μg/ml, Gibco) for 4 days. Cells were split into two groups, in vitro and in vivo screening. The cells for in vitro were cultured and creERT2 recombination was induced with 40H (0.5 μM) for 3 days. Cells were maintained for 21 days after induction.

    6.2.4 In Vivo Screening

    [0132] 1*10.sup.6 cells in 50 μl (PBS:Matrigel) were subcutaneously injected into the flanks of 6-12 week-old female mice. 7 days post cell injection we induced creERT2 recombination by intraperitoneal injection of 5 mg tamoxifen per 30 g. Every week, tumour size was measured, and mice were terminated when tumour size reached 2 cm.sup.3 (6-13 days post tamoxifen injection). 6.2.5 Genomic DNA extraction and NGS library preparation

    [0133] In vitro screened cells collected on day 21 were lysed at 55° C. for 24 h with lysis buffer. Tumours harvested from mice were lysed in 15-20 ml lysis buffer at 55° C. for 48-72 h. Both, lysed cells and tumours, were treated with 0.1 mg/ml RNase A (Qiagen) for 1 h at 37° C. gDNA was extracted with phenol and chloroform and subsequently isopropanol and EtOH precipitation. To fragment the DNA, samples were digested with BsmBI for 48 h each sample was then PCR amplified in 48 individual 50 μl reactions with 1 μg DNA per reaction (95° C. 3 min, [95° C. 20 sec, 59° C. 20 sec, 72° C. 40 sec]×33, 72° C. 3 min, 4° C. ∞). Forward primers were unique for each sample and contained a 6 bp experimental index for demultiplexing after NGS (FW_G_CrSc_2, FW_G_CrSc_15 or FW_G_CrSc_20 primers in material). Reverse primer was the same for each sample (RVGCrSc). PCR products were purified, and size separated by agarose gel electrophoresis. The two recombination products were excised together, purified on a mini-elute column. This sample was sequenced on an Illumina NextSeq2000 with a P2 SR100 sequencing run. sgRNAs were sequenced with a custom read primer (Read 1, NGS U6). Active and inactive sgRNA constructs can be distinguished by analysing the sequence of the vector 55 bp after the sgRNA. To determine the index, another custom primer was used (Index2, NGScustomNextSeq_i2_primer).

    6.3. Results & Discussion

    [0134] Major challenges must be overcome when performing in vivo screens. There are several technical bottlenecks in allograft screening, including infection and engraftment efficiency. Additionally, heterogeneity arises from intrinsic factors that are cell (line) dependent and extrinsically where it depends on the location of a cell in vivo (e.g. close to a blood vessel versus the middle of a tumour). These problems lead to unequal sgRNA representation, confounding conventional screening analysis, where one compares the sgRNA on the first and last day of the screen, not suitable. An example of this is the loss of some sgRNAs because the cells that harboured these sgRNAs could not engraft in the mouse. If the sgRNAs on the first and last day of the screen were compared, these sgRNAs would be identified as depleted and therefore the targeted gene would be defined as essential for the outgrowth of the tumour—a false positive results. CRISPR-StAR overcomes such challenges by comparing active and inactive sgRNAs present in engrafted cells at the end of the screen. This example can further elucidate genetic dependencies that differ between in vitro and in vivo conditions.

    [0135] This example describes a comparison between an in vivo screen and in vitro screen. We used the monoclonal melanoma cell line YUMM1.7 450R containing Cas9 and Cre.sup.ERT2. Upon viral transduction with the StAR construct harbouring the drugged sgRNA library pool (15,723 sgRNAs), selected cells were screened either in vitro or in vivo. 40H was used to induce Cre recombination in vitro at the start of the screen whereas intraperitoneal injection of tamoxifen 10 days post injection of the cells induced recombination in vivo. After a short screening time of 6-13 days in vivo (depending on tumour growth rate), DNA was extracted from tumours and in vitro screened cells, subjected to next generation sequencing, and bioinformatically analyzed.

    [0136] From this in vivo screen, we were able to retrieve reads from inactive and active sgRNA constructs, indicating successful Cre recombination in vivo in the StAR vector. Active sgRNAs targeting essential genes were depleted relative to the corresponding inactive sgRNA. The effect of the sgRNAs in vitro and in vivo is calculated by summing the reads of UMIs for the same sgRNA, calculating the Log.sub.e fold change (LFC) of each UMI and then calculating the median of the sum LFC for sgRNAs targeting the same gene (FIG. 14). Negative control genes (depicted in black) do not show an effect in vitro or in vivo. The majority of the essential genes (depicted in red) are depleted in vitro and in vivo. Dot size represents the number of UMIs per gene in the in vivo sample.

    Example 7: CRISPR Screen in Mouse Liver

    [0137] To perform in vivo CRISPR screening in endogenous tissues, it is necessary to selectively expand library-carrying cells in vivo, similar to selecting cells in vitro with antibiotics. In this example, we demonstrate this expansion in the liver, as hepatocytes can proliferate to regenerate the liver following liver damage. In this case, only a few cells carrying the StAR library repopulate the liver, resulting in enough cells to retrieve the library and perform a screen by comparing the ratio between active and inactive sgRNAs. Liver repopulation in fumarylacetoacetate hydroxylase (FAH) homozygous knock-out (FAH/—) mice with healthy hepatocytes is an established method to study liver regeneration (Montini et al. (2002) Molecular Therapy, 6(6), 759-769; Wuestefeld et al. (2013) Cell, 153(2), 389-401; Zhu et al. (2019) Cell, 177(3), 608-621.e12). FAH metabolizes toxic fumarylacetoacetate (FAA) into fumarate and acetoacetate. Mice lacking a functional FAH enzyme die from liver failure. However, FAH−/− mice can be maintained by nitisinone (NTBC) treatment. NTBC inhibits 4-hydroxyphenylpyruvate dioxygenase (HPD), an upstream enzyme in this metabolic pathway, preventing accumulation of FAA. Hepatocytes carrying a functional FAH gene can repopulate an FAH−/− liver when NTBC is withdrawn.

    [0138] FIG. 15 shows the sleeping beauty transposon with an EGFPP-2A-FAH expression cassette under control of the EF1a promoter with the CRISPR-StAR construct. 25 μg of the transposon plasmid and 5 μg of sleeping beauty transposase SB100X plasmid in 0.9% NaCl saline were injected into FAH−/− mice, which were maintained with 1.8 mg of NTBC in 250 mL of drinking water. A volume corresponding to 10% of the total body weight was injected into the tail vein in 5 seconds. NTBC concentration was reduced to 20% of the original concentration one day post injection. 7 days post injection, NTBC was completely removed from the drinking water. The StAR construct is cloned on a sleeping beauty transposon containing the FAH expression cassette. In this way, the liver can be repopulated with cells carrying the StAR construct. The sleeping beauty transposon and transposase were delivered into the liver via hydrodynamic tail vein injection (Bell et al. (2007) Nature Protocols, 2(12), 3153-3165; Liu et al. (1999) Gene Therapy, 6(7), 1258-1266), and we confirmed that cells carrying the StAR construct repopulated the liver after NTBC withdrawal. Thus, we can repopulate the liver with healthy, StAR containing cells to perform the CRISPR-StAR screen.

    [0139] Another example of expanding StAR-containing cells in the liver is by inducing liver cancer. Here, the StAR construct is cloned onto a sleeping beauty transposon with a KrasG12D expression cassette, a well-known cancer driver. We confirmed that StAR-containing cells expanded in the healthy liver. FIG. 16 shows the sleeping beauty transposon with a KrasG12D-P2A-FAH expression cassette under the control of the EF1a promoter with the CRISPR-StAR construct. 15 μg of the transposon plasmid and 3 μg of the sleeping beauty transposase SB100X plasmid in 0.9% NaCl saline were injected into WT mice. A volume corresponding to 10% of the total body weight was injected into the tail vein in 5 seconds. To accelerate this expansion, the transposon is injected into a liver conditionally depleted for p53, which is achieved by activating Alb-Cre.sup.ERT2 in a p53 fl/fl mouse (Ju et al. (2016) International Journal of Cancer, 138(7), 1601-1608).

    [0140] The in vivo liver screening would be done in Cas9 and Alb-CreERT2 mice with FAH−/− or p53 fl/fl mice. These examples demonstrate two methods of expanding a CRISPR-StAR library in vivo prior to inducing recombination and performing the screen.