Systems, methods, and compositions for site-specific genetic engineering using programmable addition via site-specific targeting elements (paste)
11572556 · 2023-02-07
Assignee
Inventors
Cpc classification
C12N2310/20
CHEMISTRY; METALLURGY
C12N15/111
CHEMISTRY; METALLURGY
C12N9/22
CHEMISTRY; METALLURGY
C12Y207/07049
CHEMISTRY; METALLURGY
A61K31/7105
HUMAN NECESSITIES
C12N15/113
CHEMISTRY; METALLURGY
C12N9/1276
CHEMISTRY; METALLURGY
International classification
C12N15/113
CHEMISTRY; METALLURGY
C12N15/90
CHEMISTRY; METALLURGY
C12N15/10
CHEMISTRY; METALLURGY
C12N15/11
CHEMISTRY; METALLURGY
A61K31/7105
HUMAN NECESSITIES
C12N9/12
CHEMISTRY; METALLURGY
Abstract
This disclosure provides systems, methods, and compositions for site-specific genetic engineering using Programmable Addition via Site-Specific Targeting Elements (PASTE). PASTE comprises the addition of an integration site into a target genome followed by the insertion of one or more genes of interest or one or more nucleic acid sequences of interest at the site. PASTE combines gene editing technologies and integrase technologies to achieve unidirectional incorporation of genes in a genome for the treatment of diseases and diagnosis of disease.
Claims
1. A method of site-specifically integrating an exogenous nucleic acid sequence into a mammalian cell genome or intracellular target nucleic acid, the method comprising: (a) incorporating at least one integration sequence at a specific target site in the cell genome or intracellular target nucleic acid by introducing ex vivo into a mammalian cell: (i) an expressible polynucleotide construct encoding an editing polypeptide, wherein the editing polypeptide comprises a DNA binding nuclease domain linked via a linker to a reverse transcriptase domain, wherein the DNA binding nuclease domain comprises a nickase activity; and (ii) a guide RNA (gRNA) comprising a targeting sequence, a primer binding sequence, and a complement of the at least one integration sequence, wherein the gRNA interacts with the expressed editing polypeptide to target direct the editing polypeptide to the specific target site of the cell genome or intracellular target nucleic acid, wherein the DNA binding nuclease domain nicks a strand of the cell genome or intracellular target nucleic acid to form a nicked site, and wherein the reverse transcriptase domain reverse transcribes the complement of the at least one integration sequence within the gRNA and thereby incorporates the at least one integration sequence into the nicked site, thereby incorporating the at least one integration sequence at the specific target site of the cell genome or intracellular target nucleic acid; and (b) integrating an exogenous nucleic acid sequence into the cell genome or intracellular target nucleic acid by introducing into the cell: (i) the exogenous nucleic acid sequence linked to a sequence that is an integration cognate to the site-specifically incorporated-integration sequence; and (ii) an expressible polynucleotide construct encoding an integration enzyme, wherein the integration enzyme integrates the exogenous nucleic acid sequence into the cell genome or the intracellular target nucleic acid at the at least one site-specifically incorporated integration sequence, thereby site-specifically integrating the exogenous nucleic acid sequence into the cell genome or the intracellular target nucleic acid, wherein the expressible polynucleotide encoding the editing polypeptide, the gRNA, the expressible polynucleotide construct encoding the integration enzyme, and the exogenous nucleic acid sequence are introduced into the mammalian cell concurrently.
2. The method of claim 1, wherein the gRNA, the expressible polynucleotide construct encoding the editing polypeptide, and the expressible polynucleotide construct encoding the integration enzyme are introduced into the mammalian cell using a virus, a RNP, an mRNA, a lipid, or a polymeric nanoparticle.
3. The method of claim 1, wherein the gRNA hybridizes to a strand of the mammalian cell genome.
4. The method of claim 1, wherein the exogenous nucleic acid is introduced into the mammalian cell as an adeno-associated virus (AAV) or an adenovirus (AdV).
5. The method of claim 1, wherein the exogenous nucleic acid is introduced into the mammalian cell as a minicircle, a plasmid, mRNA, or a linear DNA.
6. The method of claim 5, wherein the minicircle does not comprise a sequence of a bacterial origin.
7. The method of claim 1, wherein the linker is cleavable.
8. The method of claim 1, wherein the linker is non-cleavable.
9. The method of claim 1, wherein the linker is two associating binding domains of the DNA binding nuclease linked to a reverse transcriptase.
10. The method of claim 1, wherein the integration enzyme is selected from the group consisting of Dre, Vika, Bxbl, φC31, RDF, FLP, φBT1, R1, R2, R3, R4, R5, TP901-1, A118, φFC1, φC1, MR11, TG1, φ370.1, Wβ, BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, φRV, retrotransposases encoded by R2, L1, Tol2 Tc 1, Tc3, Mariner Himar 1, Mariner mos 1, and Minos, and any mutants thereof.
11. The method of claim 1, wherein the integration sequence is an attB sequence, an attP sequence, an attL sequence, an attR sequence, a lox71 sequence, a Vox sequence, or a FRT sequence.
12. The method of claim 1, wherein the DNA binding nuclease domain comprising a nickase activity is selected from Cas9-D10A, Cas9-H840A, and Cas12a/b nickase.
13. The method of claim 1, wherein the reverse transcriptase domain comprises a mutation relative to a wild-type sequence.
14. The method of claim 1, wherein the reverse transcriptase domain is selected from the group consisting of Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase domain, transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV-RT), and Eubacterium rectale maturase RT.
15. The method of claim 14, wherein the M-MLV reverse transcriptase domain comprises one or more mutations selected from the group consisting of D200N, T306K, W313F, T3301P, and L603W.
16. The method of claim 1, further comprising introducing a nicking guide RNA (ngRNA).
17. The method claim 1, wherein: the exogenous nucleic acid is a reporter gene; the exogenous nucleic acid is a degradation tag for programmable knockdown of proteins in the presence of small molecules; the exogenous nucleic acid is a T-cell receptor (TCR), a chimeric antigen receptor (CAR), an interleukin, a cytokine, or an immune checkpoint gene and the mammalian cell is a T-cell or natural killer (NK) cell; the exogenous nucleic acid is a beta hemoglobin (HBB) gene and the cell is a hematopoietic stem cell (HSC); the exogenous nucleic acid is a metabolic gene; or the exogenous nucleic acid is a gene involved in an inherited disease or an inherited syndrome.
18. The method of claim 17, wherein the reporter gene is a fluorescent protein.
19. The method of claim 1, wherein the mammalian cell is a dividing cell or a non-dividing cell.
20. The method of claim 1, wherein: the exogenous nucleic acid is between 1000 bp and 36,000 bp; the exogenous nucleic acid is more than 36,000 bp; or the exogenous nucleic acid is less than 1000 bp.
21. The method of claim 17 wherein the inherited disease is cystic fibrosis, familial hypercholesterolemia, adenosine deaminase (ADA) deficiency, X-linked SCID (X-SCID), Wiskott-Aldrich syndrome (WAS), hemochromatosis, Tay-Sachs, fragile X syndrome, Huntington's disease, Marfan syndrome, phenylketonuria, or muscular dystrophy.
22. A method of site-specifically integrating an exogenous nucleic acid sequence into a mammalian cell genome or intracellular target nucleic acid, the method comprising: (a) incorporating at least one integration sequence at a specific target site in the cell genome or intracellular target nucleic acid by introducing ex vivo into a mammalian cell: (i) an expressible polynucleotide construct encoding an editing polypeptide, wherein the editing polypeptide comprises a DNA-binding nuclease domain linked via a linker to a reverse transcriptase domain, wherein the DNA-binding nuclease domain comprises a nickase activity; and (ii) an expressible polynucleotide construct encoding a guide RNA (gRNA) comprising a targeting sequence, a primer binding sequence, and a complement of the at least one integration sequence, wherein the gRNA interacts with the expressed editing polypeptide to direct the editing polypeptide to the specific target site of the cell genome or intracellular target nucleic acid, wherein the DNA-binding nuclease domain nicks a strand of the cell genome or intracellular target nucleic acid to form a nicked site, and wherein the reverse transcriptase domain reverse transcribes the complement of the at least one integration sequence and thereby incorporates the at least one integration sequence into the nicked site, thereby incorporating the at least one integration sequence at the specific target site of the cell genome or intracellular target nucleic acid; and (b) integrating the exogenous nucleic acid sequence into the cell genome or intracellular target nucleic acid by introducing into the mammalian cell: (i) an exogenous nucleic acid sequence linked to a sequence that is an integration cognate to the site-specifically incorporated integration sequence; and (ii) an expressible polynucleotide construct encoding an integration enzyme, wherein the integration enzyme integrates the exogenous nucleic acid into the cell genome or intracellular target nucleic acid at the at least one site-specifically incorporated integration sequence, thereby site-specifically integrating the exogenous nucleic acid into the cell genome or the intracellular target nucleic acid, wherein the integration sequence is an attB sequence, an attP sequence, an attL sequence, an attR sequence, a lox71 sequence, a Vox sequence, or a FRT sequence, wherein the integration sequence is longer than 38 basepairs, and wherein the expressible polynucleotide constructs encoding the editing polypeptide, the gRNA, and the integration enzyme, and the exogenous nucleic acid, are introduced into the mammalian cell concurrently.
23. The method of claim 22, wherein the integration sequence is 40, 42, 44, or 46 base pairs.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Aspects, features, benefits and advantages of the embodiments described herein will be apparent with regard to the following description, appended claims, and accompanying drawings where:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
(31)
(32)
(33)
(34)
(35)
(36)
(37)
(38)
(39)
(40)
(41)
(42)
(43)
(44)
(45)
(46)
(47)
(48)
(49)
(50)
(51)
(52)
(53)
(54)
(55)
(56)
(57)
(58)
(59)
(60)
(61)
(62)
(63)
(64)
(65)
(66)
(67)
(68)
(69)
(70)
(71)
(72)
(73)
(74)
(75)
(76)
(77)
(78)
(79)
(80)
(81)
(82)
(83)
(84)
(85)
(86)
(87)
(88)
(89)
(90)
(91)
(92)
(93)
(94)
(95)
(96)
(97)
(98)
(99)
(100)
(101)
(102)
(103)
(104)
(105)
(106)
(107)
(108)
(109)
(110)
(111)
(112)
(113)
(114)
(115)
(116)
(117)
(118)
(119)
(120)
(121)
(122)
(123)
(124)
(125)
(126)
(127)
(128)
(129)
(130)
(131)
(132)
(133)
(134)
(135)
(136)
(137)
(138)
(139)
(140)
(141)
(142)
(143)
(144)
(145)
(146)
(147)
(148)
(149)
(150)
(151)
(152)
(153)
(154)
(155)
(156)
(157)
(158)
(159)
(160)
(161)
(162)
(163)
(164)
(165)
(166)
(167)
DETAILED DESCRIPTION
(168) It will be appreciated that for clarity, the following discussion will describe various aspects of embodiments of the applicant's teachings. It should be noted that the specific embodiments are not intended as an exhaustive description or as a limitation to the broader aspects discussed herein. One aspect described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced with any other embodiment(s). Reference throughout this specification to “one embodiment”, “an embodiment,” “an example embodiment,” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, appearances of the phrases “in one embodiment,” “in an embodiment,” or “an example embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular feature, structures or characteristics may be combined in any suitable manner, as would be apparent to a person skilled in the art from this disclosure, in one or more embodiments.
General Definitions
(169) Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. Definitions of common terms and techniques in molecular biology may be found in Molecular Cloning: A Laboratory Manual, 2nd edition (1989) (Sambrook, Fritsch, and Maniatis); Molecular Cloning: A Laboratory Manual, 4th edition (2012) (Green and Sambrook); Current Protocols in Molecular Biology (1987) (F. M. Ausubel et al. eds.); the series Methods in Enzymology (Academic Press, Inc.): PCR 2: A Practical Approach (1995) (M. J. MacPherson, B. D. Hames, and G. R. Taylor eds.): Antibodies, A Laboratory Manual (1988) (Harlow and Lane, eds.): Antibodies A Laboratory Manual, 2nd edition 2013 (E. A. Greenfield ed.); Animal Cell Culture (1987) (R. I. Freshney, ed.); Benjamin Lewin, Genes IX, published by Jones and Bartlet, 2008 (ISBN 0763752223); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0632021829); Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 9780471185710); Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley & Sons (New York, N.Y. 1994), March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, N.Y. 1992); and Marten H. Hofker and Jan van Deursen, Transgenic Mouse Methods and Protocols, 2nd edition (2011).
(170) As used herein, the singular forms “a”, “an,” and “the” include both singular and plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a cell” includes a plurality of such cells.
(171) As used herein, the term “optional” or “optionally” means that the subsequent described event, circumstance or substituent may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not.
(172) The recitation of numerical ranges by endpoints includes all numbers and fractions subsumed within the respective ranges, as well as the recited endpoints.
(173) As used herein, the term “about” or “approximately” refers to a measurable value such as a parameter, an amount, a temporal duration, and the like, are meant to encompass variations of and from the specified value, such as variations of +/−10% or less, +/−5% or less, +/−1% or less, +/−0.5% or less, and +/−0.1% or less of and from the specified value, insofar such variations are appropriate to perform in the disclosure. It is to be understood that the value to which the modifier “about” or “approximately” refers is itself also specifically, and preferably, disclosed.
(174) It is noted that all publications and references cited herein are expressly incorporated herein by reference in their entirety. The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present disclosure is not entitled to antedate such publication. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.
(175) Overview
(176) The embodiments disclosed herein provide non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using Programmable Addition via Site-Specific Targeting Elements (PASTE). A schematic diagram illustrating the concept of PASTE is shown in
(177) An advantage of the non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering disclosed herein is programmable insertion of large elements without reliance on DNA damage responses.
(178) Another advantage of the non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering disclosed herein is facile multiplexing, enabling programmable insertion at multiple sites.
(179) Another advantage of the non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering disclosed herein is scalable production and delivery through minicircle templates.
(180) Prime Editing
(181) The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using gene editing technologies, such as prime editing, to add an integration site into a target genome. Prime editing will be discussed in more details below.
(182) Prime editing is a versatile and precise genome editing method that directly writes new genetic information into a specified DNA site. A schematic diagram illustrating the concept of prime editing is shown in
(183) The prime editors refer to a Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase (RT) fused to a Cas9 H840A nickase. Fusing the RT to the C-terminus of the Cas9 nickase may result in higher editing efficiency. Such a complex is called PE1. The Cas9(H840A) can also be linked to a non-M-MLV reverse transcriptase such as a AMV-RT or XRT (Cas9(H840A)-AMV-RT or XRT). In some embodiments, Cas 9(H840A) can be replaced with Cas12a/b or Cas9(D10A). A Cas9 (wild type), Cas9(H840A), Cas9(D10A) or Cas 12a/b nickase fused to a pentamutant of M-MLV RT (D200N/L603W/T330P/T306K/W313F), having up to about 45-fold higher efficiency is called PE2. In some embodiments, the M-MLV RT comprise one or more of the mutations: Y8H, P51L, S56A, S67R, E69K, V129P, L139P, T197A, H204R, V223H, T246E, N249D, E286R, Q291I, E302K, E302R, F309N, M320L, P330E, L435G, L435R, N454K, D524A, D524G, D524N, E562Q, D583N, H594Q, E607K, D653N, and L671P. In some embodiments, the reverse transcriptase can also be a wild-type or modified transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV-RT), Feline Immunodeficiency Virus reverse transcriptase (FIV-RT), FeLV-RT (Feline leukemia virus reverse transcriptase), HIV-RT (Human Immunodeficiency Virus reverse transcriptase), or Eubacterium rectale maturase RT (MarathonRT). PE3 involves nicking the non-edited strand, potentially causing the cell to remake that strand using the edited strand as the template to induce HR. The nicking of the non-edited strand can involve the use of a nicking guide RNA (ngRNA).
(184) Nicking the non-edited strand can increase editing efficiency. For example, nicking the non-edited strand can increase editing efficiency by about 1.1 fold, about 1.3 fold, about 1.5 fold, about 1.7 fold, about 1.9 fold, about 2.1 fold, about 2.3 fold, about 2.5 fold, about 2.7 fold, about 2.9 fold, about 3.1 fold, about 3.3 fold, about 3.5 fold, about 3.7 fold, about 3.9 fold, 4.1 fold, about 4.3 fold, about 4.5 fold, about 4.7 fold, about 4.9 fold, or any range that is formed from any two of those values as endpoints.
(185) Although the optimal nicking position varies depending on the genomic site, nicks positioned 3′ of the edit about 40-90 bp from the pegRNA-induced nick can generally increase editing efficiency without excess indel formation. The prime editing practice allows starting with non-edited strand nicks about 50 bp from the pegRNA-mediated nick, and testing alternative nick locations if indel frequencies exceed acceptable levels.
(186) As used herein, the term “guide RNA” (gRNA) and the like refer to a RNA that guide the insertion or deletion of one or more genes of interest or one or more nucleic acid sequences of interest into a target genome. The gRNA can also refer to a prime editing guide RNA (pegRNA), a nicking guide RNA (ngRNA), and a single guide RNA (sgRNA). In some embodiments, the term “gRNA molecule” refers to a nucleic acid encoding a gRNA. In some embodiments, the gRNA molecule is naturally occurring. In some embodiments, a gRNA molecule is non-naturally occurring. In some embodiments, a gRNA molecule is a synthetic gRNA molecule. A gRNA can target a nuclease or a nickase such as Cas9, Cas 12a/b, Cas9 (H840A) or Cas9 (D10A) molecule to a target nucleic acid or sequence in a genome. In some embodiments, the gRNA can bind to a DNA nickase bound to a reverse transcriptase domain. A “modified gRNA,” as used herein, refers to a gRNA molecule that has an improved half-life after being introduced into a cell as compared to a non-modified gRNA molecule after being introduced into a cell. In some embodiments, the guide RNA can facilitate the addition of the insertion site sequence for recognition by integrases, transposases, or recombinases.
(187) As used herein, the term “prime-editing guide RNA” (pegRNA) and the like refer to an extended single guide RNA (sgRNA) comprising a primer binding site (PBS), a reverse transcriptase (RT) template sequence, and an integration site sequence that can be recognized by recombinases, integrases, or transposases. Exemplary design parameters for pegRNA are shown in
(188) During genome editing, the primer binding site allows the 3′ end of the nicked DNA strand to hybridize to the pegRNA, while the RT template serves as a template for the synthesis of edited genetic information. The pegRNA is capable for instance, without limitation, of (i) identifying the target nucleotide sequence to be edited and (ii) encoding new genetic information that replaces the targeted sequence. In some embodiments, the pegRNA is capable of (i) identifying the target nucleotide sequence to be edited and (ii) encoding an integration site that replaces the targeted sequence.
(189) As used herein, the term “nicking guide RNA” (ngRNA) and the like refer to an RNA sequence that can nick a strand such as an edited strand and a non-edited strand. Exemplary design parameters for ngRNA are shown in
(190) The pegRNA-PE complex disclosed herein recognizes the target site in the genome and the Cas9 for example nicks a protospacer adjacent motif (PAM) strand. The primer binding site (PBS) in the pegRNA hybridizes to the PAM strand. The RT template operably linked to the PBS, containing the edit sequence, directs the reverse transcription of the RT template to DNA into the target site. Equilibration between the edited 3′ flap and the unedited 5′ flap, cellular 5′ flap cleavage and ligation, and DNA repair results in stably edited DNA. To optimize base editing, a Cas9 nickase can be used to nick the non-edited strand, thereby directing DNA repair to that strand, using the edited strand as a template.
(191) Integrase Technologies
(192) The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using integrase technologies. Integrase technologies will be discussed in more details below.
(193) The integrase technologies used herein comprise proteins or nucleic acids encoding the proteins that direct integration of a gene of interest or nucleic acid sequence of interest into an integration site via a nuclease such as a prime editing nuclease. The protein directing the integration can be an enzyme such as integration enzyme. The integration enzyme can be an integrase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by integration. The integration enzyme can be a recombinase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by recombination. The integration enzyme can be a reverse transcriptase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by reverse transcription. The integration enzyme can be a retrotransposase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by retrotransposition.
(194) As used herein, the term “integration enzyme” refers to an enzyme or protein used to integrate a gene of interest or nucleic acid sequence of interest into a desired location or at the integration site, in the genome of a cell, in a single reaction or multiple reactions. Example of integration enzymes include for example, without limitation, Cre, Dre, Vika, Bxb1, φC31, RDF, FLP, φBT1, R1, R2, R3, R4, R5, TP901-1, A118, φFC1, φC1, MR11, TG1, φ370.1, Wβ, BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, φRV, and retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos. In some embodiments, the term “integration enzyme” refers to a nucleic acid (DNA or RNA) encoding the above-mentioned enzymes. In some embodiments, the Cre recombinase is expressed from a Cre recombinase expression plasmid (SEQ ID NO: 71).
(195) Mammalian expression plasmids can be found in Table 1 below.
(196) TABLE-US-00001 TABLE 1 Name Full Description SEQ ID NOS: PE2-Bxbl Single pCMV-PE2- (SEQ ID NO: 381) Vector P2A- Bxbl PE2 prime editor pCMV-PE2/ (SEQ ID NO: 382) Addgene #132775 PE2*-Bxb1 Single New NLS (SEQ ID NO: 383) Vector pCMV- PE2- P2A-Bxbl PASTEv3 pCMV-SpCas9- (SEQ ID NO: 384) XTEN-RT(1- 478)-Sto7d- GGGGS- BxbINT ACTB pegRNA ACTB N- (SEQ ID NO: 385) term PBS 13 RT 29 attB 46 pegRNA ACTB Nicking +48 ACTB N- (SEQ ID NO: 386) term Nicking guide 1 +48 guide Bxbl integrase pCAG-NLS- (SEQ ID NO: 387) HA-Bxblintegrase/ Addgene #51271 TP901-1 Integrase TP901-1 (SEQ ID NO: 388) Integrase PhiBT Integrase PhiBT Integrase (SEQ ID NO: 389) HDR sgRNA guide Minicircle U6- (SEQ ID NO: 390) sgRNA EFS- SpCas9 HDR EGFP cargo Cas9 HDR (SEQ ID NO: 391) template site with EGFP AAV helper PDF6 AAV (SEQ ID NO: 392) plasmid helper plasmid AAV EGFP donor GFP AAV donor (SEQ ID NO: 393) plasmid AAV2/8 AAV2/8 capsid (SEQ ID NO: 394) protein
(197) Minicircle cargo gene maps can be found in Table 2 below.
(198) TABLE-US-00002 TABLE 2 Full Name Description SEQ ID NOS: Cargo EGFP Parent (SEQ ID NO: 76) minicircle plasmid - Cargo EGFP with attP Bxbl site Cargo Cargo EGFP (SEQ ID NO: 395) EGFP with attP Bxbl post site - post cleavage minicircle cleavage Cargo Parent (SEQ ID NO: 396) EGFP minicircle for plasmid - fusion Cargo EGFP with attP Bxbl site for fusion mCherry Cargo (SEQ ID NO: 397) Cargo post mCherry cleavage with attP Bxbl site - post minicircle cleavage YFP Cargo YFP (SEQ ID NO: 398) Cargo with attP Bxbl post site - post cleavage minicircle cleavage SERPINA1 Cargo (SEQ ID NO: 399) Cargo SERPINA1 post with attP cleavage Bxbl site - post minicircle cleavage CPS1 Cargo CPS1 (SEQ ID NO: 400) Cargo with attP Bxbl post site - post cleavage minicircle cleavage CFTR Cargo Parent (SEQ ID NO: 401) minicircle plasmid - Cargo CFTR with attP Bxbl site NYESO Cargo (SEQ ID NO: 402) TCR Cargo NYESO post TCR with cleavage attP Bxbl site - post minicircle cleavage
(199) In some embodiments, the serine integrase φC31 from φC31 phage is use as integration enzyme. The integrase φC31 in combination with a pegRNA can be used to insert the pseudo attP integration site (SEQ ID NO: 78). A DNA minicircle containing a gene or nucleic acid of interest and attB (SEQ ID NO: 3) site can be used to integrate the gene or nucleic acid of interest into the genome of a cell. This integration can be aided by a co-transfection of an expression vector having the φC31 integrase.
(200) As used herein, the term “integrase” refers to a bacteriophage derived integrase, including wild-type integrase and any of a variety of mutant or modified integrases. As used herein, the term “integrase complex” may refer to a complex comprising integrase and integration host factor (IF). As used herein, the term “integrase complex” and the like may also refer to a complex comprising an integrase, an integration host factor, and a bacteriophage X-derived excisionase (Xis).
(201) As used herein, the term “recombinase” and the like refer to a site-specific enzyme that mediates the recombination of DNA between recombinase recognition sequences, which results in the excision, integration, inversion, or exchange (e.g., translocation) of DNA fragments between the recombinase recognition sequences. Recombinases can be classified into two distinct families: serine recombinases (e.g., resolvases and invertases) and tyrosine recombinases (e.g., integrases). Examples of serine recombinases include, without limitation, Hin, Gin, Tn3, β-six, CinH, ParA, γδ, Bxb1, φC31, TP901, TG1, φBT1, R1, R2, R3, R4, R5, φRV1, φFC1, MR11, A118, U153, and gp29. Examples of serine recombinases also include, without limitation, recombinases Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, and BxZ2 from Mycobacterial phages. Examples of tyrosine recombinases include, without limitation, Cre, FLP, R, Lambda, HK101, HK022, and pSAM2. The serine and tyrosine recombinase names stem from the conserved nucleophilic amino acid residue that the recombinase uses to attack the DNA and which becomes covalently linked to the DNA during strand exchange.
(202) Recombinases have numerous applications, including the creation of gene knockouts/knock-ins and gene therapy applications. See, e.g., Brown et al., “Serine recombinases as tools for genome engineering.”Methods, 2011; 53(4):372-9; Hirano et al., “Site-specific recombinases as tools for heterologous gene integration.” Appl. Microbiol. Biotechnol. 2011; 92(2):227-39; Chavez and Calos, “Therapeutic applications of the ΦC31 integrase system.” Curr. Gene Ther. 2011; 11(5):375-81; Turan and Bode, “Site-specific recombinases: from tag-and-target- to tag-and-exchange-based genomic modifications.” FASEB J. 2011; 25(12):4088-107; Venken and Bellen, “Genome-wide manipulations of Drosophila melanogaster with transposons, Flp recombinase, and ΦC31 integrase.”Methods Mol. Biol. 2012; 859:203-28; Murphy, “Phage recombinases and their applications.”Adv. Virus Res. 2012; 83:367-414; Zhang et al., “Conditional gene manipulation: Creating a new biological era.” J. Zhejiang Univ. Sci. B. 2012; 13(7):511-24; Karpenshif and Bernstein, “From yeast to mammals: recent advances in genetic control of homologous recombination.” DNA Repair (Amst). 2012; 1; 11(10):781-8; the entire contents of each are hereby incorporated by reference in their entirety.
(203) The recombinases provided herein are not meant to be exclusive examples of recombinases that can be used in embodiments of the disclosure. The methods and compositions of the disclosure can be expanded by mining databases for new orthogonal recombinases or designing synthetic recombinases with defined DNA specificities (See, e.g., Groth et al., “Phage integrases: biology and applications.” J. Mol. Biol. 2004; 335, 667-678; Gordley et al., “Synthesis of programmable integrases.” Proc. Natl. Acad. Sci. USA. 2009; 106, 5053-5058; the entire contents of each are hereby incorporated by reference in their entirety).
(204) Other examples of recombinases that are useful in the systems, methods, and compositions described herein are known to those of skill in the art, and any new recombinase that is discovered or generated is expected to be able to be used in the different embodiments of the disclosure.
(205) As used herein, the term “retrotransposase” and the like refer to an enzyme, or combination of one or more enzymes, wherein at least one enzyme has a reverse transcriptase domain. Retrotransposases are capable of inserting long sequences (e.g., over 3000 nucleotides) of heterologous nucleic acid into a genome. Examples of retrotransposases include for example, without limitation, retrotransposases encoded by elements such as R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), Minos, and any mutants thereof.
(206) In some embodiments, the one or more genes of interest or one or more nucleic acid sequences of interest are inserted into a desired location in a genome using a RNA fragment, such as a retrotransposon, encoding the nucleic acid linked to a complementary or associated integration site. The insertion of the nucleic acid of interest into a location in the desired location in the genome using a retrotransposon is aided by a retrotransposase.
(207) The gene and nucleic acid sequence of interest disclosed herein can be any gene and nucleic acid sequence that are known in the art. The gene and nucleic acid sequence of interest can be for therapeutic and/or diagnostic uses. Examples of genes of interest include, without limitation, GBA, BTK, ADA, CNGB3, CNGA3, ATF6, GNAT2, ABCA1, ABCA7, APOE, CETP, LIPC, MMP9, PLTP, VTN, ABCA4, MFSD8, TLR3, TLR4, ERCC6, HMCN1, HTRA1, MCDR4, MCDR5, ARMS2, C2, C3, CFB, CFH, JAG1, NOTCH2, CACNA1F, SERPINA1, TTR, GSN, B2M, APOA2, APOA1, OSMR, ELP4, PAX6, ARG, ASL, PITX2, FOXC1, BBS1, BBS10, BBS2, BBS9, MKKS, MKS1, BBS4, BBS7, TTC8, ARL6, BBS5, BBS12, TRIM32, CEP290, ADIPOR1, BBIP1, CEP19, IFT27, LZTFL1, DMD, BEST1, HBB, CYP4V2, AMACR, CYP7B1, HSD3B7, AKR1D1, OPN1SW, NR2F1, RLBP1, RGS9, RGS9BP, PROM1, PRPH2, GUCY2D, CACD, CHM, ALAD, ASS1, SLC25A13, OTC, ACADVL, ETFDH, TMEM67, CC2D2A, RPGRIP1L, KCNV2, CRX, GUCA1A, CERKL, CDHR1, PDE6C, TTLL5, RPGR, CEP78, C21orf2, C8ORF37, RPGRIP1, ADAM9, POC1B, PITPNM3, RAB28, CACNA2D4, AIPL1, UNC119, PDE6H, OPN1LW, RIMS1, CNNM4, IFT81, RAX2, RDH5, SEMA4A, CORD17, PDE6B, GRK1, SAG, RHO, CABP4, GNB3, SLC24A1, GNAT1, GRM6, TRPM1, LRIT3, TGFBI, TACSTD2, KRT12, OVOL2, CPS1, UGT1A1, UGT1A9, UGT1A8, UGT1A7, UGT1A6, UGT1A5, UGT1A4, CFTR, DLD, EFEMP1, ABCC2, ZNF408, LRP5, FZD4, TSPAN12, EVR3, APOB, SLC2A2, LOC106627981, GBA1, NR2E3, OAT, SLC40A1, F8, F9, UROD, CPDX, HFE, JH, LDLR, EPHX1, TJP2, BAAT, NBAS, LARS1, HAMP, HJV, RS1, ADAMTS18, LRAT, RPE65, LCA5, MERTK, GDF6, RD3, CCT2, CLUAP1, DTHD1, NMNAT1, SPATA7, IFT140, IMPDH1, OTX2, RDH12, TULP1, CRB1, MT-ND4, MT-ND1, MT-ND6, BCKDHA, BCKDHB, DBT, MMAB, ARSB, GUSB, NAGS, NPC1, NPC2, NDP, OPA1, OPA3, OPA4, OPA5, RTN4IP1, TMEM126A, OPA6, OPA8, ACO2, PAH, PRKCSH, SEC63, GAA, UROS, PPDX, HPX, HMOX1, HMBS, MIR223, CYP1B1, LTBP2, AGXT, ATP8B1, ABCB11, ABCB4, FECH, ALAS2, PRPF31, RP1, EYS, TOPORS, USH2A, CNGA1, C2ORF71, RP2, KLHL7, ORF1, RP6, RP24, RP34, ROM1, ADGRA3, AGBL5, AHR, ARHGEF18, CA4, CLCC1, DHDDS, EMC1, FAM161A, HGSNAT, HK1, IDH3B, KIAA1549, KIZ, MAK, NEUROD1, NRL, PDE6A, PDE6G, PRCD, PRPF3, PRPF4, PRPF6, PRPF8, RBP3, REEP6, SAMD11, SLC7A14, SNRNP200, SPP2, ZNF513, NEK2, NEK4, NXNL1, OFD1, RP1L1, RP22, RP29, RP32, RP63, RP9, RGR, POMGNT1, DHX38, ARL3, COL2A1, SLCO1B1, SLCO1B3, KCNJ13, TIMP3, ELOVL4, TFR2, FAH, HPD, MYO7A, CDH23, PCDH15, DFNB31, GPR98, USH1C, USH1G, CIB2, CLRN1, HARS, ABHD12, ADGRV1, ARSG, CEP250, IMPG1, IMPG2, VCAN, G6PC1, ATP7B and any derivatives thereof.
(208) As used here, the terms “retrotransposons,” “jumping genes,” “jumping nucleic acids,” and the like refer to cellular movable genetic elements dependent on reverse transcription. The retrotransposons are of non-replication competent cellular origin, and are capable of carrying a foreign nucleic acid sequence. The retrotransposons can act as parasites of retroviruses, retaining certain classical hallmarks, such as long terminal repeats (LTR), retroviral primer binding sites, and the like. However, the naturally occurring retrotransposons usually do not contain functional retroviral structure genes, which would normally be capable of recombining to yield replication competent viruses. Some retrotransposons are examples of so-called “selfish DNA”, or genetic information, which encodes nothing except the ability to replicate itself. The retrotransposon may do so by utilizing the occasional presence of a retrovirus or a retrotransposase within the host cell, efficiently packaging itself within the viral particle, which transports it to the new host genome, where it is expressed again as RNA. The information encoded within that RNA is potentially transported with the jumping gene. A retrotransposon can be a DNA transposon or a retrotransposon, including a LTR retrotransposon or a non-LTR retrotransposon.
(209) Non-long terminal repeat (LTR) retrotransposons are a type of mobile genetic elements that are widespread in eukaryotic genomes. They include two classes: the apurinic/apyrimidinic endonuclease (APE)-type and the restriction enzyme-like endonuclease (RLE)-type. The APE class retrotransposons are comprised of two functional domains: an endonuclease/DNA binding domain, and a reverse transcriptase domain. The RLE class are comprised of three functional domains: a DNA binding domain, a reverse transcription domain, and an endonuclease domain. The reverse transcriptase domain of non-LTR retrotransposon functions by binding an RNA sequence template and reverse transcribing it into the host genome's target DNA. The RNA sequence template has a 3′ untranslated region which is specifically bound to the transposase, and a variable 5′ region generally having Open Reading Frame(s) (“ORF”) encoding transposase proteins. The RNA sequence template may also comprise a 5′ untranslated region which specifically binds the retrotransposase. In some embodiments, a non-LTR transposons can include a LINE retrotransposon, such as L1, and a SINE retrotransposon, such as an Alu sequence. Other examples include for example, without limitation, R1, R2, R3, R4, and R5 retro-transposons (Moss, W. N. et al., RNA Biol. 2011, 8(5), 714-718; and Burke, W. D. et al., Molecular Biology and Evolution 2003, 20(8), 1260-1270). The transposon can be autonomous or non-autonomous.
(210) LTR retrotransposons, which include retroviruses, make up a significant fraction of the typical mammalian genome, comprising about 8% of the human genome and 10% of the mouse genome. Lander et al., 2001, Nature 409, 860-921; Waterson et al., 2002, Nature 420, 520-562. LTR elements include retrotransposons, endogenous retroviruses (ERVs), and repeat elements with HERV origins, such as SINE-R. LTR retrotransposons include two LTR sequences that flank a region encoding two enzymes: integrase and retrotransposase.
(211) ERVs include human endogenous retroviruses (HERVs), the remnants of ancient germ-cell infections. While most HERV proviruses have undergone extensive deletions and mutations, some have retained ORFS coding for functional proteins, including the glycosylated env protein. The env gene confers the potential for LTR elements to spread between cells and individuals. Indeed, all three open reading frames (pol, gag, and env) have been identified in humans, and evidence suggests that ERVs are active in the germline. See, e.g., Wang et al., 2010, Genome Res. 20, 19-27. Moreover, a few families, including the HERV-K (HML-2) group, have been shown to form viral particles, and an apparently intact provirus has recently been discovered in a small fraction of the human population. See, e.g., Bannert and Kurth, 2006, Proc. Natl. Acad. USA 101, 14572-14579.
(212) LTR retrotransposons insert into new sites in the genome using the same steps of DNA cleavage and DNA strand-transfer observed in DNA transposons. In contrast to DNA transposons, however, recombination of LTR retrotransposons involves an RNA intermediate. LTR retrotransposons make up about 8% of the human genome. See, e.g., Lander et al., 2001, Nature 409, 860-921; Hua-Van et al., 2011, Biol. Dir. 6, 19.
(213) Integration Site
(214) The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering via the addition of an integration site into a target genome. The integration site will be discussed in more details below.
(215) As used herein, the term “integration site” refers to the site within the target genome where one or more genes of interest or one or more nucleic acid sequences of interest are inserted. Examples of integration sites include for example, without limitation, a lox71 site (SEQ ID NO: 1), attB sites (SEQ ID NO: 3 and SEQ ID NO: 43), attP sites (SEQ ID NO: 4 and SEQ ID NO: 44), an attL site (SEQ ID NO: 67), an attR site (SEQ ID NO: 68), a Vox site (SEQ ID NO: 69), a FRT site (SEQ ID NO: 70), or a pseudo attP site (SEQ ID NO: 78). The integration site can be inserted into the genome or a fragment thereof of a cell using a nuclease, a gRNA, and/or an integration enzyme. The integration site can be inserted into the genome of a cell using a prime editor such as, without limitation, PE1, PE2, and PE3, wherein the integration site is carried on a pegRNA. The pegRNA can target any site that is known in the art. Examples of cites targeted by the pegRNA include, without limitation, ACTB, SUPT16H, SRRM2, NOLC1, DEPDC4, NES, LMNB1, AAVS1 locus, CC10, CFTR, SERPINA1, ABCA4, and any derivatives thereof. The complementary integration site may be operably linked to a gene of interest or nucleic acid sequence of interest in an exogenous DNA or RNA. In some embodiments, one integration site is added to a target genome. In some embodiments, more than one integration sites are added to a target genome.
(216) To insert multiple genes or nucleic acids of interest, two or more integration sites are added to a desired location. Multiple DNA comprising nucleic acid sequences of interest are flanked orthogonal to the integration sequences, such as, without limitation, attB and attP. An integration site is “orthogonal” when it does not significantly recognize the recognition site or nucleotide sequence of a recombinase. Thus, one attB site of a recombinase can be orthogonal to an attB site of a different recombinase. In addition, one pair of attB and attP sites of a recombinase can be orthogonal to another pair of attB and attP sites recognized by the same recombinase. A pair of recombinases are considered orthogonal to each other, as defined herein, when there is recognition of each other's attB or attP site sequences.
(217) The lack of recognition of integration sites or pairs of sites by the same recombinase or a different recombinase can be less than about 30%. In some embodiments, the lack of recognition of integration sites or pairs of sites by the same recombinase or a different recombinase can be less than about 30%, less than about 28%, less than about 26%, less than about 24%, less than about 22%, less than about 20%, less than about 18%, less than about 16%, less than about 14%, less than about 12%, less than about 10%, less than about 8%, less than about 6%, less than about 4%, less than about 2%, about 1%, or any range that is formed from any two of those values as endpoints. The crosstalk can be less than about 30%. In some embodiments, the crosstalk is less than about 30%, less than about 28%, less than about 26%, less than about 24%, less than about 22%, less than about 20%, less than about 18%, less than about 16%, less than about 14%, less than about 12%, less than about 10%, less than about 8%, less than about 6%, less than about 4%, less than about 2%, less than about 1%, or any range that is formed from any two of those values as endpoints.
(218) In some embodiments, the attB and/or attP site sequences comprise a central dinucleotide sequence. It has been shown that, for example, the central dinucleotide can be changed to GA from GT and that only GA containing attB/attP sites interact and will not cross react with GT containing sequences. In some embodiments, the central dinucleotide is selected from the group consisting of AG, AC, TG, TC, CA, CT, GA, AA, TT, CC, GG, AT, TA, GC, CG and GT.
(219) As used herein, the term “pair of an attB and attP site sequences” and the like refer to attB and attP site sequences that share the same central dinucleotide and can recombine. This means that in the presence of one serine integrase as many as six pairs of these orthogonal att sites can recombine (attPTT will specifically recombine with attBTT, attPTC will specifically recombine with attBTC, and so on).
(220) In some embodiments, the central dinucleotide is nonpalindromic. In some embodiments, the central dinucleotide is palindromic. In some embodiments, a pair of an attB site sequence and an attP site sequence are used in different DNA encoding genes of interest or nucleic acid sequences of interest for inducing directional integration of two or more different nucleic acids.
(221) The Table 3 below shows examples of pairs of attB site sequence and attP site sequence with different central dinucleotide (CD).
(222) TABLE-US-00003 TABLE 3 Pair attB attP CD 1 SEQ ID NO: 5 SEQ ID NO: 6 TT 2 SEQ ID NO: 7 SEQ ID NO: 8 AA 3 SEQ ID NO: 9 SEQ ID NO: 10 CC 4 SEQ ID NO: 11 SEQ ID NO: 12 GG 5 SEQ ID NO: 13 SEQ ID NO: 14 TG 6 SEQ ID NO: 15 SEQ ID NO: 16 GT 7 SEQ ID NO: 17 SEQ ID NO: 18 CT 8 SEQ ID NO: 19 SEQ ID NO: 20 CA 9 SEQ ID NO: 21 SEQ ID NO: 22 TC 10 SEQ ID NO: 23 SEQ ID NO: 24 GA 11 SEQ ID NO: 25 SEQ ID NO: 26 AG 12 SEQ ID NO: 27 SEQ ID NO: 28 AC 13 SEQ ID NO: 29 SEQ ID NO: 30 AT 14 SEQ ID NO: 31 SEQ ID NO: 32 GC 15 SEQ ID NO: 33 SEQ ID NO: 34 CG 16 SEQ ID NO: 35 SEQ ID NO: 36 TA
Paste
(223) The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using PASTE. PASTE will be discussed in more details below.
(224) The site-specific genetic engineering disclosed herein is for the insertion of one or more genes of interest or one or more nucleic acid sequences of interest into a genome of a cell. In some embodiments, the gene of interest is a mutated gene implicated in a genetic disease such as, without limitation, a metabolic disease, cystic fibrosis, muscular dystrophy, hemochromatosis, Tay-Sachs, Huntington disease, Congenital Deafness, Sickle cell anemia, Familial hypercholesterolemia, adenosine deaminase (ADA) deficiency, X-linked SCID (X-SCID), and Wiskott-Aldrich syndrome (WAS). In some embodiments, the gene of interest or nucleic acid sequence of interest can be a reporter gene upstream or downstream of a gene for genetic analyses such as, without limitation, for determining the expression of a gene. In some embodiments, the reporter gene is a GFP template (SEQ ID NO: 76) or a Gaussia Luciferase (G-Luciferase) template (SEQ ID NO: 77) In some embodiments, the gene of interest or nucleic acid sequence of interest can be used in plant genetics to insert genes to enhance drought tolerance, weather hardiness, and increased yield and herbicide resistance in plants. In some embodiments, the gene of interest or nucleic acid sequence of interest can be used for site-specific insertion of a protein (e.g., a lysosomal enzyme), a blood factor (e.g., Factor I, II, V, VII, X, XI, XII or XIII), a membrane protein, an exon, an intracellular protein (e.g., a cytoplasmic protein, a nuclear protein, an organellar protein such as a mitochondrial protein or lysosomal protein), an extracellular protein, a structural protein, a signaling protein, a regulatory protein, a transport protein, a sensory protein, a motor protein, a defense protein, or a storage protein, an anti-inflammatory signaling molecules into cells for treatment of immune diseases, including but not limited to arthritis, psoriasis, lupus, coeliac disease, glomerulonephritis, hepatitis, and inflammatory bowel disease.
(225) The size of the inserted gene or nucleic acid can vary from about 1 bp to about 50,000 bp. In some embodiments, the size of the inserted gene or nucleic acid can be about 1 bp, 10 bp, 50 bp, 100 bp, 150 bp, 200 bp, 250 bp, 300 bp, 350 bp, 400 bp, 600 bp, 800 bp, 1000 bp, 1200 bp, 1400 bp, 1600 bp, 1800 bp, 2000 bp, 2200 bp, 2400 bp, 2600 bp, 2800 bp, 3000 bp, 3200 bp, 3400 bp, 3600 bp, 3800 bp, 4000 bp, 4200 bp, 4400 bp, 4600 bp, 4800 bp, 5000 bp, 5200 bp, 5400 bp, 5600 bp, 5800 bp, 6000 bp, 6200, 6400 bp, 6600 bp, 6800 bp, 7000 bp, 7200 bp, 7400 bp, 7600 bp, 7800 bp, 8000 bp, 8200 bp, 8400 bp, 8600 bp, 8800 bp, 9000 bp, 9200 bp, 9400 bp, 9600 bp, 9800 bp, 10,000 bp, 10,200 bp, 10,400 bp, 10,600 bp, 10,800 bp, 11,000 bp, 11,200 bp, 11,400 bp, 11,600 bp, 11,800 bp, 12,000 bp, 14,000 bp, 16,000 bp, 18,000 bp, 20,000 bp, 30,000 bp, 40,000 bp, 50,000 bp, or any range that is formed from any two of those values as endpoints.
(226) In some embodiments, the site-specific engineering using the gene of interest or nucleic acid sequence of interest disclosed herein is for the engineering of T cells and NKs for tumor targeting or allogeneic generation. These can involve the use of receptor or CAR for tumor specificity, anti-PD1 antibody, cytokines like IFN-gamma, TNF-alpha, IL-15, IL-12, IL-18, IL-21, and IL-10, and immune escape genes.
(227) In the present disclosure, the site-specific insertion of the gene of interest or nucleic acid of interest is performed through Programmable Addition via Site-Specific Targeting Elements (PASTE). Components for inserting a gene of interest or a nucleic acid of interest using PASTE are for example, without limitation, a nuclease, a gRNA adding the integration site, a DNA or RNA strand comprising the gene or nucleic acid linked to a sequence that is complementary or associated to the integration site, and an integration enzyme. Components for inserting a gene of interest or a nucleic acid of interest using PASTE are for example, without limitation, a prime editor expression, pegRNA adding the integration site, nicking guide RNA, integration enzyme (Cre or serine recombinase), transgene vector comprising the gene of interest or nucleic acid sequence of interest with gene and integration signal. The nuclease and prime editor integrate the integration site into the genome. The integration enzyme integrates the gene of interest into the integration site. In some embodiments, the transgene vector comprising the gene or nucleic acid sequence of interest with gene and integration signal is a DNA minicircle devoid of bacterial DNA sequences. In some embodiments, the transgenic vector is a eukaryotic or prokaryotic vector.
(228) As used herein, the term “vector” or “transgene vector” refers to a recombinant DNA molecule containing a desired coding sequence and appropriate nucleic acid sequences necessary for the expression of the operably linked coding sequence in a host organism. Nucleic acid sequences necessary for expression in prokaryotes usually include for example, without limitation, a promoter, an operator (optional), a ribosome binding site, and/or other sequences. Eukaryotic cells are generally known to utilize promoters (constitutive, inducible or tissue specific), enhancers, and termination and polyadenylation signals, although some elements may be deleted and other elements added without sacrificing the necessary expression. The transgenic vector may encode the PE and the integration enzyme, linked to each other via a linker. The linker can be a cleavable linker. For example, transgenic vector encoding the PE and the integration enzyme, linked to each other via a linker is pCMV PE2 P2A Cre comprises SEQ ID NO: 73. In some embodiments, the linker can be a non-cleavable linker. In some embodiments the nuclease, prime editor, and/or integration enzyme can be encoded in different vectors.
(229) A method of inserting multiple genes or nucleic acid sequences of interest into a single site according to embodiments of the present disclosure is illustrated in
(230) In some embodiments, multiplexing allows integration of for example, signaling cascade, over-expression of a protein of interest with its cofactor, insertion of multiple genes mutated in a neoplastic condition, or insertion of multiple CARs for treatment of cancer.
(231) In some embodiments, the integration sites may be inserted into the genome using non-prime editing methods such as rAAV mediated nucleic acid integration, TALENS and ZFNs. A number of unique properties make AAV a promising vector for human gene therapy (Muzyczka, CURRENT TOPICS IN MICROBIOLOGY AND IMMUNOLOGY, 158:97-129 (1992)). Unlike other viral vectors, AAVs have not been shown to be associated with any known human disease and are generally not considered pathogenic. Wild type AAV is capable of integrating into host chromosomes in a site-specific manner M. Kotin et al., PROC. NATL. ACAD. SCI, USA, 87:2211-2215 (1990); R. J. Samulski, EMBO 10(12):3941-3950 (1991)). Instead of creating a double-stranded DNA break, AAV stimulates endogenous homologous recombination to achieve the DNA modification. Further, transcription activator-like effector nucleases (TALENs) and Zinc-finger nucleases (ZFNs) for genome editing and introducing targeted DSBs. The specificity of TALENs arises from two polymorphic amino acids, the so-called repeat variable diresidues (RVDs) located at positions 12 and 13 of a repeated unit. TALENS are linked to FokI nucleases, which cleaves the DNA at the desired locations. ZFNs are artificial restriction enzymes for custom site-specific genome editing. Zinc fingers themselves are transcription factors, where each finger recognizes 3-4 bases. By mixing and matching these finger modules, researchers can customize which sequence to target.
(232) As used herein, the terms “administration,” “introducing,” or “delivery” into a cell, a tissue, or an organ of a plasmid, nucleic acids, or proteins for modification of the host genome refers to the transport for such administration, introduction, or delivery that can occur in vivo, in vitro, or ex vivo. Plasmids, DNA, or RNA for genetic modification can be introduced into cells by transfection, which is typically accomplished by chemical means (e.g., calcium phosphate transfection, polyethyleneimine (PEI) Or lipofection), physical means (electroporation or microinjection), infection (this typically means the introduction of an infectious agent such as a virus (e.g., a baculovirus expressing the AAV Rep gene)), transduction (in microbiology, this refers to the stable infection of cells by viruses, or the transfer of genetic material from one microorganism to another by viral factors (e.g., bacteriophages)). Vectors for the expression of a recombinant polypeptide, protein or oligonucleotide may be obtained by physical means (e.g., calcium phosphate transfection, electroporation, microinjection, or lipofection) in a cell, a tissue, an organ or a subject. The vector can be delivered by preparing the vector in a pharmaceutically acceptable carrier for the in vitro, ex vivo, or in vivo delivery to the carrier.
(233) As used herein, the term “transfection” refers to the uptake of an exogenous nucleic acid molecule by a cell. A cell is “transfected” when an exogenous nucleic acid has been introduced into the cell membrane. The transfection can be a single transfection, co-transfection, or multiple transfection. Numerous transfection techniques are generally known in the art. See, for example, Graham et al. (1973) Virology, 52: 456. Such techniques can be used to introduce one or more exogenous nucleic acid molecules into a suitable host cell.
(234) In some embodiments, the exogenous nucleic acid molecule and/or other components for gene editing are combined and delivered in a single transfection. In other embodiments, the exogenous nucleic acid molecule and/or other components for gene editing are not combined and delivered in a single transfection. In some embodiments, exogenous nucleic acid molecule and/or other components for gene editing are combined and delivered in a single transfection to comprise for example, without limitation, a prime editing vector, a landing site such as a landing site containing pegRNA, a nicking guide such as a nicking guide for stimulating prime editing, an expression vector such as an expression vector for a corresponding integrase or recombinase, a minicircle DNA cargo such as a minicircle DNA cargo encoding for green fluorescent protein (GFP), any derivatives thereof, and any combinations thereof. In some embodiments, the gene of interest or amino acid sequence of interest can be introduced using liposomes. In some embodiments, the gene of interest or amino acid sequence of interest can be delivered using suitable vectors for instance, without limitation, plasmids and viral vectors. Examples of viral vectors include, without limitation, adeno-associated viruses (AAV), lentiviruses, adenoviruses, other viral vectors, derivatives thereof, or combinations thereof. The proteins and one or more guide RNAs can be packaged into one or more vectors, e.g., plasmids or viral vectors. In some embodiments, the delivery is via nanoparticles or exosomes. For example, exosomes can be particularly useful in delivery RNA.
(235) In some embodiments, the prime editing inserts the landing site with efficiencies of at least about 1%, at least about 5%, at least about 10%, at least about 15%, at least about, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, or at least about 50%. In some embodiments, the prime editing inserts the landing site(s) with efficiencies of about 1%, about 2%, about 3%, about 4%, about 5%, about 6%, about 7%, about 8%, about 9%, about 10%, about 11%, about 12%, about 13%, about 14%, about 15%, about 16%, about 17%, about 18%, about 19%, about 20%, about 21%, about 22%, about 23%, about 24%, about 25%, about 26%, about 27%, about 28%, about 29%, about 30%, about 31%, about 32%, about 33%, about 34%, about 35%, about 36%, about 37%, about 38%, about 39%, about 40%, about 41%, about 42%, about 43%, about 44%, about 45%, about 46%, about 47%, about 48%, about 49%, about 50%, or any range that is formed from any two of those values as endpoints.
(236) Sequences
(237) Sequences of enzymes, guides, integration sites, and plasmids can be found in Table 4 below.
(238) TABLE-US-00004 TABLE 4 SEQ ID NO/ DESCRIPTION/ SOURCE SEQUENCE SEQ ID NO: 1 ATAACTTCGTATAATGTATGCTATACGAACGGTA Lox71 (Artificial sequence) SEQ ID NO: 2 TACCGTTCGTATAATGTATGCTATACGAAGTTAT Lox66 (Artificial sequence) SEQ ID NO: 3 GGCCGGCTTGTCGACGACGGCGGTCTCCGTCGTCAGGATCATCCG attB G (Artificial sequence) SEQ ID NO: 4 CCGGATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGC attP C (Artificial Sequence) SEQ ID NO: 5 GGCTTGTCGACGACGGCGTTCTCCGTCGTCAGGATCAT attB-TT (Artificial Sequence) SEQ ID NO: 6 GTGGTTTGTCTGGTCAACCACCGCGTTCTCAGTGGTGTACGGTACA attP-TT AACCCA (Artificial Sequence) SEQ ID NO: 7 GGCTTGTCGACGACGGCGAACTCCGTCGTCAGGATCAT attB-AA (Artificial Sequence) SEQ ID NO: 8 GTGGTTTGTCTGGTCAACCACCGCGAACTCAGTGGTGTACGGTAC attP-AA AAACCCA (Artificial Sequence) SEQ ID NO: 9 GGCTTGTCGACGACGGCGCCCTCCGTCGTCAGGATCAT attB-CC (Artificial Sequence) SEQ ID NO: 10 GTGGTTTGTCTGGTCAACCACCGCGCCCTCAGTGGTGTACGGTACA attP-CC AACCCA (Artificial Sequence) SEQ ID NO: 11 GGCTTGTCGACGACGGCGGGCTCCGTCGTCAGGATCAT attB-GG (Artificial Sequence) SEQ ID NO: 12 GTGGTTTGTCTGGTCAACCACCGCGGGCTCAGTGGTGTACGGTAC attP-GG AAACCCA (Artificial Sequence) SEQ ID NO: 13 GGCTTGTCGACGACGGCGTGCTCCGTCGTCAGGATCAT attB-TG (Artificial Sequence) SEQ ID NO: 14 GTGGTTTGTCTGGTCAACCACCGCGTGCTCAGTGGTGTACGGTACA attP-TG AACCCA (Artificial Sequence) SEQ ID NO: 15 GGCTTGTCGACGACGGCGGTCTCCGTCGTCAGGATCAT attB-GT (Artificial Sequence) SEQ ID NO: 16 GTGGTTTGTCTGGTCAACCACCGCGGTCTCAGTGGTGTACGGTACA attP-GT AACCCA (Artificial Sequence) SEQ ID NO: 17 GGCTTGTCGACGACGGCGCTCTCCGTCGTCAGGATCAT attB-CT (Artificial Sequence) SEQ ID NO: 18 GTGGTTTGTCTGGTCAACCACCGCGCTCTCAGTGGTGTACGGTACA attP-CT AACCCA (Artificial Sequence) SEQ ID NO: 19 GGCTTGTCGACGACGGCGCACTCCGTCGTCAGGATCAT attB-CA (Artificial Sequence) SEQ ID NO: 20 GTGGTTTGTCTGGTCAACCACCGCGCACTCAGTGGTGTACGGTACA attP-CA AACCCA (Artificial Sequence) SEQ ID NO: 21 GGCTTGTCGACGACGGCGTCCTCCGTCGTCAGGATCAT attB-TC (Artificial Sequence) SEQ ID NO: 22 GTGGTTTGTCTGGTCAACCACCGCGTCCTCAGTGGTGTACGGTACA attP-TC AACCCA (Artificial Sequence) SEQ ID NO: 23 GGCTTGTCGACGACGGCGGACTCCGTCGTCAGGATCAT attB-GA (Artificial Sequence) SEQ ID NO: 24 GTGGTTTGTCTGGTCAACCACCGCGGACTCAGTGGTGTACGGTAC attP-GA AAACCCA (Artificial Sequence) SEQ ID NO: 25 GGCTTGTCGACGACGGCGAGCTCCGTCGTCAGGATCAT attB-AG (Artificial Sequence) SEQ ID NO: 26 GTGGTTTGTCTGGTCAACCACCGCGAGCTCAGTGGTGTACGGTAC attP-AG AAACCCA (Artificial Sequence) SEQ ID NO: 27 GGCTTGTCGACGACGGCGACCTCCGTCGTCAGGATCAT attB-AC (Artificial Sequence) SEQ ID NO: 28 GTGGTTTGTCTGGTCAACCACCGCGACCTCAGTGGTGTACGGTACA attP-AC AACCCA (Artificial Sequence) SEQ ID NO: 29 GGCTTGTCGACGACGGCGATCTCCGTCGTCAGGATCAT attB-AT (Artificial Sequence) SEQ ID NO: 30 GTGGTTTGTCTGGTCAACCACCGCGATCTCAGTGGTGTACGGTACA attP-AT AACCCA (Artificial Sequence) SEQ ID NO: 31 GGCTTGTCGACGACGGCGGCCTCCGTCGTCAGGATCAT attB-GC (Artificial Sequence SEQ ID NO: 32 GTGGTTTGTCTGGTCAACCACCGCGGCCTCAGTGGTGTACGGTACA attP-GC AACCCA (Artificial Sequence) SEQ ID NO: 33 GGCTTGTCGACGACGGCGCGCTCCGTCGTCAGGATCAT attB-CG (Artificial Sequence) SEQ ID NO: 34 GTGGTTTGTCTGGTCAACCACCGCGCGCTCAGTGGTGTACGGTACA attP-CG AACCCA (Artificial Sequence) SEQ ID NO: 35 GGCTTGTCGACGACGGCGTACTCCGTCGTCAGGATCAT attB-TA (Artificial Sequence) SEQ ID NO: 36 GTGGTTTGTCTGGTCAACCACCGCGTACTCAGTGGTGTACGGTACA attP-TA AACCCA (Artificial Sequence) SEQ ID NO: 37 TGCGGGTGCCAGGGCGTGCCCTTGGGCTCCCCGGGCGCGTACTCC C31-attB (Artificial Sequence) SEQ ID NO: 38 GTGCCCCAACTGGGGTAACCTTTGAGTTCTCTCAGTTGGGGG C31-attP (Artificial Sequence) SEQ ID NO: 39 GCGCCCAAGTTGCCCATGACCATGCCGAAGCAGTGGTAGAAGGGC R4-attB ACCGGCAGACAC (Artificial Sequence) SEQ ID NO: 40 AGGCATGTTCCCCAAAGCGATACCACTTGAAGCAGTGGTACTGCT R4-attP TGTGGGTACACTCTGCGGGTGATGA (Artificial Sequence) SEQ ID NO: 41 GTCCTTGACCAGGTTTTTGACGAAAGTGATCCAGATGATCCAGCTC BT1-attB CACACCCCGAACGC (Artificial Sequence) SEQ ID NO: 42 GGTGCTGGGTTGTTGTCTCTGGACAGTGATCCATGGGAAACTACTC BT1-attP AGCACCACCAATGTTCC (Artificial Sequence) SEQ ID NO: 43 TCGGCCGGCTTGTCGACGACGGCGGTCTCCGTCGTCAGGATCATCC Bxb-attB GGGC (Artificial Sequence) SEQ ID NO: 44 GTCGTGGTTTGTCTGGTCAACCACCGCGGTCTCAGTGGTGTACGGT Bxb-attP ACAAACCCCGAC (Artificial Sequence) SEQ ID NO: 45 GATCAGCTCCGCGGGCAAGACCTTCTCCTTCACGGGGTGGAAGGT TG1-attB C (Artificial Sequence) SEQ ID NO: 46 TCAACCCCGTTCCAGCCCAACAGTGTTAGTCTTTGCTCTTACCCAG TG1-attP TTGGGCGGGATAGCCTGCCCG (Artificial Sequence) SEQ ID NO: 47 AACGATTTTCAAAGGATCACTGAATCAAAAGTATTGCTCATCCAC C1-attB GCGAAATTTTTC (Artificial Sequence) SEQ ID NO: 48 AATATTTTAGGTATATGATTTTGTTTATTAGTGTAAATAACACTAT C1-attP GTACCTAAAAT (Artificial Sequence) SEQ ID NO: 49 TGTAAAGGAGACTGATAATGGCATGTACAACTATACTCGTCGGTA C370-attB AAAAGGCA (Artificial Sequence) SEQ ID NO: 50 TAAAAAAATACAGCGTTTTTCATGTACAACTATACTAGTTGTAGTG C370-attP CCTAAA (Artificial Sequence) SEQ ID NO: 51 GAGCGCCGGATCAGGGAGTGGACGGCCTGGGAGCGCTACACGCT K38-attB GTGGCTGCGGTC (Artificial Sequence) SEQ ID NO: 52 CCCTAATACGCAAGTCGATAACTCTCCTGGGAGCGTTGACAACTT K38-attP GCGCACCCTGA (Artificial Sequence) SEQ ID NO: 53 TCTCGTGGTGGTGGAAGGTGTTGGTGCGGGGTTGGCCGTGGTCGA RB-attB GGTGGGGTGGTGGTAGCCATTCG (Artificial Sequence) SEQ ID NO: 54 GCACAGGTGTAGTGTATCTCACAGGTCCACGGTTGGCCGTGGACT RV-attP GCTGAAGAACATTCCACGCCAGGA (Artificial Sequence) SEQ ID NO: 55 AGTGCAGCATGTCATTAATATCAGTACAGATAAAGCTGTATCTCCT SPBC-attB GTGAACACAATGGGTGCCA (Artificial Sequence) SEQ ID NO: 56 AAAGTAGTAAGTATCTTAAAAAACAGATAAAGCTGTATATTAAGA SPBC-attP TACTTACTAC (Artificial Sequence) SEQ ID NO: 57 TGATAATTGCCAACACAATTAACATCTCAATCAAGGTAAATGCTTT TP901-attB TTCGTTTT (Artificial Sequence) SEQ ID NO: 58 AATTGCGAGTTTTTATTTCGTTTATTTCAATTAAGGTAACTAAAAA TP901-attP ACTCCTTT (Artificial Sequence) SEQ ID NO: 59 AAGGTAGCGTCAACGATAGGTGTAACTGTCGTGTTTGTAACGGTA Wβ-attB CTTCCAACAGCTGGCGTTTCAGT (Artificial Sequence) SEQ ID NO: 60 TAGTTTTAAAGTTGGTTATTAGTTACTGTGATATTTATCACGGTAC Wβ-attP CCAATAACCAATGAATATTTGA (Artificial Sequence) SEQ ID NO: 61 TGTAACTTTTTCGGATCAAGCTATGAAGGACGCAAAGAGGGAACT A118-attB AAACACTTAATT (Artificial Sequence) SEQ ID NO: 62 TTGTTTAGTTCCTCGTTTTCTCTCGTTGGAAGAAGAAGAAACGAGA A118-attP AACTAAAATTA (Artificial Sequence) SEQ ID NO: 63 CAACCTGTTGACATGTTTCCACAGACAACTCACGTGGAGGTAGTC BL3-attB ACGGCTTTTACGTTAGTT (Artificial Sequence) SEQ ID NO: 64 GAGAATACTGTTGAACAATGAAAAACTAGGCATGTAGAAGTTGTT BL3-attP TGTGCACTAACTTTAA (Artificial Sequence) SEQ ID NO: 65 ACAGGTCAACACATCGCAGTTATCGAACAATCTTCGAAAATGTAT MR11-attB GGAGGCACTTGTATCAATATAGGATGTATACCTTCGAAGACACTT (Artificial Sequence) GTACATGATGGATTAGAAGGCAAATCCTTT SEQ ID NO: 66 CAAAATAAAAAACATTGATTTTTATTAACTTCTTTTGTGCGGAACT MR11-attP ACGAACAGTTCATTAATACGAAGTGTACAAACTTCCATACAAAAA (Artificial Sequence) TAACCACGACAATTAAGACGTGGTTTCTA SEQ ID NO: 67 ATTATTTCTCACCCTGA attL (Artificial Sequence) SEQ ID NO: 68 ATCATCTCCCACCCGGA attR (Artificial Sequence) SEQ ID NO: 69 AATAGGTCTG AGAACGCCCA TTCTCAGACG TATT Vox (Artificial Sequence) SEQ ID NO: 70 GAAGTTCCTATAC TTTCTAGA GAATAGGAACTTC FRT (Artificial Sequence) SEQ ID NO: 71 GGTCGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGG Cre recombinase GGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACT expression plasmid TACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCC (Artificial Sequence) ATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGG GACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCC CACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTA TTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGT ACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATT AGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTC ACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATT TATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGG GGCGCGCGCCAGGCGGGGGGGGGGGGGGGGGGGGGGGGGGGGG GGGGGGGCGGGGGGGGGCGGCGGCAGCCAATCAGAGCGGCGCGC TCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCT ATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGC CTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCC GGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACG GCCCTTCTCCTCCGGGCTGTAATTAGCGCTTGGTTTAATGACGGCT TGTTTCTTTTCTGTGGCTGCGTGAAAGCCTTGAGGGGCTCCGGGAG GGCCCTTTGTGCGGGGGGAGCGGCTCGGGGGGTGCGTGCGTGTGT GTGTGCGTGGGGAGCGCCGCGTGCGGCTCCGCGCTGCCCGGCGGC TGTGAGCGCTGCGGGCGCGGCGCGGGGCTTTGTGCGCTCCGCAGT GTGCGCGAGGGGAGCGCGGCCGGGGGCGGTGCCCCGCGGTGCGG GGGGGGCTGCGAGGGGAACAAAGGCTGCGTGCGGGGTGTGTGCG TGGGGGGGTGAGCAGGGGGTGTGGGCGCGTCGGTCGGGCTGCAA CCCCCCCTGCACCCCCCTCCCCGAGTTGCTGAGCACGGCCCGGCTT CGGGTGCGGGGCTCCGTACGGGGCGTGGCGCGGGGCTCGCCGTGC CGGGCGGGGGGTGGCGGCAGGTGGGGGTGCCGGGCGGGGCGGGG CCGCCTCGGGCCGGGGAGGGCTCGGGGGAGGGGCGCGGCGGCCC CCGGAGCGCCGGCGGCTGTCGAGGCGCGGCGAGCCGCAGCCATTG CCTTTTATGGTAATCGTGCGAGAGGGCGCAGGGACTTCCTTTGTCC CAAATCTGTGCGGAGCCGAAATCTGGGAGGCGCCGCCGCACCCCC TCTAGCGGGCGCGGGGCGAAGCGGTGCGGCGCCGGCAGGAAGGA AATGGGCGGGGAGGGCCTTCGTGCGTCGCCGCGCCGCCGTCCCCT TCTCCCTCTCCAGCCTCGGGGCTGTCCGCGGGGGGACGGCTGCCTT CGGGGGGGACGGGGCAGGGCGGGGTTCGGCTTCTGGCGTGTGACC GGCGGCTCTAGAGCCTCTGCTAACCATGTTCATGCCTTCTTCTTTTT CCTACAGCTCCTGGGCAACGTGCTGGTTATTGTGCTGTCTCATCAT TTTGGCAAAGAATTCTGAGCCGCCACCATGGCCAATTTACTGACC GTACACCAAAATTTGCCTGCATTACCGGTCGATGCAACGAGTGAT GAGGTTCGCAAGAACCTGATGGACATGTTCAGGGATCGCCAGGCG TTTTCTGAGCATACCTGGAAAATGCTTCTGTCCGTTTGCCGGTCGT GGGCGGCATGGTGCAAGTTGAATAACCGGAAATGGTTTCCCGCAG AACCTGAAGATGTTCGCGATTATCTTCTATATCTTCAGGCGCGCGG TCTGGCAGTAAAAACTATCCAGCAACATTTGGGCCAGCTAAACAT GCTTCATCGTCGGTCCGGGCTGCCACGACCAAGTGACAGCAATGC TGTTTCACTGGTTATGCGGCGGATCCGAAAAGAAAACGTTGATGC CGGTGAACGTGCAAAACAGGCTCTAGCGTTCGAACGCACTGATTT CGACCAGGTTCGTTCACTCATGGAAAATAGCGATCGCTGCCAGGA TATACGTAATCTGGCATTTCTGGGGATTGCTTATAACACCCTGTTA CGTATAGCCGAAATTGCCAGGATCAGGGTTAAAGATATCTCACGT ACTGACGGTGGGAGAATGTTAATCCATATTGGCAGAACGAAAACG CTGGTTAGCACCGCAGGTGTAGAGAAGGCACTTAGCCTGGGGGTA ACTAAACTGGTCGAGCGATGGATTTCCGTCTCTGGTGTAGCTGATG ATCCGAATAACTACCTGTTTTGCCGGGTCAGAAAAAATGGTGTTG CCGCGCCATCTGCCACCAGCCAGCTATCAACTCGCGCCCTGGAAG GGATTTTTGAAGCAACTCATCGATTGATTTACGGCGCTAAGGATG ACTCTGGTCAGAGATACCTGGCCTGGTCTGGACACAGTGCCCGTG TCGGAGCCGCGCGAGATATGGCCCGCGCTGGAGTTTCAATACCGG AGATCATGCAAGCTGGTGGCTGGACCAATGTAAATATTGTCATGA ACTATATCCGTAACCTGGATAGTGAAACAGGGGCAATGGTGCGCC TGCTGGAAGATGGCGATGGACCGGTGGAACAAAAACTTATTTCTG AAGAAGATCTGTGATAGCGGCCGCACTCCTCAGGTGCAGGCTGCC TATCAGAAGGTGGTGGCTGGTGTGGCCAATGCCCTGGCTCACAAA TACCACTGAGATCTTTTTCCCTCTGCCAAAAATTATGGGGACATCA TGAAGCCCCTTGAGCATCTGACTTCTGGCTAATAAAGGAAATTTAT TTTCATTGCAATAGTGTGTTGGAATTTTTTGTGTCTCTCACTCGGAA GGACATATGGGAGGGCAAATCATTTAAAACATCAGAATGAGTATT TGGTTTAGAGTTTGGCAACATATGCCCATATGCTGGCTGCCATGAA CAAAGGTTGGCTATAAAGAGGTCATCAGTATATGAAACAGCCCCC TGCTGTCCATTCCTTATTCCATAGAAAAGCCTTGACTTGAGGTTAG ATTTTTTTTATATTTTGTTTTGTGTTATTTTTTTCTTTAACATCCCTA AAATTTTCCTTACATGTTTTACTAGCCAGATTTTTCCTCCTCTCCTG ACTACTCCCAGTCATAGCTGTCCCTCTTCTCTTATGGAGATCCCTC GACCTGCAGCCCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCT GTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCC GGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAA CTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAA ACCTGTCGTGCCAGCGGATCCGCATCTCAATTAGTCAGCAACCAT AGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGT TCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGC AGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTG AGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTAACTTGT TTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAA ATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTG TCCAAACTCATCAATGTATCTTATCATGTCTGGATCCGCTGCATTA ATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCG CTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGG CTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTA TCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAA AGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGG CGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCG ACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGAT ACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCC GACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGA AGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGG TGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGT TCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCC AACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGT AACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTC TTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTT GGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTT GGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGT TTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCT CAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGA ACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAA GGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATC AATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATG CTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCA TCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGG AGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACC CACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCG GAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCA TCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCC AGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTG GTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCC AACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAG CGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGC CGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTT ACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACT CAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCT CTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAA CTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAAC TCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCAC TCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTT CTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGA ATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTC AATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATA CATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCG CACATTTCCCCGAAAAGTGCCACCTG SEQ ID NO: 72 AGCTCTGATCAAGAGACAGGATGAGGATCGTTTCGCATGATTGAA GFP-Lox66 Cre CAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGG expression plasmid CTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGAT (Artificial Sequence) GCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTG TCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAAGACGAGG CAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAG CTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTAT TGGGCGAAGTGCCGGGGCAGGATCTCCATGTCATCTACACCTTGC TCCTGCCGAGAAAGTATCCATCATGGCTGATGCAATGCGGCGGCT GCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAA ACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGT CGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGC CGAACTGTTCGCCAGGCTCAAGGCGAGCATGCCCGACGGCGAGGA TCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTG GAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTG TGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTG CTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTA CGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTT CTTGACGAGTTCTTCTGAATTATTAACTCGAGATCCACTAGAGTGT GGCGGCCGCATTCTTATAATCAGCATCATGATGTGGTACCACATCA TGATGCTGATTACCCCCAACTGAGAGAACTCAAAGGTTACCCCAG TTGGGGCGGGCCCACAAATAAAGCAATAGCATCACAAATTTCACA AATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAAC TCATCGAGCTCGAGATCTGGCGAAGGCGATGGGGGTCTTGAAGGC GTGCTGGTACTCCACGATGCCCAGCTCGGTGTTGCTGTGCAGCTCC TCCACGCGGCGGAAGGCGAACATGGGGCCCCCGTTCTGCAGGATG CTGGGGTGGATGGCGCTCTTGAAGTGCATGTGGCTGTCCACCACG AAGCTGTAGTAGCCGCCGTCGCGCAGGCTGAAGGTGCGGGCGAAG CTGCCCACCAGCACGTTATCGCCCATGGGGTGCAGGTGCTCCACG GTGGCGTTGCTGCGGATGATCTTGTCGGTGAAGATCACGCTGTCCT CGGGGAAGCCGGTGCCCACCACCTTGAAGTCGCCGATCACGCGGC CGGCCTCGTAGCGGTAGCTGAAGCTCACGTGCAGCACGCCGCCGT CCTCGTACTTCTCGATGCGGGTGTTGGTGTAGCCGCCGTTGTTGAT GGCGTGCAGGAAGGGGTTCTCGTAGCCGCTGGGGTAGGTGCCGAA GTGGTAGAAGCCGTAGCCCATCACGTGGCTCAGCAGGTAGGGGCT GAAGGTCAGGGCGCCTTTGGTGCTCTTCATCTTGTTGGTCATGCGG CCCTGCTCGGGGGTGCCCTCTCCGCCGCCCACCAGCTCGAACTCCA CGCCGTTCAGGGTGCCGGTGATGCGGCACTCGATCTTCATGGCGG GCATGGTGGCGACCGGTAGCGCTAGCGGCTTCGGATAACTTCGTA TAGCATACATTATACGAACGGTAAGCGCTACCGCCGGCATACCCA AGTGAAGTTGCTCGCAGCTTATAGTCGCGCCCGGGGAGCCCAAGG GCACGCCCTGGCACCGCGGCCGCTGAGTCTCGACCATCATCATCA TCATCATTGAGTTTATCTGGGATAACAGGGTAATGTCATCTAGGGA TAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATCTAGGGA TAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCATCTAGG GATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATCTAGG GATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCATCTA GGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATCTA GGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCATC TAGGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATC TAGGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCA TCTAGGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTA TCTAGGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGT CATCTAGGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATG TATCTAGGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAAT GTCATCTAGGGATAACAGGGTAAATGTCATCTAGGGATAACAGGG TAATGTCATCTAGGGATAACAGGGTAATGTCATCTGGGATAACAG GGTAATGTCATCTAGGGATAACAGGGTAATGTATCGCCAGCGTCG CACAGCATGTTTGCTTGTCGCCGTCGCGTCTGTCACATCTTTTCCG CCAGCAGTTAGGGATTAGCGTCTTAAGCTGGCGCGAGGACCAACG TATCAGCCAGGCGAAGCTGCTTTTGAGCACCACCCGGATGCCTAT CGCCACCGTCGGTCGCAATGTTGGTTTTGACGATCAACTCTATTTC TCGCGGGTATTTAAAAAATGCACCGGGGCCAGCCCGAGCGAGTTC CGTGCCGGTTGTGAAGAAAAAGTGAATGATGTAGCCGTCAAGTTG TCATAATTGGTAACGAATCAGACAATTGACGGCTTGACGGAGTAG CATAGGGTTTGCAGAATCCCTGCTTCGTCCATTTGACAGGCACATT ATGCATGCCGCTTCGCCTTCGCGCGCGAATTGATCTGCTGCCTCGC GCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCC GGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACA AGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCGC AGCCATGACCCAGTCACGTAGCGATAGCGGAGTGTATACTGGCTT AACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATG CGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATC AGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCG TTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATAC GGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGA GCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTT GCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAA AATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAA AGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTG TTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCG GGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTT CGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCC CCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGA GTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCAC TGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGA GTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGT ATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGA GTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGT GGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGA TCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGT GGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGCGGATACA TATTTGAATGTATTTAGAAAAATAAACAAAAGAGTTTGTAGAAAC GCAAAAAGGCCATCCGTCAGGATGGCCTTCTGCTTAATTTGATGCC TGGCAGTTTATGGCGGGCGTCCTGCCCGCCACCCTCCGGGCCGTTG CTTCGCAACGTTCAAATCCGCTCCCGGCGGATTTGTCCTACTCAGG AGAGCGTTCACCGACAAACAACAGATAAAACGAAAGGCCCAGTC TTTCGACTGAGCCTTTCGTTTTATTTGATGCCTGGCAGTTCCCTACT CTCGCATGGGGAGACCCCACACTACCATCGGCGCTACGGCGTTTC ACTTCTGAGTTCGGCATGGGGTCAGGTGGGACCACCGCGCTACTG CCGCCAGGCAAATTCTGTTTTATCAGACCGCTTCTGCGTTCTGATT TAATCTGTATCAGGCTGAAAATCTTCTCTCATCCGCCAAAACAGCC AAGCTGGAGACCGTTTGGCCCCCCTCGAGCACGTAGAAAGCCAGT CCGCAGAAACGGTGCTGACCCCGGATGAATGTCAGCTACTGGGCT ATCTGGACAAGGGAAAACGCAAGCGCAAAGAGAAAGCAGGTAGC TTGCAGTGGGCTTACATGGCGATAGCTAGACTGGGCGGTTTTATG GACAGCAAGCGAACCGGAATTGCCAGCTGGGGCGCCCTCTGGTAA GGTTGGGAAGCCCTGCAAAGTAAACTGGATGGCTTTCTCGCCGCC AAGGATCTGATGGCGCAGGGGATCA SEQ ID NO: 73 ACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTA pCMV PE2 P2A Cre CGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATA plasmid ACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCG (Artificial Sequence) CCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATA GGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACT GCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCC CCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCC AGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGT ATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATC AATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTC CACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAAC GGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAA TGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTG GTTTAGTGAACCGTCAGATCCGCTAGAGATCCGCGGCCGCTAATA CGACTCACTATAGGGAGAGCCGCCACCATGAAACGGACAGCCGAC GGAAGCGAGTTCGAGTCACCAAAGAAGAAGCGGAAAGTCGACAA GAAGTACAGCATCGGCCTGGACATCGGCACCAACTCTGTGGGCTG GGCCGTGATCACCGACGAGTACAAGGTGCCCAGCAAGAAATTCAA GGTGCTGGGCAACACCGACCGGCACAGCATCAAGAAGAACCTGAT CGGAGCCCTGCTGTTCGACAGCGGCGAAACAGCCGAGGCCACCCG GCTGAAGAGAACCGCCAGAAGAAGATACACCAGACGGAAGAACC GGATCTGCTATCTGCAAGAGATCTTCAGCAACGAGATGGCCAAGG TGGACGACAGCTTCTTCCACAGACTGGAAGAGTCCTTCCTGGTGG AAGAGGATAAGAAGCACGAGCGGCACCCCATCTTCGGCAACATCG TGGACGAGGTGGCCTACCACGAGAAGTACCCCACCATCTACCACC TGAGAAAGAAACTGGTGGACAGCACCGACAAGGCCGACCTGCGG CTGATCTATCTGGCCCTGGCCCACATGATCAAGTTCCGGGGCCACT TCCTGATCGAGGGCGACCTGAACCCCGACAACAGCGACGTGGACA AGCTGTTCATCCAGCTGGTGCAGACCTACAACCAGCTGTTCGAGG AAAACCCCATCAACGCCAGCGGCGTGGACGCCAAGGCCATCCTGT CTGCCAGACTGAGCAAGAGCAGACGGCTGGAAAATCTGATCGCCC AGCTGCCCGGCGAGAAGAAGAATGGCCTGTTCGGAAACCTGATTG CCCTGAGCCTGGGCCTGACCCCCAACTTCAAGAGCAACTTCGACC TGGCCGAGGATGCCAAACTGCAGCTGAGCAAGGACACCTACGACG ACGACCTGGACAACCTGCTGGCCCAGATCGGCGACCAGTACGCCG ACCTGTTTCTGGCCGCCAAGAACCTGTCCGACGCCATCCTGCTGAG CGACATCCTGAGAGTGAACACCGAGATCACCAAGGCCCCCCTGAG CGCCTCTATGATCAAGAGATACGACGAGCACCACCAGGACCTGAC CCTGCTGAAAGCTCTCGTGCGGCAGCAGCTGCCTGAGAAGTACAA AGAGATTTTCTTCGACCAGAGCAAGAACGGCTACGCCGGCTACAT TGACGGCGGAGCCAGCCAGGAAGAGTTCTACAAGTTCATCAAGCC CATCCTGGAAAAGATGGACGGCACCGAGGAACTGCTCGTGAAGCT GAACAGAGAGGACCTGCTGCGGAAGCAGCGGACCTTCGACAACG GCAGCATCCCCCACCAGATCCACCTGGGAGAGCTGCACGCCATTC TGCGGCGGCAGGAAGATTTTTACCCATTCCTGAAGGACAACCGGG AAAAGATCGAGAAGATCCTGACCTTCCGCATCCCCTACTACGTGG GCCCTCTGGCCAGGGGAAACAGCAGATTCGCCTGGATGACCAGAA AGAGCGAGGAAACCATCACCCCCTGGAACTTCGAGGAAGTGGTGG ACAAGGGCGCTTCCGCCCAGAGCTTCATCGAGCGGATGACCAACT TCGATAAGAACCTGCCCAACGAGAAGGTGCTGCCCAAGCACAGCC TGCTGTACGAGTACTTCACCGTGTATAACGAGCTGACCAAAGTGA AATACGTGACCGAGGGAATGAGAAAGCCCGCCTTCCTGAGCGGCG AGCAGAAAAAGGCCATCGTGGACCTGCTGTTCAAGACCAACCGGA AAGTGACCGTGAAGCAGCTGAAAGAGGACTACTTCAAGAAAATC GAGTGCTTCGACTCCGTGGAAATCTCCGGCGTGGAAGATCGGTTC AACGCCTCCCTGGGCACATACCACGATCTGCTGAAAATTATCAAG GACAAGGACTTCCTGGACAATGAGGAAAACGAGGACATTCTGGA AGATATCGTGCTGACCCTGACACTGTTTGAGGACAGAGAGATGAT CGAGGAACGGCTGAAAACCTATGCCCACCTGTTCGACGACAAAGT GATGAAGCAGCTGAAGCGGCGGAGATACACCGGCTGGGGCAGGC TGAGCCGGAAGCTGATCAACGGCATCCGGGACAAGCAGTCCGGCA AGACAATCCTGGATTTCCTGAAGTCCGACGGCTTCGCCAACAGAA ACTTCATGCAGCTGATCCACGACGACAGCCTGACCTTTAAAGAGG ACATCCAGAAAGCCCAGGTGTCCGGCCAGGGCGATAGCCTGCACG AGCACATTGCCAATCTGGCCGGCAGCCCCGCCATTAAGAAGGGCA TCCTGCAGACAGTGAAGGTGGTGGACGAGCTCGTGAAAGTGATGG GCCGGCACAAGCCCGAGAACATCGTGATCGAAATGGCCAGAGAG AACCAGACCACCCAGAAGGGACAGAAGAACAGCCGCGAGAGAAT GAAGCGGATCGAAGAGGGCATCAAAGAGCTGGGCAGCCAGATCC TGAAAGAACACCCCGTGGAAAACACCCAGCTGCAGAACGAGAAG CTGTACCTGTACTACCTGCAGAATGGGCGGGATATGTACGTGGAC CAGGAACTGGACATCAACCGGCTGTCCGACTACGATGTGGACGCT ATCGTGCCTCAGAGCTTTCTGAAGGACGACTCCATCGACAACAAG GTGCTGACCAGAAGCGACAAGAACCGGGGCAAGAGCGACAACGT GCCCTCCGAAGAGGTCGTGAAGAAGATGAAGAACTACTGGCGGC AGCTGCTGAACGCCAAGCTGATTACCCAGAGAAAGTTCGACAATC TGACCAAGGCCGAGAGAGGCGGCCTGAGCGAACTGGATAAGGCC GGCTTCATCAAGAGACAGCTGGTGGAAACCCGGCAGATCACAAAG CACGTGGCACAGATCCTGGACTCCCGGATGAACACTAAGTACGAC GAGAATGACAAGCTGATCCGGGAAGTGAAAGTGATCACCCTGAA GTCCAAGCTGGTGTCCGATTTCCGGAAGGATTTCCAGTTTTACAAA GTGCGCGAGATCAACAACTACCACCACGCCCACGACGCCTACCTG AACGCCGTCGTGGGAACCGCCCTGATCAAAAAGTACCCTAAGCTG GAAAGCGAGTTCGTGTACGGCGACTACAAGGTGTACGACGTGCGG AAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAGGCTACCGC CAAGTACTTCTTCTACAGCAACATCATGAACTTTTTCAAGACCGAG ATTACCCTGGCCAACGGCGAGATCCGGAAGCGGCCTCTGATCGAG ACAAACGGCGAAACCGGGGAGATCGTGTGGGATAAGGGCCGGGA TTTTGCCACCGTGCGGAAAGTGCTGAGCATGCCCCAAGTGAATAT CGTGAAAAAGACCGAGGTGCAGACAGGCGGCTTCAGCAAAGAGT CTATCCTGCCCAAGAGGAACAGCGATAAGCTGATCGCCAGAAAGA AGGACTGGGACCCTAAGAAGTACGGCGGCTTCGACAGCCCCACCG TGGCCTATTCTGTGCTGGTGGTGGCCAAAGTGGAAAAGGGCAAGT CCAAGAAACTGAAGAGTGTGAAAGAGCTGCTGGGGATCACCATCA TGGAAAGAAGCAGCTTCGAGAAGAATCCCATCGACTTTCTGGAAG CCAAGGGCTACAAAGAAGTGAAAAAGGACCTGATCATCAAGCTG CCTAAGTACTCCCTGTTCGAGCTGGAAAACGGCCGGAAGAGAATG CTGGCCTCTGCCGGCGAACTGCAGAAGGGAAACGAACTGGCCCTG CCCTCCAAATATGTGAACTTCCTGTACCTGGCCAGCCACTATGAGA AGCTGAAGGGCTCCCCCGAGGATAATGAGCAGAAACAGCTGTTTG TGGAACAGCACAAGCACTACCTGGACGAGATCATCGAGCAGATCA GCGAGTTCTCCAAGAGAGTGATCCTGGCCGACGCTAATCTGGACA AAGTGCTGTCCGCCTACAACAAGCACCGGGATAAGCCCATCAGAG AGCAGGCCGAGAATATCATCCACCTGTTTACCCTGACCAATCTGG GAGCCCCTGCCGCCTTCAAGTACTTTGACACCACCATCGACCGGA AGAGGTACACCAGCACCAAAGAGGTGCTGGACGCCACCCTGATCC ACCAGAGCATCACCGGCCTGTACGAGACACGGATCGACCTGTCTC AGCTGGGAGGTGACTCTGGAGGATCTAGCGGAGGATCCTCTGGCA GCGAGACACCAGGAACAAGCGAGTCAGCAACACCAGAGAGCAGT GGCGGCAGCAGCGGCGGCAGCAGCACCCTAAATATAGAAGATGA GTATCGGCTACATGAGACCTCAAAAGAGCCAGATGTTTCTCTAGG GTCCACATGGCTGTCTGATTTTCCTCAGGCCTGGGCGGAAACCGG GGGCATGGGACTGGCAGTTCGCCAAGCTCCTCTGATCATACCTCTG AAAGCAACCTCTACCCCCGTGTCCATAAAACAATACCCCATGTCA CAAGAAGCCAGACTGGGGATCAAGCCCCACATACAGAGACTGTTG GACCAGGGAATACTGGTACCCTGCCAGTCCCCCTGGAACACGCCC CTGCTACCCGTTAAGAAACCAGGGACTAATGATTATAGGCCTGTC CAGGATCTGAGAGAAGTCAACAAGCGGGTGGAAGACATCCACCC CACCGTGCCCAACCCTTACAACCTCTTGAGCGGGCTCCCACCGTCC CACCAGTGGTACACTGTGCTTGATTTAAAGGATGCCTTTTTCTGCC TGAGACTCCACCCCACCAGTCAGCCTCTCTTCGCCTTTGAGTGGAG AGATCCAGAGATGGGAATCTCAGGACAATTGACCTGGACCAGACT CCCACAGGGTTTCAAAAACAGTCCCACCCTGTTTAATGAGGCACT GCACAGAGACCTAGCAGACTTCCGGATCCAGCACCCAGACTTGAT CCTGCTACAGTACGTGGATGACTTACTGCTGGCCGCCACTTCTGAG CTAGACTGCCAACAAGGTACTCGGGCCCTGTTACAAACCCTAGGG AACCTCGGGTATCGOGCCTCGGCCAAGAAAGCCCAAATTTGCCAG AAACAGGTCAAGTATCTGGGGTATCTTCTAAAAGAGGGTCAGAGA TGGCTGACTGAGGCCAGAAAAGAGACTGTGATGGGGCAGCCTACT CCGAAGACCCCTCGACAACTAAGGGAGTTCCTAGGGAAGGCAGGC TTCTGTCGCCTCTTCATCCCTGGGTTTGCAGAAATGGCAGCCCCCC TGTACCCTCTCACCAAACCGGGGACTCTGTTTAATTGGGGCCCAGA CCAACAAAAGGCCTATCAAGAAATCAAGCAAGCTCTTCTAACTGC CCCAGCCCTGGGGTTGCCAGATTTGACTAAGCCCTTTGAACTCTTT GTCGACGAGAAGCAGGGCTACGCCAAAGGTGTCCTAACGCAAAA ACTGGGACCTTGGCGTCGGCCGGTGGCCTACCTGTCCAAAAAGCT AGACCCAGTAGCAGCTGGGTGGCCCCCTTGCCTACGGATGGTAGC AGCCATTGCCGTACTGACAAAGGATGCAGGCAAGCTAACCATGGG ACAGCCACTAGTCATTCTGGCCCCCCATGCAGTAGAGGCACTAGT CAAACAACCCCCCGACCGCTGGCTTTCCAACGCCCGGATGACTCA CTATCAGGCCTTGCTTTTGGACACGGACCGGGTCCAGTTCGGACCG GTGGTAGCCCTGAACCCGGCTACGCTGCTCCCACTGCCTGAGGAA GGGCTGCAACACAACTGCCTTGATATCCTGGCCGAAGCCCACGGA ACCCGACCCGACCTAACGGACCAGCCGCTCCCAGACGCCGACCAC ACCTGGTACACGGATGGAAGCAGTCTCTTACAAGAGGGACAGCGT AAGGCGGGAGCTGCGGTGACCACCGAGACCGAGGTAATCTGGGCT AAAGCCCTGCCAGCCGGGACATCCGCTCAGCGGGCTGAACTGATA GCACTCACCCAGGCCCTAAAGATGGCAGAAGGTAAGAAGCTAAAT GTTTATACTGATAGCCGTTATGCTTTTGCTACTGCCCATATCCATG GAGAAATATACAGAAGGCGTGGGTGGCTCACATCAGAAGGCAAA GAGATCAAAAATAAAGACGAGATCTTGGCCCTACTAAAAGCCCTC TTTCTGCCCAAAAGACTTAGCATAATCCATTGTCCAGGACATCAAA AGGGACACAGCGCCGAGGCTAGAGGCAACCGGATGGCTGACCAA GCGGCCCGAAAGGCAGCCATCACAGAGACTCCAGACACCTCTACC CTCCTCATAGAAAATTCATCACCCTCTGGCGGCTCAAAAAGAACC GCCGACGGCAGCGAATTCGAGCCCAAGAAGAAGAGGAAAGTCGG AAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGCGACGT GGAGGAGAACCCTGGACCTAATTTACTGACCGTACACCAAAATTT GCCTGCATTACCGGTCGATGCAACGAGTGATGAGGTTCGCAAGAA CCTGATGGACATGTTCAGGGATCGCCAGGCGTTTTCTGAGCATACC TGGAAAATGCTTCTGTCCGTTTGCCGGTCGTGGGCGGCATGGTGCA AGTTGAATAACCGGAAATGGTTTCCCGCAGAACCTGAAGATGTTC GCGATTATCTTCTATATCTTCAGGCGCGCGGTCTGGCAGTAAAAAC TATCCAGCAACATTTGGGCCAGCTAAACATGCTTCATCGTCGGTCC GGGCTGCCACGACCAAGTGACAGCAATGCTGTTTCACTGGTTATG CGGCGGATCCGAAAAGAAAACGTTGATGCCGGTGAACGTGCAAA ACAGGCTCTAGCGTTCGAACGCACTGATTTCGACCAGGTTCGTTCA CTCATGGAAAATAGCGATCGCTGCCAGGATATACGTAATCTGGCA TTTCTGGGGATTGCTTATAACACCCTGTTACGTATAGCCGAAATTG CCAGGATCAGGGTTAAAGATATCTCACGTACTGACGGTGGGAGAA TGTTAATCCATATTGGCAGAACGAAAACGCTGGTTAGCACCGCAG GTGTAGAGAAGGCACTTAGCCTGGGGGTAACTAAACTGGTCGAGC GATGGATTTCCGTCTCTGGTGTAGCTGATGATCCGAATAACTACCT GTTTTGCCGGGTCAGAAAAAATGGTGTTGCCGCGCCATCTGCCAC CAGCCAGCTATCAACTCGCGCCCTGGAAGGGATTTTTGAAGCAAC TCATCGATTGATTTACGGCGCTAAGGATGACTCTGGTCAGAGATA CCTGGCCTGGTCTGGACACAGTGCCCGTGTCGGAGCCGCGCGAGA TATGGCCCGCGCTGGAGTTTCAATACCGGAGATCATGCAAGCTGG TGGCTGGACCAATGTAAATATTGTCATGAACTATATCCGTAACCTG GATAGTGAAACAGGGGCAATGGTGCGCCTGCTGGAAGATGGCGAT TAATTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCA GCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAA GGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGAAAATTGCAT CGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGG GCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATG CTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACCA GCTGGGGCTCGATACCGTCGACCTCTAGCTAGAGCTTGGCGTAAT CATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAAT TCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTAGGG TGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTG CCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAA tcggccaacgcgcggggagaggcggtttgcgtattgggcgctctt CCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCG GCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCAC AGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCC AGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTT TCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCT CAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAG GCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCC TGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGT GGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAG GTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGC CCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCC GGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAG GATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAA GTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTAT CTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAG CTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTT GTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGA AGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAA AACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATC TTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCT AAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAAT CAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATA GTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGC TTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGC TCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGG GCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGT CTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTA ATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTC ACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGA TCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTT AGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAG TGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGT CATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACC AAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCC CGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAA AAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAA GGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGC ACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGG TGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAG GGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATAT TATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATAT TTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACAT TTCCCCGAAAAGTGCCACCTGACGTCGACGGATCGGGAGATCGAT CTCCCGATCCCCTAGGGTCGACTCTCAGTACAATCTGCTCTGATGC CGCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTC GCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAAGGC TTGACCGACAATTGCATGAAGAATCTGCTTAGGGTTAGGCGTTTTG CGCTGCTTCGCGATGTACGGGCCAGATAT SEQ ID NO: 74 GTCAACCAGTATCCCGGTGC +90 ngRNA guide sequence (Artificial Sequence) SEQ ID NO: 75 GTCAACCAGTATCCCGGTGCGTTTTAGAGCTAGAAATAGCAAGTT +90 ngRNA AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT (Artificial Sequence) CGGTGC SEQ ID NO: 76 TGATCCCCTGCGCCATCAGATCCTTGGCGGCGAGAAAGCCATCCA GFP minicircle GTTTACTTTGCAGGGCTTCCCAACCTTACCAGAGGGCGCCCCAGCT template (before GGCAATTCCGGTTCGCTTGCTGTCCATAAAACCGCCCAGTCTAGCT cleavage into a ATCGCCATGTAAGCCCACTGCAAGCTACCTGCTTTCTCTTTGCGCT minicircle) TGCGTTTTCCCTTGTCCAGATAGCCCAGTAGCTGACATTCATCCGG (Artificial Sequence) GGTCAGCACCGTTTCTGCGGACTGGCTTTCTACGTGCTCGAGGGGG GCCAAACGGTCTCCAGCTTGGCTGTTTTGGCGGATGAGAGAAGAT TTTCAGCCTGATACAGATTAAATCAGAACGCAGAAGCGGTCTGAT AAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGA CCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAG TGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAA TAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCT GTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGCCGG GAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGG GCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAG GCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTTTGTT TATTTTTCTAAATACATTCAAATATGTATCCGCTCATGACCAAAAT CCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAA AAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCT GCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTT TGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTT CAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTA GTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTC GCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGT CGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGG CGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCT TGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGC TATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGG TATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGA GCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTT CGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGG GGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGT TCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTA TCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTG ATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGA GCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTCCTTACGC ATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGTACAAT CTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATCGC TACGTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCCG CTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAG ACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTC ACCGTCATCACCGAAACGCGCGAGGCAGCAGATCAATTCGCGCGC GAAGGCGAAGCGGCATGCATAATGTGCCTGTCAAATGGACGAAGC AGGGATTCTGCAAACCCTATGCTACTCCGTCAAGCCGTCAATTGTC TGATTCGTTACCAATTATGACAACTTGACGGCTACATCATTCACTT TTTCTTCACAACCGGCACGGAACTCGCTCGGGCTGGCCCCGGTGC ATTTTTTAAATACCCGCGAGAAATAGAGTTGATCGTCAAAACCAA CATTGCGACCGACGGTGGCGATAGGCATCCGGGTGGTGCTCAAAA GCAGCTTCGCCTGGCTGATACGTTGGTCCTCGCGCCAGCTTAAGAC GCTAATCCCTAACTGCTGGCGGAAAAGATGTGACAGACGCGACGG CGACAAGCAAACATGCTGTGCGACGCTGGCGATACATTACCCTGT TATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTACCCTG TTATCCCTAGATGACATTACCCTGTTATCCCTAGATGACATTTACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATAAACTCAA TGATGATGATGATGATGGTCGAGACTCAGCGGCCGCGGTGCCAGG GCGTGCCCTTGGGCTCCCCGGGCGCGACTATAAGCTGCGAGCAAC TTCACTTGGGTATGCCGGCGGTAGCGCTTACCGTTCGTATAATGTA TGCTATACGAAGTTATCCGAAGCCGCTAGCGGTGGTTTGTCTGGTC AACCACCGCGGTCTCAGTGGTGTACGGTACAAACCCAGCTACCGG TCGCCACCATGCCCGCCATGAAGATCGAGTGCCGCATCACCGGCA CCCTGAACGGCGTGGAGTTCGAGCTGGTGGGCGGCGGAGAGGGC ACCCCCGAGCAGGGCCGCATGACCAACAAGATGAAGAGCACCAA AGGCGCCCTGACCTTCAGCCCCTACCTGCTGAGCCACGTGATGGG CTACGGCTTCTACCACTTCGGCACCTACCCCAGCGGCTACGAGAA CCCCTTCCTGCACGCCATCAACAACGGCGGCTACACCAACACCCG CATCGAGAAGTACGAGGACGGCGGCGTGCTGCACGTGAGCTTCAG CTACCGCTACGAGGCCGGCCGCGTGATCGGCGACTTCAAGGTGGT GGGCACCGGCTTCCCCGAGGACAGCGTGATCTTCACCGACAAGAT CATCCGCAGCAACGCCACCGTGGAGCACCTGCACCCCATGGGCGA TAACGTGCTGGTGGGCAGCTTCGCCCGCACCTTCAGCCTGCGCGA CGGCGGCTACTACAGCTTCGTGGTGGACAGCCACATGCACTTCAA GAGCGCCATCCACCCCAGCATCCTGCAGAACGGGGGCCCCATGTT CGCCTTCCGCCGCGTGGAGGAGCTGCACAGCAACACCGAGCTGGG CATCGTGGAGTACCAGCACGCCTTCAAGACCCCCATCGCCTTCGCC AGATCTCGAGCTCGATGAGTTTGGACAAACCACAACTAGAATGCA GTGAAAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTA TTTGTGGGCCCGCCCCAACTGGGGTAACCTTTGAGTTCTCTCAGTT GGGGGTAATCAGCATCATGATGTGGTACCACATCATGATGCTGAT TATAAGAATGCGGCCGCCACACTCTAGTGGATCTCGAGTTAATAA TTCAGAAGAACTCGTCAAGAAGGCGATAGAAGGCGATGCGCTGCG AATCGGGAGCGGCGATACCGTAAAGCACGAGGAAGCGGTCAGCC CATTCGCCGCCAAGCTCTTCAGCAATATCACGGGTAGCCAACGCT ATGTCCTGATAGCGGTCCGCCACACCCAGCCGGCCACAGTCGATG AATCCAGAAAAGCGGCCATTTTCCACCATGATATTCGGCAAGCAG GCATCGCCATGGGTCACGACGAGATCCTCGCCGTCGGGCATGCTC GCCTTGAGCCTGGCGAACAGTTCGGCTGGCGCGAGCCCCTGATGC TCTTCGTCCAGATCATCCTGATCGACAAGACCGGCTTCCATCCGAG TACGTGCTCGCTCGATGCGATGTTTCGCTTGGTGGTCGAATGGGCA GGTAGCCGGATCAAGCGTATGCAGCCGCCGCATTGCATCAGCCAT GATGGATACTTTCTCGGCAGGAGCAAGGTGTAGATGACATGGAGA TCCTGCCCCGGCACTTCGCCCAATAGCAGCCAGTCCCTTCCCGCTT CAGTGACAACGTCGAGCACAGCTGCGCAAGGAACGCCCGTCGTGG CCAGCCACGATAGCCGCGCTGCCTCGTCTTGCAGTTCATTCAGGGC ACCGGACAGGTCGGTCTTGACAAAAAGAACCGGGCGCCCCTGCGC TGACAGCCGGAACACGGCGGCATCAGAGCAGCCGATTGTCTGTTG TGCCCAGTCATAGCCGAATAGCCTCTCCACCCAAGCGGCCGGAGA ACCTGCGTGCAATCCATCTTGTTCAATCATGCGAAACGATCCTCAT CCTGTCTCTTGATCAGAGCT SEQ ID NO: 77 TGATCCCCTGCGCCATCAGATCCTTGGCGGCGAGAAAGCCATCCA Gaussia Luciferase GTTTACTTTGCAGGGCTTCCCAACCTTACCAGAGGGCGCCCCAGCT minicircle template GGCAATTCCGGTTCGCTTGCTGTCCATAAAACCGCCCAGTCTAGCT (Artificial Sequence) ATCGCCATGTAAGCCCACTGCAAGCTACCTGCTTTCTCTTTGCGCT TGCGTTTTCCCTTGTCCAGATAGCCCAGTAGCTGACATTCATCCGG GGTCAGCACCGTTTCTGCGGACTGGCTTTCTACGTGCTCGAGGGGG GCCAAACGGTCTCCAGCTTGGCTGTTTTGGCGGATGAGAGAAGAT TTTCAGCCTGATACAGATTAAATCAGAACGCAGAAGCGGTCTGAT AAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGA CCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAG TGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAA TAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCT GTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGCCGG GAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGG GCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAG GCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTTTGTT TATTTTTCTAAATACATTCAAATATGTATCCGCTCATGACCAAAAT CCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAA AAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCT GCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTT TGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTT CAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTA GTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTC GCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGT CGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGG CGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCT TGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGC TATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGG TATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGA GCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTT CGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGG GGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGT TCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTA TCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTG ATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGA GCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTCCTTACGC ATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGTACAAT CTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATCGC TACGTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCCG CTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAG ACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTC ACCGTCATCACCGAAACGCGCGAGGCAGCAGATCAATTCGCGCGC GAAGGCGAAGCGGCATGCATAATGTGCCTGTCAAATGGACGAAGC AGGGATTCTGCAAACCCTATGCTACTCCGTCAAGCCGTCAATTGTC TGATTCGTTACCAATTATGACAACTTGACGGCTACATCATTCACTT TTTCTTCACAACCGGCACGGAACTCGCTCGGGCTGGCCCCGGTGC ATTTTTTAAATACCCGCGAGAAATAGAGTTGATCGTCAAAACCAA CATTGCGACCGACGGTGGCGATAGGCATCCGGGTGGTGCTCAAAA GCAGCTTCGCCTGGCTGATACGTTGGTCCTCGCGCCAGCTTAAGAC GCTAATCCCTAACTGCTGGCGGAAAAGATGTGACAGACGCGACGG CGACAAGCAAACATGCTGTGCGACGCTGGCGATACATTACCCTGT TATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTACCCTG TTATCCCTAGATGACATTACCCTGTTATCCCTAGATGACATTTACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATAAACTCAA TGATGATGATGATGATGGTCGAGACTCAGCGGCCGCGGTGCCAGG GCGTGCCCTTGGGCTCCCCGGGCGCGACTATAAGCTGCGAGCAAC TTCACTTGGGTATGCCGGCGGTAGCGCTTACCGTTCGTATAATGTA TGCTATACGAAGTTATCCGAAGCCGCTAGCGGTGGTTTGTCTGGTC AACCACCGCGGTCTCAGTGGTGTACGGTACAAACCCACTACCGGT CGCCACCATGGGAGTCAAAGTTCTGTTTGCCCTGATCTGCATCGCT GTGGCCGAGGCCAAGCCCACCGAGAACAACGAAGACTTCAACATC GTGGCCGTGGCCAGCAACTTCGCGACCACGGATCTCGATGCTGAC CGCGGGAAGTTGCCCGGCAAGAAGCTGCCGCTGGAGGTGCTCAAA GAGATGGAAGCCAATGCCCGGAAAGCTGGCTGCACCAGGGGCTGT CTGATCTGCCTGTCCCACATCAAGTGCACGCCCAAGATGAAGAAG TTCATCCCAGGACGCTGCCACACCTACGAAGGCGACAAAGAGTCC GCACAGGGCGGCATAGGCGAGGCGATCGTCGACATTCCTGAGATT CCTGGGTTCAAGGACTTGGAGCCCATGGAGCAGTTCATCGCACAG GTCGATCTGTGTGTGGACTGCACAACTGGCTGCCTCAAAGGGCTT GCCAACGTGCAGTGTTCTGACCTGCTCAAGAAGTGGCTGCCGCAA CGCTGTGCGACCTTTGCCAGCAAGATCCAGGGCCAGGTGGACAAG ATCAAGGGGGCCGGTGGTGACTAAGCGGAGCTCGATGAGTTTGGA CAAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGTGAA ATTTGTGATGCTATTGCTTTATTTGTGGGCCCGCCCCAACTGGGGT AACCTTTGAGTTCTCTCAGTTGGGGGTAATCAGCATCATGATGTGG TACCACATCATGATGCTGATTATAAGAATGCGGCCGCCACACTCT AGTGGATCTCGAGTTAATAATTCAGAAGAACTCGTCAAGAAGGCG ATAGAAGGCGATGCGCTGCGAATCGGGAGCGGCGATACCGTAAA GCACGAGGAAGCGGTCAGCCCATTCGCCGCCAAGCTCTTCAGCAA TATCACGGGTAGCCAACGCTATGTCCTGATAGCGGTCCGCCACAC CCAGCCGGCCACAGTCGATGAATCCAGAAAAGCGGCCATTTTCCA CCATGATATTCGGCAAGCAGGCATCGCCATGGGTCACGACGAGAT CCTCGCCGTCGGGCATGCTCGCCTTGAGCCTGGCGAACAGTTCGG CTGGCGCGAGCCCCTGATGCTCTTCGTCCAGATCATCCTGATCGAC AAGACCGGCTTCCATCCGAGTACGTGCTCGCTCGATGCGATGTTTC GCTTGGTGGTCGAATGGGCAGGTAGCCGGATCAAGCGTATGCAGC CGCCGCATTGCATCAGCCATGATGGATACTTTCTCGGCAGGAGCA AGGTGTAGATGACATGGAGATCCTGCCCCGGCACTTCGCCCAATA GCAGCCAGTCCCTTCCCGCTTCAGTGACAACGTCGAGCACAGCTG CGCAAGGAACGCCCGTCGTGGCCAGCCACGATAGCCGCGCTGCCT CGTCTTGCAGTTCATTCAGGGCACCGGACAGGTCGGTCTTGACAA AAAGAACCGGGCGCCCCTGCGCTGACAGCCGGAACACGGCGGCA TCAGAGCAGCCGATTGTCTGTTGTGCCCAGTCATAGCCGAATAGC CTCTCCACCCAAGCGGCCGGAGAACCTGCGTGCAATCCATCTTGTT CAATCATGCGAAACGATCCTCATCCTGTCTCTTGATCAGAGCT SEQ ID NO: 78 CCCCAACTGGGGTAACCTTTGAGTTCTCTCAGTTGGGG pseudo attP site (Artificial sequence) SEQ ID NO: 79 GACTGAAACTTCACAGAATAGTTTTAGAGCTAGAAATAGCAAGTT Albumin-pegRNA- AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT SERPIN CGGTGCTTGGGATAGTTATGAATTCAATCTTCAACCCTATCCGGAT (Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCTGT GAAGTTTCAGTCA SEQ ID NO: 80 GACTGAAACTTCACAGAATAGTTTTAGAGCTAGAAATAGCAAGTT Albumin-pegRNA- AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT CPS1 CGGTGCTTGGGATAGTTATGAATTCAATCTTCAACCCTATCCGGAT (Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCTGT GAAGTTTC SEQ ID NO: 81 GGCCCAGACTGAGCACGTGAGTTTTAGAGCTAGAAATAGCAAGTT 34 bp lox71 pegRNA AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT (Artificial Sequence) CGGTGCTGGAGGAAGCAGGGCTTCCTTTCCTCTGCCATCATACCGT TCGTATAGCATACATTATACGAAGTTATCGTGCTCAGTCTG SEQ ID NO: 82 GGCCCAGACTGAGCACGTGAGTTTTAGAGCTAGAAATAGCAAGTT 34 bp lox66 pegRNA AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT (Artificial Sequence) CGGTGCTGGAGGAAGCAGGGCTTCCTTTCCTCTGCCATCAATAACT TCGTATAGCATACATTATACGAACGGTACGTGCTCAGTCTG SEQ ID NO: 83 GGCCCAGACTGAGCACGTGA gRNA (Artificial Sequence) SEQ ID NO: 84 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 46 GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC (original length) TGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA pegRNA GAA (Artificial Sequence) SEQ ID NO: 85 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS_13_RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT TP901-1 minimal GGCACAATTAACATCTCAATCAAGGTAAATGCTTGAGCTGCGAG attB f pegRNA AA (Artificial Sequence) SEQ ID NO: 86 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS_13_RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT TP901-1 minimal GGAGCATTTACCTTGATTGAGATGTTAATTGTGTGAGCTGCGAGA attB rc pegRNA A (Artificial Sequence) SEQ ID NO: 87 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS_13_RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT PhiBT1 minimal GGCAGGTTTTTGACGAAAGTGATCCAGATGATCCAGTGAGCTGC attB f pegRNA GAGAA (Artificial Sequence) SEQ ID NO: 88 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS 13 RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT PhiBT1 minimal GGCTGGATCATCTGGATCACTTTCGTCAAAAACCTGTGAGCTGCG attB rc pegRNA AGAA (Artificial Sequence) SEQ ID NO: 89 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGT ACTB N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA Nicking guide 1 +48 GTCGGTGC guide (Artificial Sequence) SEQ ID NO: 90 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGT ACTB N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA PBS_18_RT_16_with_ GTCGGTGCATATCATCATCCATGGTACCGTTCGTATAGCATACAT Lox71_Cre TATACGAAGTTATTGAGCTGCGAGAATAGCC pegRNA (Artificial Sequence) SEQ ID NO: 91 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGT ACTB N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA PBS_13_RT_29_with_ GTCGGTGCGACGAGCGCGGCGATATCATCATCCATGGTACCGTT Lox71_Cre CGTATAGCATACATTATACGAAGTTATTGAGCTGCGAGAA pegRNA (Artificial Sequence) SEQ ID NO: 92 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 34 pegRNA GGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGCCGGAT (Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGC TGCGAGAA SEQ ID NO: 93 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 26 pegRNA GGTGCGAGCGCGGCGATATCATCATCCATGGCCGGATGATCCTGA (Artificial Sequence) CGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 94 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 23 pegRNA GGTGCCGCGGCGATATCATCATCCATGGCCGGATGATCCTGACGAC (Artificial Sequence) GGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 95 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 20 pegRNA GGTGCGGCGATATCATCATCCATGGCCGGATGATCCTGACGACGG (Artificial Sequence) AGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 96 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 16 pegRNA GGTGCATATCATCATCCATGGCCGGATGATCCTGACGACGGAGAC (Artificial Sequence) CGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 97 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 18 RT 34 pegRNA GGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGCCGGAT (Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGC TGCGAGAATAGCC SEQ ID NO: 98 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 18 RT 29 pegRNA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC (Artificial Sequence) TGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA GAATAGCC SEQ ID NO: 99 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 18 RT 16 pegRNA GGTGCATATCATCATCCATGGCCGGATGATCCTGACGACGGAGAC (Artificial Sequence) CGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAATAGCC SEQ ID NO: 100 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 39 pegRNA TCGGTGCCTGCCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCA (Artificial Sequence) TGCCGGATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCC GGCCCGGGCGGCGGAGA SEQ ID NO: 101 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 34 pegRNA TCGGTGCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCG (Artificial Sequence) GATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCC GGGCGGCGGAGA SEQ ID NO: 102 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 29 pegRNA TCGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGA (Artificial Sequence) TCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCG GCGGAGA SEQ ID NO: 103 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 24 pegRNA TCGGTGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATCCTG (Artificial Sequence) ACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGA GA SEQ ID NO: 104 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 19 pegRNA TCGGTGCGGGGGTCGCAGTCGCCATGCCGGATGATCCTGACGAC (Artificial Sequence) GGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGAGA SEQ ID NO: 105 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 39 pegRNA TCGGTGCCTGCCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCA (Artificial Sequence) TGCCGGATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCC GGCCCGGGCGGCGGAGACAGCG SEQ ID NO: 106 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 34 pegRNA TCGGTGCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCG (Artificial Sequence) GATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCC GGGCGGCGGAGACAGCG SEQ ID NO: 107 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 18 RT 29 pegRNA CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATC (Artificial Sequence) CTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCG GAGACAGCG SEQ ID NO: 108 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 24 pegRNA TCGGTGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATCCTG (Artificial Sequence) ACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGA GACAGCG SEQ ID NO: 109 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 19 pegRNA TCGGTGCGGGGGTCGCAGTCGCCATGCCGGATGATCCTGACGAC (Artificial Sequence) GGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGAGACAG CG SEQ ID NO: 110 GCGTGGTGGGGCCGCCAGCGGTTTTAGAGCTAGAAATAGCAAGT LMNB1 N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA Nicking guide 1 +46 GTCGGTGC (Artificial Sequence) SEQ ID NO: 111 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 42 GGTGCGACGAGCGCGGCGATATCATCATCCATGGGGATGATCCTG pegRNA ACGACGGAGACCGCCGTCGTCGACAAGCCGGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 112 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 40 GGTGCGACGAGCGCGGCGATATCATCATCCATGGGATGATCCTGA pegRNA CGACGGAGACCGCCGTCGTCGACAAGCCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 113 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 38 GGTGCGACGAGCGCGGCGATATCATCATCCATGGATGATCCTGAC pegRNA GACGGAGACCGCCGTCGTCGACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 114 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 36 GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGATCCTGACG pegRNA ACGGAGACCGCCGTCGTCGACAAGCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 115 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 44 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCGGATGATCC pegRNA v2 TGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCGGGCGGCGG (Artificial Sequence) AGA SEQ ID NO: 116 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 42 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGGGATGATCCT pegRNA v2 GACGACGGAGACCGCCGTCGTCGACAAGCCGGCGGGCGGCGGAG (Artificial Sequence) A SEQ ID NO: 117 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 40 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGGATGATCCTG pegRNA v2 ACGACGGAGACCGCCGTCGTCGACAAGCCGCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 118 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 38 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGATGATCCTGA pegRNA v2 CGACGGAGACCGCCGTCGTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 119 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 18 RT 29 attB 46 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCCGGATG pegRNA ATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCCTC (Artificial Sequence) CAGGCAATACGCG SEQ ID NO: 120 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGTT NOLC1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 46 CGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCCGGATGATC pegRNA CTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCCTCCAGG (Artificial Sequence) CAAT SEQ ID NO: 121 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 29 attB 44 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCGGATGA pegRNA TCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCTCCTCCA (Artificial Sequence) GGCAAT SEQ ID NO: 122 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 29 attB 42 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCGGATGAT pegRNA CCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGTCCTCCAGG (Artificial Sequence) CAAT SEQ ID NO: 123 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 29 attB 40 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCGATGATC pegRNA CTGACGACGGAGACCGCCGTCGTCGACAAGCCGTCCTCCAGGCA (Artificial Sequence) AT SEQ ID NO: 124 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGTT NOLC1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 29 attB 38 TCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCATGATCCT pegRNA GACGACGGAGACCGCCGTCGTCGACAAGCCTCCTCCAGGCAAT (Artificial Sequence) SEQ ID NO: 125 GAGCCGAGCACGAGGGGATACGTTTTAGAGCTAGAAATAGCAAGT NOLC1 nicking TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG guide-43 TCGGTGC (Artificial Sequence) SEQ ID NO: 126 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 20 attB 38 GGTGCGGCGATATCATCATCCATGGATGATCCTGACGACGGAGAC pegRNA CGCCGTCGTCGACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 127 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 15 attB 38 GGTGCTATCATCATCCATGGATGATCCTGACGACGGAGACCGCCG pegRNA TCGTCGACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 128 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 10 attB 38 GGTGCTCATCCATGGATGATCCTGACGACGGAGACCGCCGTCGTC pegRNA GACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 129 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term PBS 9 AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG RT 20 attB 38 TCGGTGCGGCGATATCATCATCCATGGATGATCCTGACGACGGAG pegRNA ACCGCCGTCGTCGACAAGCCTGAGCTGCG (Artificial Sequence) SEQ ID NO: 130 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS 9 AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC RT 15 attB 38 GGTGCTATCATCATCCATGGATGATCCTGACGACGGAGACCGCCG pegRNA TCGTCGACAAGCCTGAGCTGCG (Artificial Sequence) SEQ ID NO: 131 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS 9 AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC RT 10 attB 38 GGTGCTCATCCATGGATGATCCTGACGACGGAGACCGCCGTCGTC pegRNA GACAAGCCTGAGCTGCG (Artificial Sequence) SEQ ID NO: 132 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 20 attB 38 TCGGTGCCGGGGGTCGCAGTCGCCATGATGATCCTGACGACGGA pegRNA GACCGCCGTCGTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 133 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 15 attB 38 TCGGTGCGTCGCAGTCGCCATGATGATCCTGACGACGGAGACCG pegRNA CCGTCGTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 134 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 10 attB 38 TCGGTGCAGTCGCCATGATGATCCTGACGACGGAGACCGCCGTC pegRNA GTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 135 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 9 RT 20 attB 38 CGGTGCCGGGGGTCGCAGTCGCCATGATGATCCTGACGACGGAGA pegRNA CCGCCGTCGTCGACAAGCCCGGGCGGCG (Artificial Sequence) SEQ ID NO: 136 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 9 RT 15 attB 38 TCGGTGCGTCGCAGTCGCCATGATGATCCTGACGACGGAGACCG pegRNA CCGTCGTCGACAAGCCCGGGCGGCG (Artificial Sequence) SEQ ID NO: 137 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 9 RT 10 attB 38 CGGTGCAGTCGCCATGATGATCCTGACGACGGAGACCGCCGTCGT pegRNA CGACAAGCCCGGGCGGCG (Artificial Sequence) SEQ ID NO: 138 GAGAAGCGGCGTCCGGGGCTAGTTTTAGAGCTAGAAATAGCAAGT SUPT16H N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS 13 RT 24 Bxb1- TCGGTGCTCTTTGTCCAGAGTCACAGCCATACCGGATGATCCTGAC GT_Initial length GACGGAGACCGCCGTCGTCGACAAGCCGGCCCCCCGGACGCCGC (Artificial Sequence) SEQ ID NO: 139 GGGCACGGGGCCATGTACAAGTTTTAGAGCTAGAAATAGCAAGT SRRM2 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 24 Bxb1 GTCGGTGCGGCGTCGGCAGCCCGATCCCGTTGCCGGATGATCCT Initial length GACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTACATGGCCC (Artificial Sequence) CGT SEQ ID NO: 140 GTGTCAGGTGGGGCGGGGCTAGTTTTAGAGCTAGAAATAGCAAG DEPDC4 N-term TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG PBS 18 RT 24 Bxb1 AGTCGGTGCGCTGGCTCCTCCCCTGGCACCATACCGGATGATCCT Initial length GACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCCCCGCCCCA (Artificial Sequence) CCTGACAC SEQ ID NO: 141 GAGTGGGTCAGACGAGCAGGAGTTTTAGAGCTAGAAATAGCAAGT NES N-term PBS 13 TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG RT 29 Bxb1 Initial TCGGTGCGATGGAGGGCTGCATGGGGGAGGAGTCGCCGGATGATC length CTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGCTCGTCT (Artificial Sequence) GACC SEQ ID NO: 142 GCAGCCACCCGCTCTCGGCCCGTTTTAGAGCTAGAAATAGCAAG SUPT16H nicking TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG guide-53 AGTCGGTGC (Artificial Sequence) SEQ ID NO: 143 GTGTAGTCAGGCCGCTCACCCGTTTTAGAGCTAGAAATAGCAAG SRRM2 N-term TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG nicking guide 1 +87 AGTCGGTGC (Artificial Sequence) SEQ ID NO: 144 GCTGACAAGTCTACGGAACCTGTTTTAGAGCTAGAAATAGCAAG DEPDC4 N-term TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG Nicking guide 1 +59 AGTCGGTGC (Artificial Sequence) SEQ ID NO: 145 GCTCCTCCAGCGCCTTGACCGTTTTAGAGCTAGAAATAGCAAGTTA NES N-term Nicking AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC guide 2 + 9 GGTGC (Artificial Sequence) SEQ ID NO: 146 GCTATTCTCGCAGCTCACCA HITI_ACTB_guide (Artificial Sequence) SEQ ID NO: 147 AGAAGCGGCGTCCGGGGCTA HITI_SUPTH16_guide (Artificial Sequence) SEQ ID NO: 148 GGGCACGGGGCCATGTACAA HITI_SRRM2_guide (Artificial Sequence) SEQ ID NO: 149 GCGTATTGCCTGGAGGATGG HITI_NOLCl_guide (Artificial Sequence) SEQ ID NO: 150 TGTCAGGTGGGGCGGGGCTA HITI_DEPDC4_guide (Artificial Sequence) SEQ ID NO: 151 AGTGGGTCAGACGAGCAGGA HITI_NES_guide (Artificial Sequence) SEQ ID NO: 152 GCTGTCTCCGCCGCCCGCCA HITI_LMNB1_guide (Artificial Sequence) SEQ ID NO: 153 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT HDR Cas9 ACTB AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG guide TCGGTGC (Artificial Sequence) SEQ ID NO: 154 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC original length TGACGACGGAGXXCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA pegRNAs for GAA dinucleotides XX: CG, GC, AT, TA, GG, TT, GA, AG, CC, TC, CT, AA, TG, GT, CA, or (Artificial Sequence) AC SEQ ID NO: 155 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 pegRNA GGTGCGACGAGCGCGGCGATATCATCATCCATGCCGGATGATCCT with attB 46 GT for GACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAG fusion AA (Artificial Sequence) SEQ ID NO: 156 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 pegRNA GGTGCGACGAGCGCGGCGATATCATCATCCATGCCGGATGATCCT with attB 46 CT for GACGACGGAGAGCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA multiplexing GAA (Artificial Sequence) SEQ ID NO: 157 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGTT NOLC1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 18 RT 29 pegRNA CGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCCGGATGATC with attB 46 GA for CTGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCTCCTCCAGG multiplexing CAATACGCG (Artificial Sequence) SEQ ID NO: 158 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 18 RT 29 pegRNA CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATC with attB 46 AG for CTGACGACGGAGCTCGCCGTCGTCGACAAGCCGGCCCGGGCGGCG multiplexing GAGACAGCG (Artificial Sequence) SEQ ID NO: 159 GTCACCTCCAATGACTAGGG EMX1 Cas9 guide 1 (Artificial Sequence) SEQ ID NO: 160 GGGCAACCACAAACCCACGA EMX1 Cas9 guide 2 (Artificial Sequence) SEQ ID NO: 161 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 56 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCTATGCCGGAT pegRNA GATCCTGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCCTAGC (Artificial Sequence) TGAGCTGCGAGAA SEQ ID NO: 162 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 51 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGCCGGATGAT pegRNA CCTGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCCTATGAGC (Artificial Sequence) TGCGAGAA SEQ ID NO: 163 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 46 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC pegRNA TGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA (Artificial Sequence) GAA SEQ ID NO: 164 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 41 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGGGATGATCCTG pegRNA ACGACGGAGTCCGCCGTCGTCGACAAGCCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 165 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 36 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGATCCTGACG pegRNA ACGGAGTCCGCCGTCGTCGACAAGCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 166 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 31 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGATCCTGACGAC pegRNA GGAGTCCGCCGTCGTCGACATGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 167 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 26 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCTGACGACGG pegRNA AGTCCGCCGTCGTCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 168 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 21 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGACGACGGAG pegRNA TCCGCCGTCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 169 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 16 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGACGACGGAGTC pegRNA CGCCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 170 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 11 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGGACGGAGTCCG pegRNA TGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 171 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 6 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCGGAGTTGAGC pegRNA TGCGAGAA (Artificial Sequence) SEQ ID NO: 172 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_18_RT_34_with_ CGGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGTACCG Lox71_Cre TTCGTATAGCATACATTATACGAAGTTATTGAGCTGCGAGAATAG pegRNA CC (Artificial Sequence) SEQ ID NO: 173 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_18_RT_29_with_ CGGTGCGACGAGCGCGGCGATATCATCATCCATGGTACCGTTCGT Lox71_Cre ATAGCATACATTATACGAAGTTATTGAGCTGCGAGAATAGCC pegRNA (Artificial Sequence) SEQ ID NO: 174 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_13_RT_34_with_ CGGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGTACCG Lox71_Cre TTCGTATAGCATACATTATACGAAGTTATTGAGCTGCGAGAA pegRNA (Artificial Sequence) SEQ ID NO: 175 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_13_RT_16_with_ CGGTGCATATCATCATCCATGGTACCGTTCGTATAGCATACATTAT Lox71_Cre ACGAAGTTATTGAGCTGCGAGAA pegRNA (Artificial Sequence) SEQ ID NO: 176 CCCCACGATGGAGGGGAAGAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT Nicking guide 2 +93 CGGTGC guide (Artificial Sequence) SEQ ID NO: 177 CCTTCTCCTGGAGCCGCGACGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT Nicking guide 2 +87 CGGTGC guide (Artificial Sequence)
(239) Sequences of insertion sites can be found in Table 4 below.
(240) TABLE-US-00005 TABLE 4 FORWARD SEQUENCE (5′-3′) REVERSE SEQUENCE (5′-3′) DESCRIPTION/ SEQ ID SEQ ID SOURCE NO Sequence NO Sequence Bxb1_attP_GT_ 178 GTGGTTTGTCTGGTC 179 TGGGTTTGTACCGTA original_site AACCACCGCGGTCT CACCACTGAGACCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_CG_ 180 GTGGTTTGTCTGGTC 181 TGGGTTTGTACCGTA site AACCACCGCGCGCT CACCACTGAGCGCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_GC_ 182 GTGGTTTGTCTGGTC 183 TGGGTTTGTACCGTA site AACCACCGCGGCCT CACCACTGAGGCCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_AT_ 184 GTGGTTTGTCTGGTC 185 TGGGTTTGTACCGTA site AACCACCGCGATCT CACCACTGAGATCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TA_ 186 GTGGTTTGTCTGGTC 187 TGGGTTTGTACCGTA site AACCACCGCGTACT CACCACTGAGTACG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_GG_ 188 GTGGTTTGTCTGGTC 189 TGGGTTTGTACCGTA site AACCACCGCGGGCT CACCACTGAGCCCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TT_ 190 GTGGTTTGTCTGGTC 191 TGGGTTTGTACCGTA site AACCACCGCGTTCTC CACCACTGAGAACG (Artificial AGTGGTGTACGGTA CGGTGGTTGACCAG Sequence) CAAACCCA ACAAACCAC Bxb1_attP_GA_ 192 GTGGTTTGTCTGGTC 193 TGGGTTTGTACCGTA site AACCACCGCGGACT CACCACTGAGTCCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_AG_ 194 GTGGTTTGTCTGGTC 195 TGGGTTTGTACCGTA site AACCACCGCGAGCT CACCACTGAGCTCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_CC_ 196 GTGGTTTGTCTGGTC 197 TGGGTTTGTACCGTA site AACCACCGCGCCCT CACCACTGAGGGCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TC_ 198 GTGGTTTGTCTGGTC 199 TGGGTTTGTACCGTA site AACCACCGCGTCCTC CACCACTGAGGACG (Artificial AGTGGTGTACGGTA CGGTGGTTGACCAG Sequence) CAAACCCA ACAAACCAC Bxb1_attP_CT_ 200 GTGGTTTGTCTGGTC 201 TGGGTTTGTACCGTA site AACCACCGCGCTCTC CACCACTGAGAGCG (Artificial AGTGGTGTACGGTA CGGTGGTTGACCAG Sequence) CAAACCCA ACAAACCAC Bxb1_attP_AA_ 202 GTGGTTTGTCTGGTC 203 TGGGTTTGTACCGTA site AACCACCGCGAACT CACCACTGAGTTCGC (Artificial CAGTGGTGTACGGT GGTGGTTGACCAGA Sequence) ACAAACCCA CAAACCAC Bxb1_attP_C 204 GTGGTTTGTCTGGTC 205 TGGGTTTGTACCGTA A_site AACCACCGCGCACT CACCACTGAGTGCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_AC_ 206 GTGGTTTGTCTGGTC 207 TGGGTTTGTACCGTA site AACCACCGCGACCT CACCACTGAGGTCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TG_ 208 GTGGTTTGTCTGGTC 209 TGGGTTTGTACCGTA site AACCACCGCGTGCT CACCACTGAGCACG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attB_46_ 210 GGCCGGCTTGTCGA 211 CCGGATGATCCTGA GT_original_ CGACGGCGGTCTCC CGACGGAGACCGCC site GTCGTCAGGATCATC GTCGTCGACAAGCC (Artificial CGG GGCC Sequence) Bxb1_attB_46_ 212 GGCCGGCTTGTCGA 213 CCGGATGATCCTGA AA_site CGACGGCGAACTCC CGACGGAGTTCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 214 GGCCGGCTTGTCGA 215 CCGGATGATCCTGA GA_site CGACGGCGGACTCC CGACGGAGTCCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 216 GGCCGGCTTGTCGA 217 CCGGATGATCCTGA CA_site CGACGGCGCACTCC CGACGGAGTGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 218 GGCCGGCTTGTCGA 219 CCGGATGATCCTGA TA_site CGACGGCGTACTCC CGACGGAGTACGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 220 GGCCGGCTTGTCGA 221 CCGGATGATCCTGA AG_site CGACGGCGAGCTCC CGACGGAGCTCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 222 GGCCGGCTTGTCGA 223 CCGGATGATCCTGA GG_site CGACGGCGGGCTCC CGACGGAGCCCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 224 GGCCGGCTTGTCGA 225 CCGGATGATCCTGA CG_site CGACGGCGCGCTCC CGACGGAGCGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 226 GGCCGGCTTGTCGA 227 CCGGATGATCCTGA TG_site CGACGGCGTGCTCC CGACGGAGCACGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 228 GGCCGGCTTGTCGA 229 CCGGATGATCCTGA AC_site CGACGGCGACCTCC CGACGGAGGTCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 230 GGCCGGCTTGTCGA 231 CCGGATGATCCTGA GC_site CGACGGCGGCCTCC CGACGGAGGCCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 232 GGCCGGCTTGTCGA 233 CCGGATGATCCTGA CC_site CGACGGCGCCCTCC CGACGGAGGGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 234 GGCCGGCTTGTCGA 235 CCGGATGATCCTGA TC_site CGACGGCGTCCTCC CGACGGAGGACGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 236 GGCCGGCTTGTCGA 237 CCGGATGATCCTGA AT_site CGACGGCGATCTCC CGACGGAGATCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 238 GGCCGGCTTGTCGA 239 CCGGATGATCCTGA CT_site CGACGGCGCTCTCC CGACGGAGAGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 240 GGCCGGCTTGTCGA 241 CCGGATGATCCTGA TT_site CGACGGCGTTCTCCG CGACGGAGAACGCC (Artificial TCGTCAGGATCATCC GTCGTCGACAAGCC Sequence) GG GGCC Bxb1_attB_38_ 242 GGCTTGTCGACGAC 243 ATGATCCTGACGAC GT_site GGCGGTCTCCGTCGT GGAGACCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 244 GGCTTGTCGACGAC 245 ATGATCCTGACGAC AA_site GGCGAACTCCGTCG GGAGTTCGCCGTCGT (Artificial TCAGGATCAT CGACAAGCC Sequence) Bxb1_attB_38_ 246 GGCTTGTCGACGAC 247 ATGATCCTGACGAC GA_site GGCGGACTCCGTCG GGAGTCCGCCGTCG (Artificial TCAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 248 GGCTTGTCGACGAC 249 ATGATCCTGACGAC CA_site GGCGCACTCCGTCGT GGAGTGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 250 GGCTTGTCGACGAC 251 ATGATCCTGACGAC TA_site GGCGTACTCCGTCGT GGAGTACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 252 GGCTTGTCGACGAC 253 ATGATCCTGACGAC AG_site GGCGAGCTCCGTCG GGAGCTCGCCGTCG (Artificial TCAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 254 GGCTTGTCGACGAC 255 ATGATCCTGACGAC GG_site GGCGGGCTCCGTCG GGAGCCCGCCGTCG (Artificial TCAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 256 GGCTTGTCGACGAC 257 ATGATCCTGACGAC CG_site GGCGCGCTCCGTCGT GGAGCGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 258 GGCTTGTCGACGAC 259 ATGATCCTGACGAC TG_site GGCGTGCTCCGTCGT GGAGCACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 260 GGCTTGTCGACGAC 261 ATGATCCTGACGAC AC_site GGCGACCTCCGTCGT GGAGGTCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 262 GGCTTGTCGACGAC 263 ATGATCCTGACGAC GC_site GGCGGCCTCCGTCGT GGAGGCCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 264 GGCTTGTCGACGAC 265 ATGATCCTGACGAC CC_site GGCGCCCTCCGTCGT GGAGGGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 266 GGCTTGTCGACGAC 267 ATGATCCTGACGAC TC_site GGCGTCCTCCGTCGT GGAGGACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 268 GGCTTGTCGACGAC 269 ATGATCCTGACGAC AT_site GGCGATCTCCGTCGT GGAGATCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 270 GGCTTGTCGACGAC 271 ATGATCCTGACGAC CT_site GGCGCTCTCCGTCGT GGAGAGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 272 GGCTTGTCGACGAC 273 ATGATCCTGACGAC TT_site GGCGTTCTCCGTCGT GGAGAACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Cre Lox 66 274 TACCGTTCGTATAAT 275 ATAACTTCGTATAGC site GTATGCTATACGAA ATACATTATACGAA (Artificial GTTAT CGGTA Sequence) Cre Lox 71 276 ATAACTTCGTATAAT 277 TACCGTTCGTATAGC site GTATGCTATACGAA ATACATTATACGAA (Artificial CGGTA GTTAT Sequence) TP901-1 278 TTTACCTTGATTGAG 279 CACAATTAACATCTC minimal attB ATGTTAATTGTG AATCAAGGTAAA site (Artificial Sequence) TP901-1 280 GCGAGTTTTTATTTC 281 AAAGGAGTTTTTTAG minimal attP GTTTATTTCAATTAA TTACCTTAATTGAAA site GGTAACTAAAAAAC TAAACGAAATAAAA (Artificial TCCTTT ACTCGC Sequence) PhiBT1 282 CTGGATCATCTGGAT 283 CAGGTTTTTGACGAA minimal attB CACTTTCGTCAAAAA AGTGATCCAGATGA site CCTG TCCAG (Artificial Sequence) PhiBT1 284 TTCGGGTGCTGGGTT 285 TGGTGCTGAGTAGTT minimal attP GTTGTCTCTGGACAG TCCCATGGATCACTG site TGATCCATGGGAAA TCCAGAGACAACAA (Artificial CTACTCAGCACCA CCCAGCACCCGAA Sequence)
(241) Sequences of Bxb1 and RT mutants can be found in Table 6 below.
(242) TABLE-US-00006 TABLE 6 SEQ ID NO/ DESCRIPTION/ SOURCE FORWARD SEQUENCE (5′-3′) SEQ ID NO: 286 AAAAGTGTGGGCTGCAGGATCTGA Bxb1_mut_V368A (Artificial Sequence) SEQ ID NO: 287 GGAGCTGGCAGCTGTCAATGCC Bxb1_mut_E379A (Artificial Sequence) SEQ ID NO: 288 AGTCAATGCCGCTCTCGTGGA Bxb1_mut_E383A (Artificial Sequence) SEQ ID NO: 403 TTGAGCGGGCCCCCACCGT RT_mut_L139P (Artificial Sequence) SEQ ID NO: 289 CAGCGGGCTCAGCTGATAGCA RT_mut_E562Q (Artificial Sequence) SEQ ID NO: 290 CGGATGGCTAACCAAGCGGCC RT_mut_D653N (Artificial Sequence) SEQ ID NO: 404 atgactcactatcaggccttgcttaggacacggaccgggtccagttcggaccggtggtagccctgaaccc RT(1-478)_Sto7d ggctacgctgctcccactgcctgaggaagggctgcaacacaactgccttgatGGGACAGGTGG fusion CGGTGGTGTCACCGTCAAGTTCAAGTACAAGGGTGAGGAACTT GAAGTTGATATTAGCAAAATCAAGAAGGTTTGGCGCGTTGGTA AAATGATATCTTTTACTTATGACGACAACGGCAAGACAGGTAG AGGGGCAGTGTCTGAGAAAGACGCCCCCAAGGAGCTGTTGCAA ATGTTGGAAAAGTCTGGGAAAAAGtctggcggctcaaaaagaaccgccgacgg cagcgaattcgagcccaagaagaagaggaaagtc
(243) Sequences of primers, probes and restriction enzymes used in ddPCR readout can be found in Table 7 below.
(244) TABLE-US-00007 TABLE 7 Restrict- SEQ Forward SEQ Reverse SEQ ion Locus Cargo ID NO: Primer ID NO: Primer Probe ID NO: Enzymes ACTB GFP 291 CCCG 292 GAAC /56- 405 Eco91I, (pDY0186) GCTTC TCCAC FAM/C HindIII CTTTG GCCG C GGC TCC TTCA TTG T/ZEN/ C GAC GAC GGC G/3IAB kFQ/ ACTB TP90-1 293 CCCG 294 AACC /56- 406 None GFP GCTTC ACAA FAM/T (pDY0333) CTTTG CTAG G CTA TCC AATG TTG CAGT C/ZEN/ GA T TTA TTT GTG GGC CCG /3IABk FQ/ ACTB TP90-1 295 CCCG 296 GAAC /56- 407 None rc GFP GCTTC TCCAC FAM/ (pDY0334) CTTTG GCCG CC TCC TTCA ATG AAG A/ZEN/ T CGA GTG CCG CAT CA/3I ABkF Q/ ACTB PhiBT1 297 CCCG 298 AACC /56- 406 None GFP GCTTC ACAA FAM/T (pDY0367) CTTTG CTAG G CTA TCC AATG TTG CAGT C/ZEN/ GA T TTA TTT GTG GGC CCG /3IABk FQ/ ACTB PhiBT1 299 CCCG 300 GAAC /56- 407 None rc GFP GCTTC TCCAC FAM/ (pDY0368) CTTTG GCCG CC TCC TTCA ATG AAG A/ZEN/ T CGA GTG CCG CAT CA/3I ABkF Q/ LMNB1 GFP 301 TCCTT 302 GAAC /56- 407 Eco91I, (pDY0186) ATCA TCCAC FAM/ HindIII CGGT GCCG CC CCCG TTCA ATG CTCG AAG A/ZEN/ T CGA GTG CCG CAT CA/3I ABkF Q/ NOLC1 GFP 303 CGTC 304 GAAC /56- 407 Eco91I, (pDY0186) GACA TCCAC FAM/ HindIII ACGG GCCG CC TAGT TTCA ATG G AAG A/ZEN/ T CGA GTG CCG CAT CA/3I ABkF Q/ SUPT1 GFP 305 TCGC 306 GAAC /56- 407 Eco91I, 6 H pDY0186) GTGA TCCA FAM/C HindIII TTCTC CGCC C ATG GGAA GTTC AAG C A A/ZEN/ T CGA GTG CCG CAT CA/31A BkFQ/ SRRM2 GFP 307 GGGC 308 GAAC /56- 407 Eco91I, (pDY0186) GGTA TCCAC FAM/ HindIII AGTG GCCG CC GTTA TTCA ATG GTTT AAG A/ZEN/ T CGA GTG CCG CAT CA/3I ABkF Q/ DEPDC4 GFP 309 AAGA 310 GAAC /56- 407 Eco91I, (pDY0186) GGCG TCCAC FAM/ HindIII GAGC GCCG CC CAGT TTCA ATG A AAG A/ZEN/ T CGA GTG CCG CAT CA/3I ABkF Q/ NES GFP 311 CTCCC 312 GAAC /56- 405 Eco91I, (pDY0186) TTCTC TCCAC FAM/C HindIII CCGG GCCG C GGC TGCCC TTCA TTG T/ZEN/ C GAC GAC GGC G/3IAB kFQ/ ACTB ACTB 313 CCCG 314 GAAC /56- 407 Eco91I HITI GCTTC TCCAC FAM/ template CTTTG GCCG CC GFP TCC TTCA ATG (pDY0219) AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ SRRM2 SRRM2 315 GGGC 316 GAAC /56- 407 Eco91I HITI GGTA TCCAC FAM/ template AGTG GCCG CC GFP GTTA TTCA ATG (aRY0182_ GTTT AAG A2) A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ NOLC1 NOLC1 317 CGTC 318 GAAC /56- 407 Eco91I HITI GACA TCCAC FAM/ template ACGG GCCG CC GFP TAGT TTCA ATG (aRY0182_ G AAG A3) A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ DEPDC4 DEPDC4 319 AAGA 320 GAAC /56- 407 Eco91I HITI GGCG TCCAC FAM/ template GAGC GCCG CC GFP CAGT TTCA ATG (aRY0182_ A AAG A5) A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ NES NES 321 CTCCC 322 GAAC /56- 407 Eco91I HITI TTCTC TCCAC FAM/ template CCGG GCCG CC GFP TGCCC TTCA ATG (aRY0182_ AAG A7) A/ZEN/ T CGA GTG CCG CAT CA/3I ABkF Q/ LMNB1 LMNB1 323 TCCTT 324 GAAC /56- 407 Eco91I HITI ATCA TCCAC FAM/ template CGGT GCCG CC GFP CCCG TTCA ATG (aRY0182_ CTCG AAG A4) A/ZEN/ T CGA GTG CCG CAT CA/3I ABkF Q/ ACTB SERPINA 325 CCCG 326 GGCC /56- 405 EcoRI, (pDY0298) GCTTC TGCC FAM/ XhoI, CTTTG AGCA CC HindIII TCC GGAG GGC GA TTG T/ZEN/ C GAC GAC GGC G/3I ABkF Q/ ACTB CPS1 327 CCCG 328 GGTG /56- 408 XhoI, (pDY299) GCTTC TGCA FAM/ HindIII CTTTG GTCA AC TCC CATTG AGC GTAA TTT AGCC C/ZEN/ A AAG TGG TGA GGA CAC T/3IA BkFQ / ACTB CFTR 329 CCCG 330 GATG /56- 409 Eco91I, (pDY0373) GCTTC GGTCT FAM/ HindIII CTTTG AGTC TAC TCC CAGC GGT TAAA ACA/ G ZEN/ AAC CC ACC CGA GAG A/3I ABkF Q/ ACTB NYESO 331 CCCG 332 GAGA /56- 409 Eco47III, TRAC GCTTC GACA FAM/ HindIII (pDY0318) CTTTG AGGC TAC TCC TGCA GGT CA ACA/ ZEN/ AAC CC ACC CGA GAG A/3I ABkF Q/ NC_ GFP 333 CCAG 334 GAAC /56- 405 Eco91I, 000003 (pDY0186) GTGA TCCAC FAM/ HindIII GAGT GCCG CC CAGG TTCA GGC GTAG TTG TGTTC T/ZEN/ A C GAC GAC GGC G/3I ABkF Q/ NC_ GFP 335 AGGG 336 GAAC /56- 405 Eco91I, 000002 (pDY0186) ACCTT TCCAC FAM/ HindIII TGCCT GCCG CC GTGT TTCA GGC GAGT TTG C T/ZEN/ C GAC GAC GGC G/3I ABkF Q/ NC_ GFP 337 TCAG 338 GAAC /56- 405 Eco91I, 000009 (pDY0186) CTCTG TCCAC FAM/ HindIII TGCTG GCCG CC AGGC TTCA GGC GAA TTG T/ZEN/ C GAC GAC GGC G/3I ABkF Q/ chr6: GFP 339 AAGC 340 GAAC /56- 405 Eco91I, 149045959 (pDY0186) CATCT TCCAC FAM/ HindIII CCCA GCCG CC GAAT TTCA GGC ATCTG TTG CTTAG T/ZE AAAT N/C G GAC GAC GGC G/3I ABkF Q/ chr16: GFP 341 GAGA 342 GAAC /56- 405 Eco91I, 18607730 (pDY0186) GGAG TCCAC FAM/ HindIII CAAC GCCG CC AGTG TTCA GGC AGCA TTG TGAT T/ZE G N/C GAC GAC GGC G/3I ABkF Q/ chr6: ACTB 343 AAGC 344 GAAC /56- 405 Eco91I 149045959 HITI CATCT TCCAC FAM/ template CCCA GCCG CC GFP GAAT TTCA GGC (pDY0219) ATCTG TTG CTTAG T/ZE AAAT N/C G GAC GAC GGC G/3I ABkF Q/ chr16: ACTB 345 GAGA 346 GAAC /56- 405 Eco91I 18607730 HITI GGAG TCCAC FAM/ template CAAC GCCG CC GFP AGTG TTCA GGC (pDY0219) AGCA TTG TGAT T/ZE G N/C GAC GAC GGC G/3I ABkF Q/ ACTB CAG_ 347 CCCG 348 GGCT /56- 405 Eco91I, Kozak_ GCTTC ATGA FAM/ HindIII bGH_ CTTTG ACTA CC thera- TCC ATGA GGC peutic_ CCCC TTG genes GT T/ZE generic N/C minicircle GAC GAC GGC G/3I ABkF Q/ ACTB Hibit- 349 CCCG 350 GGCC /56- 405 EcoRI, SERPINA GCTTC TGCC FAM/ XhoI, (pDY045) CTTTG AGCA CC HindIII TCC GGAG GGC GA TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ ACTB Hibit- 351 CCCG 352 GGTG /56- 408 XhoI, CPS1 GCTTC TGCA FAM/ HindIII (pDY406) CTTTG GTCA AC TCC CATTG AGC GTAA TTT AGCC C/ZE N/A AAG TGG TGA GGA CAC T/3IA BkFQ /
(245) Sequences of primers used for NGS readout can be found in Table 8 below.
(246) TABLE-US-00008 TABLE 8 SEQ ID NO/ DESCRIPTION/ SOURCE ID SEQUENCE (5′-3′) SEQ ID NO: 353 PD0966 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCCGAC N-term ACTB Tn5 CTCGGC TCACAGCG readout F 1 (Artificial Sequence) SEQ ID NO: 354 PD0967 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACCGA N-term ACTB Tn5 CCTCGG CTCACAGCG readout F 2 (Artificial Sequence) SEQ ID NO: 355 PD0968 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGACCG N-term ACTB Tn5 ACCTCG GCTCACAGCG readout F 3 (Artificial Sequence) SEQ ID NO: 356 PD0969 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTGACC N-term ACTB Tn5 GACCTC GGCTCACAGCG readout F 4 (Artificial Sequence) SEQ ID NO: 357 PD0970 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTGAC N-term ACTB Tn5 CGACCT CGGCTCACAGCG readout F 5 (Artificial Sequence) SEQ ID NO: 358 PD0971 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACTGA N-term ACTB Tn5 CCGACC TCGGCTCACAGCG readout F 6 (Artificial Sequence) SEQ ID NO: 359 PD0972 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTACTG N-term ACTB Tn5 ACCGAC CTCGGCTCACAGCG readout F 7 (Artificial Sequence) SEQ ID NO: 360 PD0973 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGTACT N-term ACTB Tn5 GACCGA CCTCGGCTCACAGCG readout F 8 (Artificial Sequence) SEQ ID NO: 361 FP0952 GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTCCAC ACTB N-term NGS CCAGCC AGCTCCC R for Cas14 indels (Artificial Sequence) SEQ ID NO: 362 PD0313 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCCGGT NGS EMX1 GGCGCAT TGCCAC Forward 1 (Artificial Sequence) SEQ ID NO: 363 PD0314 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACCGG NGS EMX1 TGGCGCA TTGCCAC Forward 2 (Artificial Sequence) SEQ ID NO: 364 PD0315 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGACCG NGS EMX1 GTGGCGC ATTGCCAC Forward 3 (Artificial Sequence) SEQ ID NO: 365 PD0316 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTGACC NGS EMX1 GGTGGCG CATTGCCAC Forward 4 (Artificial Sequence) SEQ ID NO: 366 PD0317 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTGAC NGS EMX1 CGGTGGC GCATTGCCAC Forward 5 (Artificial Sequence) SEQ ID NO: 367 PD0318 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACTGA NGS EMX1 CCGGTGG CGCATTGCCAC Forward 6 (Artificial Sequence) SEQ ID NO: 368 PD0319 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTACTG NGS EMX1 ACCGGTG GCGCATTGCCAC Forward 7 (Artificial Sequence) SEQ ID NO: 369 PD0320 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGTACT NGS EMX1 GACCGG GGCGCATTGCCAC Forward 8 (Artificial Sequence) SEQ ID NO: 370 PD0321 GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTCAGA NGS EMX1 Reverse GTCCAGC TTGGGCCCA (Artificial Sequence)
(247) Sequences of off-target sites can be found in Table 9 below.
(248) TABLE-US-00009 TABLE 9 SEQ ID NO/ DESCRIPTION/ SOURCE SEQUENCE (5′-3′) SEQ ID NO: 371 GATATTTTCCCAGCTCACCA Cas9_chr6:149045959 (Artificial Sequence) SEQ ID NO: 372 TCTATTCTCCCAGCTCCCCA Cas9_chr16:18607730 (Artificial Sequence) SEQ ID NO: 373 AGCGGCTTCTGTCTCTGTGAGTGAGCTGGCGGTCTCCGTC Bxb1_NC_000002 (Artificial Sequence) SEQ ID NO: 374 GACTAGCCCACGCTCCGGTTCTGAGCCGCGACGGCGGTCTCCG Bxb1_NC_000003 (Artificial Sequence) SEQ ID NO: 375 CCCAGGGTCCCATGCGCTCCCCGGCCCTGACGGCGGTCTCC Bxb1_NC_000009 (Artificial Sequence)
(249) Linker sequences in Table 10 below.
(250) TABLE-US-00010 TABLE 10 Description Sequence (5′-3′) Amino acid sequence A-P2A GGAAGCGGAGCTACTAACTTCAGCCT GSGATNFSLLKQAGDVEENPGP (SEQ ID GCTGAAGCAGGCTGGCGACGTGGAGG NO: 418) AGAACCCTGGACCT (SEQ ID NO: 410) B-(GGGS)3 GGGGGAGGAGGTTCTGGAGGCGGAGG GGGGSGGGGSGGGGS (SEQ ID NO: 419) CTCCGGAGGCGGAGGGTCA (SEQ ID NO: 411) C-GGGGS GGAGGTGGCGGGAGC (SEQ ID NO: GGGGS (SEQ ID NO: 420) 412) D-PAPAP CCCGCACCAGCGCCT (SEQ ID NO: PAPAP (SEQ ID NO: 421) 413) E-(EAAAK)3 GAGGCAGCTGCCAAGGAAGCCGCT EAAAKEAAAKEAAAK (SEQ ID NO: GCCAAGGAGGCGGCCGCAAAG 422) (SEQ ID NO: 414) F-XTEN AGTGGGAGCGAGACCCCTGGGACT SGSETPGTSESATPES (SEQ ID NO: 423) AGCGAGTCAGCTACACCCGAAAGC (SEQ ID NO: 415) G-(GGS)6 GGGGGGTCAGGTGGATCCGGCGG GGSGGSGGSGGSGGSGGS (SEQ ID NO AAGTGGCGGATCCGGTGGATCTGG 424) CGGCAGT (SEQ ID NO: 416) H-EAAAK GAAGCTGCTGCTAAG (SEQ ID NO: EAAAK (SEQ ID NO: 425) 417)
(251) Exemplary fusion sequences in Table 11 below.
(252) TABLE-US-00011 Description Sequence SpCas9-XTEN- MKRTADGSEFESPKKKRKVDKKYSIGLDIGTNSVGWAVITDEYKVPS RT(1-478)-Sto7d- KKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRR GGGGS-BxbINT KNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNI Amino acid VDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHF SEQ ID NO: 376 LIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSAR LSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDA KLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVN TEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQS KNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRK QRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIP YYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIER MTNFDKNLPNEKVLPKIISLLYEYFTVYNELTKVKYVTEGMRKPAF LSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVED RFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIE ERLKTYAHLFDDKVMKQLICRRRYTGWGRLSRKLINGIRDKQSGK TILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHI ANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTT QKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYL QNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDK NRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERG GLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIRE VKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALI KKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMN FFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMP QVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDS PTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLE AKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELAL PSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEHEQISE FSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPA AFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDS GGSSGGSSGSETPGTSESATPESSGSETPGTSESATPESSGSETPGTSESAT PESSGGSSGGSSTLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAET GGMGLAVRQAPLIIPLKATSTPVSIKQVPMSQEARLGIKPHIQRLLD QGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVEDIHPTV PNPYNLLSGPPPSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDP EMGISGQLTWTRLPQGFKNSPTLFNEALHRDLADFRIQHPDLILLQ YVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQKQKQV KYLGYLLKEGQRWLTEARKETVMGQPTPKTPRQLREFLGKAGFC RLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQKAYQEIKQALLTAPA LGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAYLSKKLDP VAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVK QPPDRWLSNARNITHYQALLLDTDRVQFGPVVALNPATLLPLPEEG LQIINCLDGTGGGGVTVKFKYKGEELEVDISKIKKVWRVGKNIISFT YDDNGKTGRGAVSEKDAPKELLQMLEKSGKKSGGSKRTADGSEFE PKKKRKVGGGGSPKKKRKVYPYDVPDYAGSRALVVIRLSRVTDATTS PERQLESCQQLCAQRGWDVVGVAEDLDVSGAVDPFDRKRRPNLAR WLAFEEQPFDVIVAYRVDRLTRSIRHLQQLVHWAEDHKKLVVSAT EAHFDTTTPFAAVVIALMGTVAQMELEAIKERNRSAAHFNIRAGKY RGSLPPWGYLPTRVDGEWRLVPDPVQRERILEVYHRVVDNHEPLH LVAHDLNRRGVLSPKDYFAQLQGREPQGREWSATALKRSMISEAM LGYATLNGKTVRDDDGAPLVRAEPILTREQLEALRAELVKTSRAKP AVSTPSLLLRVLFCAVCGEPAYKFAGGGRKHPRYRCRSMGFPKHC GNGTVAMAEWDAFCEEQVLDLLGDAERLEKVWVAGSDSAVELAE VNAELVDLTSLIGSPAYRAGSPQREALDARIAALAARQEELEGLEAR PSGWEWRETGQRFGDWWREQDTAAKNTWLRSMNVRLTFDVRGG LTRTIDFGDLQEYEQHLRLGSVVERLHTGMS SpCas9-XTEN- ATGAAACGGACAGCCGACGGAAGCGAGTTCGAGTCACCAAAGAAG RT(1-478)-Sto7d- AAGCGGAAAGTCGACAAGAAGTACAGCATCGGCCTGGACATCGGCA GGGGS-BxbINT CCAACTCTGTGGGCTGGGCCGTGATCACCGACGAGTACAAGGTGCC Nucleic acid CAGCAAGAAATTCAAGGTGCTGGGCAACACCGACCGGCACAGCATC SEQ ID NO: 377 AAGAAGAACCTGATCGGAGCCCTGCTGTTCGACAGCGGCGAAACAG CCGAGGCCACCCGGCTGAAGAGAACCGCCAGAAGAAGATACACCA GACGGAAGAACCGGATCTGCTATCTGCAAGAGATCTTCAGCAACGA GATGGCCAAGGTGGACGACAGCTTCTTCCACAGACTGGAAGAGTCC TTCCTGGTGGAAGAGGATAAGAAGCACGAGCGGCACCCCATCTTCG GCAACATCGTGGACGAGGTGGCCTACCACGAGAAGTACCCCACCAT CTACCACCTGAGAAAGAAACTGGTGGACAGCACCGACAAGGCCGAC CTGCGGCTGATCTATCTGGCCCTGGCCCACATGATCAAGTTCCGGGG CCACTTCCTGATCGAGGGCGACCTGAACCCCGACAACAGCGACGTG GACAAGCTGTTCATCCAGCTGGTGCAGACCTACAACCAGCTGTTCGA GGAAAACCCCATCAACGCCAGCGGCGTGGACGCCAAGGCCATCCTG TCTGCCAGACTGAGCAAGAGCAGACGGCTGGAAAATCTGATCGCCC AGCTGCCCGGCGAGAAGAAGAATGGCCTGTTCGGAAACCTGATTGC CCTGAGCCTGGGCCTGACCCCCAACTTCAAGAGCAACTTCGACCTGG CCGAGGATGCCAAACTGCAGCTGAGCAAGGACACCTACGACGACGA CCTGGACAACCTGCTGGCCCAGATCGGCGACCAGTACGCCGACCTG TTTCTGGCCGCCAAGAACCTGTCCGACGCCATCCTGCTGAGCGACAT CCTGAGAGTGAACACCGAGATCACCAAGGCCCCCCTGAGCGCCTCT ATGATCAAGAGATACGACGAGCACCACCAGGACCTGACCCTGCTGA AAGCTCTCGTGCGGCAGCAGCTGCCTGAGAAGTACAAAGAGATTTT CTTCGACCAGAGCAAGAACGGCTACGCCGGCTACATTGACGGCGGA GCCAGCCAGGAAGAGTTCTACAAGTTCATCAAGCCCATCCTGGAAA AGATGGACGGCACCGAGGAACTGCTCGTGAAGCTGAACAGAGAGG ACCTGCTGCGGAAGCAGCGGACCTTCGACAACGGCAGCATCCCCCA CCAGATCCACCTGGGAGAGCTGCACGCCATTCTGCGGCGGCAGGAA GATTTTTACCCATTCCTGAAGGACAACCGGGAAAAGATCGAGAAGA TCCTGACCTTCCGCATCCCCTACTACGTGGGCCCTCTGGCCAGGGGA AACAGCAGATTCGCCTGGATGACCAGAAAGAGCGAGGAAACCATCA CCCCCTGGAACTTCGAGGAAGTGGTGGACAAGGGCGCTTCCGCCCA GAGCTTCATCGAGCGGATGACCAACTTCGATAAGAACCTGCCCAAC GAGAAGGTGCTGCCCAAGCACAGCCTGCTGTACGAGTACTTCACCG TGTATAACGAGCTGACCAAAGTGAAATACGTGACCGAGGGAATGAG AAAGCCCGCCTTCCTGAGCGGCGAGCAGAAAAAGGCCATCGTGGAC CTGCTGTTCAAGACCAACCGGAAAGTGACCGTGAAGCAGCTGAAAG AGGACTACTTCAAGAAAATCGAGTGCTTCGACTCCGTGGAAATCTCC GGCGTGGAAGATCGGTTCAACGCCTCCCTGGGCACATACCACGATC TGCTGAAAATTATCAAGGACAAGGACTTCCTGGACAATGAGGAAAA CGAGGACATTCTGGAAGATATCGTGCTGACCCTGACACTGTTTGAGG ACAGAGAGATGATCGAGGAACGGCTGAAAACCTATGCCCACCTGTT CGACGACAAAGTGATGAAGCAGCTGAAGCGGCGGAGATACACCGG CTGGGGCAGGCTGAGCCGGAAGCTGATCAACGGCATCCGGGACAAG CAGTCCGGCAAGACAATCCTGGATTTCCTGAAGTCCGACGGCTTCGC CAACAGAAACTTCATGCAGCTGATCCACGACGACAGCCTGACCTTT AAAGAGGACATCCAGAAAGCCCAGGTGTCCGGCCAGGGCGATAGCC TGCACGAGCACATTGCCAATCTGGCCGGCAGCCCCGCCATTAAGAA GGGCATCCTGCAGACAGTGAAGGTGGTGGACGAGCTCGTGAAAGTG ATGGGCCGGCACAAGCCCGAGAACATCGTGATCGAAATGGCCAGAG AGAACCAGACCACCCAGAAGGGACAGAAGAACAGCCGCGAGAGAA TGAAGCGGATCGAAGAGGGCATCAAAGAGCTGGGCAGCCAGATCCT GAAAGAACACCCCGTGGAAAACACCCAGCTGCAGAACGAGAAGCT GTACCTGTACTACCTGCAGAATGGGCGGGATATGTACGTGGACCAG GAACTGGACATCAACCGGCTGTCCGACTACGATGTGGACGCTATCG TGCCTCAGAGCTTTCTGAAGGACGACTCCATCGACAACAAGGTGCT GACCAGAAGCGACAAGAACCGGGGCAAGAGCGACAACGTGCCCTC CGAAGAGGTCGTGAAGAAGATGAAGAACTACTGGCGGCAGCTGCTG AACGCCAAGCTGATTACCCAGAGAAAGTTCGACAATCTGACCAAGG CCGAGAGAGGCGGCCTGAGCGAACTGGATAAGGCCGGCTTCATCAA GAGACAGCTGGTGGAAACCCGGCAGATCACAAAGCACGTGGCACA GATCCTGGACTCCCGGATGAACACTAAGTACGACGAGAATGACAAG CTGATCCGGGAAGTGAAAGTGATCACCCTGAAGTCCAAGCTGGTGT CCGATTTCCGGAAGGATTTCCAGTTTTACAAAGTGCGCGAGATCAAC AACTACCACCACGCCCACGACGCCTACCTGAACGCCGTCGTGGGAA CCGCCCTGATCAAAAAGTACCCTAAGCTGGAAAGCGAGTTCGTGTA CGGCGACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGC GAGCAGGAAATCGGCAAGGCTACCGCCAAGTACTTCTTCTACAGCA ACATCATGAACTTTTTCAAGACCGAGATTACCCTGGCCAACGGCGA GATCCGGAAGCGGCCTCTGATCGAGACAAACGGCGAAACCGGGGA GATCGTGTGGGATAAGGGCCGGGATTTTGCCACCGTGCGGAAAGTG CTGAGCATGCCCCAAGTGAATATCGTGAAAAAGACCGAGGTGCAGA CAGGCGGCTTCAGCAAAGAGTCTATCCTGCCCAAGAGGAACAGCGA TAAGCTGATCGCCAGAAAGAAGGACTGGGACCCTAAGAAGTACGGC GGCTTCGACAGCCCCACCGTGGCCTATTCTGTGCTGGTGGTGGCCAA AGTGGAAAAGGGCAAGTCCAAGAAACTGAAGAGTGTGAAAGAGCT GCTGGGGATCACCATCATGGAAAGAAGCAGCTTCGAGAAGAATCCC ATCGACTTTCTGGAAGCCAAGGGCTACAAAGAAGTGAAAAAGGACC TGATCATCAAGCTGCCTAAGTACTCCCTGTTCGAGCTGGAAAACGGC CGGAAGAGAATGCTGGCCTCTGCCGGCGAACTGCAGAAGGGAAACG AACTGGCCCTGCCCTCCAAATATGTGAACTTCCTGTACCTGGCCAGC CACTATGAGAAGCTGAAGGGCTCCCCCGAGGATAATGAGCAGAAAC AGCTGTTTGTGGAACAGCACAAGCACTACCTGGACGAGATCATCGA GCAGATCAGCGAGTTCTCCAAGAGAGTGATCCTGGCCGACGCTAAT CTGGACAAAGTGCTGTCCGCCTACAACAAGCACCGGGATAAGCCCA TCAGAGAGCAGGCCGAGAATATCATCCACCTGTTTACCCTGACCAAT CTGGGAGCCCCTGCCGCCTTCAAGTACTTTGACACCACCATCGACCG GAAGAGGTACACCAGCACCAAAGAGGTGCTGGACGCCACCCTGATC CACCAGAGCATCACCGGCCTGTACGAGACACGGATCGACCTGTCTC AGCTGGGAGGTGACTCTGGAGGATCTAGCGGAGGATCCTCTGGCAG CGAGACACCAGGAACAAGCGAGTCAGCAACACCAGAGAGCTCTGGT AGCGAGACACCCGGTACCAGTGAAAGCGCCACGCCAGAAAGCAGT GGGAGTGAGACTCCGGGTACATCTGAATCAGCGACACCGGAATCAA GTGGCGGCAGCAGCGGCGGCAGCAGCACCCTAAATATAGAAGATGA GTATCGGCTACATGAGACCTCAAAAGAGCCAGATGTTTCTCTAGGGT CCACATGGCTGTCTGATTTTCCTCAGGCCTGGGCGGAAACCGGGGGC ATGGGACTGGCAGTTCGCCAAGCTCCTCTGATCATACCTCTGAAAGC AACCTCTACCCCCGTGTCCATAAAACAATACCCCATGTCACAAGAA GCCAGACTGGGGATCAAGCCCCACATACAGAGACTGTTGGACCAGG GAATACTGGTACCCTGCCAGTCCCCCTGGAACACGCCCCTGCTACCC GTTAAGAAACCAGGGACTAATGATTATAGGCCTGTCCAGGATCTGA GAGAAGTCAACAAGCGGGTGGAAGACATCCACCCCACCGTGCCCAA CCCTTACAACCTCTTGAGCGGGCCCCCACCGTCCCACCAGTGGTACA CTGTGCTTGATTTAAAGGATGCCTTTTTCTGCCTGAGACTCCACCCC ACCAGTCAGCCTCTCTTCGCCTTTGAGTGGAGAGATCCAGAGATGGG AATCTCAGGACAATTGACCTGGACCAGACTCCCACAGGGTTTCAAA AACAGTCCCACCCTGTTTAATGAGGCACTGCACAGAGACCTAGCAG ACTTCCGGATCCAGCACCCAGACTTGATCCTGCTACAGTACGTGGAT GACTTACTGCTGGCCGCCACTTCTGAGCTAGACTGCCAACAAGGTAC TCGGGCCCTGTTACAAACCCTAGGGAACCTCGGGTATCGGGCCTCG GCCAAGAAAGCCCAAATTTGCCAGAAACAGGTCAAGTATCTGGGGT ATCTTCTAAAAGAGGGTCAGAGATGGCTGACTGAGGCCAGAAAAGA GACTGTGATGGGGCAGCCTACTCCGAAGACCCCTCGACAACTAAGG GAGTTCCTAGGGAAGGCAGGCTTCTGTCGCCTCTTCATCCCTGGGTT TGCAGAAATGGCAGCCCCCCTGTACCCTCTCACCAAACCGGGGACT CTGTTTAATTGGGGCCCAGACCAACAAAAGGCCTATCAAGAAATCA AGCAAGCTCTTCTAACTGCCCCAGCCCTGGGGTTGCCAGATTTGACT AAGCCCTTTGAACTCTTTGTCGACGAGAAGCAGGGCTACGCCAAAG GTGTCCTAACGCAAAAACTGGGACCTTGGCGTCGGCCGGTGGCCTA CCTGTCCAAAAAGCTAGACCCAGTAGCAGCTGGGTGGCCCCCTTGC CTACGGATGGTAGCAGCCATTGCCGTACTGACAAAGGATGCAGGCA AGCTAACCATGGGACAGCCACTAGTCATTCTGGCCCCCCATGCAGTA GAGGCACTAGTCAAACAACCCCCCGACCGCTGGCTTTCCAACGCCC GGATGACTCACTATCAGGCCTTGCTTTTGGACACGGACCGGGTCCAG TTCGGACCGGTGGTAGCCCTGAACCCGGCTACGCTGCTCCCACTGCC TGAGGAAGGGCTGCAACACAACTGCCTTGATGGGACAGGTGGCGGT GGTGTCACCGTCAAGTTCAAGTACAAGGGTGAGGAACTTGAAGTTG ATATTAGCAAAATCAAGAAGGTTTGGCGCGTTGGTAAAATGATATC TTTTACTTATGACGACAACGGCAAGACAGGTAGAGGGGCAGTGTCT GAGAAAGACGCCCCCAAGGAGCTGTTGCAAATGTTGGAAAAGTCTG GGAAAAAGTCTGGCGGCTCAAAAAGAACCGCCGACGGCAGCGAATT CGAGCCCAAGAAGAAGAGGAAAGTCGGAGGTGGCGGGAGCCCAAA AAAGAAAAGAAAAGTGTATCCCTATGATGTCCCCGATTATGCCGGT TCAAGAGCCCTGGTCGTGATTAGACTGAGCCGAGTGACAGACGCCA CCACAAGTCCCGAGAGACAGCTGGAATCATGCCAGCAGCTCTGTGC TCAGCGGGGTTGGGATGTGGTCGGCGTGGCAGAGGATCTGGACGTG AGCGGGGCCGTCGATCCATTCGACAGAAAGAGGAGGCCCAACCTGG CAAGATGGCTCGCTTTCGAGGAACAGCCCTTTGATGTGATCGTCGCC TACAGAGTGGACCGGCTGACCCGCTCAATTCGACATCTCCAGCAGCT GGTGCATTGGGCTGAGGACCACAAGAAACTGGTGGTCAGCGCAACA GAAGCCCACTTCGATACTACCACACCTTTTGCCGCTGTGGTCATCGC ACTGATGGGCACTGTGGCCCAGATGGAGCTCGAAGCTATCAAGGAG CGAAACAGGAGCGCAGCCCATTTCAATATTAGGGCCGGTAAATACA GAGGCTCCCTGCCCCCTTGGGGATATCTCCCTACCAGGGTGGATGGG GAGTGGAGACTGGTGCCAGACCCCGTCCAGAGAGAGCGGATTCTGG AAGTGTACCACAGAGTGGTCGATAACCACGAACCACTCCATCTGGT GGCACACGACCTGAATAGACGCGGCGTGCTCTCTCCAAAGGATTAT TTTGCTCAGCTGCAGGGAAGAGAGCCACAGGGAAGAGAATGGAGTG CTACTGCACTGAAGAGATCTATGATCAGTGAGGCTATGCTGGGTTAC GCAACACTCAATGGCAAAACTGTCCGGGACGATGACGGAGCCCCTC TGGTGAGGGCTGAGCCTATTCTCACCAGAGAGCAGCTCGAAGCTCT GCGGGCAGAACTGGTCAAGACTAGTCGCGCCAAACCTGCCGTGAGC ACCCCAAGCCTGCTCCTGAGGGTGCTGTTCTGCGCCGTCTGTGGAGA GCCAGCATACAAGTTTGCCGGCGGAGGGCGCAAACATCCCCGCTAT CGATGCAGGAGCATGGGGTTCCCTAAGCACTGTGGAAACGGGACAG TGGCCATGGCTGAGTGGGACGCCTTTTGCGAGGAACAGGTGCTGGA TCTCCTGGGTGACGCTGAGCGGCTGGAAAAAGTGTGGGTGGCAGGA TCTGACTCCGCTGTGGAGCTGGCAGAAGTCAATGCCGAGCTCGTGG ATCTGACTTCCCTCATCGGATCTCCTGCATATAGAGCTGGGTCCCCA CAGAGAGAAGCTCTGGACGCACGAATTGCTGCACTCGCTGCTAGAC AGGAGGAACTGGAGGGCCTGGAGGCCAGGCCCTCTGGATGGGAGTG GCGAGAAACCGGACAGAGGTTTGGGGATTGGTGGAGGGAGCAGGA CACCGCAGCCAAGAACACATGGCTGAGATCCATGAATGTCCGGCTC ACATTCGACGTGCGCGGTGGCCTGACTCGAACCATCGATTTTGGCGA CCTGCAGGAGTATGAACAGCACCTGAGACTGGGGTCCGTGGTCGAA AGACTGCACACTGGGATGTCC SpCas9 DKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGA Amino acid LLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFH SEQ ID NO: 378 RLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDK ADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFE ENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLG LTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKN LSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPE KYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLN REDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILT FRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIER MTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSG EQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASL GTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAH LFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFA NRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGIL QTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIE EGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRL SDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKN YWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITK HVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREI NNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKS EQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWD KGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKK DWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERS SFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQ KGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEII EQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGA PAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD RT(1-478)-Sto7d LNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLII Amino acid PLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLP SEQ ID NO: 379 VKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGPPPSHQWYTV LDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPT LFNEALHRDLADFRIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQT LGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTEARKETVMGQPT PKTPRQLREFLGKAGFCRLFIPGFAEMAAPLYPLTKPGTLFNWGPDQQK AYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRR PVAYLSKKLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAP HAVEALVKQPPDRWLSNARMTHYQALLLDTDRVQFGPVVALNPATLL PLPEEGLQHNCLDGTGGGGVTVKFKYKGEELEVDISKIKKVWRVGKMI SFTYDDNGKTGRGAVSEKDAPKELLQMLEKSGKKSGGSKRTADGS BxbINT SRALVVIRLSRVTDATTSPERQLESCQQLCAQRGWDVVGVAEDLDVSG Amino acid AVDPFDRKRRPNLARWLAFEEQPFDVIVAYRVDRLTRSIRHLQQLVHW SEQ ID NO: 380 AEDHKKLVVSATEAHFDTTTPFAAVVIALMGTVAQMELEAIKERNRSA AHFNIRAGKYRGSLPPWGYLPTRVDGEWRLVPDPVQRERILEVYHRVV DNHEPLHLVAHDLNRRGVLSPKDYFAQLQGREPQGREWSATALKRSM ISEAMLGYATLNGKTVRDDDGAPLVRAEPILTREQLEALRAELVKTSRA KPAVSTPSLLLRVLFCAVCGEPAYKFAGGGRKHPRYRCRSMGFPKHCG NGTVAMAEWDAFCEEQVLDLLGDAERLEKVWVAGSDSAVELAEVNA ELVDLTSLIGSPAYRAGSPQREALDARIAALAARQEELEGLEARPSGWE WRETGQRFGDWWREQDTAAKNTWLRSMNVRLTFDVRGGLTRTIDFG DLQEYEQHLRLGSVVERLHTGMS
EXAMPLES
(253) While several experimental Examples are contemplated, these Examples are intended to be non-limiting.
Example 1
CRE Integration Efficiency
(254) The efficiency of the CRE integration was tested. In order to test the efficacy of PASTE with GFP using lox71/lox66/Cre recombinase system, a clonal HEK293FT cell line with lox71 sequence (SEQ ID NO: 1) integrated into the genome using lentivirus was developed. The integration of GFP was tested by transfection of modified HEK293FT cell line with: (1) plus/minus SEQ ID NO: 71 comprising a Cre recombinase expression plasmid, and (2) SEQ ID NO: 72 comprising a GFP template and a lox 66 Cre site of SEQ ID NO: 2. After 72 hours, the percent integration of GFP into the lox71 site was probed.
Example 2
Programmable Addition Via Site-Specific Targeting Elements (PASTE) with Cre Recombinase—Addition of Lox Site
(255) The lox71 (SEQ ID NO: 1) or lox66 (SEQ ID NO: 2) sequence was inserted into the HEK293FT cell genome using prime editing to test integration of GFP into the HEK293FT genome. In order to insert lox71 or lox66 sequence into HEK293FT cell genome, a pegRNA with PBS length of 13 base pairs operably linked to RT region of varying lengths was used. The following plasmids were used in the transfection of HEK293FT cells. The cells were transfected with (1) prime editing construct (PE2) or PE2 with conditional Cre expression, (2) Lox71 or Lox66 pegRNA targeting the HEK3 locus, and (3) plus/minus +90 HEK3 nicking second guide RNA targeting the HEK3 locus (+90 ngRNA). After 72 hours, the percent editing of the HEK293FT genome at the HEK3 locus was probed for incorporation of various lengths of lox71 or lox66 (see
Example 3
PASTE with Cre Recombinase—Integration of Gene
(256) The lox71 or lox66 pegRNAs having PBS length of 13 base pairs and insert length of 34 base pairs were used to probe integration of GFP in the HEK293F genome. The PE and Cre were delivered in an inducible expression vectors and induced at day 2. The HEK293FT cells were transfected with the following plasmids: (1) prime editing construct (PE2 or PE2 with conditional Cre expression); (2) Lox71 pegRNA; (3) plus/minus +90 HEK3 nicking guide RNA; and (4) EGFP template with Lox66 site. After 72 hours, the percent editing of lox71 site and percent integration of GFP was probed with or without lox66 site in the presence of various PE/Cre constructs.
Example 4
Bxb1 Integration Data Lenti Reporter
(257) The integration system was switched to an integrase system that could result in an integration of target genes into a genome with higher efficiency. Serine integrase Bxb1 has been shown to be more active than Cre recombinase and highly efficient in bacteria and mammalian cells for irreversible integration of target genes.
(258) To probe the efficiency of the Bxb1 integration system, a clonal HEK293FT cell line with attB Bxb1 site (SEQ ID NO: 3) integrated using lentivirus was developed. The modified HEK293FT cell line was then transferred with the following plasmids: (1) plus/minus Bxb1 expression plasmid and (2) plus/minus GFP (SEQ ID NO: 76) or G-Luc (SEQ ID NO: 77) minicircle template with attP Bxb1 site. After 72 hours, the integration of GFP or Gluc into the attB site in the HEK293FT genome was probed. The percent integrations of GFP or Gluc into the attB locus are shown in
Example 5
Addition of Bxb1 Site to Human Genome Using PRIME
(259) The maximum length of attB that can be integrated into a HEK293FT cell line with the best efficiency was probed. To probe the best length of attB (SEQ ID NO: 3) or its reverse complement attP (SEQ ID NO: 4) for prime editing, pegRNAs having PBS length of 13 nt with varying RT homology length were used. The following plasmids were transfected in HEK293FT: (1) prime expression plasmid; (2) HEK3 targeting pegRNA design; and (3) HEK3 +90 nicking guide. After 72 hours, the percent integration of each of the attB construct was probed.
(260) Integration PASTE was then tested with tagging cell-organelle marker proteins with GFP in HEK29FT cells. PASTE was used to tag SUPT16H, SRRM2, LAMNB1, NOLC1 and DEPDC4 with GFP in different cell-culture wells and to test the usefulness of PASTE in tracking protein localization within the cells using microscopy.
(261) The transfection of the plasmids can be achieved using electroporation as illustrated in
Example 6
Programmable Integration of Genes with PASTE
(262) The efficiency of gene integration of Gluc or EGFP with PASTE was tested. To enable gene integration with PASTE, the following HEK3 targeting pegRNAs were used: (1) 44 pegRNA: PBS of 13nt and RT homology of 44nt; (2) 34 pegRNA: PBS of 13nt and RT homology of 34nt; and (3) 26 pegRNA: PBS of 13nt and RT homology of 26nt.
(263) A HEK293 cell line was transfected with following plasmids HEK293FT: (1) Prime expression plasmid; (2) Bxb1 expression plasmid; (3) HEK3 targeting pegRNA design; (4) HEK3 +90 nicking guide; and (5) EGFP or Gluc minicircle. After 72 hours, the percent integration of Gluc or EGFP was observed.
Example 7
PASTE for Integration of Multiple Genes
(264) The PASTE technique for site-specific integration of multiple genes into a cell is facilitated with the use of orthogonal attB and attP sites. Central dinucleotide can be changed to GA from GT, and only GA containing attB/attP sites can interact and do not cross react with GT containing sequences. A screen of dinucleotide combinations to find orthogonal attB/attP pairs for multiplexed PASTE editing can be performed. It has been shown that many orthogonal dinucleotide combinations can be found using a Bxb1 reporter system.
(265) To test this, attB.sup.GT and attB.sup.GA dinucleotides for Bxb1 was added at a ACTB site by prime editing. A EGFP-attP.sup.GT DNA minicircle and a mCherry-attP.sup.GA DNA minicircle was introduced to test the percent EGFP and mCherry editing in the presence or absence of Bxb1. The results of EGFP and mCherry editing are shown in
(266) Orthogonal editing with the right GT-EGFP and GA-mCherry pairs was achieved demonstrating the ability for multiplexed PASTE editing in cells.
(267) Two genes were introduced in the same cell using multiplexed PASTE to tag two different genes in a single reaction. EGFP and mCherry were tagged into the loci of ACTB and NOLC1 in a x cell line, in a single reaction. Further, EGFP and mCherry were tagged into the loci of ACTB and LAMNB1. The cells were visualized using fluorescence microscopy.
(268) The ability of multiplexing with 9-different attB and attP central dinucleotides—AA, GA, CA, AG, AC, CC, GT, CT and TT (SEQ ID NOs: 7, 8, 23, 24, 19, 20, 25, 26, 27, 28, 9, 10, 15, 16, 17, 18, 5 and 6)—in a 9×9 cross of attB and attP was tested. The edits were probed using next-generation sequencing. The results of the 9×9 cross of attB and attP central dinucleotides—AA, GA, CA, AG, AC, CC, GT, CT and TT—are shown in
Example 8
Integration of Albumin and CPS1 Into Albumin Locus
(269) 12 pegRNAs with albumin guide were linked to PBS and reverse transcriptase sequence of variable length, and different nicking guide RNAs were used to transfect HEK293FT cells. The percent editing in the albumin was probed using next-generation sequencing. The results of prime editing at the albumin locus are shown in
Example 9
Engineering T-cells
(270) In order to engineer CD8+ T-cells, the efficiency of PASTE delivery and editing in T-cells can be evaluated (
(271) Five vectors, three vectors, and two vectors PASTE systems show that robust T-cell editing can be achieved with maximal editing using the three-vector approach (
Example 10
PASTE for CFTR
(272) PASTE for the CFTR locus can be tested in HEK293FT cells to identify top performing pegRNA and nicking designs for human cells. Neuro-2A cells can also be tested to identify top performing pegRNA and nicking designs for mouse cells. The best constructs can be applied for testing in mouse air lung interface (ALI) organoids in vitro or for delivery in pre-clinical models of cystic fibrosis in mice. Table 12 shows the pegRNA, nicking guide and minicircle DNA characteristics for the CFTR gene modulation.
(273) TABLE-US-00012 TABLE 12 Variables Characteristics pegRNA 38 bp shortened minimal attB and normal 46 bp attB sequence with: a. PBS of 17, 13, and 9 nt length, and b. RT of 20, 15, and 10 nt in length Nicking guides Nicking guide 1 +64 bp Nicking guide 2 +23 bp Nicking guide 3 −60 bp Nicking guide 4 −78 bp (distance is calculated from cut site of pegRNA) Minicircle A. CFTR coding sequence alone template (~4,454 pb in size) B. CFTR coding sequence plus 5′ and 3′ UTRs (~6,011 bp in size) (Both minicircles have attP site on them for integration by Bxbl and a bGH poly A signal)
Example 11
AttB and EGPF Integration Using PASTE
(274) The efficiency of the integration of attB and EGPF at the ACTB locus was evaluated (
(275) To make the tool simpler to use, the Bxb1 can be linked to Prime via a P2A linker to the Cas9-RT fusion, allowing for only a single plasmid to be used for PASTE protein expression rather than two. This optimization can maintain the same level of editing, making it easier to use the tool and deliver it (
Example 12
Programmable EGFP Integrations in Different Cell Types
(276) The programmable EGFP integration in liver hepatocellular carcinoma cell line HEPG2 (
Example 13
Mutagenesis of Bxb1 for Enhanced PASTE Activity
(277) The mutagenesis of Bxb1 for enhanced PASTE activity was evaluated (
Example 14
Effect of the pegRNA PBS and RT Lengths on the Prime Editing Integration Efficiency
(278) The effect of the pegRNA PBS and RT lengths on the prime editing integration efficiency was evaluated (
Example 15
Comparison of PASTE and HITI On-target and Off-target Activities
(279) The PASTE and HITI on-target and off-target activities were compared (
Example 16
Multiplexing with PASTE and Orthogonal Di-nucleotide attB and attP Sites
(280) Multiplexing with PASTE and orthogonal di-nucleotide attB and attP sites was evaluated (
Example 17
PASTE Multiplexed Integrations at Endogenous Sites
(281) PASTE multiplexed integrations at endogenous sites were evaluated (
Example 18
Combination of CRISPR-Based Genome Editing and Site-Specific Integration
(282) The combination of CRISPR-based genome editing and site-specific integration was evaluated.
(283) PegRNAs containing different attB length truncations were assessed (
Example 19
Impact of Prime Editing and Integrase Parameters on PRIME Editing
(284) The impact of prime editing and integrase parameters on the integration efficiency of PRIME editing was assessed.
(285) Relevant pegRNA parameters for PASTE include the primer binding site (PBS), reverse transcription template (RT), and attB site lengths, as well as the relative locations and efficacy of the pegRNA spacer and nicking guide (
(286) The length of the attB landing site must balance two conflicting factors: the higher efficiency of prime editing for smaller inserts and reduced efficiency of Bxb1 integration at shorter attB lengths. AttB lengths were evaluated at ACTB, LMNB1, and nucleolar phosphoprotein p130 (NOLC1), and the optimal attB length was found to be locus dependent. At the ACTB locus, long attB lengths could be inserted by prime editing (
(287) The PE3 version of prime editing combines PE2 and an additional nicking guide to bias resolution of the flap intermediate towards insertion. To test the importance of nicking guide selection on PASTE editing, editing at ACTB and LMNB1 loci was tested with two nicking guide positions. Suboptimal nicking guide positions were found to reduce the PASTE efficiency up to 30% (
(288) Rational mutations were also introduced in both the Bxb1 integrase and reverse transcriptase domain of the PE2 construct to optimize PASTE further. While some of these mutations were well tolerated by PASTE (
(289) Short RT and PBS lengths can offer additional improvements for editing. A panel of shorter RT and PBS guides were tested at ACTB and LMNB1 loci and while shorter RT and PBS sequences did not increase editing at ACTB (
Example 20
PASTE Tagging at Multiple Endogenous Genes
(290) GFP insertion efficiency was measured at seven different gene loci—ACTB, SUPT16H, SRM2, NOLC1, DEPDC4, NES, and LMNB1—to test the versatility of the PASTE programming. A range of integration rates up to 22% was found (
(291) The precise insertions of PASTE for in-frame protein tagging or expressing cargo without disruption of endogenous gene expression was assessed. As Bxb1 leaves residual sequences in the genome (termed attL and attR) after cargo integration, these genomic scars can serve as protein linkers. The frame of the attR sequence was positioned through strategic placement of the attP on the minicircle cargo, achieving a suitable protein linker, GGLSGQPPRSPSSGSSG (SEQ ID NO: 427). Using this linker, four genes (ACTB, SRRM2, NOLC1, and LMNB1) were tagged with GFP using PASTE. To assess correct gene tagging, the subcellular location of GFP was compared with the tagged gene product by immunofluorescence. For all four targeted loci, GFP co-localized with the tagged gene product, indicating successful tagging (
Example 21
Orthogonal Sequence Preferences for Bxb1 Integration
(292) The central dinucleotide of Bxb1 is involved in the association of attB and attP sites for integration, and changing the matched central dinucleotide sequences can modify integrase activity and provide orthogonality for insertion of two genes. Expanding the set of attB/attP dinucleotides can enable multiplexed gene insertion with PASTE. The efficiency of GFP integration at the ACTB locus with PASTE across all 16 dinucleotide attB/attP sequence pairs was profiled to find optimal attB/attP dinucleotides for PASTE insertion. Several dinucleotides with integration efficiencies greater than the wild-type GT sequence were found (
(293) The specificity of matched and unmatched attB/attP dinucleotide interactions was then assessed. The interactions between all dinucleotide combinations in a scalable fashion using a pooled assay to compare attB/attP integration were profiled (
(294) GA, AG, AC, and CT dinucleotide pegRNAs were then tested for GFP integration at ACTB, either paired with their corresponding attP cargo or mispaired with the other three dinucleotide attP sequences. All four of the tested dinucleotides efficiently were found to integrate cargo only when paired with the corresponding attB/attP pair, with no detectable integration across mispaired combinations (
Example 22
Multiplex Gene Integration with PASTE
(295) Multiplexing in cells by using orthogonal pegRNAs that direct a matched attP cargo to a specific site in the genome was assessed (
(296) An application for multiplexed gene integration is for labeling different proteins to visualize intracellular localization and interactions within the same cell. PASTE was used to simultaneously tag ACTB (GFP) and NOLC1 (mCherry) or ACTB (GFP) and LMNB1 (mCherry) in the same cell. No overlap of GFP and mCherry fluorescence was observed and tagged genes were confirmed to be visible in their appropriate cellular compartments, based on the known subcellular localizations of the ACTB, NOLC1 and LMNB1 protein products (
Example 23
PASTE Efficiencies Compared With DSB-based Insertion Methods
(297) PASTE efficiencies were found to exceed comparable DSB-based insertion methods.
(298) PASTE editing was assessed alongside DSB-dependent gene integration using either NHEJ (i.e., homology-independent targeted integration, HITI) or HDR pathways. PASTE had equivalent or better gene insertion efficiencies than either HITI (
Example 24
Off-Target Characterization of PASTE and HITI Gene Integration
(299) Off-target editing can be used in genome editing technologies. The specificity of PASTE at specific sites was assessed based on off-targets generated by Bxb1 integration into pseudo-attB sites in the human genome and off-targets generated via guide- and Cas9-dependent editing in the human genome (
(300) Genome-wide off-targets due to either Cas9 or Bxb1 through tagging and PCR amplification of insert-genomic junctions were additionally assessed (
(301) Expression of reverse transcriptases and integrases involved in PASTE can have detrimental effects on cellular health. The complete PASTE system, the corresponding guides and cargo with only PE2, and the corresponding guides and cargo with only Bxb1 were transfected and compared to both GFP control transfections and guides without protein expression via transcriptome-wide RNA sequencing to determine the extent of these effects. While Bxb1 expression in the absence of Prime editing was found to have several significant off targets, the complete PASTE system had only one differentially regulated gene with more than a 1.5-fold change (
Example 25
PASTE Efficiency in Non-Dividing Cell
(302) PASTE activity in non-dividing cells was assessed. Cas9 and HDR templates or PASTE were transfected into HEK293FT cells and cell division was arrested via aphidicolin treatment (
Example 26
Production and Secretion of Therapeutic Transgene
(303) PASTE with larger transgenes and in additional cell lines were assessed.
(304) To evaluate the size limits for therapeutic transgenes, insertion of cargos up to 13.3 kb in length in both dividing and aphidicolin treated cells was assessed. Insertion efficiency greater than 10% was found (
(305) To improve the efficiency of PASTE, PE2* NLS was incorporated for prime editing and improved PASTE integration at multiple loci was found (
(306) Programmable gene integration provides a modality for expression of therapeutic protein products, and protein production was assessed for therapeutically relevant proteins Alpha-1 antitrypsin (encoded by SERPINA1) and Carbamoyl phosphate synthetase I (encoded by CPS1), involved in the diseases Alpha-1 antitrypsin deficiency and CPS1 deficiency, respectively. By tagging gene products with the luminescent protein subunit HiBiT, the transgene production and secretion were assessed independently in response to PASTE treatment (
Example 27
Optimized PASTE Constructs
(307) To optimize complex activity, a panel of protein modifications were screened, including alternative reverse transcriptase fusions and mutations, various linkers between the reverse transcriptase domain and integrase and between the Cas9 and reverse transcriptase domain, and reverse transcriptase and BxbINT domain mutants (
(308) Additionally, pegRNAs containing different AttB length truncations were tested and found that prime editing was capable of inserting sequences up to 56 bp at the beta-actin (ACTB) gene locus, with higher efficiency at lengths below 31 bp (
Example 28
Viral Delivery & In Vivo Editing
(309) In order to package the complete PASTE system in viral vectors, an AdV vector was utilized (
(310) To further demonstrate PASTE would be amenable for in vivo delivery, an mRNA version of the PASTE protein components was developed as well as chemically-modified synthetic atgRNA and nicking guide against the LMNB1 target (
Example 29
Simultaneous Deletion & Insertion with PASTE
(311) The PASTE system was used to simultaneously delete one sequence and insert another. 130 bp and 385 bp deletions of first exon of LMNB1 with combined insertion of AttB nucleic acid sequence was performed (
(312) A130 bp deletion of the first exon of LMNB1 with combined insertion of a 967 bp cargo using the PASTE system was also performed.
(313) One of two attP sequences were inserted using the mini circle template that has mutated AttP, as described above. This AttP mutants shows better integration kinetics and efficiency, especially for the shorter AttBs (38-44 bp). The LMNB1 AttB used in this experiment is 38 bp (