Modified strains for the production of recombinant silk
10647975 ยท 2020-05-12
Assignee
Inventors
Cpc classification
International classification
Abstract
Disclosed herein are modified strains for reducing degradation of recombinantly expressed products secreted from a host organism and methods of using the modified strains. In some embodiments, to attenuate a protease activity in Pichia pastoris, the genes encoding enzymes the degrade proteases are inactivated or mutated to reduce or eliminate activity. In preferred strains, the protease activity of proteases encoded by PAS_chr4_0584 (YPS1-1) and PAS_chr3_1157 (YPS1-2) (e.g., polypeptides comprising SEQ ID NO: 66 and 67) is attenuated.
Claims
1. A Pichia pastoris microorganism, in which the activity of a YPS1-1 protease comprising a polypeptide sequence at least 95% identical to SEQ ID NO: 67 and a YPS1-2 protease comprising a polypeptide sequence at least 95% identical to SEQ ID NO: 68 has been attenuated or eliminated, wherein said polypeptide sequence at least 95% identical to SEQ ID NO: 67 and said polypeptide sequence at least 95% identical to SEQ ID NO: 68 each have a protease activity before said attenuation or elimination, and wherein said microorganism expresses a recombinant protein.
2. The microorganism of claim 1, wherein said YPS1-1 protease comprises SEQ ID NO: 67.
3. The microorganism of claim 1, wherein said YPS1-1 protease is encoded by a YPS1-1 gene comprising a polynucleotide sequence at least 95% identical to SEQ ID NO: 1 and encoding a polypeptide having protease activity.
4. The microorganism of claim 3, wherein said YPS1-1 gene comprises SEQ ID NO: 1.
5. The microorganism of claim 1, wherein said YPS1-2 protease comprises SEQ ID NO: 68.
6. The microorganism of claim 1, wherein said YPS1-2 protease is encoded by a YPS1-2 gene comprising a polynucleotide sequence at least 95% identical to SEQ ID NO: 2 and encoding a polypeptide having protease activity.
7. The microorganism of claim 6, wherein said YPS1-2 gene comprises SEQ ID NO: 2.
8. The microorganism of claim 1, wherein said YPS1-1 protease is encoded by a YPS1-1 gene, wherein said YPS1-2 protease is encoded by a YPS1-2 gene, and wherein said YPS1-1 gene or said YPS1-2 gene, or both, has been mutated or knocked out.
9. The microorganism of claim 1, wherein said recombinant protein comprises one or more repeat sequences {GGY-[GPG-X.sub.1]n.sub.1-GPS-(A)n.sub.2}n.sub.3 (SEQ ID NO: 514), wherein
X1=SGGQQ (SEQ ID NO: 515), GAGQQ (SEQ ID NO: 516), GQGPY (SEQ ID NO: 517), AGQQ (SEQ ID NO: 518) or SQ; n1 is from 4 to 8; n2 is from 6 to 20; and n3 is from 2 to 20, wherein said one or more repeat sequences are a silk-like polypeptide.
10. The microorganism of claim 9, wherein said recombinant protein comprises SEQ ID NO: 463.
11. The microorganism of claim 1, wherein the activity of one or more additional proteases has been attenuated or eliminated.
12. A Pichia pastoris engineered microorganism comprising YPS1-1 and YPS1-2 activity reduced by a mutation or deletion of the YPS1-1 gene comprising SEQ ID NO: 1 and the YPS1-2 gene comprising SEQ ID NO: 2, wherein said microorganism further comprises a recombinantly expressed protein comprising a polypeptide sequence comprising SEQ ID NO: 463.
13. A cell culture comprising the microorganism of claim 1.
14. A cell culture comprising the microorganism of claim 1, wherein said recombinantly expressed protein is less degraded than a cell culture comprising an otherwise identical Pichia pastoris microorganism whose YPS1-1 and YPS1-2 activity has not been attenuated or eliminated.
15. A method of producing a recombinant protein with a reduced degradation, comprising: culturing the microorganism of claim 1 in a culture medium under conditions suitable for expression of the recombinantly expressed protein; and isolating the recombinant protein from the microorganism or the culture medium.
16. The method of claim 15, wherein said recombinant protein is secreted from said microorganism, and wherein isolating said recombinant protein comprises collecting a culture medium comprising said secreted recombinant protein.
17. The method of claim 15, wherein said recombinant protein has a decreased level of degradation as compared to said recombinant protein produced by an otherwise identical microorganism wherein said YPS1-1 and said YPS1-2 protease activity has not been attenuated or eliminated.
18. A method of making the Pichia pastoris microorganism of claim 1 comprising knocking out or mutating a gene encoding the YPS 1-1 protein and a gene encoding the YPS 1-2 protein.
19. The method of claim 18, wherein said recombinantly expressed protein comprises a polyA sequence comprising at least at least 2, 3, 4, 5, 6, 7, 8, 9, or 10 contiguous alanine residues (SEQ ID NO: 519).
20. The method of claim 18, wherein said recombinantly expressed protein comprises a silk-like polypeptide.
21. The method of claim 20, wherein said silk-like polypeptide comprises one or more repeat sequences {GGY-[GPG-X.sub.1]n.sub.1-GPS-(A)n.sub.2}n.sub.3 (SEQ ID NO: 514), wherein
X.sub.1=SGGQQ (SEQ ID NO: 515) or GAGQQ (SEQ ID NO: 516) or GQGPY (SEQ ID NO: 517) or AGQQ (SEQ ID NO: 518) or SQ; n1 is from 4 to 8; n2 is from 6 to 20; and n3 is from 2 to 20.
22. The method of claim 18, wherein said recombinantly expressed protein comprises a polypeptide sequence encoded by SEQ ID NO: 462.
23. A Pichia pastoris microorganism, in which the activity of a YPS 1-1 protease comprising a polypeptide sequence at least 95% identical to SEQ ID NO: 67 and a YPSI-2 protease comprising a polypeptide sequence at least 95% identical to SEQ ID NO: 68 has been attenuated or eliminated, wherein said polypeptide sequence at least 95% identical to SEQ ID NO: 67 and said polypeptide sequence at least 95% identical to SEQ ID NO: 68 each have a protease activity before said attenuation or elimination.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The foregoing and other objects, features and advantages will be apparent from the following description of particular embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead placed upon illustrating the principles of various embodiments of the invention.
(2)
(3)
(4)
(5)
(6)
(7)
DETAILED DESCRIPTION
(8) The details of various embodiments of the invention are set forth in the description below. Other features, objects, and advantages of the invention will be apparent from the description and the drawings, and from the claims.
(9) Definitions
(10) Unless otherwise defined herein, scientific and technical terms used in connection with the present invention shall have the meanings that are commonly understood by those of ordinary skill in the art. Further, unless otherwise required by context, singular terms shall include the plural and plural terms shall include the singular. The terms a and an includes plural references unless the context dictates otherwise. Generally, nomenclatures used in connection with, and techniques of, biochemistry, enzymology, molecular and cellular biology, microbiology, genetics and protein and nucleic acid chemistry and hybridization described herein are those well-known and commonly used in the art.
(11) The following terms, unless otherwise indicated, shall be understood to have the following meanings:
(12) The term polynucleotide or nucleic acid molecule refers to a polymeric form of nucleotides of at least 10 bases in length. The term includes DNA molecules (e.g., cDNA or genomic or synthetic DNA) and RNA molecules (e.g., mRNA or synthetic RNA), as well as analogs of DNA or RNA containing non-natural nucleotide analogs, non-native internucleoside bonds, or both. The nucleic acid can be in any topological conformation. For instance, the nucleic acid can be single-stranded, double-stranded, triple-stranded, quadruplexed, partially double-stranded, branched, hairpinned, circular, or in a padlocked conformation.
(13) Unless otherwise indicated, and as an example for all sequences described herein under the general format SEQ ID NO:, nucleic acid comprising SEQ ID NO: 1 refers to a nucleic acid, at least a portion of which has either (i) the sequence of SEQ ID NO:1, or (ii) a sequence complementary to SEQ ID NO: 1. The choice between the two is dictated by the context. For instance, if the nucleic acid is used as a probe, the choice between the two is dictated by the requirement that the probe be complementary to the desired target.
(14) An isolated RNA, DNA or a mixed polymer is one which is substantially separated from other cellular components that naturally accompany the native polynucleotide in its natural host cell, e.g., ribosomes, polymerases and genomic sequences with which it is naturally associated.
(15) An isolated organic molecule (e.g., a silk protein) is one which is substantially separated from the cellular components (membrane lipids, chromosomes, proteins) of the host cell from which it originated, or from the medium in which the host cell was cultured. The term does not require that the biomolecule has been separated from all other chemicals, although certain isolated biomolecules may be purified to near homogeneity.
(16) The term recombinant refers to a biomolecule, e.g., a gene or protein, that (1) has been removed from its naturally occurring environment, (2) is not associated with all or a portion of a polynucleotide in which the gene is found in nature, (3) is operatively linked to a polynucleotide which it is not linked to in nature, or (4) does not occur in nature. The term recombinant can be used in reference to cloned DNA isolates, chemically synthesized polynucleotide analogs, or polynucleotide analogs that are biologically synthesized by heterologous systems, as well as proteins and/or mRNAs encoded by such nucleic acids.
(17) An endogenous nucleic acid sequence in the genome of an organism (or the encoded protein product of that sequence) is deemed recombinant herein if a heterologous sequence is placed adjacent to the endogenous nucleic acid sequence, such that the expression of this endogenous nucleic acid sequence is altered. In this context, a heterologous sequence is a sequence that is not naturally adjacent to the endogenous nucleic acid sequence, whether or not the heterologous sequence is itself endogenous (originating from the same host cell or progeny thereof) or exogenous (originating from a different host cell or progeny thereof). By way of example, a promoter sequence can be substituted (e.g., by homologous recombination) for the native promoter of a gene in the genome of a host cell, such that this gene has an altered expression pattern. This gene would now become recombinant because it is separated from at least some of the sequences that naturally flank it.
(18) A nucleic acid is also considered recombinant if it contains any modifications that do not naturally occur to the corresponding nucleic acid in a genome. For instance, an endogenous coding sequence is considered recombinant if it contains an insertion, deletion or a point mutation introduced artificially, e.g., by human intervention. A recombinant nucleic acid also includes a nucleic acid integrated into a host cell chromosome at a heterologous site and a nucleic acid construct present as an episome.
(19) As used herein, the phrase degenerate variant of a reference nucleic acid sequence encompasses nucleic acid sequences that can be translated, according to the standard genetic code, to provide an amino acid sequence identical to that translated from the reference nucleic acid sequence. The term degenerate oligonucleotide or degenerate primer is used to signify an oligonucleotide capable of hybridizing with target nucleic acid sequences that are not necessarily identical in sequence but that are homologous to one another within one or more particular segments.
(20) The term percent sequence identity or identical in the context of nucleic acid sequences refers to the residues in the two sequences which are the same when aligned for maximum correspondence. The length of sequence identity comparison may be over a stretch of at least about nine nucleotides, usually at least about 20 nucleotides, more usually at least about 24 nucleotides, typically at least about 28 nucleotides, more typically at least about 32 nucleotides, and preferably at least about 36 or more nucleotides. There are a number of different algorithms known in the art which can be used to measure nucleotide sequence identity. For instance, polynucleotide sequences can be compared using FASTA, Gap or Bestfit, which are programs in Wisconsin Package Version 10.0, Genetics Computer Group (GCG), Madison, Wis. FASTA provides alignments and percent sequence identity of the regions of the best overlap between the query and search sequences. Pearson, Methods Enzymol. 183:63-98 (1990) (hereby incorporated by reference in its entirety). For instance, percent sequence identity between nucleic acid sequences can be determined using FASTA with its default parameters (a word size of 6 and the NOPAM factor for the scoring matrix) or using Gap with its default parameters as provided in GCG Version 6.1, herein incorporated by reference. Alternatively, sequences can be compared using the computer program, BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990); Gish and States, Nature Genet. 3:266-272 (1993); Madden et al., Meth. Enzymol. 266:131-141 (1996); Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997); Zhang and Madden, Genome Res. 7:649-656 (1997)), especially blastp or tblastn (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)).
(21) The term substantial homology or substantial similarity, when referring to a nucleic acid or fragment thereof, indicates that, when optimally aligned with appropriate nucleotide insertions or deletions with another nucleic acid (or its complementary strand), there is nucleotide sequence identity in at least about 75%, 80%, 85%, preferably at least about 90%, and more preferably at least about 95%, 96%, 97%, 98% or 99% of the nucleotide bases, as measured by any well-known algorithm of sequence identity, such as FASTA, BLAST or Gap, as discussed above.
(22) Alternatively, substantial homology or similarity exists when a nucleic acid or fragment thereof hybridizes to another nucleic acid, to a strand of another nucleic acid, or to the complementary strand thereof, under stringent hybridization conditions. Stringent hybridization conditions and stringent wash conditions in the context of nucleic acid hybridization experiments depend upon a number of different physical parameters. Nucleic acid hybridization will be affected by such conditions as salt concentration, temperature, solvents, the base composition of the hybridizing species, length of the complementary regions, and the number of nucleotide base mismatches between the hybridizing nucleic acids, as will be readily appreciated by those skilled in the art. One having ordinary skill in the art knows how to vary these parameters to achieve a particular stringency of hybridization.
(23) In general, stringent hybridization is performed at about 25 C. below the thermal melting point (T.sub.m) for the specific DNA hybrid under a particular set of conditions. Stringent washing is performed at temperatures about 5 C. lower than the T.sub.m for the specific DNA hybrid under a particular set of conditions. The T.sub.m is the temperature at which 50% of the target sequence hybridizes to a perfectly matched probe. See Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989), page 9.51, hereby incorporated by reference. For purposes herein, stringent conditions are defined for solution phase hybridization as aqueous hybridization (i.e., free of formamide) in 6SSC (where 20SSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% SDS at 65 C. for 8-12 hours, followed by two washes in 0.2SSC, 0.1% SDS at 65 C. for 20 minutes. It will be appreciated by the skilled worker that hybridization at 65 C. will occur at different rates depending on a number of factors including the length and percent identity of the sequences which are hybridizing.
(24) The nucleic acids (also referred to as polynucleotides) of this present invention may include both sense and antisense strands of RNA, cDNA, genomic DNA, and synthetic forms and mixed polymers of the above. They may be modified chemically or biochemically or may contain non-natural or derivatized nucleotide bases, as will be readily appreciated by those of skill in the art. Such modifications include, for example, labels, methylation, substitution of one or more of the naturally occurring nucleotides with an analog, internucleotide modifications such as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoramidates, carbamates, etc.), charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), pendent moieties (e.g., polypeptides), intercalators (e.g., acridine, psoralen, etc.), chelators, alkylators, and modified linkages (e.g., alpha anomeric nucleic acids, etc.) Also included are synthetic molecules that mimic polynucleotides in their ability to bind to a designated sequence via hydrogen bonding and other chemical interactions. Such molecules are known in the art and include, for example, those in which peptide linkages substitute for phosphate linkages in the backbone of the molecule. Other modifications can include, for example, analogs in which the ribose ring contains a bridging moiety or other structure such as the modifications found in locked nucleic acids.
(25) The term mutated when applied to nucleic acid sequences means that nucleotides in a nucleic acid sequence may be inserted, deleted or changed compared to a reference nucleic acid sequence. A single alteration may be made at a locus (a point mutation) or multiple nucleotides may be inserted, deleted or changed at a single locus. In addition, one or more alterations may be made at any number of loci within a nucleic acid sequence. A nucleic acid sequence may be mutated by any method known in the art including but not limited to mutagenesis techniques such as error-prone PCR (a process for performing PCR under conditions where the copying fidelity of the DNA polymerase is low, such that a high rate of point mutations is obtained along the entire length of the PCR product; see, e.g., Leung et al., Technique, 1:11-15 (1989) and Caldwell and Joyce, PCR Methods Applic. 2:28-33 (1992)); and oligonucleotide-directed mutagenesis (a process which enables the generation of site-specific mutations in any cloned DNA segment of interest; see, e.g., Reidhaar-Olson and Sauer, Science 241:53-57 (1988)).
(26) The term attenuate as used herein generally refers to a functional deletion, including a mutation, partial or complete deletion, insertion, or other variation made to a gene sequence or a sequence controlling the transcription of a gene sequence, which reduces or inhibits production of the gene product, or renders the gene product non-functional. In some instances a functional deletion is described as a knockout mutation. Attenuation also includes amino acid sequence changes by altering the nucleic acid sequence, placing the gene under the control of a less active promoter, down-regulation, expressing interfering RNA, ribozymes or antisense sequences that target the gene of interest, or through any other technique known in the art. In one example, the sensitivity of a particular enzyme to feedback inhibition or inhibition caused by a composition that is not a product or a reactant (non-pathway specific feedback) is lessened such that the enzyme activity is not impacted by the presence of a compound. In other instances, an enzyme that has been altered to be less active can be referred to as attenuated.
(27) The term deletion as used herein refers to the removal of one or more nucleotides from a nucleic acid molecule or one or more amino acids from a protein, the regions on either side being joined together.
(28) The term knock-out as used herein is intended to refer to a gene whose level of expression or activity has been reduced to zero. In some examples, a gene is knocked-out via deletion of some or all of its coding sequence. In other examples, a gene is knocked-out via introduction of one or more nucleotides into its open reading frame, which results in translation of a non-sense or otherwise non-functional protein product.
(29) The term vector as used herein is intended to refer to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a plasmid, which generally refers to a circular double stranded DNA loop into which additional DNA segments may be ligated, but also includes linear double-stranded molecules such as those resulting from amplification by the polymerase chain reaction (PCR) or from treatment of a circular plasmid with a restriction enzyme. Other vectors include cosmids, bacterial artificial chromosomes (BAC) and yeast artificial chromosomes (YAC). Another type of vector is a viral vector, wherein additional DNA segments may be ligated into the viral genome (discussed in more detail below). Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., vectors having an origin of replication which functions in the host cell). Other vectors can be integrated into the genome of a host cell upon introduction into the host cell, and are thereby replicated along with the host genome. Moreover, certain preferred vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as recombinant expression vectors (or simply expression vectors).
(30) Operatively linked or operably linked expression control sequences refers to a linkage in which the expression control sequence is contiguous with the gene of interest to control the gene of interest, as well as expression control sequences that act in trans or at a distance to control the gene of interest.
(31) The term expression control sequence refers to polynucleotide sequences which are necessary to affect the expression of coding sequences to which they are operatively linked. Expression control sequences are sequences which control the transcription, post-transcriptional events and translation of nucleic acid sequences. Expression control sequences include appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation signals; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (e.g., ribosome binding sites); sequences that enhance protein stability; and when desired, sequences that enhance protein secretion. The nature of such control sequences differs depending upon the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal binding site, and transcription termination sequence. The term control sequences is intended to include, at a minimum, all components whose presence is essential for expression, and can also include additional components whose presence is advantageous, for example, leader sequences and fusion partner sequences.
(32) The term regulatory element refers to any element which affects transcription or translation of a nucleic acid molecule. These include, by way of example but not limitation: regulatory proteins (e.g., transcription factors), chaperones, signaling proteins, RNAi molecules, antisense RNA molecules, microRNAs and RNA aptamers. Regulatory elements may be endogenous to the host organism. Regulatory elements may also be exogenous to the host organism. Regulatory elements may be synthetically generated regulatory elements.
(33) The term promoter, promoter element, or promoter sequence as used herein, refers to a DNA sequence which when ligated to a nucleotide sequence of interest is capable of controlling the transcription of the nucleotide sequence of interest into mRNA. A promoter is typically, though not necessarily, located 5 (i.e., upstream) of a nucleotide sequence of interest whose transcription into mRNA it controls, and provides a site for specific binding by RNA polymerase and other transcription factors for initiation of transcription. Promoters may be endogenous to the host organism. Promoters may also be exogenous to the host organism. Promoters may be synthetically generated regulatory elements.
(34) Promoters useful for expressing the recombinant genes described herein include both constitutive and inducible/repressible promoters. Where multiple recombinant genes are expressed in an engineered organism of the invention, the different genes can be controlled by different promoters or by identical promoters in separate operons, or the expression of two or more genes may be controlled by a single promoter as part of an operon.
(35) The term recombinant host cell (or simply host cell), as used herein, is intended to refer to a cell into which a recombinant vector has been introduced. It should be understood that such terms are intended to refer not only to the particular subject cell but to the progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term host cell as used herein. A recombinant host cell may be an isolated cell or cell line grown in culture or may be a cell which resides in a living tissue or organism.
(36) The term peptide as used herein refers to a short polypeptide, e.g., one that is typically less than about 50 amino acids long and more typically less than about 30 amino acids long. The term as used herein encompasses analogs and mimetics that mimic structural and thus biological function.
(37) The term polypeptide encompasses both naturally-occurring and non-naturally-occurring proteins, and fragments, mutants, derivatives and analogs thereof. A polypeptide may be monomeric or polymeric. Further, a polypeptide may comprise a number of different domains each of which has one or more distinct activities.
(38) The term isolated protein or isolated polypeptide is a protein or polypeptide that by virtue of its origin or source of derivation (1) is not associated with naturally associated components that accompany it in its native state, (2) exists in a purity not found in nature, where purity can be adjudged with respect to the presence of other cellular material (e.g., is free of other proteins from the same species) (3) is expressed by a cell from a different species, or (4) does not occur in nature (e.g., it is a fragment of a polypeptide found in nature or it includes amino acid analogs or derivatives not found in nature or linkages other than standard peptide bonds). Thus, a polypeptide that is chemically synthesized or synthesized in a cellular system different from the cell from which it naturally originates will be isolated from its naturally associated components. A polypeptide or protein may also be rendered substantially free of naturally associated components by isolation, using protein purification techniques well known in the art. As thus defined, isolated does not necessarily require that the protein, polypeptide, peptide or oligopeptide so described has been physically removed from its native environment.
(39) The term polypeptide fragment refers to a polypeptide that has a deletion, e.g., an amino-terminal and/or carboxy-terminal deletion compared to a full-length polypeptide. In a preferred embodiment, the polypeptide fragment is a contiguous sequence in which the amino acid sequence of the fragment is identical to the corresponding positions in the naturally-occurring sequence. Fragments typically are at least 5, 6, 7, 8, 9 or 10 amino acids long, preferably at least 12, 14, 16 or 18 amino acids long, more preferably at least 20 amino acids long, more preferably at least 25, 30, 35, 40 or 45, amino acids, even more preferably at least 50 or 60 amino acids long, and even more preferably at least 70 amino acids long.
(40) A protein has homology or is homologous to a second protein if the nucleic acid sequence that encodes the protein has a similar sequence to the nucleic acid sequence that encodes the second protein. Alternatively, a protein has homology to a second protein if the two proteins have similar amino acid sequences. (Thus, the term homologous proteins is defined to mean that the two proteins have similar amino acid sequences.) As used herein, homology between two regions of amino acid sequence (especially with respect to predicted structural similarities) is interpreted as implying similarity in function.
(41) When homologous is used in reference to proteins or peptides, it is recognized that residue positions that are not identical often differ by conservative amino acid substitutions. A conservative amino acid substitution is one in which an amino acid residue is substituted by another amino acid residue having a side chain (R group) with similar chemical properties (e.g., charge or hydrophobicity). In general, a conservative amino acid substitution will not substantially change the functional properties of a protein. In cases where two or more amino acid sequences differ from each other by conservative substitutions, the percent sequence identity or degree of homology may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well known to those of skill in the art. See, e.g., Pearson, 1994, Methods Mol. Biol. 24:307-31 and 25:365-89 (herein incorporated by reference).
(42) The twenty conventional amino acids and their abbreviations follow conventional usage. See Immunology-A Synthesis (Golub and Gren eds., Sinauer Associates, Sunderland, Mass., 2.sup.nd ed. 1991), which is incorporated herein by reference. Stereoisomers (e.g., D-amino acids) of the twenty conventional amino acids, unnatural amino acids such as -, -disubstituted amino acids, N-alkyl amino acids, and other unconventional amino acids may also be suitable components for polypeptides of the present invention. Examples of unconventional amino acids include: 4-hydroxyproline, -carboxyglutamate, -N,N,N-trimethyllysine, -N-acetyllysine, O-phosphoserine, N-acetylserine, N-formylmethionine, 3-methylhistidine, 5-hydroxylysine, N-methylarginine, and other similar amino acids and imino acids (e.g., 4-hydroxyproline). In the polypeptide notation used herein, the left-hand end corresponds to the amino terminal end and the right-hand end corresponds to the carboxy-terminal end, in accordance with standard usage and convention.
(43) The following six groups each contain amino acids that are conservative substitutions for one another: 1) Serine (S), Threonine (T); 2) Aspartic Acid (D), Glutamic Acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Alanine (A), Valine (V), and 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W).
(44) Sequence homology for polypeptides, which is sometimes also referred to as percent sequence identity, is typically measured using sequence analysis software. See, e.g., the Sequence Analysis Software Package of the Genetics Computer Group (GCG), University of Wisconsin Biotechnology Center, 910 University Avenue, Madison, Wis. 53705. Protein analysis software matches similar sequences using a measure of homology assigned to various substitutions, deletions and other modifications, including conservative amino acid substitutions. For instance, GCG contains programs such as Gap and Bestfit which can be used with default parameters to determine sequence homology or sequence identity between closely related polypeptides, such as homologous polypeptides from different species of organisms or between a wild-type protein and a mutein thereof. See, e.g., GCG Version 6.1.
(45) A useful algorithm when comparing a particular polypeptide sequence to a database containing a large number of sequences from different organisms is the computer program BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990); Gish and States, Nature Genet. 3:266-272 (1993); Madden et al., Meth. Enzymol. 266:131-141 (1996); Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997); Zhang and Madden, Genome Res. 7:649-656 (1997)), especially blastp or tblastn (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)).
(46) Preferred parameters for BLASTp are: Expectation value: 10 (default); Filter: seg (default); Cost to open a gap: 11 (default); Cost to extend a gap: 1 (default); Max. alignments: 100 (default); Word size: 11 (default); No. of descriptions: 100 (default); Penalty Matrix: BLOWSUM62.
(47) Preferred parameters for BLASTp are: Expectation value: 10 (default); Filter: seg (default); Cost to open a gap: 11 (default); Cost to extend a gap: 1 (default); Max. alignments: 100 (default); Word size: 11 (default); No. of descriptions: 100 (default); Penalty Matrix: BLOWSUM62. The length of polypeptide sequences compared for homology will generally be at least about 16 amino acid residues, usually at least about 20 residues, more usually at least about 24 residues, typically at least about 28 residues, and preferably more than about 35 residues. When searching a database containing sequences from a large number of different organisms, it is preferable to compare amino acid sequences. Database searching using amino acid sequences can be measured by algorithms other than blastp known in the art. For instance, polypeptide sequences can be compared using FASTA, a program in GCG Version 6.1. FASTA provides alignments and percent sequence identity of the regions of the best overlap between the query and search sequences. Pearson, Methods Enzymol. 183:63-98 (1990) (incorporated by reference herein). For example, percent sequence identity between amino acid sequences can be determined using FASTA with its default parameters (a word size of 2 and the PAM250 scoring matrix), as provided in GCG Version 6.1, herein incorporated by reference.
(48) Throughout this specification and claims, the word comprise or variations such as comprises or comprising, will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.
(49) Exemplary methods and materials are described below, although methods and materials similar or equivalent to those described herein can also be used in the practice of the present invention and will be apparent to those of skill in the art. All publications and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. The materials, methods, and examples are illustrative only and not intended to be limiting.
(50) Overview
(51) Provided herein are recombinant strains and methods of producing recombinant strains to increase production of a full-length desired product in a target cell, e.g., by reducing protease degradation.
(52) In some embodiments, to attenuate a protease activity in Pichia pastoris, the genes encoding these enzymes are inactivated or mutated to reduce or eliminate activity. This can be done through mutations or insertions into the gene itself of through modification of a gene regulatory element. This can be achieved through standard yeast genetics techniques. Examples of such techniques include gene replacement through double homologous recombination, in which homologous regions flanking the gene to be inactivated are cloned in a vector flanking a selectable maker gene (such as an antibiotic resistance gene or a gene complementing an auxotrophy of the yeast strain).
(53) Alternatively, the homologous regions can be PCR-amplified and linked through overlapping PCR to the selectable marker gene. Subsequently, such DNA fragments are transformed into Pichia pastoris through methods known in the art, e.g., electroporation. Transformants that then grow under selective conditions are analyzed for the gene disruption event through standard techniques, e.g. PCR on genomic DNA or Southern blot. In an alternative experiment, gene inactivation can be achieved through single homologous recombination, in which case, e.g. the 5 end of the gene's ORF is cloned on a promoterless vector also containing a selectable marker gene. Upon linearization of such vector through digestion with a restriction enzyme only cutting the vector in the target-gene homologous fragment, such vector is transformed into Pichia pastoris. Integration at the target gene site is confirmed through PCR on genomic DNA or Southern blot. In this way, a duplication of the gene fragment cloned on the vector is achieved in the genome, resulting in two copies of the target gene locus: a first copy in which the ORF is incomplete, thus resulting in the expression (if at all) of a shortened, inactive protein, and a second copy which has no promoter to drive transcription.
(54) Alternatively, transposon mutagenesis is used to inactivate the target gene. A library of such mutants can be screened through PCR for insertion events in the target gene.
(55) The functional phenotype (i.e., deficiencies) of an engineered/knockout strain can be assessed using techniques known in the art. For example, a deficiency of an engineered strain in protease activity can be ascertained using any of a variety of methods known in the art, such as an assay of hydrolytic activity of chromogenic protease substrates, band shifts of substrate proteins for the selected protease, among others.
(56) Attenuation of a protease activity described herein can be achieved through mechanisms other than a knockout mutation. For example, a desired protease can be attenuated via amino acid sequence changes by altering the nucleic acid sequence, placing the gene under the control of a less active promoter, down-regulation, expressing interfering RNA, ribozymes or antisense sequences that target the gene of interest, or through any other technique known in the art. In preferred strains, the protease activity of proteases encoded at PAS_chr4_0584 (YPS1-1) and PAS_chr3_1157 (YPS1-2) (e.g., polypeptides comprising SEQ ID NO: 67 and 68) is attenuated by any of the methods described above. In some aspects, the invention is directed to methylotrophic yeast strains, especially Pichia pastoris strains, wherein a YPS1-1 and a YPS1-2 gene (e.g., as set forth in SEQ ID NO: 1 and SEQ ID NO: 2) have been inactivated. In some embodiments, additional protease encoding genes may also be knocked-out in accordance with the methods provided herein to further reduce protease activity of a desired protein product expressed by the strain.
(57) Production of Recombinant Strains
(58) Provided herein are methods of transforming a strain to reduce activity, e.g., using vectors to deliver recombinant genes or to knock-out or otherwise attenuate endogenous genes as desired. These vectors can take the form of a vector backbone containing a replication origin and a selection marker (typically antibiotic resistance, although many other methods are possible), or a linear fragment that enables incorporation into the target cell's chromosome. The vectors should correspond to the organism and insertion method chosen.
(59) Once the elements of a vector are selected, construction of the vector can be performed in many different ways. In an embodiment, a DNA synthesis service or a method to individually make every vector may be used.
(60) Once the DNA for each vector (including the additional elements required for insertion and operation) is acquired, it must be assembled. There are many possible assembly methods including (but not limited to) restriction enzyme cloning, blunt-end ligation, and overlap assembly [see, e.g., Gibson, D. G., et al., Enzymatic assembly of DNA molecules up to several hundred kilobases. Nature methods, 6(5), 343-345 (2009), and GeneArt Kit (http://tools.invitrogen.com/content/sfs/manuals/geneart_seamless_cloning_and_assembly_man.pdf)]. Overlap assembly provides a method to ensure all of the elements get assembled in the correct position and do not introduce any undesired sequences.
(61) The vectors generated above can be inserted into target cells using standard molecular biology techniques, e.g., molecular cloning. In an embodiment, the target cells are already engineered or selected such that they already contain the genes required to make the desired product, although this may also be done during or after further vector insertion.
(62) Depending on the organism and library element type (plasmid or genomic insertion), several known methods of inserting the vector comprising DNA to incorporate into the cells may be used. These may include, for example, transformation of microorganisms able to take up and replicate DNA from the local environment, transformation by electroporation or chemical means, transduction with a virus or phage, mating of two or more cells, or conjugation from a different cell.
(63) Several methods are known in the art to introduce recombinant DNA in bacterial cells that include but are not limited to transformation, transduction, and electroporation, see Sambrook, et al., Molecular Cloning: A Laboratory Manual (1989), Second Edition, Cold Spring Harbor Press, Plainview, N.Y. Non-limiting examples of commercial kits and bacterial host cells for transformation include NovaBlue Singles (EMD Chemicals Inc., NJ, USA), Max Efficiency DH5, One Shot BL21 (DE3) E. coli cells, One Shot BL21 (DE3) pLys E. coli cells (Invitrogen Corp., Carlsbad, Calif., USA), XL1-Blue competent cells (Stratagene, CA, USA). Non limiting examples of commercial kits and bacterial host cells for electroporation include Zappers electrocompetent cells (EMD Chemicals Inc., NJ, USA), XL1-Blue Electroporation-competent cells (Stratagene, CA, USA), ElectroMAX A. tumefaciens LBA4404 Cells (Invitrogen Corp., Carlsbad, Calif., USA).
(64) Several methods are known in the art to introduce recombinant nucleic acid in eukaryotic cells. Exemplary methods include transfection, electroporation, liposome mediated delivery of nucleic acid, microinjection into to the host cell, see Sambrook, et al., Molecular Cloning: A Laboratory Manual (1989), Second Edition, Cold Spring Harbor Press, Plainview, N.Y. Non-limiting examples of commercial kits and reagents for transfection of recombinant nucleic acid to eukaryotic cell include Lipofectamine 2000, Optifect Reagent, Calcium Phosphate Transfection Kit (Invitrogen Corp., Carlsbad, Calif., USA), GeneJammer Transfection Reagent, LipoTAXI Transfection Reagent (Stratagene, CA, USA). Alternatively, recombinant nucleic acid may be introduced into insect cells (e.g. sf9, sf21, High Five) by using baculo viral vectors.
(65) Transformed cells are isolated so that each clone can be tested separately. In an embodiment, this is done by spreading the culture on one or more plates of culture media containing a selective agent (or lack of one) that will ensure that only transformed cells survive and reproduce. This specific agent may be an antibiotic (if the library contains an antibiotic resistance marker), a missing metabolite (for auxotroph complementation), or other means of selection. The cells are grown into individual colonies, each of which contains a single clone.
(66) Colonies are screened for desired production of a protein, metabolite, or other product, or for reduction in protease activity. In an embodiment, screening identifies recombinant cells having the highest (or high enough) product production titer or efficiency. This includes a decreased proportion of degradation products or an increased total amount of full-length desired polypeptides collected from a cell culture.
(67) This assay can be performed by growing individual clones, one per well, in multi-well culture plates. Once the cells have reached an appropriate biomass density, they are induced with methanol. After a period of time, typically 24-72 hours of induction, the cultures are harvested by spinning in a centrifuge to pellet the cells and removing the supernatant. The supernatant from each culture can then be tested for protease activity and/or protein degradation.
(68) Silk Sequences
(69) In some embodiments, the modified strains with reduced protease activity described herein recombinantly express a silk-like polypeptide sequence. In some embodiments, the silk-like polypeptide sequences are 1) block copolymer polypeptide compositions generated by mixing and matching repeat domains derived from silk polypeptide sequences and/or 2) recombinant expression of block copolymer polypeptides having sufficiently large size (approximately 40 kDa) to form useful fibers by secretion from an industrially scalable microorganism. Large (approximately 40 kDa to approximately 100 kDa) block copolymer polypeptides engineered from silk repeat domain fragments, including sequences from almost all published amino acid sequences of spider silk polypeptides, can be expressed in the modified microorganisms described herein. In some embodiments, silk polypeptide sequences are matched and designed to produce highly expressed and secreted polypeptides capable of fiber formation. In some embodiments, knock-out of protease genes or reduction of protease activity in the host modified strain reduces degradation of the silk like polypeptides.
(70) Provided herein, in several embodiments, are compositions for expression and secretion of block copolymers engineered from a combinatorial mix of silk polypeptide domains across the silk polypeptide sequence space, wherein the block copolymers have minimal degradation. In some embodiments provided herein are methods of secreting block copolymers in scalable organisms (e.g., yeast, fungi, and gram positive bacteria) with minimal degradation. In some embodiments, the block copolymer polypeptide comprises 0 or more N-terminal domains (NTD), 1 or more repeat domains (REP), and 0 or more C-terminal domains (CTD). In some aspects of the embodiment, the block copolymer polypeptide is >100 amino acids of a single polypeptide chain. In some embodiments, the block copolymer polypeptide comprises a domain that is at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to a sequence of a block copolymer polypeptide as disclosed in International Publication No. WO/2015/042164, Methods and Compositions for Synthesizing Improved Silk Fibers, incorporated by reference in its entirety.
(71) Several types of native spider silks have been identified. The mechanical properties of each natively spun silk type are believed to be closely connected to the molecular composition of that silk. See, e.g., Garb, J. E., et al., Untangling spider silk evolution with spidroin terminal domains, BMC Evol. Biol., 10:243 (2010); Bittencourt, D., et al., Protein families, natural history and biotechnological aspects of spider silk, Genet. Mol. Res., 11:3 (2012); Rising, A., et al., Spider silk proteins: recent advances in recombinant production, structure-function relationships and biomedical applications, Cell. Mol. Life Sci., 68:2, pg. 169-184 (2011); and Humenik, M., et al., Spider silk: understanding the structure-function relationship of a natural fiber, Prog. Mol. Biol. Transl. Sci., 103, pg. 131-85 (2011). For example:
(72) Aciniform (AcSp) silks tend to have high toughness, a result of moderately high strength coupled with moderately high extensibility. AcSp silks are characterized by large block (ensemble repeat) sizes that often incorporate motifs of poly serine and GPX. Tubuliform (TuSp or Cylindrical) silks tend to have large diameters, with modest strength and high extensibility. TuSp silks are characterized by their poly serine and poly threonine content, and short tracts of poly alanine. Major Ampullate (MaSp) silks tend to have high strength and modest extensibility. MaSp silks can be one of two subtypes: MaSp1 and MaSp2. MaSp1 silks are generally less extensible than MaSp2 silks, and are characterized by poly alanine, GX, and GGX motifs. MaSp2 silks are characterized by poly alanine, GGX, and GPX motifs. Minor Ampullate (MiSp) silks tend to have modest strength and modest extensibility. MiSp silks are characterized by GGX, GA, and poly A motifs, and often contain spacer elements of approximately 100 amino acids. Flagelliform (Flag) silks tend to have very high extensibility and modest strength. Flag silks are usually characterized by GPG, GGX, and short spacer motifs.
(73) The properties of each silk type can vary from species to species, and spiders leading distinct lifestyles (e.g. sedentary web spinners vs. vagabond hunters) or that are evolutionarily older may produce silks that differ in properties from the above descriptions (for descriptions of spider diversity and classification, see Hormiga, G., and Griswold, C. E., Systematics, phylogeny, and evolution of orb-weaving spiders, Annu. Rev. Entomol. 59, pg. 487-512 (2014); and Blackedge, T. A. et al., Reconstructing web evolution and spider diversification in the molecular era, Proc. Natl. Acad. Sci. U.S.A., 106:13, pg. 5229-5234 (2009)). However, synthetic block copolymer polypeptides having sequence similarity and/or amino acid composition similarity to the repeat domains of native silk proteins can be used to manufacture on commercial scales consistent silk-like fibers that recapitulate the properties of corresponding natural silk fibers.
(74) In some embodiments, a list of putative silk sequences can be compiled by searching GenBank for relevant terms, e.g. spidroin fibroin MaSp, and those sequences can be pooled with additional sequences obtained through independent sequencing efforts. Sequences are then translated into amino acids, filtered for duplicate entries, and manually split into domains (NTD, REP, CTD). In some embodiments, candidate amino acid sequences are reverse translated into a DNA sequence optimized for expression in Pichia (Komagataella) pastoris. The DNA sequences are each cloned into an expression vector and transformed into Pichia (Komagataella) pastoris. In some embodiments, various silk domains demonstrating successful expression and secretion are subsequently assembled in combinatorial fashion to build silk molecules capable of fiber formation.
(75) Silk polypeptides are characteristically composed of a repeat domain (REP) flanked by non-repetitive regions (e.g., C-terminal and N-terminal domains). In an embodiment, both the C-terminal and N-terminal domains are between 75-350 amino acids in length. The repeat domain exhibits a hierarchical architecture. The repeat domain comprises a series of blocks (also called repeat units). The blocks are repeated, sometimes perfectly and sometimes imperfectly (making up a quasi-repeat domain), throughout the silk repeat domain. The length and composition of blocks varies among different silk types and across different species. Table 1 lists examples of block sequences from selected species and silk types, with further examples presented in Rising, A. et al., Spider silk proteins: recent advances in recombinant production, structure-function relationships and biomedical applications, Cell Mol. Life Sci., 68:2, pg 169-184 (2011); and Gatesy, J. et al., Extreme diversity, conservation, and convergence of spider silk fibroin sequences, Science, 291:5513, pg. 2603-2605 (2001). In some cases, blocks may be arranged in a regular pattern, forming larger macro-repeats that appear multiple times (usually 2-8) in the repeat domain of the silk sequence. Repeated blocks inside a repeat domain or macro-repeat, and repeated macro-repeats within the repeat domain, may be separated by spacing elements. In some embodiments, block sequences comprise a glycine rich region followed by a polyA region. In some embodiments, short (.sup.1-10) amino acid motifs appear multiple times inside of blocks. For the purpose of this invention, blocks from different natural silk polypeptides can be selected without reference to circular permutation (i.e., identified blocks that are otherwise similar between silk polypeptides may not align due to circular permutation). Thus, for example, a block of SGAGG (SEQ ID NO: 494) is, for the purposes of the present invention, the same as GSGAG (SEQ ID NO: 495) and the same as GGSGA (SEQ ID NO: 496); they are all just circular permutations of each other. The particular permutation selected for a given silk sequence can be dictated by convenience (usually starting with a G) more than anything else. Silk sequences obtained from the NCBI database can be partitioned into blocks and non-repetitive regions.
(76) TABLE-US-00001 TABLE 1 Samples of Block Sequences Species Silk Type Representative Block Amino Acid Sequence Aliatypus gulosus Fibroin 1 GAASSSSTIITTKSASASAAADASAAATASAASRSSANAAASAFAQS FSSILLESGYFCSIFGSSISSSYAAAIASAASRAAAESNGYTTHAYA CAKAVASAVERVTSGADAYAYAQAISDALSHALLYTGRLNTANANSL ASAFAYAFANAAAQASASSASAGAASASGAASASGAGSAS (SEQ ID NO: 497) Plectreurys tristis Fibroin 1 GAGAGAGAGAGAGAGAGSGASTSVSTSSSSGSGAGAGAGSGAGSGAG AGSGAGAGAGAGGAGAGFGSGLGLGYGVGLSSAQAQAQAQAAAQAQA QAQAQAYAAAQAQAQAQAQAQAAAAAAAAAAA (SEQ ID NO: 498) Plectreurys tristis Fibroin 4 GAAQKQPSGESSVATASAAATSVTSGGAPVGKPGVPAPIFYPQGPLQ QGPAPGPSNVQPGTSQQGPIGGVGGSNAFSSSFASALSLNRGFTEVI SSASATAVASAFQKGLAPYGTAFALSAASAAADAYNSIGSGANAFAY AQAFARVLYPLVQQYGLSSSAKASAFASAIASSFSSGTSGQGPSIGQ QQPPVTISAASASAGASAAAVGGGQVGQGPYGGQQQSTAASASAAAA TATS (SEQ ID NO: 499) Araneus TuSp GNVGYQLGLKVANSLGLGNAQALASSLSQAVSAVGVGASSNAYANAV gemmoides SNAVGQVLAGQGILNAANAGSLASSFASALSSSAASVASQSASQSQA ASQSQAAASAFRQAASQSASQSDSRAGSQSSTKTTSTSTSGSQADSR SASSSASQASASAFAQQSSASLSSSSSFSSAFSSATSISAV (SEQ ID NO: 500) Argiope aurantia TuSp GSLASSFASALSASAASVASSAAAQAASQSQAAASAFSRAASQSASQ SAARSGAQSISTTTTTSTAGSQAASQSASSAASQASASSFARASSAS LAASSSFSSAFSSANSLSALGNVGYQLGFNVANNLGIGNAAGLGNAL SQAVSSVGVGASSSTYANAVSNAVGQFLAGQGILNAANA (SEQ ID NO: 501) Deinopis spinosa TuSp GASASAYASAISNAVGPYLYGLGLFNQANAASFASSFASAVSSAVAS ASASAASSAYAQSAAAQAQAASSAFSQAAAQSAAAASAGASAGAGAS AGAGAVAGAGAVAGAGAVAGASAAAASQAAASSSASAVASAFAQSAS YALASSSAFANAFASATSAGYLGSLAYQLGLTTAYNLGLSNAQAFAS TLSQAVTGVGL (SEQ ID NO: 502) Nephila clavipes TuSp GATAASYGNALSTAAAQFFATAGLLNAGNASALASSFARAFSASAES QSFAQSQAFQQASAFQQAASRSASQSAAEAGSTSSSTTTTTSAARSQ AASQSASSSYSSAFAQAASSSLATSSALSRAFSSVSSASAASSLAYS IGLSAARSLGIADAAGLAGVLARAAGALGQ (SEQ ID NO: 503) Argiope trifasciata Flag GGAPGGGPGGAGPGGAGFGPGGGAGFGPGGGAGFGPGGAAGGPGGPG GPGGPGGAGGYGPGGAGGYGPGGVGPGGAGGYGPGGAGGYGPGGSGP GGAGPGGAGGEGPVTVDVDVTVGPEGVGGGPGGAGPGGAGFGPGGGA GFGPGGAPGAPGGPGGPGGPGGPGGPGGVGPGGAGGYGPGGAGGVGP AGTGGFGPGGAGGFGPGGAGGFGPGGAGGFGPAGAGGYGPGGVGPGG AGGFGPGGVGPGGSGPGGAGGEGPVTVDVDVSV (SEQ ID NO: 504) Nephila clavipes Flag GVSYGPGGAGGPYGPGGPYGPGGEGPGGAGGPYGPGGVGPGGSGPGG YGPGGAGPGGYGPGGSGPGGYGPGGSGPGGYGPGGSGPGGYGPGGSG PGGYGPGGYGPGGSGPGGSGPGGSGPGGYGPGGTGPGGSGPGGYGPG GSGPGGSGPGGYGPGGSGPGGFGPGGSGPGGYGPGGSGPGGAGPGGV GPGGFGPGGAGPGGAAPGGAGPGGAGPGGAGPGGAGPGGAGPGGAGP GGAGGAGGAGGSGGAGGSGGTTIIEDLDITIDGADGPITISEELPIS GAGGSGPGGAGPGGVGPGGSGPGGVGPGGSGPGGVGPGGSGPGGVGP GGAGGPYGPGGSGPGGAGGAGGPGGAYGPGGSYGPGGSGGPGGAGGP YGPGGEGPGGAGGPYGPGGAGGPYGPGGAGGPYGPGGEGGPYGP (SEQ ID NO: 505) Latrodectus AcSp GINVDSDIGSVTSLILSGSTLQMTIPAGGDDLSGGYPGGFPAGAQPS hesperus GGAPVDFGGPSAGGDVAAKLARSLASTLASSGVFRAAFNSRVSTPVA VQLTDALVQKIASNLGLDYATASKLRKASQAVSKVRMGSDTNAYALA ISSALAEVLSSSGKVADANINQIAPQLASGIVLGVSTTAPQFGVDLS SINVNLDISNVARNMQASIQGGPAPITAEGPDFGAGYPGGAPTDLSG LDMGAPSDGSRGGDATAKLLQALVPALLKSDVFRAIYKRGTRKQVVQ YVTNSALQQAASSLGLDASTISQLQTKATQALSSVSADSDSTAYAKA FGLAIAQVLGTSGQVNDANVNQIGAKLATGILRGSSAVAPRLGIDLS (SEQ ID NO: 506) Argiope trifasciata AcSp GAGYTGPSGPSTGPSGYPGPLGGGAPFGQSGFGGSAGPQGGFGATGG ASAGLISRVANALANTSTLRTVLRTGVSQQIASSVVQRAAQSLASTL GVDGNNLARFAVQAVSRLPAGSDTSAYAQAFSSALFNAGVLNASNID TLGSRVLSALLNGVSSAAQGLGINVDSGSVQSDISSSSSFLSTSSSS ASYSQASASSTS (SEQ ID NO: 507) Uloborus diversus AcSp GASAADIATAIAASVATSLQSNGVLTASNVSQLSNQLASYVSSGLSS TASSLGIQLGASLGAGFGASAGLSASTDISSSVEATSASTLSSSASS TSVVSSINAQLVPALAQTAVLNAAFSNINTQNAIRIAELLTQQVGRQ YGLSGSDVATASSQIRSALYSVQQGSASSAYVSAIVGPLITALSSRG VVNASNSSQIASSLATAILQFTANVAPQFGISIPTSAVQSDLSTISQ SLTAISSQTSSSVDSSTSAFGGISGPSGPSPYGPQPSGPTFGPGPSL SGLTGFTATFASSFKSTLASSTQFQLIAQSNLDVQTRSSLISKVLIN ALSSLGISASVASSIAASSSQSLLSVSA (SEQ ID NO: 508) Euprosthenops MaSp1 GGQGGQGQGRYGQGAGSSAAAAAAAAAAAAAA (SEQ ID NO: australis 509) Tetragnatha MaSp1 GGLGGGQGAGQGGQQGAGQGGYGSGLGGAGQGASAAAAAAAA (SEQ kauaiensis ID NO: 510) Argiope aurantia MaSp2 GGYGPGAGQQGPGSQGPGSGGQQGPGGLGPYGPSAAAAAAAA (SEQ ID NO: 511) Deinopis spinosa MaSp2 GPGGYGGPGQQGPGQGQYGPGTGQQGQGPSGQQGPAGAAAAAAAAA (SEQ ID NO: 512) Nephila clavata MaSp2 GPGGYGLGQQGPGQQGPGQQGPAGYGPSGLSGPGGAAAAAAA (SEQ ID NO: 513)
(77) Fiber-forming block copolymer polypeptides from the blocks and/or macro-repeat domains, according to certain embodiments of the invention, is described in International Publication No. WO/2015/042164, incorporated by reference. Natural silk sequences obtained from a protein database such as GenBank or through de novo sequencing are broken up by domain (N-terminal domain, repeat domain, and C-terminal domain). The N-terminal domain and C-terminal domain sequences selected for the purpose of synthesis and assembly into fibers include natural amino acid sequence information and other modifications described herein. The repeat domain is decomposed into repeat sequences containing representative blocks, usually 1-8 depending upon the type of silk, that capture critical amino acid information while reducing the size of the DNA encoding the amino acids into a readily synthesizable fragment. In some embodiments, a properly formed block copolymer polypeptide comprises at least one repeat domain comprising at least 1 repeat sequence, and is optionally flanked by an N-terminal domain and/or a C-terminal domain.
(78) In some embodiments, a repeat domain comprises at least one repeat sequence. In some embodiments, the repeat sequence is 150-300 amino acid residues. In some embodiments, the repeat sequence comprises a plurality of blocks. In some embodiments, the repeat sequence comprises a plurality of macro-repeats. In some embodiments, a block or a macro-repeat is split across multiple repeat sequences.
(79) In some embodiments, the repeat sequence starts with a Glycine, and cannot end with phenylalanine (F), tyrosine (Y), tryptophan (W), cysteine (C), histidine (H), asparagine (N), methionine (M), or aspartic acid (D) to satisfy DNA assembly requirements. In some embodiments, some of the repeat sequences can be altered as compared to native sequences.
(80) In some embodiments, the repeat sequences can be altered such as by addition of a serine to the C terminus of the polypeptide (to avoid terminating in F, Y, W, C, H, N, M, or D). In some embodiments, the repeat sequence can be modified by filling in an incomplete block with homologous sequence from another block. In some embodiments, the repeat sequence can be modified by rearranging the order of blocks or macrorepeats.
(81) In some embodiments, non-repetitive N- and C-terminal domains can be selected for synthesis. In some embodiments, N-terminal domains can be by removal of the leading signal sequence, e.g., as identified by SignalP (Peterson, T. N., et. Al., SignalP 4.0: discriminating signal peptides from transmembrane regions, Nat. Methods, 8:10, pg. 785-786 (2011).
(82) In some embodiments, the N-terminal domain, repeat sequence, or C-terminal domain sequences can be derived from Agelenopsis aperta, Aliatypus gulosus, Aphonopelma seemanni, Aptostichus sp. AS217, Aptostichus sp. AS220, Araneus diadematus, Araneus gemmoides, Araneus ventricosus, Argiope amoena, Argiope argentata, Argiope bruennichi, Argiope trifasciata, Atypoides riversi, Avicularia juruensis, Bothriocyrtum californicum, Deinopis Spinosa, Diguetia canities, Dolomedes tenebrosus, Euagrus chisoseus, Euprosthenops australis, Gasteracantha mammosa, Hypochilus thorelli, Kukulcania hibernalis, Latrodectus hesperus, Megahexurafulva, Metepeira grandiosa, Nephila antipodiana, Nephila clavata, Nephila clavipes, Nephila madagascariensis, Nephila pilipes, Nephilengys cruentata, Parawixia bistriata, Peucetia viridans, Plectreurys tristis, Poecilotheria regalis, Tetragnatha kauaiensis, or Uloborus diversus.
(83) In some embodiments, the silk polypeptide nucleotide coding sequence can be operatively linked to an alpha mating factor nucleotide coding sequence. In some embodiments, the silk polypeptide nucleotide coding sequence can be operatively linked to another endogenous or heterologous secretion signal coding sequence. In some embodiments, the silk polypeptide nucleotide coding sequence can be operatively linked to a 3 FLAG nucleotide coding sequence. In some embodiments, the silk polypeptide nucleotide coding sequence is operatively linked to other affinity tags such as 6-8 His residues (SEQ ID NO: 520).
(84) Silk-Like Polypeptides
(85) In some embodiments, the P. pastoris strains disclosed herein have been modified to express a silk-like polypeptide. Methods of manufacturing preferred embodiments of silk-like polypeptides are provided in WO 2015/042164, especially at Paragraphs 114-134, incorporated herein by reference. Disclosed therein are synthetic proteinaceous copolymers based on recombinant spider silk protein fragment sequences derived from MaSp2, such as from the species Argiope bruennichi. Silk-like polypeptides are described that include two to twenty repeat units, in which a molecular weight of each repeat unit is greater than about 20 kDa. Within each repeat unit of the copolymer are more than about 60 amino acid residues that are organized into a number of quasi-repeat units. In some embodiments, the repeat unit of a polypeptide described in this disclosure has at least 95% sequence identity to a MaSp2 dragline silk protein sequence.
(86) In some embodiments, each repeat unit of a silk-like polypeptide comprises from two to twenty quasi-repeat units (i.e., n3 is from 2 to 20). Quasi-repeats do not have to be exact repeats. Each repeat can be made up of concatenated quasi-repeats. Equation 1 shows the composition of a repeat unit according the present disclosure and that incorporated by reference from WO 2015/042164. Each silk-like polypeptide can have one or more repeat units as defined by Equation 1.
{GGY-[GPG-X.sub.1]n.sub.1-GPS-(A)n.sub.2}n.sub.3 (SEQ ID NO: 514).(Equation 1)
(87) The variable compositional element X.sub.1 (termed a motif) is according to any one of the following amino acid sequences shown in Equation 2 and X.sub.1 varies randomly within each quasi-repeat unit.
X.sub.1=SGGQQ (SEQ ID NO: 515) or GAGQQ (SEQ ID NO: 516) or GQGPY (SEQ ID NO: 517) or AGQQ (SEQ ID NO: 518) or SQ(Equation 2)
(88) Referring again to Equation 1, the compositional element of a quasi-repeat unit represented by GGY-[GPG-X.sub.1].sub.n1-GPS (SEQ ID NO: 521) in Equation 1 is referred to a first region. A quasi-repeat unit is formed, in part by repeating from 4 to 8 times the first region within the quasi-repeat unit. That is, the value of n1 indicates the number of first region units that are repeated within a single quasi-repeat unit, the value of n1 being any one of 4, 5, 6, 7 or 8. The compositional element represented by (A).sub.n2 (SEQ ID NO: 522) (i.e., a polyA sequence) is referred to as a second region and is formed by repeating within each quasi-repeat unit the amino acid sequence A n.sub.2 times (SEQ ID NO: 522). That is, the value of n.sub.2 indicates the number of second region units that are repeated within a single quasi-repeat unit, the value of n.sub.2 being any one of 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20. In some embodiments, the repeat unit of a polypeptide of this disclosure has at least 95% sequence identity to a sequence containing quasi-repeats described by Equations 1 and 2. In some embodiments, the repeat unit of a polypeptide of this disclosure has at least 80%, or at least 90%, or at least 95%, or at least 99% sequence identity to a sequence containing quasi-repeats described by Equations 1 and 2.
(89) In additional embodiments, 3 long quasi repeats are followed by 3 short quasi-repeat units. Short quasi-repeat units are those in which n.sub.1=4 or 5. Long quasi-repeat units are defined as those in which n.sub.1=6, 7 or 8. In some embodiments, all of the short quasi-repeats have the same X.sub.1 motifs in the same positions within each quasi-repeat unit of a repeat unit. In some embodiments, no more than 3 quasi-repeat units out of 6 share the same X.sub.1 motifs.
(90) In additional embodiments, a repeat unit is composed of quasi-repeat units that do not use the same X.sub.1 more than two occurrences in a row within a repeat unit. In additional embodiments, a repeat unit is composed of quasi-repeat units where at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 of the quasi-repeats do not use the same X.sub.1 more than 2 times in a single quasi-repeat unit of the repeat unit.
(91) Thus, in some embodiments, provided herein are strains of yeast that recombinantly express silk-like polypeptides with a reduced degradation to increase the amount of full-length polypeptides present in the isolated product from a cell culture. In some embodiments, the strain expressing a silk-like polypeptide is a P. pastoris strain comprises a PAS_chr4_0584 knock-out and a PAS_chr3_1157 knock-out.
(92) Equivalents and Scope
(93) Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments in accordance with the invention described herein. The scope of the present invention is not intended to be limited to the above Description, but rather is as set forth in the appended claims.
(94) In the claims, articles such as a, an, and the may mean one or more than one unless indicated to the contrary or otherwise evident from the context. Claims or descriptions that include or between one or more members of a group are considered satisfied if one, more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process unless indicated to the contrary or otherwise evident from the context. The invention includes embodiments in which exactly one member of the group is present in, employed in, or otherwise relevant to a given product or process.
(95) The invention includes embodiments in which more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process.
(96) It is also noted that the term comprising is intended to be open and permits but does not require the inclusion of additional elements or steps. When the term comprising is used herein, the term consisting of is thus also encompassed and disclosed.
(97) Where ranges are given, endpoints are included. Furthermore, it is to be understood that unless otherwise indicated or otherwise evident from the context and understanding of one of ordinary skill in the art, values that are expressed as ranges can assume any specific value or subrange within the stated ranges in different embodiments of the invention, to the tenth of the unit of the lower limit of the range, unless the context clearly dictates otherwise.
(98) All cited sources, for example, references, publications, databases, database entries, and art cited herein, are incorporated into this application by reference, even if not expressly stated in the citation. In case of conflicting statements of a cited source and the instant application, the statement in the instant application shall control.
(99) Section and table headings are not intended to be limiting.
EXAMPLES
(100) Below are examples of specific embodiments for carrying out the present invention. The examples are offered for illustrative purposes only, and are not intended to limit the scope of the present invention in any way. Efforts have been made to ensure accuracy with respect to numbers used (e.g., amounts, temperatures, etc.), but some experimental error and deviation should, of course, be allowed for.
(101) The practice of the present invention will employ, unless otherwise indicated, conventional methods of protein chemistry, biochemistry, recombinant DNA techniques and pharmacology, within the skill of the art. Such techniques are explained fully in the literature. See, e.g., T. E. Creighton, Proteins: Structures and Molecular Properties (W.H. Freeman and Company, 1993); A. L. Lehninger, Biochemistry (Worth Publishers, Inc., current addition); Sambrook, et al., Molecular Cloning: A Laboratory Manual (2nd Edition, 1989); Methods In Enzymology (S. Colowick and N. Kaplan eds., Academic Press, Inc.); Remington's Pharmaceutical Sciences, 18th Edition (Easton, Pa.: Mack Publishing Company, 1990); Carey and Sundberg Advanced Organic Chemistry 3rd Ed. (Plenum Press) Vols A and B (1992).
Example 1: Production of Recombinant Yeast Expressing 18B
(102) First, we transformed a strain of P. pastoris to abrogate KU70 function to facilitate further editing and engineering. A HIS+ derivative of Pichia pastoris (Komagataella phaffii) strain GS115 (NRRL Y15851) was electroporated with a DNA cassette consisting of homology arms flanking a zeocin resistance marker and targeting the KU70 locus. A map of the cassette is shown in
(103) Then, we modified this strain to express a recombinant gene encoding a silk-like polypeptide. A HIS+ derivative of Pichia pastoris (Komagataella phaffli) strain GS115 (NRRL Y15851) was transformed with a recombinant vector (SEQ ID NO: 462) to cause expression and secretion of a silk-like polypeptide (18B) (SEQ ID NO: 463). Transformation was accomplished by electroporation as described in PMID 15679083, incorporated by reference herein.
(104) Each vector includes an 18B expression cassette with the polynucleotide sequence encoding the silk-like protein in the recombinant vectors flanked by a promoter (pGCW14) and a terminator (tAOX1 pA signal). The recombinant vectors further comprised dominant resistance markers for selection of bacterial and yeast transformants, and a bacterial origin of replication. The first recombinant vector included targeting regions that directed integration of the 18B polynucleotide sequences immediately 3 of the AOX2 loci in the Pichia pastoris genome. The resistance marker in the first vector conferred resistance to G418 (aka geneticin). The second recombinant vector included targeting regions that directed integration of the 18B polynucleotide sequences immediately 3 of the TEF1 loci in the Pichia pastoris genome. The resistance marker in the second vector conferred resistance to Hygromycin B.
Example 2: Generating a Library of Single Protease KO Mutants
(105) After successful transformation and secretion of 18B in a recombinant Pichia pastoris strain, 65 open reading frames (ORFs) encoding proteases were individually targeted for deletion (Table 2). Cells were transformed with vector comprising a DNA cassette with 1150 bp homology arms flanking a nourseothricin resistance marker. A plasmid map comprising the nourseothricin resistance marker is shown in
(106) Homology arms used for each target were amplified by the primers provided in Table 7, and inserted into the nourseothricin resistance plasmid. Homology arms were inserted into the nourseothricin plasmid to generate cassettes comprising a nourseothricin resistance marker flanded by 3 and 5 homology arms to the target protease as shown in
(107) The homology arms in each vector targeted one of the 65 desired protease loci as provided in Table 2. Transformants were plated on YPD agar plates supplemented with nourseothricin, and incubated for 48 hours at 30 C.
(108) TABLE-US-00002 TABLE 2 Proteases targeted for deletion in P. Pastoris strain Protease Protease ORF polypeptide sequence Sequence Protease Gene Symbol (SEQ ID NO:) (SEQ ID NO:) PAS_chr4_0584 (YPS1-1) 1 67 PAS_chr3_1157 (YPS1-2) 2 68 PAS_chr3_0299 (YPS1-3) 3 PAS_chr3_0303 4 PAS_chr3_0866 5 PAS_chr3_0394 6 PAS_chr1-1_0379 (MCK7) 7 PAS chr1-1 0174 8 PAS chr1-1 0226 9 PAS_chr3_1087 10 PAS_chr3_0076 11 PAS_chr3_0691 12 PAS_chr3_0815 13 PAS_chr1-4_0164 14 PAS_chr3_0979 15 PAS_chr3_0803 16 PAS_chr2-1_0366 17 PAS_chr3_0842 18 PAS_chr1-3_0195 19 PAS_chr1-4_0052 20 PAS_chr2-2_0057 21 PAS_chr1-3_0150 22 PAS_chr1-3_0221 23 PAS_FragD_0022 24 PAS_chr2-1_0159 25 PAS_chr2-1_0326 26 PAS_chr1-4_0611 27 PAS_chr1-1_0274 28 PAS_chr4_0834 29 PAS_chr3_0896 30 PAS_chr3_0561 31 PAS_chr3_0633 32 PAS_chr4_0013 33 PAS_chr2-1_0172 34 PAS_chr1-4_0251 35 PAS_chr4_0874 36 PAS_chr3_0513 37 PAS_chr1-1_0127 38 PAS_chr4_0686 39 PAS_chr2-2_0056 40 PAS_chr2-2_0159 41 PAS_chr3_0388 42 PAS_chr3_0419 43 PAS_chr1-3_0258 44 PAS_chr4_0913 45 PAS_chr1-1_0066 46 PAS_chr2-2_0310 47 PAS_chr1-3_0261 48 PAS_chr2-1_0546 49 PAS_chr2-2_0398 50 PAS_chr4_0835 51 PAS_chr1-1_0491 52 PAS_chr2-1_0447 53 PAS_chr1-3_0053 54 PAS_chr3_0200 55 PAS_chr1-3_0105 56 PAS_chr3_0635 57 PAS_chr4_0503 58 PAS_chr2-1_0569 59 PAS_chr3_1223 60 PAS_chr2-1_0597 61 PAS_chr1-1_0327 62 PAS_chr2-2_0380 63 PAS_chr3_0928 64 PAS_chr1-3_0184 65
Example 3: Testing Single Protease Knockout Clones for Reduced Protein Degradation
(109) Resulting clones were inoculated into 400 L of Buffered Glycerol-complex Medium (BMGY) in 96-well blocks, and incubated for 48 hours at 30 C. with agitation at 1,000 rpm. Following the 48-hour incubation, 4 L of each culture was used to inoculate 400 L of BMGY in 96-well blocks, which were then incubated for 48 hours at 30 C. Guanidine thiocyanate was added to a final concentration of 2.5M to the cell cultures to extract the recombinant protein. After a 5 minute incubation, solutions were centrifuged and the supernatant was sampled and analyzed by western blot.
(110) Western blot data for a representative clone of each protease knock-out is shown in
Example 4: Generating a Library of Protease Double Knock-Outs
(111) In addition to the individual KOs, different pair-wise combinations of proteases were knocked out. These proteases were selected, in part, because they were paralogs that may have compensatory function with respect to each other.
(112) To generate double knockouts, nourseothricin resistance was eliminated from the single protease knock-out strains produced in Example 2, and a second protease deleted by transformation with a second nourseothricin resistance cassette as provided in Example 2. Transformants were plated on YPD agar plates supplemented with nourseothricin, and incubated for 48 hours at 30 C. Double protease knock-outs tested are provided in Table 3.
(113) TABLE-US-00003 TABLE 3 Protease double KO strains of P. Pastoris expressing silk-like polypeptide Double ORF SEQ ORF SEQ KO Strain Protease KO 1 ID NO: Protease KO 2 ID NO: 1 PAS_chr1-1_0379 7 PAS_chr3_0299 3 2 PAS_chr3_0394 6 PAS_chr3_0303 4 3 PAS_chr4_0584 1 PAS_chr3_1157 2 4 PAS_chr3_0076 11 PAS_chr1-4_0164 14 5 PAS_chr4_0584 1 PAS_chr3_0299 3 6 PAS_chr1-3_0195 19 PAS_chr1-4_0289 66 7 PAS_chr3_0896 30 PAS_chr2-2_0310 47 8 PAS_chr3_0394 6 PAS_chr3_1157 2
Example 5: Testing Double Protease Knockout Clones for Reduced Protein Degradation
(114) Resulting clones were inoculated into 400 L of Buffered Glycerol-complex Medium (BMGY) in 96-well blocks, and incubated for 48 hours at 30 C. with agitation at 1,000 rpm. Following the 48-hour incubation, 4 L of each culture was used to inoculate 400 L of BMGY in 96-well blocks, which were then incubated for 48 hours at 30 C. Guanidine thiocyanate was added to a final concentration of 2.5M to the cell cultures to extract the recombinant protein. After a 5 min incubation, solutions were centrifuged and the supernatant was sampled and analyzed by western blot.
(115)
Example 6: Additional Protease Knock-Out Strains
(116) As shown in Examples 4 and 5, a modified Pichia pastoris cell capable of producing a desired protein (e.g., 18B) was transformed to delete proteases at PAS_chr4_0584 and PAS_chr3_1157 to mitigate degradation of the desired protein. We further knocked out one or more additional proteases to enhance the production of full-length products and minimize degradation.
(117) For each additional knockout, an additional protease gene was deleted from a single protease KO (1KO), double protease KO (2KO), triple protease KO (3KO), or quadruple protease KO (4KO) by transformation with a nourseothricin resistance cassette with homology arms targeting the desired gene as provided in Example 2. The protease genes knocked out in each strain are shown in Table 4:
(118) TABLE-US-00004 TABLE 4 2X-5X KO Strains KO Strain Protease Genes Knocked Out 2X KO PAS_chr4_0584 (YPS1-1) PAS_chr3_1157 (YPS1-2) 3X KO PAS_chr4_0584 (YPS1-1) PAS_chr3_1157 (YPS1-2) PAS_chr3_0688 (YPS1-5) 4X KO PAS_chr4_0584 (YPS1-1) PAS_chr3_1157 (YPS1-2) PAS_chr3_0688 (YPS1-5) PAS_chr1-1_0379 (MCK7) 5X KO PAS_chr4_0584 (YPS1-1) PAS_chr3_1157 (YPS1-2) PAS_chr3_0688 (YPS1-5) PAS_chr1-1_0379 (MCK7) PAS_chr3_0299 (YPS1-3)
(119) The resulting cells were isolated on selective media plates (by auxotrophy or antibiotic resistance marker) and individual clones were isolated for further testing. Individual clones were tested by liquid culture assay under product protein producing conditions as follows: Isolated colonies of each strain were inoculated into 400 L of Buffered Glycerol-complex Medium (BMGY) in 96-well blocks, and incubated for 48 hours at 30 C. with agitation at 1,000 rpm. Following the 48-hour incubation, 4 L of each culture was used to inoculate either 400 L of BMGY or 400 L of YPD (Yeast Extract Peptone Dextrose Medium) in 96-well blocks, which were then incubated for 48 hours at 30 C. with agitation at 1,000 rpm.
(120) Protein expressed by the cells was isolated and analyzed for degradation as follows: Guanidine thiocyanate was added to a final concentration of 2.5M to the cell cultures to extract the recombinant protein. After a 5 min incubation, solutions were centrifuged and the supernatant was sampled and analyzed by western blot.
(121)
OTHER EMBODIMENTS
(122) It is to be understood that the words which have been used are words of description rather than limitation, and that changes may be made within the purview of the appended claims without departing from the true scope and spirit of the invention in its broader aspects.
(123) While the present invention has been described at some length and with some particularity with respect to the several described embodiments, it is not intended that it should be limited to any such particulars or embodiments or any particular embodiment, but it is to be construed with references to the appended claims so as to provide the broadest possible interpretation of such claims in view of the prior art and, therefore, to effectively encompass the intended scope of the invention.
(124) All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, section headings, the materials, methods, and examples are illustrative only and not intended to be limiting.
SEQUENCE LISTING
(125) TABLE-US-00005 TABLE 5 Open reading frame nucleotide sequence for proteases targeted for deletion in P. pastoris Protease Gene Symbol/Locus tag SEQ ID NO: Open reading frame nucleotide sequence (5to 3) PAS_chr4_0584 1 atgttgaaggatcagttcttgttatgggttgctttgatagcgagcgtaccggtttccggcgtgatggcagctcctagcgagtccgggcataa cacggttgaaaaacgagatgccaaaaacgttgttggcgttcaacagttggacttcagcgttctgaggggtgattccttcgaaagtgcctctt cagagaacgtgcctcggcttgtgaggagagatgacacgctagaagctgagctaatcaaccagcaatcattctacttgtcacgactgaaagtt ggatcacatcaagcggatattggaatcctagtggacacaggatcctctgatttatgggtaatggactcggtaaacccatactgcagtagccg ttcccgcgtgaagagagatatacacgatgagaagatcgccgaatgggatcccatcaatctcaagaaaaatgaaacttctcagaataaaaatt tttgggattggctcgttggaactagcactagttctccttccaccgccacggcaactggtagtggtagtggtagtggtagtggtagtggtagt ggtagtgctgccacagccgtatcggtaagttctgcacaggcaacattggattgctctacgtatggaacgtttgatcacgctgattcctcgac gttccatgacaataatacagactttttcatctcatacgctgataccacttttgcttcaggaatctggggttatgacgacgtcattatcgacg gcatagaggtgaaagaactttccttcgccgttgcagacatgaccaattcctctattggtgtgttaggtattggactgaaaggcctagaatcc acatatgctagtgcatcttcggtcagtgaaatgtatcagtatgacaatttgccagccaagatggtcaccgatgggttgatcaacaaaaatgc atactccttgtacttgaactccaaggacgcctcaagtggttccatcctctttggaggtgtggatcatgaaaaatattcgggacaattgttga cagttccagtcatcaacacactcgcttccagtggttacagagaggcaattcgtttacaaattactttaaatggaatagatgtgaaaaagggt tctgaccagggaactcttttacaagggagatttgctgcattattggactctggagctacgctaacgtatgctccttcttctgttttaaattc aattggccggaacctgggcggctcctatgattcgtcaagacaagcttataccattcgttgtgtttctgcatcagataccacttctctggtat tcaattttgggggtgctacagtggaagtttccctgtacgatctacagattgcaacatattacaccgggggaagtgccacgcaatgtcttatt ggaatattcagctctggaagtgatgagtttgtgctcggtgataccttcttgaggtcagcctacgtggtttacgatcttgatgggcttgaagt gtcgcttgcccaagccaacttcaacgaaaccgattctgatgttgaggctattacctccagtgtaccttccgctactcgtgcatccggataca gttctacatggtctggttctgccagcggtacagtttacacttcggttcagatggaatccggtgctgcttccagctccaactcttctggatcg aatatgggttcctcttcctcatcgtcctcttcatcgtcctcgacttccagtggagacgaagaaggagggagctccgccaacagggtcccctt cagctacctttctctctgtttggtagttattctcggcgtgtgtatagtatag PAS_chr3_1157 2 atgatcatcaaccacttggtattgacagccctcagcattgcactagcaagtgcgcaactccaatcgcctttcaaggctaacaagttgccatt caaaaagtttatcattccaacgacccaaaggaccgtttaattaagagagatgactacgagtccctcgacttgagacacatcggagtcttgta cactgcagagatccaaattggatctgacgaaactgaaattgaggtcattgtcgacactggttctgccgacttgtgggtcatcgattccgacg ctgccgtctgtgagttatcctacgatgagattgaggccaatagcttttcctcggcttctgccaaattcatggacaagatagctcctccatca caagagctcctggatgggctgagtgagtttggatttgctctcgatggtgaaatttctcaatacctagccgataaatctggacgtgtttcgaa aagagaggaaaatcaacaagatttcaacattaaccgtgacgagcctgtgtgtgaacagtttggttccttcgattctagttcttccgacactt tccaaagcaacaattcagcttttggtattgcttaccttgatggaaccactgctaacggaacttgggtcagggacacagtccgcatcggcgac tttgccatcagccaacagagttttgccttagtcaacatcacagataactacatgggaatcttgggtctcggtcctgctacccaacaaaccac caatagtaacccaattgcagcaaacagatttacttatgatggtgttgtggattcattgcggtcccaaggatttatcaattcagcatcgtttt ctgtttacttgtctccagatgaagataacgagcacgacgaattcagcgacggagaaattttatttggtgctattgatagggccaagatagac gggccatttagacttttcccatatgtcaatccttacaaaccagtttaccccgatcaatatacttcctacgttacagtgtccacaattgcggt gtcttcgtcagatgaaactctcattattgaaagacgtcctcgtttggcattaatcgatacaggtgccaccttctcctatttgccaacctacc cattgattcgtttagcgttttccatccatggaggctttgaatatgtttctcaattgggactatttgtcattcgtacaagttctctgtctgtt gctagaaataaggtgattgagttcaagtttggtgaagacgttgtgatccaatccccagtttctgatcatctattggacgtctcaggcctttt tactgatggccaacaatactccgcattaactgtacgtgaaagtcttgacggactttccattctaggtgatacattcatcaaatcggcctact tattctttgacaatgaaaacagccagctgggtattggtcagatcaacgtcactgatgacgaggatattgaggtggtcggtgatttcactatt gaacgagacccagcctactcctctacttggtctagcgatttacctcatgaaacacccactagggctttgagtactgcttcagggggaggcct tggtaccggaataaacacggccacaagtcgtgcaagttctcgttccacatctggctctacttcacgaacttcttctacatctggctctgctt ctggtacttcttcaggtgcatcttctgctactcaaaatgacgaaacatccactgatcttggagctccagctgcatctttaagtgcaacgcca tgtctttttgccatcttgctgctcatgttgtag PAS_chr3_0299 3 atgaaccctagcagcttaattctacttgcactcagcattggctactccattgctgagtcaaatttctctttcaaacccagcaagttacctct caaaaaacatcgtgattcttcttccccgcatgaacgatttcttaaacgagatggaccctatcatccgctagaagccgacgcttacttttact acactacgtctatattggttggatcagaagaagaaaaagttgaagtaacagttgatttaggaacctctgatttatgggtcgtcgattacaac accggtttatgtgatagatcctttgacgaaacctatcttaaacgtagtctggatacttctgaggaagattattctgctggagatcttggctc ctcagtcggtgtacgcagcgctagaaaattcttgcgcaaaagggacaccaatcaaactgaggttaatgaagctaactatggtgcttgtccaa attcgattaccttcaatccagaaaactcgtcttctttccagagtaatgatactgctttcaatatcagctactttgatggaaccagtgctagt ggtttttgggctactgatacaatttactttggtgaccttgaggtcagcgagcaattttttgggctggcaaacttaacaataagttatggagg agtcttaggtcttggcccttccaacctacaaacaaccaatgctaaccccaacggtgaggaattcatttacagcggagtcttagattccatgc gtgatcaagggcttatcaactcggcttctttctcaatctatctcaatccagagaatttcagagatgaagataactattctaatgaaggagcg attttgttcggagcaattgataatgcgaagattgacgggtcattgaagctgttaccatacgtgacttcaggtggacactctcagattgatgc taatttcacttacatcaccttgaataatattgccgtggctgacaatgatacagccctgatcgttgagaccaacccccaattggcaatgttga atccaaagtttatatacacctattttccaaacgaagtattgacccggctggtaaactctattgacaatctagaatatgatcctgttgagggg ttatataggataaggagaacaaacattagggatattaacaaaaaaatcatagagtttcaatttggtgacgagattgtgatacattctccctt atcaaattatctgtctgatacatgggttccaagcacaaactacacctatttggagattcaggatagcagagaggatttctttatccttggta atgcatttttcaagtctgcgtatttgttttttgacaatgataacagtgaagtcggtattggccaactaaaggttaccgataaggaggacatc gttccagttggtgaattttctttggatcaagattcagggtactcgtcaacctggtcaacgttctcctatgaaactggttcagctcccttggg tacgtcaactttcgaaacgagtacaaaaactagttcagatggagctgccccgtcggtgtctcacattaacactagttcctacttatttgcgt ttgtactacttttcctttag PAS_chr3_0303 4 atgttgcccatccgcttatccaaacttctgcttttgctctccttaaagttgaaattgggtacagctgaagaaaaataccaaaagttggattt aaaaagaattgacaaagactattatgccgtcgatgtcaaagtcggctccgatgagcaggagatcaaagaggtactaatagatacgggttcat ctgatttctggatcttggacaaatcgttctgtaattctccaacatcagaggaagaagagaacagtaacgggcgtagcaacaaggaaagctgt ggagtctatggctcgttcgactccaacaagtcagagacatttcaggcaactggccaagtatttgacgctgcttacggtgacaccacagccga gtcgacaggatcttcaggagttcgaggaattgatcagctacgggtaggagatattcatatagaagaactctattttggactagtgacaaaca ctacaagtttaccacccgttttaggaattgcccagctttccgaagagttcagcaacaactcttatcctaactttccataccagatgaaagag gaaggtctgattgatgttgttgcatactctctctccttgggccaaagtaaaggtgaactactgttcggggctatggaccactcaaaatataa tggaacactattgaaagcccctatattgcaggcgggcacaccaggaatgcaagttcttttaactggagtggcccttacaaatggttcatcaa gcgtcttcaatgagacagacaataaaggttttatctactttgacagtgggactactgcttccactctgccatcagagcactttgatgatctt ttcaaccatcacggatgggcgtacgatggtgatacattgacatattcgattcaatgcgatagtgagggagaaaaatctttacttgacttcac tttagaatataccattgctggtaatattgtcatcaaagtaccatttgaagacattattatgaagaatgaaaatgatggagaatgcctctcaa ccgtaatggtgtcgaaccagacttctttttcatattccgatgacacaccctttttcgttgctggagacgaagttctgttgaacgcttatgtt gtttacaacctagaaacacaagagctggccattgctccagcagtggataatccagaagatactgaagaagatattgagattatctccgcaga ctttgatatttcagaagccagagattatagcgttggattagagttcagaaataccacaattccagctacaactgattacttgccttcctcga tgtcgtcaggttcagtcagcgaagagactggttccaagtctgagagctctacttctgaggactttgctgcagccacgttgaaaccatttaca ttttggggtttcgtcctttttttctttcactttttgatttga PAS_chr3_0866 5 atgttagttgctgttgccctagtgttgttactgtctacaggctatgctggaatcgtcgccattgataccgaatatgagttcaccattggttt tcttagtacgatagaaatagggtttcccccacaaagcataacggctcaatgggatacaggatcgtctgacctcttggtcaattccgtgacaa attcacagtgtgctcaggacggatgtagctttggtgcgttcgccttcaacaaatccaccacttattccaatataacaaaccctaacaacctt catgttcagttctcctttgcaagcggcagcgtggttgatgacaaacttgtgagtgacactatttttgtagattccaaggtaatcccacggtt caactttgcactggtatcgaagggagacctgtatggtgataatatttttggtattggaccgagagggaaccagggaacattcgattccaatg gaactccagctttctatgatagctttccttatcacttgaaggccctcggtttaatcaaacgactggcttactcattttacactgggcccacc cagggaaaggtagtatttggaggggtggatcatggaaagtacgatgggtgcctggagaaactcgagattgtccatgacagtgctttttacac actgcttgaggcaattgatgctgatgatacttccgtcttggatgagcaaattcatgttttgtttgatactggtaccgccttgacactttttc ccagctttattgctgaacaactggctgattttttgaaagctacatattcggacgaatacaatacgtttgtagttccctgcgaccaagatttt gattttgaataccttcattttggttttcgaaacattaagttgtcggtgcgctttaaggatctgtttttagtcattgacgatagtgtttgtgc tgtggggtttgatcaaggggcagatgcaaacaagataacctttgggtcttcacttttaagaaactactacacgctttatgatctagattcca aagaaattttgattgctgacgtcaagcctgatggtccagacgatattgaaatattatcgggtccagttcaacgaatttgtgatgaaaagggt gtcagtagcacttcattatggagtagtctgagtatagagtccacgatagaaccagacacttttaccactaagccttctatttcccagacacg gtattcgactagctccattggacctcaaaacatttctaactctttaggtgaatatccttcagtttccgtcactctttctgaacaccataaca ctacttccatagcctcaaattcctcattagaagggaaaccagcaactccaactgttacagaccagtcgtaccagaataataagactacctct accgtaattgctgtgaatttgattacccattcaaccactcattcaaccactcattcacccacctattcaaccactcattctagtaatggatc acgctcaactttagagtacacttcaaccaaggaatcctcggtgaaaatgccctgtgcgttgatcatctccgacacaattccgtacaatgctt ccggtgggaatagtagttatggatcgttaatttcaacatctacggttaacaatgttgaagagaataattcaaacactgttagaccaagaaaa agacagaccttcgtttcgggaaccacttccacgatactactctattcctcaactacgacccaagcatatcagatgttgtcctcaacttcaat cccccgaccatccataaaagccagttcaaatgctggtagccgcaaaacttcaaagacattattaacatttatcatattgtatattttttag PAS_chr3_0394 6 atgtaccaggcgttgttggttttgtctctgatatgcttttcgtcggctaattttgttaagctgcgaagcaacgctggtatgttttatgatac tatggctggagttccacgttcagatgaagagttctggttgcgtttggatattaaccaaggtctctcttggactctggatagtagctactact cctgtaatggctcaaatgtttcgtcttccctgtgtttcaattctgctcaaaacgtttacgatgcttccaatagtccaactgcagatttcgtt gatgtctacgcaaacacaactgtaaacaatacagatgaggcatcggccgagagagtaaatcttacaaacaacttatttgctgatggcgttta tatggaagacaatttttacgtcacattgaataatggagcaagaatgactgctacagatctgaaatttttgaatgcccacaatagtagcgccg ctgtggggtctttggcgttggggagttacacctcacaggacgtgccaactttcttacaaagactccaaagcggtggtcttattgaatccaac tcgttttcattggcattaaacgaaatcgattcttcatatggagagctctatttggggacaataaactctaccaagtatgtcgagcctctggt agaattcgattttattccggtgtcagatcccaatggagtttttggattcgattgggaagatacattccctacagttccgatcagcggattaa gcatgtcttcgaatgacaaacagagaactgtctttttccccaatgagtggaacaacacggtcttaacgggaacatacccacttccaatgatg ttagattcaagaaacatctttatccatcttccattctcttcaatcatacatatagcagtgcagcttaatgcactgtatcttgatacacttca taaatgggccgtgaactgttctgttggtcaactggacgcaactttaaactttcacatgggtaaccttaccgttcatgctcctatcaaggagt tgatttatccagcataccaaggagacaaaaggctgagctttgctaatggagaagatgtttgtattcttgccatggctcctgatgtttacatt ggttatccactgctaggaaccccctttttaaggaatgcagtggttgccgttaatcatgattcaaaaaaggtcgccgttgccaatcttaatag agatagcattcctcccgcttcgaacgtttctgtttcggaatcaatgggagtttatgttcctccacctgtttcaacttcaagaacatcggaga gaccgtccacactagatgagactagtacagccaattttgacaaaagggaagagtctgcaatatcatcaagttcagtcactaacagctcgtct agaaattcttcaaccataacttcttcaggaactcaaaccgagcaaacatcaggcatagctaccatcgaaacagatagcataccaggagctct agggaataatttaactgattattcaacgctgactctaacaatatacaccaattccgaagtggacgaactcaatcctaacatagcaacagcat tcatttccaatggttctatttattcagagccttaccccttttccggaactgcagttgctgaatcattcagtgcatcaccttcacaggctgaa ggatcgaactcatcgtcctcaggatcttctttagttttgtgtttctttacatcattggccagtctgttgactgtgagctgtctactactgta a PAS_chr1- 7 atgtttgtgatccagctggcattcctatgtctaggcgtcagcctaaccactgcacaacctagttcacctttcaaggcaaataagtttccttt 1_0379 taaaaaggttcactactcatcaaaccctagcgatcgccttattaagcgagacaactataagaagcttgacttgagacatcttggcgtcttgt atactgcggaaattgaaattggttcaggcaaaactgaaatcgaagttattgttgacaccggatctgcagatttgtgggtaattgactcaaat gcagccgtatgcgattgtcctatcttgagatacaaggtacaagtgtttccacccttagtcaaactgccaacgtaacacccctatcaggtaaa cttttgaatggacttcaagaaattggcattgtaactgatggcaaaatttccaaaaagtttcaggaaaaccatcttttgaagagaaacgaggc cttgaattttgatgtcgatctgaataagcccatttgtgatcaatttggatccttcaatccacagtcatcaagaacttttcaaagcaacgaca cagcatttagtatcagatatctggacaactcttttgccaatggatcgtgggtgagggatacggtttatgttggtgattttgaaattgaccag caaagttttgcattggttgatatcacaaataactacatgggaattctgggccttggtccttctagtcagcagacaaccaatagtgatcctac agataacagtttcacttatcttggtattctggattctttgcgggcccaaggattcattaattcagcctcgtactcggtttatctggccccag atggtaagactgatgatactgatcacgatgatggtgagatcctgtttggtgctatcgacgaggctaaaattaatggacagttgaagttgttt ccatatgtcaatccttataaatcggtataccctgaccaatacgcttcatacatcaccgtttccagtattactgtagccagttattttagtag ccgcttggttgaaagaatccctcaattagctcttttagacactggtgccacattttcttacttgccaacttatacgctgatacgtctcgcct atgccatccatcctggttttgagtatgtccgacaactgggtttatttattatagagtcaaacgtactctccagtgcgagacaaagtaccatt gacttccggtttggcaaagacgtagtaattcgatccaatgtttcagaccatctactcgacgtatcacaatacttcacatctggacattatct tgcacttaccatccatgaaagtgtcgatgggcttctcattttgggtgacacgtttatcaagtccacctacttatttttcgacaatgataaca gtgaattgggtattggtcagatcaaaattaccaatgacgaggatattcaagaagttggtgaattcaccttagaacgcgattcagactattct tctacatggtccatttactcttatgaaacttctttggatcccttaagcactggcactggtacggggtcaacctattctcctactcgcagtac tacagctagaagcgaaccgactacgtctcgacgctccaccacccttcaacccagaacaactgtgattccttctattgacaggctttcattga acagcataactagtcatggttcctctactaacggaacctccccaactaatgagacttcttttgctgaggatggaggaactttgacacccgaa gaagcttctttgacaacttcactaaattctgctactatttctgagactacttttgtcgatgttgaaacttctactaccaatggtgcttcagt tgtatctttgagtgttggtccctgcattattgccttcctactactcatctcttaa PAS chr1-1 8 atgagcatgggagctactgtttcaaaggagtccactgtagacctaacactgccgctgttgcagctgagtccaagactgttgttcctgcctgg 0174 agttgtctacaagacgactttcaagttccaggagggggtcaacatcttgctacgttttagagacctgttcgatgagtctttttctgaaagaa atgacgttctaggtgatattgcccgctcgcagaaggaacaacaggaaaacgattatgaccatatcccttttttgagcagcaatgctaagaag agcataggtgtcctgaaagaccaacttgaacttggtgggtctgatgacaagtcacttccctgggttattgcctgtctccctgggttcgacca gtcagaccaggactccattgccactacaatttgtcagataactgaggtgtccgtcgttaaccaggatattgtactatccttcgaagcattaa ccagaggatctttaaaatccaaaaagaccatctccatgaatgaatcaaccatatctgtggaagtggatataccatttactgaggttgaccag accatcagtaacaagctcatcttgacaaatattgataagggtctgcaactactggagaatatcaaacagtttctagtcacctatcaaaatga catgatgaaccttgaagatactaccatggaaaagaactcccgtctaaagtctgcaatgatgattttggctccgttgtctcacttgatctacg ccactgtctcatctcaagaatccactcatgcttatactagactatccaaccagtacaagtccgctaagaaggaattagattcaaccaaaaac agaaagtctttactcaagaagattttgaaaactaatgatattctcacttcagtgttccccttcagtatggttcaaaaggtggatgtcttggg agctatttcaagttctacagacaggatccaaacaactatcgacgcgttggactttgccaatccacttttcgaaacatatttgaacgttgatt atgttctggagacatggaaagattttgacactaagaacggcaaaattgctgccaatttgaccaggtctcaattagtatctaaccacttgaag ggcctcagagtactgattgaagacatccaaggaacttcaagaaggcgggtcagtccttctcagagaactcgtttggcgccttcgcccaatac aaattctgcaaatcaggcaccgaaagctggagaatcagacgacgaaaataaagaattgcgtgattttatcaacaacctctccaaattgaaga tctcagaggatggaaagaggctcgttaccaaagatttcaacagaatgactcaaatgcaaccaagttcatcggagtaccaactgctcagaact tatttagagattattatggatatcccatgggaaacaaaaaatattgtaaaacaacaaatttttgatctagacaaggccaaagaaacactaga tcaggaccattacggaatggactccgtcaaagataggatcttagagtatttagcagttcttaaactccacgatcacattaaaacgtccaacc ccaagcaagaagacgaggaaatcaaagccagagcacccattctcttactaacaggtccacctggtgttggtaaaacttcgttaggaaaatct attgcaaaggctctgaacaaaaagttccagcgagtaagtcttggaggattgaaggatgagtccgaaattaagggacatcgcagaacttacgt tggagcaatgccaggactattgacccaagcactgaggaaatctcaatcttttgatccagtgatacttttggatgaaattgacaaggttgtcg atggatcccaaggccctggtagtcgtgtaaacggtgatccagctgctgctttgcttgaagtgttagacccagagcaaaattctaacttctct gaccattatatcgggttcccacttgacttgtctcgtgttgtttttatctgtacgtccaacgatatgagcatgatcagtgccccattaaggga tagaatggaggttattgaactgaatggctacaattatttcgaaaaagtggagattgttaaacaattcttattaccaaagcagatcaaaagaa acggactgcctacgaatgccgaatcaccatcggtggttattcctgacgaagtgattatgtacatcgctgtcaattatactcgggagccaggt attcgtaatttggaacggttaatagggagtatctgtcggggtaaggctattgaatactctagcttgatgagtagtactcaagctccaggcga aattccaaagggatacgtttccaaggtcacggtagataatctttcaaagtacattggaatacccccggaattgtctacaggcaagaatatga ggaatgattcagctatctctaaaaagtacggaatcgtgaacggcctcagttacaatagtagcggacatggaagtaccctagtctttgaaatg accggtatacctaatagtactaacactaacatgattacgaccggcagattgggtgatgttcttacagaaagtgtcaagatcgcaagaacaat tataagatcgatgtttagtcacaacttactacaattaaaggatgacgaaacttcaacttctggggatcttttgaagaggtttgacactactc aggttcacatgcatgtgcccgctggtgctattcaaaaagacggacccagtgctggaatcaccattacgctgtgccttctgtcggtgatgcta gagaaacctgtaccaagggatttggccatgactggagagattactttgagagggatggtactgccaattggaggtgttcatgagaagctact aggagcacatttaactggaaccgttaaaagggtgatccttccaagaagtaatcgaagagatgtcattcaagactttatctctaacttggaag ccaataacagaagttctagggataagctactggtagatcttatcaaagaggaggagtcattactgtccaactcaaataaatccgaacgaatt ggagtgttcgggcttcctgaaaaatgggttcaagagaagttgggacttcaagtgagctacgtggaagaattttgggatgttatccagattgt ctggaacgatcaggttgaaattgacagcaccaaattacacgagctagctactaaagagttcgcaaggctatga PAS chr1-1 9 atgcaattgcgtcattccgttggattggctatcttatctgccatagcagtccaaggattgctaattcctaacattgagtcattacccagcca 0226 gtttggtgctaatggtgacagtgaacaaggtgtattagcccaccatggtaaacatcctaaagttgatatggctcaccatggaaagcatccta aaatcgctaaggattccaagggacaccctaagctttgccctgaagctttgaagaagatgaaagaaggccacccttcggctccagtcattact acccattccgcttctaaaaacttaatcccttactcttatattatagtcttcaagaagggtgtcacttcagaggatatcgacttccaccgtga ccttatctccactcttcatgaagagtctgtgagcaaattaagagagtcagatccaaatcactcatttttcgtttctaatgagaatggcgaaa caggttacaccggtgacttctccgttggtgacttgctcaagggttacaccggatacttcacggatgacactttagagcttatcagtaagcat ccagcagttgctttcattgaaagggattcgagagtatttgccaccgattttgaaactcaaaacggtgctccttggggtttggccagagtctc tcacagaaagcctctttccctaggcagcttcaacaagtacttatatgatggagctggtggtgaaggtgttacttcctatgttatcgatacag gtatccacgtcactcacaaagaattccagggtagagcatcttggggtaagaccattccagctggagacgttgatgacgatggaaacggtcac ggaactcactgtgctggtaccattgcttctgaaagctacggtgttgccaagaaggctaatgttgttgccatcaaggtcttgagatctaatgg ttctggttcgatgtcagatgttctgaagggtgttgagtatgccacccaatcccacttggatgctgttaaaaagggcaacaagaaatttaagg gctctaccgctaacatgtcactgggtggtggtaaatctcctgctttggaccttgcagtcaatgctgctgttaagaatggtattcactttgcc gttgcagcaggtaacgaaaaccaagatgcttgtaacacctcgccagcagctgctgagaatgccatcaccgtcggtgcatcaaccttatcaga cgctagagcttacttttctaactacggtaaatgtgttgacattttcgctccaggtttaaacattctttctacctacactggttcggatgacg caactgctaccttgtctggtacttcaatggcctctcctcacattgctggtctgttgacttacttcctatcattgcagcctgctgctggatct ctgtactctaacggaggatctgagggtgtcacacctgctcaattgaaaaagaacctcctcaagtatgcatctgtcggagtattagaggatgt tccagaagacactccaaacctcttggtttacaatggtggtggacaaaacctttcttctttctggggaaaggagacagaagacaatgttgctt cctccgacgatactggtgagtttcactcttttgtgaacaagcttgaatcagctgttgaaaacttggcccaagagtttgcacattcagtgaag gagctggcttctgaacttatttag PAS_chr3_1087 10 atgatatttgacggtactacgatgtcaattgccattggtttgctctctactctaggtattggtgctgaagccaaagttcattctgctaagat acacaagcatccagtctcagaaactttaaaagaggccaattttgggcagtatgtctctgctctggaacataaatatgtttctctgttcaacg aacaaaatgctttgtccaagtcgaattttatgtctcagcaagatggttttgccgttgaagcttcgcatgatgctccacttacaaactatctt aacgctcagtattttactgaggtatcattaggtacccctccacaatcgttcaaggtgattcttgacacaggatcctccaatttatgggttcc tagcaaagattgtggatcattagcttgcttcttgcatgctaagtatgaccatgatgagtcttctacttataagaagaatggtagtagctttg aaattaggtatggatccggttccatggaagggtatgtttctcaggatgtgttgcaaattggggatttgaccattcccaaagttgattttgct gaggccacatcggagccggggttggccttcgcttttggcaaatttgacggaattttggggcttgcttatgattcaatatcagtaaataagat tgttcctccaatttacaaggctttggaattagatctccttgacgaaccaaaatttgccttctacttgggggatacggacaaagatgaatccg atggcggtttggccacatttggtggtgtggacaaatctaagtatgaaggaaagatcacctggttgcctgtcagaagaaaggcttactgggag gtctcttttgatggtgtaggtttgggatccgaatatgctgaattgcaaaaaactggtgcagccatcgacactggaacctcattgattgcttt gcccagtggcctagctgaaattctcaatgcagaaattggtgctaccaagggttggtctggtcaatacgctgtggactgtgacactagagact ctttgccagacttaactttaaccttcgccggttacaactttaccattactccatatgactatactttggaggtttctgggtcatgtattagt gctttcacccccatggactttcctgaaccaataggtcctttggcaatcattggtgactcgttcttgagaaaatattactcagtttatgacct aggcaaagatgcagtaggtttagccaagtctatttag PAS_chr3_0076 11 atgaagctctccaccaatttgattctagctattgcagcagcttccgccgttgtctcagctgctccagttgctccagccgaagaggcagcaaa ccacttgcacaagcgtgcttactacaccgacacaaccaagactcacactttcactgaggttgttactgtctaccgaactttgaaaccgggcg aaagtatcccaactgactctccaagccacggtggtaaaagtactaaaaagggtaagggtagtaccactcactctggtgctccaggagctacc tctggtgctccaactgacgacaccacttcgactagtggctcagtagggttaccaactagcgcaacttcagttacctcttctacctcctctgc aagtacaacaagcagtggaacttcagccactagcactggtaccggtactagcactagcactagcactggtactggtactggtactacaggca caggaaccactagttccagcactagctcttctgctacttcgactccaaccggttctatcgacgctatcagccagacacttctggatactcac aatgataagcgtgctttgcacggcgtcccagaccttacttggtctaccgaactcgctgactacgcccaaggttacgccgattcatacacttg tggctcttcattagaacacacaggtggaccatacggtgaaaatttggcctctggatactctcctgctggcagtgtagaagcatggtacaacg agatcagcgactacgatttctctaacccaggttattctgctggtaccggtcacttcacccaagttgtctggaaatcaactacacagctgggc tgtggatacaaggagtgcagtaccgacagatactacatcatctgcgaatacgcacctcgtggaaatattgtttctgccggctacttcgaaga caacgtcctgcctcctgtttga PAS_chr3_0691 12 Atgactgtgcaaattttgattgtagttaccagtgttgctaagtatgaaagcggaaagctgccaacaggcttgtggttaagtgagttgacaca tatgtatcatagtgcaaaagagaacggctatgatgtgacgattgcgagtccgcaaggcggaaacattccgcttgaccctgaaagcttgaaat caatgctgattgacaagctttcaaaggattatgagacaaaccaagactttatgaagttgttgcaaaacacaaaaagtttgggtgaagtcaca ggacaacagtttgacgttgtttatttggcaggtggacacggaacaatgtatgactttccgaacaacactgttttacaaaacatcatcaaaga acactatgaggcgggcaaaattgttgccgctgtatgtcacggagtttgtgggcttttgaacgtaaaactgtctgatggcgagtatctaatca aagacaaggccattacaggatttaattggtttgaagaagctatagcaggacgcagaaaagaagtaccgttcaaccttgaagcagaattgaat aaaaaaacttcaaaatacgagaaagcttttatcccaatgacgtcaaaagtggtcgtggacgggaacttaatcacaggacagaacccattcag ttcaaaagaaattgcgaaagtggtaatggaacaactgaagcaataa PAS_chr3_0815 13 atgattgatgagaagcaattgaatcaacccaaaaggagcgtcttaagacgtctccatatgctgtttctgccattactagctatctccttttt cctgatatatttaagtgatatcacacagcctctcttccgtgcccgaaaggaagacgaaaacccgttggaaatttacttgaaggcattggaaa cgaatgaagctcacaaatggtcaaaggtgtacacttcgcagcctcatttggccggaaccaactacggattggttgagtttactaagtccaaa tttgaagaatatggatttgaggccagtgtcgatgactacgatgtgtacctgagttaccctattgatcatagtttggaattgtatgagcattc tgaggataaaaatgacaagctcttgtataaggcttcgctgcaagaggacgttctctctgaagacccaactacttcaggcgacgacctgatcc ctaccttccttggttacggtgctaacggcaatgtatctgcagaatacatctacgctaactatggaaccaaagaggactttgaggatttggtg gcccgtggtgttccaatcaaggggaagatcgcagtcattagatatggtcaaatatttagaggcttaaaggtgaaatttgcccaagaatatgg cgcaatcggtgctgtcatatacagtgacccaggcgacgattatggtatcacccctgaaaatggttacaagccttaccctcatggtaaagcca gaaacccaagctctgtgcaaagaggttctgcccaatttttgtctgtttatcccggtgacccaaccacgccaggagttggatcgaagaaggga gtagaaagagttgatcctcatgctacaaccccttccattccagtcttgcctttgagtttcaaagatgccttgccaattttgaagaaacttaa taaggaaggattgtctgttcctgactcctggaagggaggtctcgagggagttgattacagtaccggcccagctaaaaacattcatttgaacc tttatagcgaacaaaactttactattacacctatttacaatgtctatggagagatcaaaggtgagaatgctgacgaagttatcattattggt aaccatcgtgacgcttggattaagggaggtgcttctgaccctaacagtggatctgctgctttgattgaacttagtagaggtttgcacgccct aaccaaaacaggatggaagccacaccgtactattgtactagcttcctgggatgctgaggaatatggcttgattggatctactgagtttggag aacagtttgagaagttccttcagaagaaggtcgttgcctatttgaacgttgacgttgctgtagctggaactcatcttcatttgggtgcctcg ccatctttgttcaaactattgaaggataatgccaaagaaatcactttcaagaattcaaccgagactttgtatgacaactatgttaaagatca tggcaacgacattatttcgaccttaggaagtggaagtgactacactgtctttttggatcatttgggaattccttcgcttgatattggtttca ttgctggaaaaggtgacccagtatatcactatcattcaaactatgattcgtaccactggatcagtactagtggtgatcctggatttgagtat cataatgtactggccaaatatttgggttcgttggttttgaatctctctgagagagaggtgttgtacctgaagcttcatgattatgctaccga attgctcaagtacctcttggaagcctacgcccaaatgccagaggaatgggacgatgaagtaattggtttcagatcttcctcgtgtcatcgtg cgaaagcatctcatcatggtaaggatcctcatcatgagggaagacgccatcacggaaaaggattccattctaaaggagggcctcatcatggg gaacgccatcacggaaaaggattccacgctgaagggggaccccaccatgagaaaggaccgcatcacgaaaaagggctccacgtcgaaggaga gccccatcatcagaaaggacctcactttgaaaaaggattccatcatgacatggagatgtaccataagaaattggctcatcacggtaaagaac ccaagacgaagctaaagcacttgaagaaacaagttgagagtttaatcatcgatttcgccaataccactcaaacatatgacgcttacactgac ttccttcagaagcaacatgagattagggattctctttcattctgggagaaaatcaagctacattttaagatcaaggcagctaacttcaaact taaatattttgagcgagttttccttcatgaaaatggcttaaagaacagagaatggttcaaacatattgtatatgctgcaggaaggaacactg gttacgccggacaaagactgcctggtcttgtggaagccattgaagacaagaatctgcatgatgcagtaaaatggcttcacatcctttccaag aagattgatagtctacagaagtcattagagtag PAS_chr1- 14 atgagattacttcacatttcattgctatcaattatctcagtattgaccaaggccaacgctgaatgttgttacaccaacacacatactaccac 4_0164 tgaagtctggtatactacagtatatgctcgagatgttagtgaagagacttcttccacactggctggtggaagtgcaactgtcagctcagaag tgagttcgacaattgaatctagcgttgccacttccgctaccaccgaatcttcaagtgagacatcagggtccacatctgggtccacatctgcc actgaatcatcaactggtagtagctcgctagcaaccagttcatcgataaccagttcagagtcttccaccattacacaaaccacaggacaaga gtcaacaagcccaaccccatcgtcctcagagacaggttcttctactactactccctacgatataagtccaacggcaagttccgactttgatg cttttaaatatcaaattcttgatgaacacaacataaaaagagctctacatggagttgacggattagagtgggatgaagaagtatatgctgcc gcccaagcatatgctgacgcatacacttgtgacggaaccttggttcactctggaaatagtctgtacggagaaaacttagcgtatggttactc aaccagagggactgttgatgcctggtacagtgaaattgaatattatgactttaataacccaggttataccccaggtgttggacatttcactc aagtagtttggaaaagcaccacaaagctcggctgcgctttcaagtactgcaatgactattacggagcctacgtggtatgcaactactcacca ccaggaaattatgtcaacgagggatacttcgaagccaatgtgttaccactggtagattaa PAS_chr3_0979 15 atgagttatcccctaggtctgggtcgtacagcttataggttcatcccgaggtcaatctgttcaagacgatccatctcatcccatgcattacc tccaacgccctccaactcaccaccagcaggagatttattcaccaaactgctgaacgaacgcatcatatatttagcaggaggcattgatgatg cgcaagcaacatctatcacggctcaattgctgtatctggaatcgcagtcaacgtcgaaacaaatcaacatttacatcaactcaccaggaggt tctgtcacggcagggctggccatctacgacacaatccagtatatccgagcgccagtttccacggtttgcttaggacaggcatgctccatggc atccctcttgcttgcaagcggaacgcatggcaaacgtttgatcttgccaaacgctaccataatggtgcatcaaccatcttcggcaaacggaa ttaagggacaggccactgatatcgagatatatgcccgtcatatcatcaataccaaacagaaattgcaaactttatacctaaaacacatgtct ccaaccatgacggtggatgaaatcactgcacttttggagagagatcggttcatggagccagaggaggcagtgtctcttggactggcggaccg tgtattagagaggaaacccccggttgtatctgactaa PAS_chr3_0803 16 atgacagataccaaggagttagccacgttgctggagaacttgttgaaattgcaaaaatcaggaagtcttggtgaaattgtgggtcaagcaca gcgcatttatcatgacatttctgacctctcagtcctatctggattatcaaccccagaagtgctctctcctcacacatctccagatgtccccg agagagttccatctgaagtcaacttagacaattccaatctggcaactgatgtcaacgaaaaggagaagtattttgacgattttgcaaatgac tacatcgagtttacctacaagaaccccaccacctaccatttggtgcaatctgtggcggaattgttgaagaaaagcggattcgaatatcttcc tgaagcagctgactggtccaaattattcgaccctgaaaagacgggagcgtatttcacaatccggaatggaacctctttagctgccttcacaa ttggtagtttctggtccccagccaagggagtaggagctatcggaagtcacatcgatgctctcacaactaagctgaagccagtctccaataag agtaaggttgatggctacgagttgttgggagtttccccctatgctggtgctttgtctgacgtctggtgggatagagatttgggtattggtgg aagagtaatttacaaaaatgaatcttccggcaagctttccaccactttggttaacagtacacctcatcctgttgctcatattccaactttgg cccctcattttggtactccctccaacggtccattcaacaaggaaacccaagcagttcccgttgtaggattttctgacggaaacgacgaggag aaacccactgaggatgaacaaaagtctcctttgattggtaagcattctttaaaactactccgctacatatctaagctagcaggagtgccagt gtcctccttgattgatttcgatttggacatattcgatgtccaaaaaggtactaggggcggtctttccaatgagttcatttacgccccaagag tggatgatcgtatttgttcttactctgctctacaagcgcttatcagacgtcacaaggatcccgaatcctttgtcacagacgactctttcaat cttgttgccctttatgacaacgaggagatcggatctctctccagacagggagccaagggtggtctacttgagtcgaccatttccagagcaat cgctgcattgaaaatttcagagccagggactctgcaaagactatatgcaaattcagtgattctttctgcagatgtcacacatttgttaaatc ccaatttcaccgaagtgtacttggagcaccacaagccactgccaaacacagggattgcacttgcgctggattcgaatggccatatggccaca gatttgttaggcaaggtcgttgttgagcagctggctaaactcaatgatgataaagtgcagtacttccagattcggaacgattcaaggtctgg agggaccattggacccagtatttccagtagtactggcgctagaaccattgatcttggaattccccaattgtccatgcacagtattcgtgcta ccgtgggatacaaagatgttggcctcgctgtcaagtttttccaagggttctttaaaaattggagaaaagttgtcgacggcattgaagagttt taa PAS_chr2- 17 atgacttcggtatttttgggtgtttatagagccctatttgattaccaagctcaaaatgacgaagaactaactgtgcatgagaatgatctact 1_0366 atacgtattggaaaagtccgaaattgatgactggtggaaagttaaacaacgagttatcggagttaatgtcgaggaaccaataggtctggtac ccagtacttatattgagcctgctacacctatcgggtcagctgttgcactgtatgattatgacagacaaacagaagaagaaattactttcaag gagaatgacacctttgacgtgtacgacaccgacgatcaggagtggatcttggttggcctgaacaatatccattttggtttcgtgcctgcaaa ctacatacaaatttctttgggtacgacggcacctgcttctaacaatccaccaatacttagtcccgccagcttccctccacctcctcaacgga tcaacaactcctctgttccctctctcaaagatgctgaaccagcaagaaatctagaggacgataatgcttatgaagaggaggaagatgtacct ccaccaatgccaacgcgaccaactgccactacagctacatctaatatctctgctcctcaggactctgaatccgaagaggaaccttctagtag tagcagaaggccaagtggccgttcaagggcggatgatgattttgtaaaaggagactatttcacttgggatgttcaggaaattaatggccgca aaaagaggaaagctgtcctgggtatcggaaatggtagtatttatgtccaagcagagggacattcttctaagaaatgggatatcaggaatttg acaaatttcagtaacgaaaaaaagcacgtcttttttgactttaccaacccctcggcatcctatgaacttcatgcaggctccaaggacgcagc agatgccatcctgtcaattgttggtgatttgaaaggtgcttcttcaatgcgtgctttgaaagaggtgaaggctgcatcttctgccccaaaaa ccaagactggtaaagtcagttacaacttcgatgctgaaagtcccgatgagttgtcgattagggagggtgatgttgtctacatattgaacgat aaagaatcctctgagtggtggatagttcaggacgttaatactaacaagaaaggtgttgttccagctagctacatagagttgattagcggggg tggatctactttagccagcattggctcttctatttccaaaggttctaagaaagcttttggatcctccagaaaacgtaaggaaaaagagcgta agcatttggaagagcaacgtgccgctaaaagagaaaccgaaagggaacgtcaaagacttcgatccaaggaagaaagggataggctaagaaag ttagatgaaaaggaaagaaggaaaaagcaaaaagctactccacaggatgaagaccaacccgagactagcaaacctaatcctcatagagtgcg tacctggattgacagttcaggatccttcaaagttgaagcagagtatttgggagttgttgacggtaagattcatctgcataaaacaaacggtg taaagattgccgtagcggctcctaagttgtcactagaggatttagagtatgtggaaagaatcactggaatgtcgttagaaaaatacaagcca aagccaaaatctagtggttcctattccagaccttccaaaaagccatcctctagagaatcttcaccaaaggagtccagccgctccggagttaa acaatcagttcccaagattgatcctcccaaagacccagattatgattggtttcaatttttcttgggttgcgatattgatccgaataattgtc agcgatacagtgtggttttcattaatgaacaactggatgagagtagtttgcaagacctcactccatccctactaagatcgctagggttaaga gaaggtgatattttgagagttcaaaaattcttggataacaagtttggtcgaaccaaagctcaagaatctgctaccaatggtggtttatttac caagagtgatggtacattgaagaacaataggtccactgatgttctaacaagtacagttgtaacgcgagaaactttaagtcctactaaggccg aggctaagagcaaaagaattgatgacgaagcatgggctctcaaacccgctgccgaatctagctctcaaatggatcaattctccagacctgtc agtgcaatgagcaaacaattgactggatccatacaagatctcgtcaacttgaaacctttgggggacaatgcaaacaacgcttcggtagccca caaagctgaaacaccaaacactacccaggacaaaccttctgctcctgtcttggaacctgtgaagactggagctgcaaggggacctgtgcaag cgcaaccaacaagtggtggtttcgtcactgcacaacctactggtgctctagttgcaatgcctacaggtttcatgcccattacgatggtgccc gtaaagacaggaggaactatagctcttcaacccactggtggattcgtttcgttgcaaagaactggtggggtacttccgcaggttacaggggg acttgttcccgttcagactggtgggttagtaatgcctcagacctcatttggtgtaactccaactttgcagccaacaggagggattctacctg ctcagaggacaggtggattggttcctgttcaaaggacgggggggctaattcccgtccaacaaactggaagattagttcctgttcaacaaact ggaggattgattcctgttcaaaggactggaggattagttcccgttcagagaactggaaacttacaacctgtacctacaacctcttttggaag tcaaccaacaggaacttttgtgcctcaatcttcctttggtaatcagttggccaccaatttgaataacccgcaaaccacattcggctctcaac caacaggaggtttccctcagacatcatttgcacaaaatcagtttagacaatcgacaggaggtttccagcagaccccaattgtgcaacaaaca gggggattcccccaatactccgctggacaacagacggtaggattccctcagaactcttttggacagcagacaggaggaattgcccaaaactc atttggacaacagacaggaggttatcaaacaggttttcaaggaaatggatcgattccaatgccccagtcctcattcggtgcttcaaatctgg gattcaatggtgctacgcagcagaactacaacattggcatgggccaatctttgccagcagcttctatccctccccttcaaccctcttacacc tcatcactcaatggaatgtcaaacatgcttcagaacgtaagcatctctcagcagccacaacaagcccagccaatgacgacttttggagcacc tgtggcccagcctccgttacaggctcaaccaactggctttggttttggtaactcgccctatggaggtcagaacccactccaatctcagccaa caggtaaaagagccaacttatcagcagctaccgcagacaacccattcggcttctag PAS_chr3_0842 18 atgaccaaccaatcaacagtggtggatttacgcctttcatccaagagagttgttggcaaaccagtcaagttgcccacagtcctagcgtgctc agggtcagattcttccggtggtgcagggatcgaagcagatatcaaatccatcacggcttttgggtgctatgcgctaacagcaattacatctt taactgcccagaataccaaaggtgtcaccagtatagaaaacaccgacccaaagtttttcgaagagattttagaggcaaattttgaggacatt gaaatcgatgtggtgaaaactggactgttaaaccctgagtcatctcgtttattgctgaaatttttagataaataccacaaaggaaagccatt tgtcctggatccggtcttagtggctacgtctggttcaatgcttgcagatcaacacgaattagggttcaccattgattctcattttaagaaag ctactatcattactccaaatttcgaagaggcatgtgtgatctactcttacttgaaaaagctgaagactgtagatgagttgggtgaaatagaa actttagaggatttgaaaggaatggccaagttcatccagcaaactacacattgcaactctgttcttcttaaaggtggccatattccctggaa tagaaacgagcagttggttaaaaaaaagggaggagatccagcatacattactgatattctttatcagggtcatttggataaattcacggtaa tcaagacagattacttgacaagttctggaactcatggttctgggtgtacgattgctgcctcaattgctgcaaacattgcccgttcgttgaag attgaggatgctgtaatttcttcgattagatacgttcatcaggcaatttttggagcagatgagacgctaggacaaggaaaaggccctttgaa tcatgtgtttcatatttctcctcccattaacggcacaagtgctgagaataactttcttccgttctatccaggtcacttcttagattacttac tggagcatcctttggtgagtcccatctggaagaactacatcaaccacccatttttagaaaacgtagcaacaaataagctggctaagaacaga ttcatccactacatttgtcaagattacgtgtatctagcttcttatgcccgtgtccacggcttagctgccggagttgcacctgatattgaaag cataaaggcagaagcccatataatcgactccatcatggaagaaatgcatagacataaagacgtattgaactctcgtggaattgtgaaactgg atgaattaagaccctccaaggcctgcaaacagtattccgactacctcctaaacattgcgaagacatcagactgggtggccataaaaatcgcc ttagcaccatgcatctttggctactattacgctgccatttatgctcggtcgtttatcaaggatgaagctgacgtggacgaagaattcttgaa ttggatcaatacgtataccggtgattggtacaaagatgctgttgacgaggccagacagtcgctagaaagccatatgcaagctgtttctcccg tccagttagcagagctagtcaagatctttgcagatgtctgtcaattggaggtgaacttctggacttcgccaatggaactaccagaacaagat ctatga PAS_chr1- 19 atgcctacagtggtgactaacgagtcctctctcttgcaaacaaccgtgagtgttgcaccattggtgcttttatctgttgttgatcactacga 3_0195 acgagtggtgcaggcacccaacgccccaactaattcaaacgacaaaagagtcgtgggggtcattttgggagacaatacaaacaagaacttga tcaaggtaaccaactcatttgccatcccgtttgaagaagacgaaaagaacagggatatttggtttttggatcacgacttcatcgaatcgatg atggaaatgttcaagaagattaatgccaaagaaagacttattggatggtaccactctggaccaaagttaaagtcatctgatctacaaatcaa cgagttattcaagagattcactccaaatcctttgcttttgattgtggatgtaaattccaccgatatagtcgatattcctacagactcatatt tggcaattgaagaaattagagacgatggctcaagtgcagaaaaaacgtttatccatttaccatccatcatccaggccgaagaagcagaagaa attggagtggagcatcttctgagggatatccgagaccaggcgtgcggaaatctgtccataagattgactaacaatttcaaatcgctgaagtc tttaaacgatcgcatagccaacattgtccaatatttgcgcaagattttaagtggagaattaccaataaataatgtaattcttggaaaattac aggacatattcaacttattgcccaacttggttgccgttcaaggtgatcccacaaaaccagccactgcaagtgctaaccaactagccacatca ttcaatgtgaagaccaatgatgaattaatgatggtttacatctccagtttagtaagatccatcttggctttccatgatttgatcgacaataa gatcgagaacaagaagaacaacgagaaagataaggaattcacaccaacagaggaagaaccccaacaagcggctatagaatcgaaataa PAS_chr1- 20 atgacaatgtcaaccgaagatatcatcgccaggcataggaaggagaaaagggaccaaattgcacttattacaaggatgaagaagcagagcac 4_0052 taagtcaaccaaaaaggaaatcatgaaacaatgctctctcttggaagaagagctacaggcaagacataagaaggagttaggtgagtgcaaga ctgaaaattccgtcgagagaagtagtgagcctactgacgaaaaatcaaatggtggagaacttttttcccctgaaaagttattatcaatgatg actttaaaacagcaaggaactccaagtgagaatcaaggaaacgcaactgttccaaagagaaaacgcaataggcagaaggacagattagctag aagggaagttgccattaaagagatgcaagcagcagcagcaaaagaggctaacctccaaacaaatttcaaagagatagaattgaacaacataa gccaactgtgccaagttgctcacctggaaccatatgatatccgacctgatgggcattgcttgtttgcatctataaaagatcagttggaggtt cggcacaaaattgaaaatataagtatacaagatcttcggtctctggctgcgagtcatattaaaaatgatcccgagacttatactcctttcct ttttgatgagaatactatgaaaatcagggacattgatgactatgcaaacgagctggaaaccacggctttatggggaggtgatatggaaattt tggcattgagcaaagagtttgattgtccaatcagtgtaatgattagtggaagacctattcatcttgtcaatgccgacggttctaaagaggag ttgaagttggtttattaccgtcatgcatatggcctaggtgagcattacaactctttaagagatagatcagagataagggagtcttgtatagt tgagcaagaggaaaaagaagcggtagacgatggaaaatcatcttcttga PAS_chr2- 21 atgagacttaagatcaagcgttcaaatgaacagcggctaataacattgcctgacggggctacagtatcggatttacttaatgaaattggatc 2_0057 agcttctatcaatataaaggttgggtttcctcctcagacaattgatatctcagataccagcaagttgcttactgatagtggaatcaagaatg gtgaaatgatcattgtcactgataccattgaaacagaagtgcctgtcaacaagaatgaggttgcaattgccactgtctcaaaccagaatgat gcgccctacgttcaaatagacgacatcttcctagtcttgcggaagattcccgatgataattcttgtttcttcaactctgtcggctactgtat atttggtcctgattcaatcaagtatccggattctcaacaagaactaagacaggccgtcgctaatgtaatcagagagaacaaccaaggtattt ataactccgccatcttgggtggaaagtcaatcacagagtattctcagtggatccaaagcagtaattcctggggaggagccatcgaagcacag atattggcagaataccttgatatcagtatctggacagtggatattgagtctcttcaagtctacaaatttaatgatgaaatggcttcaaggtt ttgcgttattatgtatagtggtattcattacgacgctatggctctcaagctggacacatcattagatgaggaggactcacaaatttgtgtgt ttgataagttcagtgagttggggactttgattgaagacaacgttctcaaattaaccaaccatcttaagaaccagggctattatacgaatact tccacattcatactccaatgtcaaatatgtctcgcaacattgcaaggagaaaaagaagcaaatagccacgcaaagaaaactggccacacaaa ttttggtgaagtcaattga PAS_chr1- 22 atgtcattgtctgatcctgaggacagcctaagacgtctacttgtgagtttaccctccaatgttaagtacgatgcggagtcttcggtattgaa 3_0150 aagccgactgaaccttgctctatatttctcgctgacaaagagaggtgaatatctgggttccttggtaacggacttgccaatggatttgccat catcttattccgaaatcttagaggctgaagatgattcctactcaagattggctgaatcaatgtacaaatgccctaactataagcatcatgga agaccttgtgcaaggcagttcaagcaaggagagccgatataccggtgctacgaatgtggttttgacgagacttgtgtaatgtgcatgcattg ttttaatagggagcaacatcgagaccacgaggtttccatttcaattgcttcgtcctccaacgatggtatctgtgattgtggagatcctcagg catggaatatcgaattacactgccagagtgaactggaacaagatgaccattcaagttcagaagttaatccagattttaaatctgctataagg gaaacaatggatattattttagattacattttggattgtactattcattctgcatctatgcttcctgctgttcaggacatgatgaaggaaga cccatccgactatgaaatggctattcaatatgcttcagatagttcttctctgcccattgaaagatatggagtggaagacacgaatgttcagt cctggaacgtagtcctgtggaacgacgaattccataattatgatgaggctattgattgcatccagcaagttagtagatgttcattgtctaaa ggacaagctgacgctcaaaagattaatgattttggattttccatcataagaagaagtgaatccttgcctttactgatagaaaggtgcgccaa ggttgaagaatccgggtttactattacgattctttctgatagagatgttacccgattgattattattgatactatttttgattggttattga ctctgttagaaatttcaaggccggaaattcagactgctattagagaaagtttgtgtgaatctcttttggaagagtttcatgccgacattcac gaaggagattttttctaccgggaagatgaatattcagacacacggggtttgctggatttcaaaaacagaattccagccccattggtggagga tgtaatgaacgagttgtctattgatgacttgaagaacagaaaactatccagttttcttaatgaacaaccttcagctctagtcggctcaagag tacagtatttcttctatatggatctgcggttctggaaaaaggcaagaaaatctttgaaattgctaacgacatctgttttggtttcaaacttg gaatacaaaaagactttttctgaacagtttgtgaaaatatactcgcatctgttgatattgatggcaaaggaagatagagagtggcttctcag caatgcgggcaatgctgtagtacaactctttacatgtcctaaaacatctctccatttattacaaccacaatatttcagaagcatcatcgtcc ccatcattttgttgttcgaatcttatactggaaaccatttgctgtggaaacgaccatatcaactcttatcacgtaagaaaggtctcaaattt ggtttaatgcgttctttaactgatctagtgacgttaatcaccactgcccatcaatcagaagaacatttggtactttttcagggtaagaactt catttacataatcatgctttttaggatgttccagagtgccctgacattggtcagaaaggaaggagaacatattaccagggaatccactgaat ttttaacctacctgcaaatatcttactaccttaatgatgtcatcaaaggtattgttgaaattgcgcaggttcctgaaatacgtaaacctgaa cattggaaagttgtggaaacaaacatacaaatattggccactttaatttcatcagaaccttataagtttcatatggtgcacgaaaaacaact tattgaccatgacgtaacaaagaaaccaacctctcttattaatccattgaatggattactgtctaacatgttaacaaccgtaagggccaatt ctttttcatttttaactcgtcaagtttctcagattaatttttggagtatcaatcccgaagtctcattttcagatgatttagactatctgaaa ctctcatcgaagagtttagaagcaattactttgagttcacagataaaaattggccactggattagaaatggatccatgactagtaaacaagc gcaattgtactgcacgaggttcactcaatatggttacatagccgacgttcatttgaaccaacttgctatactcgaagaacgcgacgatgatc gtctattattaaacattttggatagattcaatctaatagattggttctataacgatcaggacgtgcttggtactgttttcgaagaacgatct ttttacctaatgaatgaattggttaagtttctttataatatgttttcacacagagttaacttccagtttgaatcaaatttcacagagaaaac ccagtatgaggtaacgcaatacattttatacacgctttgtaaaggatctttgtcattttcagatctgacagccgactttcctatctccgtgg aagttactgtttttgacaagatccttgatgaggttgctgtttacgaagagcccaaaactatgaatgattctggaaagtattctatcaagaaa agttattacaaaaagatggatccaatgtctatttatgtggactcgggtgatttcgatgatgtatcaacagcgatagtaaaggaactttcaat tttaggaaaaataaaagaggagaatgttgtaattgaacctcagatcagtggaccgaatgaatccaacagccgtgtcttgagcagattgaaac ggttcttcattagcaaatctgtagtcaaactgttttataaattgttacaatctgctctttctgagagcaatgagacctacgtcattgaactt ttacatttgattcaagcagttttattagatgaacatgaattgtacagaatcgaagatccagtgcaatactttattcaaattcctgtgtgtga tctactgttatcagttgttgagcacaatgatttttcacgacctgtctgcaaaaaactgaagttctattgaattggttgatccagcgggacga gtcaatcattgactcattggttgattcttttggtgaaaagcacattgaaaactttaaaaaatctaagggatctcaagttctggagactaaac gagctaaacaaaagcgtttagccaaggagagacaagagaagatcaaatcacgatttgctaaacagcaaaagtctttcatgaagcagaatttg gacgcaaaaaagagtgcggaacatgtaactacacatttatccaaagacaatgaaggattaggtagttcctcccaggactcttttcatgagtg cattctttgtcaacgtgctcaggagggcaacgagatgtttggaatccctgcatatgttgaaaaagtttccacgttttgggattttcaaccta aggatgagtcaacctatacggaaagatgcttaacaaccattgaaaatcaaatgaaacaattgcatgaagaaacggatgccaacaatgaggtt agagaacatctttattatcaaaaagatactcctgtaaaaagcatggcaccgatatcttcaagacacattgttaagtcatgcgggcaccacat gcattataaatgtttttctgagttactagaaaacagcaggaagtttagcacttgtccgctttgtcgctctgccattaatgcttttgttccac aatttgccatgaaaaacgatgctagccctgcttttcaggaggctgcttcgaatattagtcactttgaaaagttgaatttgaatcaaattgta tcgaaatatcttctcaatgattccttcttgaaatttattgcggaagaaagtaaggaccagttcatgtatttgaatgagtttaaagacatttt gaaagacgccccagatgcttctgaccacatgttgagtgaagggttatttccctcatttttggccatgtcaacattattgggtaataccctag caaatactgaaattcgtctcagattatcccccgagaagattccccagaaaggaaacttgaagagaaaagattcggaattaataacctcatta cttcaatgtgtctcggttatctcaatcttattgaaacaatcttatcctgaagagcagtatctgtctccatttttgaataaaccaaattcatt aattattgattttgccatttcacttctacttggaaaagaagactcacttcaagaaactattgtgggcatttacaagcaaacaattctgcatt cattgaatttactattgactaacgttggagataatgagcatttcagaaggatgctgagcggtgcaaactctattattaatgattcagaactg gccattttcaaaaagtttgtgtcaacggccacttttacctctgatgtttcattcattacttgcaacgaacaattattggttggactgtatat tcttttggagaaaaccaccacagtgtatcttaaacagttgtttctgataatcagcatgtgcagacccttggacttatgcctaaatcgtgact acgagaattccaatgattacgaccactatttgtttggccaactgtgcaaattttttaacctttccagtataatcagttatttgggatctgga attcctggtggaaacctattggaggagcaaaatgatcttatattaaaaggacaatccactctcccttcaacaattgagtatccaggtctcgt ttatcttgtgaatttgcctagagaactgaacacttttactttttcaaaatatgacacccaagatgcagttaatctaaacttttctgtttgtt taacgtgtggcaaaagagtgaaacatagcggtgattctgaaaatgaaattgaaaacttccctgggtacaatggtgttcctcttactttgttt caccatcataagaattgtcctttctctggatatggagaagcacaatgtatcttcttaaccccaaagttgaataaattgactgccttactaaa gattcagcctccacgaggaatttctgatcgctcgctatatcacagtacatttgcattcccattgagcagcccatatctaaccacacatggag agtcacattctggtcatggaggcttgatacgcaaagcgttcctgaatagagatcgatttcgaaatctgaatgagctatggttggatggtgaa ctagctttgtatatttcccgaagccttggggattctcaaattgtagcggaaccaatcaaccctgttatgattacaatgccgggaggtattca ggaggcattaaatcttgcgttcaccactttcctcggtgaccaagaacccggggatgatgacttggaagattatgagtatgacatactgttaa atagatga PAS_chr1- 23 atgtctgcctttggtgtggttccgagtgtattaaacactggaaaccagatcaagcagaaaaacggaacgcttttcaagaaatcttctggagt 3_0221 ttacaataaacagcagcgggatcacaattccagggataaaaagcgatcagctcgtaaaacaaatacaccgccaacaccgactgagagtactt ccgcaaagaagtcatcaactcaatcagacgacaaagtgagtcctgatattttacaattgtcgcatattgagattcaatatgtgggcccactt ctttccaacccagaatctttgggatatgtgaaacaaaacaataataccaaaatcaagactccgaaatatttagtggatacagattcaaacct ggtttttggtcctgatacaactaataaatgggatattgagaaccagcacaaaatgatcgaaatggaatcttcccatcaaggtgactggcaag gtatttatgaacaatttcaagaaatgaataaagtggagcgtcaaaaaatggaagatctgggcttggtggcaaaagagggacaaagcatggac ctgacaaatgctatctcattcaaaggtagctgcgtggatatgtgtcccgtttatgatagagtcaagagggaggtacagagagatgttgatcc attggagagagatcctgccactggtaagatatctcgagagagagctttaaagaaatttgtgcgtccttcaggccaagcaccgcctcttcctt ctgacgtaagacctcctcatattctggtaaaaagtttaaactatattgtggataatttgctggataaattaccgcaaagtcattcattaatt tgggatagaacccgtagtatcagacaagattttacactacagagctactctggcttggaagcaattgagtgtaacgaaagaatttgtcgcat acatctactttgtgctcatataatgccgggttctgatcaatctgacttctccaagcagcaagaaattgaacaattcacaaaatcattgaaaa cattaacagacatatatgatgttgtcagatccaaaggaggaaaatgtgccaacgaagctgaattcagggcttataatttgctggtgcatttt cgggacccaaatctaattcatgaaatccagaacttacctactcgaattcttaaggacgaacgagttcaacttgctttaatgtttcgaagtct actattgaataataatttcaaagaataccagaggaacattcctggttgcttgggggtttttcagcagtttttcaatatgtgttttgatccag ccaccccattcttaatcggatgtgtgctggaacttaattttgaagagataagattttacgctttgaaatcgatctcacgttcttatcacaag aaatctgcccctctaacgacccagaagttagcatctatgctcggatttgattccgaggataagctcctaactttcactaattatttcaagac tcctacgtgtactaattctagaaatgaaacgtgcattgatatctcaaaacttagatacgagagttttacggatttggctgctccaaagcaga tttacacttcaagattagacaacaaattaaaaggattcacctataaggatgttgttgatcaaggattaaataacacatccttgcacatagct aatttgaaagaaacaatggctcagaatcaacatattgcagtggagaaattacccaatatctcatttccacaacatgctttgtcttctacccc tttcgaagtagaatcaaagtcagacatagtcagatcttcttccggatcggctccgccccagactttgatcccaccgattcaagaaaaagtaa taacttctcaaatacagccaccaataactcccgtcgttcccactgaagaaatccaaactcttccaaaaatagaggagcccaggttcaaagat cttccaaattttgaaaatgcatgcaaagaggtttcctctattttaatcaagaagactatatctcctttgattgctcccatagtgaacaatca gctagaagagtacaaccggcgacaaacggttttaagggatcaggagagacaaaatcaaagaagacaacttttgatttcatcccttcaggaag aattgtactctgcttttatacgagaacaagtgtatattcaagtggttgatactcaagccaaagagtgctttaacaagaatctgaaacggcga atatttcagaaattcatcgggggtttaattacattgaaaaacaaacaaatgaataagagaagaaaacttgatgaaattcaagtcttcaagaa taaggttgtttcctcaagtcaacttcggtattcagtttcaagaagtcaaacggaggacaattcaacgtcaaactcgagtgacgaggaagcat cagctgttcagatgaatattactctttcaccatctgtggatccactttggtcacccatagatattaagtttatattagactccaatttaaag ttgtttgaggataacaaggataaatactggaatttcatgtttgcgattgccgattggactattctaccaagcaaatggcttcgttacaaatt ccaacttcaaaaccccagtctcataaatactgttgaatcctcaaattacaaagccaaattacgggctctacccagtgacaaacttcttacaa gggaatacatggagcactgtcgatttttggtatttcaagtcggaaaggttgatgaatcatcaaacctgaaagaatctttgttcagagactca cagtttattaaccgattaatgaaatatgccaagaagtactcgcaataccagattggagtacttgtcttatattatcatgaggatgactcttt tgataaacagaaaattattgatcttttgttattagaacaatacacaaataagttagtcaactcactcgagatagttgacatgaacaaactca caaatgatgaactgataaaagcattgaccacgctagtccacaactataaggataaaggtatcaacaaatcggtaccaacatcttccaccaaa ggacacaccactagcattatggaacaggatatgacagtatacagctacagcacgtccaattccagggatgctaagcttaattatattttgaa gcaagcctacccccgcagggggtttcacttgaaacaatga PAS_FragD_0022 24 atgtcagaatggccctcagctttggaaaattttgtaagtcattgtttccagcgtgccaacattgagagctttccacccggcaaaaaaaaaga actccaaaaacagttgacgcaaatcatcaatttagcaattcttgaaaacaaacttaattctaataactggtccaaacaaaagctaccaatat ttggagaagcaagagagttagaattggagcagaaaatgggaaatgtttatccaattactgtttctagtcgaagaagtgacttgatgcatcaa gaggcagttcaaccatctgagcctttagttccctccgaaagccaacaaaagaaaaagtctagagaattgcgatttaagatcactaaaaaaag ttctgtatcacccgcaaataaaatacaagttgcttgtgacttgaattgtaaacttgtgggaactaacacctctatcgagaaagattattata gacttacatctcatccggatccttccatggtaagacctttgcctattttaaagaaatcgttgcagcatctttacgccaaatatcaaagtcta gaacgtttcaaagctctcagcaaggcagagtacagctattttttgaatcaactgaaatccctaaggcaagacctcacagtgcaagacattca gaatcagttcactgttaaagtttacgaatttaatactcaattggcgattcaaaatgaagattttggtgagcttaatcaatgtttgactcagc tggcgcaattgtacactgtatcaactatgggtcatacttattactattctgatactggcaaatacaaccaagagcacaactgttttcttgcc aaggatctttgtgaggatcgaaaccatatcaatatgttcaaatttacgagttatagaattttatattttcttctcatagacgccccctggga attgctaaaaataaggcaggatttattcaaccgtggtcaacagtatgcaattcgtcacaacaaatttcttttgaagtcattcaagctttcgg atctcataaccgccatggattatattcatatcaaggacgaatattcattcctcgtgaatatggactcagatgtctgcaatttaaggacagtg tttgatgacgaacatatgactttgaaccaagacgactggtttttctataagatactctaccataagattttcttacgagaacagctgaaggc cctgataactataagcaaatcttatcgacagatatccctctactacttgaaaaatctactgatggatttagtattcttggaaaagaataagt tatctcgtttcattgagaatggtgaggtatttaactgcacgagcgcaagatcattactgcttcaaatagagaagaagcagctatcaaagata gatatcaagggtcaggtatga PAS_chr2- 25 atggttgactcagagactatcaacaaattcatagaagtaacgggagcctctgccttccaagcaattcagtacctagaggagactgatgactt 1_0159 tgaagcggcagtcaatgattattattcctctcaactggagaatgagaagggcaagggtaaatcagaacgtccagtcaatcaaacaaaggctt ctgcagggcccaagatcagaactttcaacgacctaaatagcaactcaaatggggacaacaatcttttcacaggtggtgaaaagtccggtctt caagttgagaacccagacaaacgtggggacccttttgggttggtcaatgatcttttgaagaaagctgaggaaactggccaacaaccagatac aaggccccatgaagaagctcctgctagacaatttgttggaactggccacaagctgggcagtacggacagtccctccgaagttagtgtctgac cctgcctcaagaataagaagagctcagaaagtcagccgacagataacattttggaaggacggattccaagttggagacggagatttatacag atatgatgaccctgcaaacgcaagatatctagccgacttgaacgctggaagggcaccactggctcttctagatgtcgagattgggcaagagg tagatgtcacagtgcataaaaagatagaaaaaaatttcactcctcctaagaaagcccgagttggctttcaaggtaaaggtcagagattaggg tctccagtaccgggcgacataaagctcagtcaatctcctgaggtgcaacaagaaacacaagaggaagctgaggaggaaaagcaaaaggagga ggccgagcagctgggaactggggattctcccgttcagattagactcgccaatggtcagagaattgttcatagattcaattctactgattctg ttgctcaattatatgcatttgtcaatgaacatagtccctccgccagagaatttgtgctttctctagctttcccggtgaaacctattgagaac aatgaggacacactcaaggatgctggactcataaacgctgttgttgtccaaagatggaaataa PAS_chr2- 26 atgggcgtgatacttccagacgatggtaagcaatcgggaggccaaccaaatagaagggctaaagtcctgagccgatttttaccaccagaaca 1_0326 tcaaagaccttcaatcggcctcttcctgggaccttttactccagcagctgataatgagattgccctgtggacttgcattggcgctcagctct ttagtgggctggcattgcttagaatgagccgaagatttgttttttcgcccgatcaatctgtaagaaggtttctctttaagacttttcataat gtggtaggtgcagccctgatatttgggagcggattagaagggactaggatgcttctacctgaggatccttggaaagaagaagctagaaaagc aagaatattggcccaattgaaaggtgagcccgttagttggtggtatggacccaagagttttattccttctggaaggttagaatacacaaaac agatgcagtttcacaactttgaagtcatgcataaatcacccgaaaaaatagcccgagctctcatgattaaggacaaactcaaggaggaaaca aataccctttattcgtccattcatgagaaagcggaacaacagactattcgactctctaaagatctacagaacaacgttcccctcaaaggggt aacgtcatatgttcctcaatttagcacttcaaatacggacaccaagttatatttgaaaaatgttagcttgaagacccatgccgacctggaaa aggtctgggcagaacacaatccttgggacatcctggaagagaaaatttctccaatttccgtaattgcactgccaaagtttaacccaattata tctgaggttgaacctgacaagcagcaaccatctacgggtgatatcaaatacattagtgacagaaaataa PAS_chr1- 27 atgaaatatttgccactcgttgctaccctggcctcttcggccctcgctgctggcatcaacttcgcccaattactggaccagaagccactgga 4_0611 cattgccgataatgttaaatgggaattgaagcctgaggtcgactctgctgctcttcaaagtgcagtcaatgagctagacttgaaaatcgaag ccagctatttgtttaaagttgcacatggttccgtctttgaatacggacatcctaccagagtcatcggttctcctggtcactggtccacaatc aaccatgtcctcgacacattacataacttcaaacactactacgacgttgacgttcagccatttgaagcctttaccggtatccttaagtcttt ctcattgaccattaacggagttgcaccaaagtctgcagaagctttagatttaactcctcctactcctggcggttttccagtgaccggtccag tcgttttagttgataattatggttgtcaagcttctgactatccattcaacgtgactaacggaattgccttaattcaaaggggttcttgttca ttcggtcaaaaatcagaacttgctggtctccgtggagccaaagccgctctcatttacaacaacgtgccaggtagtgctaagggaaccttagg tgccccaactcctcatcaggtaccatcgttgtcactttctcaggaagatggagaggccgtcaagcgtcagcttctgacttctggaagcgtaa ttgcaactgtcgctgtcgattcctacgttaagaagttcaaaaccaagaatgtgattgctaccactcgttacggtaatgatagcaacattgtg atgctaggtgcacattcagactctgttgctgctggaccaggtatcaatgacgatggttctggtaccatctctcttttgaacgtggccaaata cctaactaaattcaaagttaataacaaggttcgtttcgcttggtgggcagctgaagaagaaggattacttggatccgactactacgtttcaa agttaacccccaaggagaaatctcagattcgtttgtttatggactacgatatgatggcttcccctaactacgcctaccaggtctataatgcc actaacagcgagaacccagttggatctgaggagcttaagaatttatacattgactggtacgttgaacagggtctgaactacactctagttcc atttgatggccgatccgactatgatggattcatcaagagcggtattcccggaggtggtattgctaccggagcagaaggtttgaagaccgaag aggaggctgaactatttggtggtgaagctggagttgcatatgacccatgttaccactctctttgtgacgatttggccaaccctgactatgtt ccatgggttgtcaatactaaattaattgcccacagtgtcgccacttatgcaaagagcttggacggattcccattgcgtgaggagcctagccc attcaagatgactgcccagtcaaacttcaagtaccacggtccaaaacttgtcctttag PAS_chr1- 28 atgctcaaacactccttaaaaacagggttggtctttctcacttggataccggtgatttatacggtaaaggaacacctgatatacgttggaaa 1_0274 ggtggaaggatcctcaatgtcacccactttgaatcccgttaaaggttattctgactatgtgattttatggaagttaaacttcaaagagtcac tcaaagtgggagacgtggtttttataaggtctcctgtagatccagagaagttatatgctaaacgtataaaggctgttcaaggggataccgtg gtgactaggcatccataccccaaagacaaagtgtccattccaagaaaccatctttgggtagaaggagacaatatacacagcgtggatagtaa caactttggtccgatatcgttgggccttgtattaggaagagcaactcacgtaatttttcccctgaacaggataggtaatatctctggtgaag ggggtagagaagttagggaggattatttaagagcggaggacagtccgatgtaa PAS_chr4_0834 29 atggtttctgaaattcagcttagattagctgttattatttatgatatactctgttcggcgtcttatgttctagtcatccatttgagaccaac cagagcccttccgcatcaacccatagaccgtaacaatcctctaacgattaaagaaaggtgccagcgagccagtgtgttgactgctacacatg tattattattgcctattcttttaaaagtgttgagactgtcagaaattgcggaaactacggcgaaacttggaatagtggtgggatatcacaac cagagctggtctttctctaacctccaagatgatattgtcagcattttcaaagctttaggtttgaccatgattctcttttctggtcctattgt agattatttttactattcaaactcaacagaagtaatcaagcaagatctggcgtatgtcgttagcctcgagggtatgcgtgatctacttgtgg gacccatcactgaggaacttctttatcggtcatgttccatttcattaatgctagtagctaacgattacgccaacaaatttctgttcggccaa cactggttaataatggtatcatcactctacttcggtatagcacatcttcatcatgctgttgaactgtatcattgtaaaagatattcattaac taccataaccatatcaactgccttccaatggtcatatacaacgttatttggaatatatgcaagctttctatacttgcgaacaggatctgtat ggtcagcaatagttgttcattcattttgcaacatgatggggtttccccggttgacatttggacgtgatgaagcgagagattggaaagtgggt tactatgtgttgctcgctctaggttccgtcctattcaaaaagtttctttactctctaacagaatctaaccatacgcttcttctataa PAS_chr3_0896 30 atgtatcccgaacacaagtatcgggagtatcaacggagggtgcccttatggcagtactccctgttggtgattgtactgctatacgggtctca tttgcttatcagcaccatcaacttgatacactataaccacaaaaattatcatgcacacccagtcaatagtggtatcgttcttaatgagtttg ctgatgacgattcattctctttgaatggcactctgaacttggagaactggagaaatggtaccttttcccctaaatttcattccattcagtgg accgaaataggtcaggaagatgaccagggatattacattctctcttccaattcctcttacatagtaaagtctttatccgacccagactttga atctgttctattcaacgagtctacaatcacttacaacggtgaagaacatcatgtggaagacgtcatagtgtccaataatcttcaatatgcat tggtagttacggataagagacataattggcgccattctttttttgcgaattactggctgtataaagtcaacaatcctgaacaggttcagcct ttgtttgatacagatctatcgttgaatggtcttattagccttgtccattggtctccggattcttcccaagttgcatttgtgttggaaaataa catatatttgaagcatcttaacaacttttctgattcaaggattgatcaactaacttatgatggaggcgaaaacatattttatggcaaaccag attgggtttatgaagaagaagtgtttgaaagcaactctgctatgtggtggtctccaaatggaaagtttttatcaatattgcgaactaatgac acccaagtgcctgtctatcctattccatattttgttcagtctgatgctgaaacagctatcgatgaataccctcttctgaaacacataaaata cccaaaggcaggatttcccaatccagttgttgatgtgattgtatacgatgttcaacgccagcacatatctaggttacctgctggtgatcctt tctacaacgatgagaacattaccaatgaggacagacttatcactgagatcatctgggttggtgattcacggttcctgaccaagattacgaac agggaaagtgacttgttagcattttatctggtagacgctgaggctaacaatagtaagctggtaagattccaagatgctaagagcaccaagtc ttggtttgaaattgaacacaacacattgtatattcctaaggatacttcagtgggaagggcacaagatggctacatcgacaccatagatgtta acggctacaaccatttagcctatttctcaccaccagacaacccagaccccaaggtcattcttacgcgtggtgattgggaagtcgttgacagt ccatctgcatttgacttcaaaagaaatttggtttactttacagcaaccaagaaatcctcaatagaaagacatgtttattgtgttgggataga cgggaaacaattcaacaatgtaactgatgtttcatcagatggatactacagtacaagcttttcccctggagcaagatatgtattgctatcac accaaggtccccgtgtaccttatcaaaagatgatagatcttgtcaaaggcaccgaagaaataatcgaatctaacgaagatttgaaagactcc gttgctttatttgatttacctgatgtcaagtacggcgaaatcgagcttgaaaaaggtgtcaagtcaaactacgttgagatcaggcctaagaa cttcgatgaaagcaaaaagtatccggttttattttttgtgtatggggggccaggttcccaattggtaacaaagacattttctaagagtttcc agcatgttgtatcctctgagcttgacgtcattgttgtcacggtggatggaagagggactggatttaaaggtagaaaatatagatccatagtg cgggacaacttgggtcattatgaatccctggaccaaatcacggcaggaaaaatttgggcagcaaagccttacgttgatgagaatagactggc catttggggttggtcttatggaggttacatgacgctaaaggttttagaacaggataaaggtgaaacattcaaatatggaatgtctgttgccc ctgtgacgaattggaaattctatgattctatctacacagaaagatacatgcacactcctcaggacaatccaaactattataattcgtcaatc catgagattgataatttgaagggagtgaagaggttcttgctaatgcacggaactggtgacgacaatgttcacttccaaaatacactcaaagt tctagatttatttgatttacatggtcttgaaaactatgatatccacgtgttccctgatagtgatcacagtattagatatcacaacggtaatg ttatagtgtatgataagctattccattggattaggcgtgcattcaaggctggcaaataa PAS_chr3_0561 31 atgaaaccgtatcaccatgcaaaaagccgcccaataggcagctacctgtattttggggtgtttaccgtagcattgacatttctgacgtggct taaatatgacgcagagctgtttgctcagcaggttcactcgaaagacatttatgacccacagttcaacattacgttgccaattgatggcccaa catttaccccatcaaagaactattcaattagtgttcaaaatgcagcagtggcgtccgatatagaacaatgttcaaaattaggtgtatctatt ctgcagcaaggtggcaatgcggccgattcagcagtcaccgtggccctgtgtatcggaacaatcaattcgtattcgtccggtatagggggagg aggattcattgtctctaagttaattgataatcctaccgctctgagttttgattgtcgagaaatggctccttctaaaagtttcaaagaaatgt tcaactatcatgaggagaaggccagagtaggtggtttggctgtcgccattccaggagagttaaagggactctatgaactgtttcagcaccat ggttctggtaatgttgagtggaaagatttgattttgcccgttgctgagttggctgaggtgggatggactgtcgatccgctgttttctagtgc attgaaatctattgagcaccatatttacgagcattcatatgattgggcctttgcattgaatgaagacggaaaaattaaaaaaagaggtgact ggattaatcgtcccatgttggctactacgttgaggagaatagctgaaagtggcaacgttgatctattctatgacccagagagcgatatagta caaagcatggtgaatgctactagaaagtatggaggaatccttgaagcctcagactttgcaaaatatagagttcgaattgaagaatcgttgac attgcataactttacatctgacggccttacggtttatacgtccaatggggcatcctcagggttggtgctccttgctgggttgaagctcatgg acttattcgaagatttcaaggaatttcataatgatttcggggctgttgagtctcaaaggcttgttgaaacgatgaagtggatggcttcagta agaagcaaccttggagatttgaacatttactccaccaacgaaactgaaattgacgatcataggaagaggtacgacagatacaaatcagatga gtgggcaatagaaactcatgccaaaattaatgattcccacacacttccttcttggaaagattatgctccagcctttctacctaatgatcctc atggtacatctcatttcagtatcgttgaccaatacggtaatgcggtggctatgacaaccactgttaaccttggatttggatctaaaatacac gatcctatatcagggattattctaaatgatgaaatggacgatttttcagttccaacatcatctaatgcatttggtttgcatccatcaatcta taattgggtagagccttacaaaagacctctctcttcatgtgctcctaccgtaattgttgattctctgggagtacctcattttgtcatcgggg cagcaggagggtccaagatcactaccacagttttacaagcaattataagagtttaccattatcacctggatcttttagacgtcattgcatat ccacgctttcatcatcaactacttccggaagaagttcttctggagtttccacgagataataaactaatacgccatctaaaagaaagagggca tgatgttagagtccaagcaccaatatccaccatgaatggtatcctacgaaaaagaggtggaagcctgatagcagttagtgatcactggagaa agcttggtcgaccttggggcttttga PAS_chr3_0633 32 atgaaatcggttatttggagccttctatctttgctagcattgtcgcaggcattgactattccattgctggaagagcttcaacagcaaacatt ttttagcaagaaaaccgttcctcaacaagttgctgaattggtgggcacccattactctaaggatgagataatcagtctatggaaggacattg agctggatgtacccagggaaaagatccaagaggccttcgataagttcgtaaaacaatcaactgccacttcccccgttagaaatgaatttccc ttgtctcagcaagattgggtgacagtgaccaacaccaagtttgataattatcaattgagggttaaaaaatcccaccctgaaaagctaaacat tgataaggtaaagcaatcttcgggatacctggatatcattgatcaagataagcatcttttctattggttttttgaatcccgaaatgatccgt ccacagacccaatcatcctatggttgaatggtggacccggctgctcttctattacagggttgctattcgaaaagattggccccagttacatc accaaagagattaagccggaacataatccttattcatggaacaacaatgctagtgttatcttccttgagcaaccggttggagtaggattttc ttactcttctaagaaagtcggtgatactgcaactgctgccaaagatacatatgtgtttttggagcttttcttccaaaagtttcctcagttcc tgacctctaatctgcacattgctggggaatcgtatgctggccattatttgcccaagattgcttctgagattgtgtctcacgcagacaagacg tttgacctttcaggagtcatgatcggtaatggtcttactgatcctctaattcagtataagtactatcagccaatggcctgtggaaaaggtgg ctacaagcaggtcatttcggacgaggaatgtgatgaattggatagggtctatccaagatgtgaacgtttaacgcgggcatgttatgagttcc aaaattcagttacttgtgttccggcaacactttattgcgaccaaaagctactgaagccgtacactgacactggcttgaatgtctatgatatt cgtacaatgtgcgatgaagggactgatttgtgttacaaagaactggaatacgtggagaagtacatgaaccagcctgaagtgcaggaagccgt gggctctgaagtcagttcttacaaaggttgtgacgatgatgtcttcttaagatttttgtactctggcgatggatctaagcctttccaccagt atatcacggatgttctcaatgcaagtattccggttctgatttacgcaggtgataaagattatatctgtaattggctaggaaaccaagcttgg gtcaatgagctagaatggaacttgtctgaggaattccaggcaactccgattcgaccgtggttcactttggacaataacgattatgcaggaaa cgtacaaacttatggaaacttttcctttctaagagtatttgatgctggtcacatggttccttacaatcaaccagtcaacgcacttgacatgg ttgtcagatggacacacggtgatttctcatttggttattaa PAS_chr4_0013 33 atgactcaattagatgtcgaatcattgattcaagaactcacactaaatgaaaaggttcaacttctgtccggatcagacttttggcacaccac cccagttagacgtctaggaattccaaagatgagattatctgacggtcctaacggcgtccgaggaaccaagtttttcaatggagttccaaccg catgttttccttgtggtactggattaggtgccactttcgataaagaacttctaaaagaagctggctccttgatggcagacgaagctaaagca aaagctgcctcggtagttttgggtcctacagctaacattgctcgaggccccaacggaggaagaggcttcgaatcttttggagaggatccagt ggttaatggattatctagtgctgcaatgattaatggattgcaaggtaaatatattgcggctaccatgaaacattatgtttgtaacgatttag agatggatcgtaattgcattgatgcacaggtgtctcacagagctctaagagaagtgtaccttcttccattccaaattgcggtaagagatgca aatcctcgcgctatcatgactgcttataataaagcaaacggtgaacatgtatctcagtcaaagtttcttctagatgaggttttgagaaaaga atggggctgggatggtttgttaatgtccgattggttcggtgtgtacgatgcaaagtcttctatcactaatggtcttgacctggaaatgcctg gtccacctcagtgcagagtccattcggcaaccgatcatgccatcaattctggggagatacacataaatgatgtcgatgagcgggtgcgaagc ctcttaagtttaattaactattgtcaccagagtggcgtcactgaggaggatccggagacatccgataacaacaccccagagaccatcgaaaa actcagaaaaatcagtagagaatcaatcgtcttgctgaaggatgatgacaggaacagaagtatccttcctctgaagaagtcagataaaattg ccgtgattggaaacaatgctaagcaggctgcatattgcggaggaggttctgcttctgttctctcgtaccatactacaactcctttcgactct atcaaatcacgattggaagattcaaacactccagcttacaccatcggtgctgatgcttacaagaaccttccgcctttgggccctcagatgac agacagcgatggaaaaccggggttcgacgccaaattttttgttggctcgcctacatctaaagatagaaagctgattgatcactttcagttga ccaattcacaagtcttcctggttgactactataatgaacagatccctgaaaacaaagagttttacgtagacgttgaagggcaattcattcct gaggaagatggaacctataactttggcttgaccgtattcggaacgggaagattattcgtggatgataagctggtttccgatagtagccaaaa ccagacccctggagattccttttttggactagcagctcaagaggttatcgggtccattcatttggtcaagggtaaagcatataaaataaagg ttctttatggatccagtgtcaccagaacatatgaaattgcagccagtgttgcttttgaaggaggagcatttacttttggtgcagcaaaacaa agaaatgaagatgaagaaattgctagagctgtggaaattgctaaggcaaatgataaagtggtgttgtgcataggtctaaatcaagactttga aagtgagggattcgacaggccggatatcaaaattcctggagcaaccaacaagatggtaagtgctgttttgaaggctaaccctaacactgtga tcgtcaaccaaacaggaaccccagtcgagatgccatgggccagtgacgctccagtgatcttgcaggcttggtttggggggtctgaggcaggg accgctatagctgatgtactattcggtgactacaaccctagcggaaaactaacggttacttttcccttgagatttgaggataaccctgcata tctcaacttccaatccaataagcaagcatgttggtatggggaagacgtttatgtgggctacagatattacgagaccatagacaggcctgtgt tattcccatttggccacggattgtcattcaccgaatttgattttaccgacatgtttgtcaggcttgaagaagaaaaccttgaagttgaggtt gtagtcagaaacacaggaaagtatgatggtgctgaagttgtgcagttgtacgtagcaccagtatccccatccctgaaaaggcccatcaaaga actcaaggaatatgctaagattttcttagccagtggtgaggcaaaaacagttcacctgagcgttcctattaagtatgccacttcgttctttg acgaatatcagaagaaatggtgctccgagaaaggagagtacacaatcttactgggatccagctcagcagatattaaagtttcgcaatctatt actttagaaaaaacaactttttggaaaggtttatag PAS_chr2- 34 atgttcctcaaaagtctccttagttttgcgtctatcctaacgctttgcaaggcctgggatctggaagatgtacaagatgcaccaaagatcaa 1_0172 aggtaatgaagtacccggtcgctatatcattgagtatgaagaagcttccacttcagcatttgctacccaactgagagctgggggatatgact ttaacatccaatacgactactcaactggttcccttttcaacggagcatctgttcaaatcagcaacgataacaaaaccactttccaggatttg caaagtttgcgtgcagtcaaaaatgtttacccagctactctcattacattagatgaaacatttgagcttgctgacacgaagccatggaaccc tcatggaattaccggtgtcgattctttgcatgagcaaggatatactggtagtggtgttgttattgcagttatcgatactggtgttgactata cacaccctgctctgggtggtggtatcggagataatttccctatcaaagctggttatgatttgtcttccggtgatggtgtcatcacgaatgat cctatggattgtgacggtcatggtacctttgtatcctccatcattgttgcaaataacaaagatatggttggtgttgcaccagatgctcagat tgtcatgtacaaagtgttcccctgttctgatagtacttcgactgacatagttatggcgggtatgcaaaaggcctatgatgatggtcacaaga ttatttcgctatcactgggatctgactcggggttttccagtactccagcttccttaatggccagcaggattgctcaagacagagttgttttg gtggctgctggtaactctggagaacttggtccattctatgcctcctcccctgcttctgggaaacaagtcatttcagttggatctgttcaaaa cgaacaatggacaacctttccagtaacctttacctcttcaaacggtgaatcaagggtttttccttacctcgcttacaatggtgcacagattg gatttgatgccgagcttgaggttgattttaccgaagaaagaggatgcgtctatgaaccagagatctccgcagataatgcgaataaagctatt ttgttaagaaggggcgtcggctgtgttgaaaacttggaattcaatttattgtctgtggctggttacaaggcttacttcttgtacaactcatt ttcaagaccatggagtctcttgaatatttctccactgattgagctagacaacgcttactctcttgttgaagaggaagttggaatatgggtga aaacccaaatcgacgccggtaacaccgtcaagttaaaggtgagcacgagtgaccaaatgttgccatctgataaagagtatttgggagttgga aagatggattattactcctctcaaggacctgcttatgagcttgaatttttcccaacgatatccgctccaggtggagacagttggggcgcttg gcccggtgggcaatacggtgttgcctcaggaacaagttttgcttgcccctatgttgcaggtcttacagctctttatgaatcgcagtttggaa ttcaagatccccaggactatgtgagaaaattagtctccacagctaccgatcttcaattatttgactggaacgcagtgaaacttgagacctct atgaatgctccacttattcaacagggagctggtctagtgaacgctcttggtttgtttgagactaagactgtgatcgtgtctgctccttattt ggagctcaatgacaccatcaatagagccagtgagtataccattcaaattaagaatgagaactctgagactattacctatcaagttgttcacg ttccgggaactactgtctactctagatcagcttctgggaacatcccatacctggtcaatcaagattttgcaccttacggtgatagtgatgct gcgacagttgctctatccacagaagagttggttttgggaccaggagaagttggtgaagtcactgtgatcttctctacagaagaaattgatca agaaactgctccaattattcagggtaagattacattttatggtgatgtcataccgattgctgttccttatatgggagttgaagttgatattc attcctgggagcctctcattgagaggcctttatcagtgagaatgtatttggatgatggttccttagcatatgttgatgatgatcctgattat gagttcaatgtgtatgactgggattctcctagattttattttaacctgagatatgcaaccaaagaagtatcgattgacttggtgcaccctga ttatagcattgagaacgactacgaatggcctttagtttccggacacaacaactattatggtcccgtgggatacgactacgattatacctcgg gtcaagcctttttgcctcgttactttcaacaacgtattaacgaacttggatatctttctttttccagatttgctaacttttctgtagttcct gctggtgaatacaaagctctatttagagttttgctaccatatggagacttttggaacaaagaagactggcaattgtttgaatccccagtgtt taacgtcctcgctccaccgaatgaagaaaacactactgaagagccaactgaggaatccagcgaggagcctaccgaagagtcaacgtctgagt caactgaagagccctcttctgagtcaactgagaaatctagcgaggtgccaactgaagaaattactgaagatgcaacatccacaattgatgat gatgaagcatccaccgaaagctctactgaagaaccaagtgctcagcccaccggtccttactctgatttgactgtcggtgaggccattaccga cgttagtgtcaccagtttgaggacaactgaagcatttggatacacttccgactggttggttgtgtctttcactttcaacactactgacagag atattactctcccaccttacgctgttgtacaagtaactatcccaaatgaacttcaattcattgctcatccagaatacgccccataccttgag ccctcattgcaagttttctacactaagaatgaaagattaattatgactagtcagttcaactacgacaccagagtcatcgacttcaagtttga caatcgagaccaagtaataactcaagtggagggagttgtttatttcacgatgaaactagaacaagatttcatttctgcattggccccaggtg aatacgattttgaatttcatacatccgttgattcttatgcttcgacctttgactttattccattgattagatccgagccaatcaaattgata gcaggtgcaccagacgaagttgaatggtttattgatattccaagtgcatacagcgatttggcaacgatagatattagttctgatatcgatac taatgataatttgcagcagtacttctatgattgctcaaagctcaagtacactattggaaaagagtttgatcagtggggtaattttacagctg gatcagatggtaaccaatacagcaataccaccgatgggtatgttccaattactgattctaccggctctccagtagctgaagttcaatgttta atggaaagtatctcattgagtttcacaaatactcttgctgaggatgaagtattgagagttgttcttcactcttctgcgtttagacgtggttc attcaccatggccaacgtggtaaacgttgacattacagctggtggattggcaaaaagagaactcttctcttatatattggatgaaaattact atgctagtactggatctgaggggttggcatttgacgtatttgaagttgctgatcaggtcgaggagccaactgaggagtcaacctcagaggaa tctactgaacaggaaacttccaccgaggaacctaccgaggaatcaactgaacctactgaggaatctacccaggaacctactgaagagcccac cgacgagcctacttctgagtcaactgaggaaccttctgaggagccaacttctgacgatctctcaattgacccaactgctgtacctaccgatg aacctactgaagagccaactgaggagcctacttctgagtcaactgaggaaccttctgaggagccaacttctgacgatctctcaattgaccca actgctgtacctaccgatgaacctactgaagagccaactgaggagccgacctctgagactaccgatgatccatcgatagcacctactgctgt gccaacttccgacacatcttctggacaatcggtggttactcaaaacactacagtcactcagactaccatcacttcagtctgtaatgtttgtg ctgagacccctgtaacaatcacttacactgcaccagttgtgactaagccagtttcttacaccaccgttacttcagtttgccatgtatgtgca gagacaccaatcacagttaccttgacgttgccatgtgaaaccgaagacgtgacaaagactgccggccctaagactgtcacttacaccgaagt ttgcaactcctgtgctgacaagcctatcacttacacctacatcgctccagagtacactcaaggtgccgaacgtacaacagttacatcggttt gcaacgtttgtgctgagacacctgtaacgctaacatacactgcgccgaaagccagtcgtcatacagttccttcacaatattcaagtgccgga gagctcatttcatccaaggggatcacgattcctactgttcctgcccgtccaactggtacttatagtaagtctgttgacactagccaacgtac actcgctaccattacaaaatcttcagatgagtctaacactgttaccactactcaagccacacaagttttgagcggtgaatccagtggaattc aagctgcttcaaacagcacgagcatctcagctccaactgtcactacagctgggaacgagaactctggatctagattttcgtttgctggacta ttcacagttctgcctcttatcttgttcgttatataa PAS_chr1- 35 atgcagtttgcttccttactgcttctcttgtatattttcttggggcaaatttatcctactgaagcagcaaaatattttgttcgtctgaagaa 4_0251 gcctcacacactagacctcttgttcaaacaggatgaagcagatgcatctgctgagaaccgaatctctcttcatggtttaagggaccgaatca aaaaaaagatctcttttggaacgttcgaaggttttgttggtgaattcacaacagaacttgtagaaaaactaaaaaagaattcgttgattgca gacataactcctgacattatcgtctcatcttgcgatatcgaattgcagtcccccgctcctgatcacctggctaggttatccaaagaaggtgc cgtaagagcacaagatcgtcttcttggaccggaatttttctacgatggtgactggactggagaaggcgtcaatgtatacgtgatagacacgg gtatcagggtaaatctagatgaatttgagggcagagcatcatttggtgctgattttacaggcactgggaaagatgactctgttggtcatgga acccacgtagctggtcttattggctccaaaacttttggagtggccaaaaatatcaacttgatatccgtaaaagctctctctggtaatgggag cggttcgctttcagaggtcctacaggcgattgaattcgcagtcaagcatatgaaagccagtcgtaagccaggtgttgctaacttgtctctag gtgcaccaaaaaattcaatccttgaaaaagcgattgaagaggcattcaagaacggtttagtcatagtagcagcagctggcaatgccttcgtg gatgcctgtaacacatcccctgcaaactctccatatgcaatcaccgttggagctataggtgatcacaacgatgaaataactagattttccaa ctggggagcctgtgtcgatctttttgcaggaggggacacaattgtaagtgtaggacttctcaatggagtcgctgtccgcatgtctggaactt cgatgtctgctccaatagtcgcaggcttagccggaatattacttgaccagggtgtggccccagaagatgtaaaaggtaagttaatagagctc tcagatgaagggaagatcaacgataatactggaattctaaagccgggaactccaaaccgaatagccaacaatggaattcgaaaaagtgatta tgaagatcaaaaagaaaatgacaatgatgaagacgatgaagacggggaagacaatctagaagacattgaagaggacgaggattattgggatg aagagagaaggtatagggaatatgcggtatctagtttagtcttctaa PAS_chr4_0874 36 atgttcaacattatccaacggatacagagtttgagcaatttttatttaacggtttccattctattatgtattgttacaacagttgtctcaat tattagtatgttcttggatgaaacgtccagtattcctgcccaattaagcaatgttgtaatatcaacaaatttaaagtatagcagatcgtttg gttcagtcggtggtagacctaaagaaaactccaagattttatttgatcttgatatggatctggctccattattcaattggaatactaaacaa ctgtttgtacaattggtagcagagtaccctacctctgttgccgatgatggtgcgaaggtgacctattgggatagcataattactgagaaaaa gtacgcaagagtgcatgtcaataagcagaggggaaaatactcagtttgggacgtgtcggactcctttcaaggccgcaatgctacggttaaac tgaaatggaacttacagccctatgtcggctttctattctttggacaaactaagggagagattgaggtggcctatcctgcaacataa PAS_chr3_0513 37 atgagtgtcatagtgcatcctcttgcactattgacaataatcgacgagttccagagacgaggtcgcaacaacgattccataatattcggtgg gttacttggtaaacatgatgaatccaccaaccaaatatctgttgttaacagctttgtgataccattgatcgataatcagtttttgaataaag agtacttgcaggacatgctactcaaattttctatcattaattccaactttcgattcgtaggttactatcacgttcaatctttaaacggtacc gaaactcaacagtatgacttgaacgctattaacctagtatgccaagatgataataggccttcgtcctttgtccattggatagtaacagatcc aaaagagttcaaatcattctcgatgtattacttggatgattcaatggttcaactcgtcaattccaatattcaacattacatttctaaaccat tgccctatgaatttaaaaaccttctgtctgagaaaattgctatcgacacaatcctcaagcaatccaggctagaaaaagacttatccaccaaa aactcactgaagaaattaaacaatagttatatcgacattcattcctcactgaacgttctctataaatcagtcaataggcttattcgttacct caaaaaatgctcaaaatcagaagtttcaattgactatgacacagttcaggaaatgaatactgtaatactgaaaattgaaaggcttaaattga taccccaagtcaaggaggagtttgacttagtgactctttcactactggtagacaatcttgatcagatggatcatcttttgtatctccggaaa caagtggaacagtacaaaatatctgaatcaatgtatagttag PAS_chr1- 38 atgaaatttcactcgattgtcttcacattttcactcgttttgagttcactggcgttgtcgataccatgggtgtctgaccacatggtccagca 1_0127 tctttttgccgacccttcaatcagtaaaggtcctgatgtagatctcgttgggctacataagcatttggtcagcatcaaatctctttcgggct atgaacaagaagtagtatcgtggttggccgattatctagccagtaggggtcttactgtggagttgaacaaggtcgaggacgaaactgaacgt tacaatttgtatgcttatttgggaaccacccgcaacactaaggttgtgctaacttctcacttagacacagttcccccttatcttccctacaa agttgaggaaggtggctatatctttggcagaggaagctgtgatgctaagggatcagttgcggcacaagtgattgccttcctaaatctcttgg aagagggctccatcaaagaaggtgatgtcagtcttttgtacgtcgttggtgaagagattggaggtgatggaatgcgcacagctagcaagacc ttgggtgctaaatgggacactgccatttttggagaacctaccgagaacaagcttgccattggacacaagggaattgcactgtttgacctgaa gattacaggaaaatcctgtcattctggataccctgagctgggaattgatgccgacgctatgttggtccagattttgcacaagttgctttttg agacttcttggcctgtcagtgatttgctgggaaactccacagtcaacgcgggacagatcaacggaggagtagctgctaatgttatttcttcg gaagcacatgccaaggttttaatccgcgtggctaaagacattgacgctgtagagaagctgatctacgaggccattgcccccttcgaggagta tacagacattacctttcactccaaagaagatgctactttcttggattacaaggttgaagggttcgagaactacattgcagcctacagtaccg atgtaccattcctagtgacgggctccaatttgaccagatatttgtacggaccaggaagcatcatggtggctcatgggcctgatgaaatggtc aaggtttcagacctgcaggatagtgttgacggatacaagcgattagtctccgtctcactttag PAS_chr4_0686 39 atgccagagaaaaagaaacaaaaaaaagagtcgacatctccattcaagggtaacctagttgggatctcattggtagctgtggcattgtttgc catctaccagtacctctacccaagctcgttttcctctcagcctgaaaccccagccccagttttcgatctgagcagtgaattagaagcattgt gtcccgtgtaccctgcagtcagatcttccgacttcgaaaaggatcgccccatcttagagagaattctgaacgatccctcatttagaatcgct tctgctcaaaaactgagtaaggctgttcagatcgatacccaagtgttcgacgaacaattggacgtggctcaagaccctgaagtttggaccaa attcgtcaagttccatgaatatttggaggcaactttccccaccgtttactcccaattgaaggtcgacaaaatcaacacctatggcttggttt tcacttgggaaggctcagaccctagtctgaaaccactcatgttcttggctcaccaagacgtggttccagtccagaaagatactcttcaggat tggtcatatccccctttcgaaggacgtatcgccgatgacagagtttggggacgtggatcagctgattgcaagagtttactgattgcattact ggaaaccgtagaattgctggtagatgaagggtactcaccaaagagaggtgtcatcctcgcatttggattcgacgaagaagcttcaggtacct acggtgctcacaatatctccaagtttttgcttgagaaatatgggccagatagtattgccctcattttggatgaaggtgaggctgtcagttac gtggacaagaaacaaactaccctcgttgcaaagattgctacgcaggaaaagggttaccttgacctagaggtcgcattgaccactgtaggagg ccattcttctgtcccccctaagcacactgcaattggccttatttccaagttggtcacacatatcgaagatcatccattggacccagaaatta gtaccagaaatcctctggtacagttttcgaactgtcttggtgcagctggggctttgagagatgacttcaagactgctcttgttgcatacagc aaggatccgtcgaacaacattgtcaaacaaggtgtgattaaaggtatttccaagattgcatttttcttcggttctttgattaccacaacaca agccaccgatcttattttcggtggagagaagatcaatgctttgcctgaaagtgctagagtagttatcaaccatagagtggacgttgagcgtg attcagcccaaatcatagacagattgattcacttccacgttgttcctattgccaaggagcacggtttcaaggtcacttacagtgactatggt agtgacaaagttgaaactgtctacgagccagaaggagttgcctcattgggagaattccacgtttctcctttctccagagtctgggagcctgc tccagaatctccatccgacgacaatgtctggtccatcatttctggtaccactcgtacgatatttgaggagtttgtggacccctcggctaaac ttattgcaagtccatacatgatgcctggtaacaccgacactcgacactactggccgctgacaaagaatatctatagatacgttccaggtatt gtagatatttacaaggctaagatacactcggtagatgaatctaccgaggttgatgcccacttgcaagttatagctttctaccacgagttcat caaggttgccagcgaatgggagctttga PAS_chr2- 40 atgaaatcctctaaagaactatacaaggaggctctcaactatgaatactcttccgcggtttctttcaaggcctgggttcgaagtgctcaaat 2_0056 cattttgcgacatgcccggcagtttgctgaacaaagatacatcagtgagtgctataagttgtctgttcgttttgtagacttgattgtgaaca agatggccacgcataaagagctcaagcaattgaagaaaataaatgcaccagtatatctcacctatttggatttggctacgaagaaagtccca gatgtcatcaaggaatgtgaggccttgaagacaattttggatgatgagtaccaaagctacctcaaactgcaacaattgaaacgacagaagca gaaagaccaattgatccatcatcagaatcaggctcaaacgcataaattacgtagatcttcatcaatattgaaagatcatatcaacgctgttg atgaaagagcgctgttgaaacaactacagcagttgacataccatgatcgtgaattcgcaaccgcaataacggagatgccaaattatccagag atcccccagctgagtatttcaacgaatcagaacactagatcagaggcacccccacttccaccaagagtatcgcaggaacagtcattagcacc agtatcactagattcatcacaggcagatttacaacacaaaactgttaacttcaccgaagctgggcaaccattacgaacagtatttatttcag atagactccaatctgagttccttagactagcggaaccaaacacgatacaaaagctagagacttgtggcatcctttgtggaaagctcgtcaga aatgcattcttcatcacccatttggttataccagatcaagagtcgacaccaaacacatgtaatacaagaaatgaggaaaagttattcgacac tatagatcagcttgatttatttgtccttggatggatacatacccacccaacacaatcatgcttcctgtcttccatagacttacatacacaga attcgtaccagatcatgttaagcgaagcaattgccattgtgtgtgcaccagcacctcagttttctcatcattcttttggatgttttcggcta acccatcctccgggaattccaaccattacacaatgcactaggacgggatttcatcctcatgaggaacccaatctgtatgtgacttgtaatcg aaagaacatgggcgacgtgcaaggcggacacgttgtgatcaagaatcatttaccgtttgaaaagcttgatctaagataa PAS_chr2- 41 atgactagttctgtagataaagtgagtcagaaggtcgctgacgtaaaactgggctcctccaagtcaacaaagaataacaagagcaaaggtaa 2_0159 aggaaaatccaacaagaatcaagtggttgaggatgatgatgaggatgattttgaaaaggccttggagcttgcaatgcaattagatgcacaaa aactagctcagaaaaaagctgatgatgtgcctcttgttgaagaagaagagaaaaaagttgaggaaaagattgaacagcaatatgaccccatt tccactttttaccctgatggaaactatccccaaggagaagttgtggattacaaagatgacaacttgtaccgtactactgatgaagaaaagcg agctttggatcgagagaagaataacaagtggaatgaatttcgtaaaggtgctgaaattcataggagagttcgaaaactggcaaaggatgaga tcaaaccgggaatgtcaatgatcgagatcgccgaactaatcgaaaacgcagttcgtggatatagtggtgaagacggactcaagggtggtatg ggatttccttgtggtctttctttgaaccattgtgctgcgcactattctcctaatgctaacgacaaacttgtcttaaattatgaagacgtcat gaaagtagattttggtgtccatgtgaacggtcacattatcgatagtgcattcacgttaacattcgatgacaaatatgatgatctgttgaaag ctgtcaaggatgctaccaatactggtattcgtgaagcaggtattgatgtgagattgaccgacattggtgaagccatccaagaagtaatggag tcctacgaagttactttagacggagaaacataccaagttaaacctatcaagaatctttgtggccataacatcggccagtatagaattcatgg tggtaagtctgttcccatagtgaagaattttgacaacaccaagatggaggaaggtgaaacctttgcaattgaaacctttggcagtacaggaa ggggtcatgtgataggacaaggtgaatgctctcactacgccaagaatccagatgcccccgccaatgctatctccagcattcgtgtgaaccgt gctaaacaattgctaaagactatcgatgagaactttggtactcttccattctgtcgtcgctacatagatcgtcttggagaagaaaagtactt attggcattgaaccagttggttaaatctggagttgttagcgattatccacccttggtagatgtcaaggggtcatacactgcccaatacgagc acaccatccttttgagacctaatgttaaggaagttgtatcccgcggtgaagactactag PAS_chr3_0388 42 atgattcacagctgtgctagtgctgagtgctcaaaagcgactgaatctaccttaaaatgtcccttgtgtctaaaacaaggtcagatccaata tttttgtaaccaaaaatgtttcaagaatggatggaagatccacaaagcggttcacgccaaagatggtgatatagatggttcgtacaacccct ttcccaactttgcctacaccggtgagctcagaccagcatatcccttgtctgtgagacgagaggttccagagaacattactctcccagattat gctcttgatggagtaccagtctcagaaatcaaaaataacagaatgaacaagatcaatttggtaacggagccagaagacctggccaagctaaa aaatgtttgccgtttagcacgagaggttctagatgctgcggctgcatctatcaaaccaggagttaccactgatgagatagatgaaatcgttc atagtgaaacaatcaagagagaagcatacccctcccctttaaattacttcaattttcccaaatctgtttgcacatccgttaatgaagtcatc tgccacggtatacctgatcgtagaccgctccaggatggtgacatcgtgaacctggatgttaccctttataaagatggatttcatgcagatct gaatgaaacgtactatgttggagagaaggccaagactaacaaagatctggtcaacctcgtcgagacaaccagagaagctcttgctgaagcta tccgtttagtgaaacccggcatgccgttccgtcaaattggtactgttatcgaaaactatgtgactgaaagaggctgtgaaactgttcgttct tacactggtcatggtatcaatactttgttccacactgaaccaaccattccgcattacgctcgtaacaaagctgttggagtagccaaaccagg agtggtattcactatcgaaccaatgttgactctgggcactcatcgtgacgtggtttggcccgacaactggaccgccgttaccgctgatggag gaccaagtgcccaatttgaacatacccttttggttacggaagatggtgtggagattctcactggcagaacggaaacttcgccaggcggtgcc atctcaagactataa PAS_chr3_0419 43 atgctctataagaccaccttgtcaatagcacacacgagtgtgatattgttgtcattgataaccgccataagttgctttgagttgcatcttcc tcagaaggtttctcatatagtagacagtttacaatatacttgcggccaatttttgcaaaagcagcagatctttgcactctataacaagcaaa atttcaccgaaatagtgaaccagaatatcaagggaatagaggagagagttttgtctgagttgcttgaagaaagattagagaatgaatcccag aatgattattataccgccaattctcaaaattggcctatcgacttggatcagtactcagaatcatttgtaataaggatcacatctgaagatga gtttatcaagtacttgatcttcaaggaagctaaagctttgcatatttccatatgggagcaatctgttggtttgatagatttgaaggttgacc gtgatcagatgcaccgcctactttacaacgtggagtcacgcatactggaacgaagaacgagaagtgttgacagtccagtttctgaatataaa gtacaattgatgattggagatcttccacagcgaatctacgaaacatatccttcgacaaaagtgacatctttgcaagccctaggagagttccc ttctttccagaacctaagtaatgctttttttgaggattttagaacgctggaaactatatacgactggttcgaagaaatacagaaggaatttc ctaagctagtgtcgatcaactggattgggcaaacttatgaaggtcgtgatctgaaggctcttcacgttagagggaagcactctggcaacaaa acagtagtcgttacaggtggaatgcatgcgcgtgaatggatatcagtaaccagtgcatgctatgccgttcacaaactgctccaaaactatgc tgacggacaccacaaggaagcgaaatacctggacaagttggactttttgtttgttccagttttgaatcctgatggatacgaatatagcttta acgaagacaggttgtggaggaagaacagacaagaaacttatatgccccgatgttttggtatagacattgaccattcatttgattatcatttc gtgaaatcagaagacttaccctgtggagaggaatattcgggtgagtcccctttcgaaagtatagaaagtgaagtgtggaataatttcctgaa cagaaccaaagaagaacataagatctacggctatatcgacttacactcgtattcgcaaacggtgctgtatccctatgcgtactcatgcgaaa tcttaccaagggacgaggaaaacctgattgagctaggttacggtattgcaagggccataagaaagagtacagggaaaaaatatcaagtgttg aaggcatgcgaagacagggatgcagatctattgcctgatttgggaggaggaaccgctttagattatatgtaccacaaccgtgcatactgggc gtttcagatcaaattgagggattccggtaatcatggctttctccttcccaaaaagtttatatacccagttggaacagaggtttatgcctcaa ttcagtacttttgttcttttgtgctgaatttagaaggctaa PAS_chr1- 44 atgaaattgaccataacattagcccataacgatcaaatcttggacattgatgtgtccagtgaaatgctactatctgacctcaaagtcctgtt 3_0258 ggagttggaaacttccgtacttaaaaacgaccaacaattattttacaataacaacctgctcactggagatgactcgccactggaagatttag gactcaaagataatgaactcataattctgagcaaagtcgaagcacatagtgatgtcaattcacacttgaactctgttagagaacagttgata caaaacccgctataccaggccagtttacctccaagtcttagagataagctcgacgaccctcaaggcttcaaagaagaagtggaaaaactaat ccaattggggcagtttggacaatacgggccttcccgtacttccgtccaacaggaattagacagactacaaagagatcctgacaatccacaaa atcagaaacgaattatggagctcattaacgaacaagctatagaggaaaatatgaatactgcttttgaaatctcacctgaatctttcgtttcc gtgaatatgctctatataaatgtggaaattaatggtgtccattgtaaagcattcgtcgatagtggagcccaaacgaccataatgtcccctaa actcgcagagaaatgcaaccttgcgaatctaattgataaaaggttccgaggagtcgcacagggtgtaggaagttctgaaatcattggtcgta tccattctgctcccataaaaatcgaagatattattgttccctgctcattcactgttttggataccaaggttgaccttctattcggacttgat atgttgagaagacatcagtgtgtgattgaccttaagaacaactgtttacaaattgcagacagaaagacagaatttttaggagaagcagacat cccaaaggaattctttaaccaaccaatggaagctccatccacagctcctgtcccaaaacctgtacaacctcctcaacaactcggtcagcggc cggctggaagccctccctccacaattcaaagaccagcagtacaaccgccacctgtggatatacctccagaaaaaatccagcagttgatcaac cttggattcggagaagaggagtcgaaagaagcacttattagatctagaggaaatgtggaagttgcagcggctttgttattcaactag PAS_chr4_0913 45 atgccaaaccttccttctagcttgaacaagatgactgctcaagccgtgaaatacgcaaacggtatgtcatctgccctctcccgtgtttgaga ctctatccactaactttagattttatcaccttcctgaacaattcacctactccataccatgctgtcgactccgtaaagtccaaattggtaga gtcggggtttaacgagctcagtgagagagttaattgggccggaaaagtcaagaagaatggcgcttactttgtgactcgtaacaattcgtcca ttatagccttcactgttggcgggcactggcagccaggtaacggagtgtcaattgttggagcccatactgattccccaaccttgagaatcaaa cccatatcccattcgactaaggagggatttaaccaagttggaattgaaacttatggtggaggcttgtggcatacgtggtttgacagagattt aggagtagctggacgagtgtttattgaagaagaagaatctggtaacattgtgtccaagttagtcaagatcgataaaccagtattgagaatcc ccacactagccatacaccttaccaaagagagagctaagtttgagtttaataaggaaactcaattccatccaatctcatcgcttgaaaactcc tctgaaaaggagaaaaacaaagatgaggaacatgacgcttgtgcaggagaagatttgactacggaggagtttaagtcaattcaatctgttgt ggagagacacaacaaacaattgcttgatctggtggctgcagatcttgattgctctatatcccagatagtggactttgaattgattcttttcg accacaacaaaccagtactcggaggtttgaatgaagaatttgtgttctcaggaagattggacaacctaacttcttgtttctgtgccactgaa gcgcttataaatgccagtaaagataccaacaggttagatctggatactaatattcaactgatctctctgtttgaccacgaagagattggatc agtttctgctcaaggagctgattcttcatttctacctgacatacttcagcgtataacaagactaactggtaatgaggttagcaccgatctgg aaggacaaccaaattctttctttttagagtcaatggccaaatctttcctactatcttcagatatggcacatggtgtgcatcccaactatggg gaagtctatgagaagctaaataggccaagaatcaacgagggaccagtgatcaaaataaacgctaatcaaaggtacagcaccaattccccagg tattgttttgctcaagaagattggtgagttgggaaaggtccccttgcaattgtttgttgttagaaacgactctccctgtgggtcaacaattg gtccaatgttgagtgctaaacttggacttcgaacgctggacctcgggaacccccagctctccatgcattctatcagagaaactggaggtgct cgtgacgttaaaaagttggtcgatcttttcgaaagctattttgagaattattacaccttggagcctaagattaaggtataa PAS_chr1- 46 atgaacaaaggtccgaaagaattggagggccgcaagtatccagcaagagcccatgcactgacggtcaaaaatcactttatccaaaagaaggc 1_0066 tgacatttcaagtcgttctgcaatctttattagtggcgaagatctcaagttgtatccttactgtgaccaaacagctcctctcagacagaatc gttatttcttttatctgtcaggttgtaatatccctggatcccatgtcctttttgacttggacgccgaattgttaattctggtgctaccagaa attgattgggatgatgtcatgtggagtgggatgcctctttcgattgaagatgcctacaagacgtttgatgtggacaaggtggtatatcttaa agatttgcaaggctttttgtcgtcgtttggaaaaatatatacaactgacatcaatgatgaaaattctaagtttggcaatctactaacagaga aagatcctgacttgttctgggctctggatgaatccagattgatcaaagacgactatgaactcactctaatgagacatgcgtcaaaaatttct gacaattcccattacgctgtcatgtcggctcttccaattgaaactgacgaaggccatattcacgctgagtttgtttatcattcgttaagaca gggatctaaatttcaaagttatgacccgatttgttgcagtggaccaaactgtagtacccttcattatgttaagaatgacgattctatggaga ataaacacaccgttctaatcgatgctggtgcagaatggaacaactatgctagtgacgttacaagatgttttcccatcaatggagattggacg aaagagcatcttgagatctataatgctgttttggatatgcaggaccaagttatgaagaagattaagcctgaagcccattgggatgagctaca ccttttggcacatcgtgttctcattaagcattttttgagcctcggcatatttcataacggaacagaggatgagatatttgagagtggagtct cagtatcattctttcctcatgggctgggtcaccttttaggaatggatactcatgatgttggtgggcaccccaactatgatgatccaaaccct ctattgagatacctaagattgagaagagtgttgaaagaaaatatggtagttacgaacgaacctggaatctacttctctccctatcttgttga attgggactgaaggatgataataaggcaaaatatgtcaacaaggatgtactggaaaagtattggtatgtcggaggtgtgagaattgaagacg atattcttgttacgaaagatgggtatgaaaacttcaccaagattactagcgaccccgaagaaatttccaaaatcgttaaaaaggggttggag aagggtaaagacgggttccataatgttgtatga PAS_chr2- 47 atgacatctcggacagctgagaacccgttcgatatagagcttcaagagaatctaagtccacgttcttccaattcgtccatattggaaaacat 2_0310 taatgagtatgctagaagacatcgcaatgattcgctttcccaagaatgtgataatgaagatgagaacgaaaatctcaattatactgataact tggccaagttttcaaagtctggagtatcaagaaagagctgtatgctaatatttggtatttgctttgttatctggctgtttctctttgacctt gtatgcgagggacaatcgattttccaatttgaacgagtacgttccagattcaaacagccacggaactgcttctgccaccacgtctaatcgtt gaaccaaaacagactgaattacctgaaagcaaagattctaacactgattatcaaaaaggagctaaattgagccttagcggctggagatcagg tctgtacaatgtctatccaaaactgatctctcgtggtgaagatgacatatactatgaacacagttttcatcgtatagatgaaaagaggatta cagactctcaacacggtcgaactgtatttaactatgagaaaattgaagtaaatggaatcacgtatacagtgtcatttgtcaccatttctcct tacgattctgccaaattcttagtcgcatgcgactatgaaaaacactggagacattctacgtttgcaaaatatttcatatatgataaggaaag cgaccaagaggatagctttgtacctgtctacgatgacaaggcattgagcttcgttgaatggtcgccctcaggtgatcatgtagtattcgttt ttgaaaacaatgtatacctcaaacaactctcaactttagaggttaagcaggtaacttttgatggtgatgagagtatttacaatggtaagcct gactggatctatgaagaggaagttttaagtagcgacagagccatatggtggaatgacgatggatcgtactttacgttcttgagacttgatga cagcaatgtcccaaccttcaacttgcagcatttttttgaagaaacaggctctgtgtcgaaatatccggtcattgatcgattgaaatatccaa aaccaggatttgacaaccccctggtttctttgtttagttacaacgttgccaagcaaaagttagaaaagctaaatattggagcagcagtttct ttgggagaagacttcgtgctttacagtttaaaatggatagacaattcttttttcttgtcgaagttcacagaccgcacttcgaaaaaaatgga agttactctagtggacattgaagccaattctgcttcggtggtgagaaaacatgatgcaactgagtataacggctggttcactggagaatttt ctgtttatcctgtcgttggagataccattggttacattgatgtaatctattatgaggactacgatcacttggcttattatccagactgcaca tccgataagtatattgtgcttacagatggttcatggaatgttgttggacctggagttttagaagtgcttgaagatagagtctactttatcgg caccaaagaatcatcaatggaacatcacttgtattatacatcattaacgggacccaaggttaaggctgttatggatatcaaagaacctgggt actttgatgtaaacattaagggaaaatatgctttactatcttacagaggccccaaactcccataccagaaatttattgatctttctgaccct agtacaacaagtcttgatgacattttatcgtctaatagaggaattgtcgaggttagtttagcaactcacagcgttcctgtttctacctatac taatgtaacacttgaggacggcgtcacactgaacatgattgaagtgttgcctgccaattttaatcctagcaagaagtacccactgttggtca acatttatggtggaccgggctcccagaagttagatgtgcagttcaacattgggtttgagcatattatttcttcgtcactggatgcaatagtg ctttacatagatccgagaggtactggaggtaaaagctgggcttttaaatcttacgctacagagaaaataggctactgggaaccacgagacat cactgcagtagtttccaagtggatttcagatcactcatttgtgaatcctgacaaaactgcgatatgggggtggtcttacggtgggttcacta cgcttaagacattggaatatgattctggagaggttttcaaatatggtatggctgttgctccagtaactaattggcttttgtatgactccatc tacactgaaagatacatgaaccttccaaaggacaatgttgaaggctacagtgaacacagcgtcattaagaaggtttccaattttaagaatgt aaaccgattcttggtttgtcacgggactactgatgataacgtgcattttcagaacacactaaccttactggaccagttcaatattaatggtg ttgtgaattacgatcttcaggtgtatcccgacagtgaacatagcattgcccatcacaacgcaaataaagtgatctacgagaggttattcaag tggttagagcgggcatttaacgatagatttttgtaa PAS_chr1- 48 atgacctgccaaagtgtagaagagctggatgctattgttgaatcaaagcttagggaggttgataataaagtttcgaacggaaatgttgactt 3_0261 catcaaacaatatctgattcaggcgatgaactattatgacaagtatagatctgaaatcaaaaaaattggacccacagaaaagaaccctaaat actattgttttcaagaggcagcgtatgttaactacaaagcttcccaagctttactaagagagagaatacccaagctgcctggctttggagga tataaatctgcgtattcaaaaatctatcgtgaactgatagaaatggtagaggggcaagaacatgagattgcccagataaaaagcggcttaag gaaaaacttttgtgatgatacattagttcttcgactgagaagtttaaaatcaccatctgctactcagcccaaaagtttaccggattctacac ccacttcacaatttaaaccaaaaccttcaaagccttttagtatcacaatcaatgaggaatacatttcggttgaccaattgtcacgccttctt aaaacgaacccgaatgacatactcctcattgatctacggtctcgtcaagagtacgacgtgtatcacattgaagatggctccggggtggacat gtcaatatgtatagaaccaatgagtatcagaaacggatacacagcagaggatctttatcaactttcaatggccgtcaatccagattatgaaa ggagattgttcaagaatcggtctcagtatgaactgttggtatgttatggtaattatgacaacgaggctactgttcaaatgttcatgactatc atgaataaagatacttccctcaagaggcggagcgtctatttgaaatccggaattaagggctggaatcaggatctgagttttcaagattcgaa accgaatgggtacttaactagtacgactgactacttcagtaacactccgaaacacacaattacgcccaaatcatcaaaatcaagttcaaaac ctactttaaaaactactgtcaactctgggcctgcccacactgttgggatcaataatctaggaaatacatgttacatgaattgcatacttcaa tgcctattagaaagtgataagtttgtttcattttttttacaaggcgattataagaaacatatcaatattaatagccgattaggctcgagagg tatattggctacaggatttcatttgttagtgctattaatatccagatcatctggtaaaacagtgactccttcttcatttgccaaagatgttt caacagtgaataagaattttaagttaggagagcaacaggattgttttgaatttttagattttctcctggatagtttacatgaagacctgaat gaatgtgggaatgaaccaccaatcgcagaactcacacctgaagaggaaaagcttagggaagctttacctatcaggattgcttcgaccattga atgggaaaggtatttaaaaaacaattttagcatagtagaagatgtgtttcaagggcagtacttctccagattggaatgtacagtctgtaaaa gcacttcaactacttataactcattcagttcactgtccttgccaatcccattagatcgacaaaatgtcacactagatgactgtttccaggct ttttgttctgtagaagaattgaacggagatgacagatggcattgtccaagctgtaaaaaaaagcaggtcgcttttaagaaacttggtatctc tagactaccaagtgttctgatcgttcactttaaaaggtttcaggtcaagtgggaaacaggtcatataatcaagatagacaagtttatcagtt atccgttcaagctatcaatggacaaatattggcccaaagctcaatcagaagaagaactaagaaacttggagaagctaccatcgagaaatcag aatccccctttcaattatcgattgacaggggtggctaatcattttgggaccagaacatcatctggtcactacacatcatatgttcaaaaagg tggccaatggtattactttgacgatagtgctgtgactagcaatgttgatcgtcataaaatcgtaaatgggaacgcctatgttttattttatc gacgtagttag PAS_chr2- 49 atggaagccgtgaatttacaaattgaatggattagacaggtgcctccagttactgtggctcttgtagcatccatgtcaatgacctatttttt 1_0546 gcaacgcatagatgtattatcctcaaatatgttcgtgtttgaaagacatcgtgtgtttaatgagatggcctattctcgtttgatactaagtt tcttcttcagcgcccattcgtttgttggattcttttggacattgtacacattatttcagaattcacaggcactcgagctgacctatgaaaac tcaatcgattacctctactcattggtgataatagcaggtttgatcgtggcatgggcctcatacttggggggtccgttcatgctgggatgggt tctagctgacgtcttgagaaccatatggtgcaaacagaatcccaacgaaagaatgtctattttggggctagtttccttcaaggcaggatact ttccatttgtaatacttgccatttcatggctagaaggaagttcaagaaatcttctattaatgctaattagccaaactgtcagtcaggcttat atttttggacaccatatgatgcccgaactacacgggatcgatctgtttctgcctatatggaaattccagtgtttcagacgtcagagacaacc accaattcatcagcatcaagactaa PAS_chr2- 50 atgtcaaaggtggtggtattcctaaatggattattggcaataacctttacgtttgaacttctctctgttttaagcgtgccaatcaccaagca 2_0398 tatccaactttgttcttatcaaggatataagtttggcgtgtttggatattgcaccgagaataatatctgcacaacgataggaatcggttatc atcgaaattcaatagacgaattgagaggcttttcattaccaagtaatgcaagaagctctatatcaagcttgttggtggttcatttgattggc tgtgtttgcacctttattttatgggttctaagtctcatgttgaatatggatagatttcacagatcattatggttcttattaacgtgtctagt atggacttgtgctttctttttttttacattattctccttcctggtagacgtgttactatttgtgccacacgttgcgtttggaggttggttga tgttggtaagtactgtatttttggcatttacaggaaccattttttgcatcatgcgaagaactgtcagctcaagaaaaactcatttgaagaac tacaacgggggaagtacaagtttgatgcggctgcagacgtatatctccaatagctctagaggaagctctgtaaccaatgatgaatacgtctg gtttcaagaaactccattacaagacctctaccccccagacaatcccaattacgacgacatctacggaacgactgaacacgaactaacccgct tggacacaatatctcttgaaaggccaagaataggccttatcacaaacgaaaatgccagcggcgatggtggggtagtttccccaccacagaat gacagtacacttctggaatcttcgggcagaattaggaatgggccactgggagaccgaagtgaatttcccaacggatcaacaagcgaactttc tgcataa PAS_chr4_0835 51 atgaaatacagtgaccaattaatagaagagtacaaagaattatggttaacagcgacatctaatgagcttactagagaatggtgccagggaac tctccacctgagcaaattatacgtttacttgacacaagacttaaagtattttggggatggatttcgacttttaggcaaaaccatttcgttat gtcgccgtaggcaatcgcttgtgtcattaggcaaacatgtggggatgctcagtaatagtgagaacacgtacttcgtggattgtattaacgat cttactgaacagttattaagagatgggatgtacaatgctgaagaattagaagaaatcagtggtttaacgttacctgccgtggaaaggtacct tttattcatgagatcgatggtagagtcttctacaataacttatgcagaaatgattactgtgatgtttgtaatggaacaagtctatctggatt ggtcaaataatggactgagaagtaaacctgacaacttgcattggtggttcaatgaatggattgatatacatagtggggagaactttgaaagc tggtgccagtttttaaaggatgaggtagaccgctgtatacaggagttgaaggatgctaatagagatgatctcgtggcgagggttgaggagat ttttagagaaacattagaacttgaagtcgaattctttaaaagttgttacgatatcacggacgatgaatga PAS_chr1- 52 atgcactcgaaatttaggtgggtatgtgtcgatactcaattctgcacacaccaccaaaatctgtcgcctttctcttatatctccaacccgag 1_0491 tccaatgtcattttcttaccttgaaggcaacatcgattttaaaggacaggaacttgcaaacaggatcactaaaaaactaatcacatttggtg caattattagttttctggtaggatttttgagtgacaacatcttatacactgtatacactttcgcagcttttggtttattgactgcttctttg gttattcccccttttagcttctacaaaaagaaccctgtaacatggttaccaaagaaatccaaaatagagattcagcattga PAS_chr2- 53 atgacagactctgttaactctgatgattctgatctggaaatcatagaggtgactgagcctactccaaaagtggaccttttggcccccaatcc 1_0447 agcatttaattttactgcccccataagcaacagtaacggcacaactccaataaggagaaaacttgatgaccaatccaactccaattcttttg ccagactggaatcgttacgggaatcatcagtgaaaccacaagctagtacgttcaatagtagtaggttcatcccccaagccgaccaattttcc aataatcagaataatgaacttgataacaacaatggattcgccgactggatttctaagtcccaacctgaatttccctttccacttaatgatgg accaaaaaagtccagcaatcaacctacaaactcaaattttgaagagatcatcgatttaactgaagatatcgagataaatacatctgtccccg catctacatcatcttctaccccagttccctccagcacacagaatcagagccatcatatagccaacaacaacacagcacaagatgcgcatatc ttccaagggaaacgacctctccaatcatattcagatgatgaagacgaagatttgcaaattgtaggatccaatattgttcagcagcctctagg aattatgccaggaactttcaacgcccctgcaaacatactccattttgacggttcaaaccagaatgaacaagccagatggctggacttgcgga taaaagatttgttagataatcttcacaatcttcgagttcatgctcagtcgaatattatggagatcaataggttcatttccactttggggcat ttaaacagagaagtttcagagctcaatctaagatatcaatctatcgtgaacaatcctcaggcgaccgctaataatcaaggatacctcactca gcttttgaacaggattcaggagcttactaatgaaaaagcgcacatatttagagagatggatacatccaagataaaacagcaggagattcaca gaagaatccatgctctctcgtcaacaattgacaaactgaaaaaagatcgtgaacttatctttcgaaatgctcaaaatgcttttcacggtgat atgaagaatgaagttttggaaggccagtctttcatggatgcaattcatagggcaaatagcttgggttatgcttcaaatatttattctcgttc tgatgaagacgctggaagcttacaacggcttcttgaaaatatccagcccgatatggaggacaaagacgatgatgaattggctaaaactccga aggagttcaatattcaactgctgaagcatcagagagttgggttagattggctacttcggatggagaagtcaaccaacaaaggaggcatttta gcagatgccatgggcctgggaaaaaccatccaggctattagtattatttacgcaaacaaatggaaaacacaagaagaagccgaagaggaggc aaaacttgaagagaaggttagatccgaaaagtctacatcagaaacgaatggagaggtcagcaaaacgtcaacggcaaagtcggaaaagaaac ccatccaaggagacgaaggatatttcaaaactacgttaataatagcaccagtttctcttctacatcagtgggagtctgaaatcttgttaaag acgaaaccagaatacaggctaaaagttttcatttatcacaagcaaaaaatgtcctcgtttgaagagctccaacagtatgatatagtattaac atcgtatggaactctgtcttctcaaatgaagaagcattttgaagaggcaattaaggaggcagacctacagcccaactcttcatccataccag cagaagactctggaggcatatctttcaagtcaccattttttgcaaaagaaacaaagtttcttcgagtcattctagacgaagcccataagatc aaaggaaaaaatacaatcacttcgaaggcagtcgctttggtgaagtctaaatacagatggtgtttaacgggcacaccgctacaaaataaaat tgaagaactatggcctctacttcgattcttgagaattaagccatattatgatgaaaagcgatttagaactggcatagtattacctataaaga gttccatgtcaggcaaatatgattccacagacaagaagattgctatgaggaaacttcatgccctacttaaagcaatcttgttgaaacgaaac aaagattcgaagattgatggagagcccattctcaagttacccaagaagcatatcattgacacattcatagaaatggaagcaaaagagttaga cttttacaaggatctggaaggacagacagccaaaaaagccgaaaagatgctaaacgctggaaagggacaaggaaatcattattctggtattc ttatcttgctattgagactgagacaaacttgttgccaccatttcctcgtgaagttatctgagatgaagcaagaagccaaattgaaacaggaa gttgctaccaagatgccacaattggccacacaactatctcctgctgtggtaaggagaattaacattgaagcagaggccggatttacgtgtcc tatatgtttggataacatcataaatgagaatgcttgtatattatacaaatgtggacatgttgtttgtcaagattgcaaagacgatttcttca ccaattatcaagagaatgaaactgatgacggtcttagagtgtccaaatgtgtgacctgtcgtttgcctgtcaacgaaagcaatgtaatcagt ttcccagtctacgacaagattgtgaaccagcatatttcagtgatggatatagttaaaagtgagtctccagtgttgtcaaaaattgaaatgat tcaacaactgatccgggagaacaaaggcgtcttcgaatcgtctgccaagatcgataaagcagtggaaatgatacaagagttactgagagaca atccaggggagaagatcatagtttttagtcaattcacaactctcttcgatgtcatagaggtaatactcaaagagaacaacattaaattcatt agatatgacgggtcaatgtctcttagcaatagagatgctgccattcaagagttttatgagagtacggagaaaaacgtaatgcttctttcttt gaaagcagggaacgtggggttgacattgacttgcgcctcccgtgtcataataatggacccattttggaacccatatgtggaagaccaggcca tggatagagcccatagaattggccagttaagagaagttttcgtctatcgaatgttgatcaagaacaccgtcgaagatagaattttgaccatt caaaatacgaaaagagaaatagttgaaaacgctctggataaccagagtttgaatacgatatccaagcttggcaggaacgagttggctttctt atttggtatcggcaattga PAS_chr1- 54 atggagtgtaaaaaagtcaaagatcgcctagtcacggaatacttaaagattgaatgtagtcgacttaaccgaaggatacgctccctgaaaaa 3_0053 tccaaaagttgagcaagccctactgcaattcaagaactcacgtttggctcacatgagaaaggctcatctggatggaataagaaacccacagt atacggatgacgccatctttcaggcattggaaaccatggatttggaccacatatttgagaaggcaggtagtctttacaactcacagcaacaa gatgaatcaaaaaaagattccctggatgaaacagatttcaccgtggtggcgttgctagattggttcaagaatgacttcttcaaatgggtaaa caagccaccttgtcctgtttgccatagtgaagatgaaagccgcataagaatggtcggatctgcaaggcccactagtgaagaattgtcgtacg gagcaggggtcgtagaggtgtttaattgtgaccattgtagctgtgcaatcagatttccaagatataacgaccctaagaagctcctgagaact agagctggacgatgtggggaatggaataactgttttctgttgtgtctaaaagccttgggtctgaaagctagatgtgtgaggaatgtggaaga tcatgtatggagtgaatactactcggaacatctcaagcggtgggtccatctggatagttgcgagaatgcctttgatcaaccagaactatact gcaaaggttgggggaaaaagatgagctattgttttgcttttgatgacactctcatagaagatgtgagtgccaagtacattactcaaggtaga ctgcctaaaatgctagacgacgaaaccatcagaatatgcttgtattttttcaaccaggaagctcttaagatggtgagtgaaaatccagaggc attctactccgctttggttaagtatcacagatgtctgtctgcgaatagaaaagagagcgggtcaaaatcacgagccgtgaatgctagtttga cttcattgttaccacgacaatctggtagcgcatcctggacgtctgagagaggcgaaaacggactttag PAS_chr3_0200 55 atgcctataaaggggcggttcaccaaaaagaagccaaaaaggaaagatgagccaaatcgaccgtcccccacccagttcatcaaaaaaatagc ctcattgaaaaagcagaccaggagagatgaggccctggatgtgctacacgaactagcagttgttgtgtcacctttgatgaaagagaacggtt tcactgttggattattatgcgaaatgttcccgaagaatgcctctttattggggctgaatgtgaatatgggttcaaagatcatgatccgattg agacctagccacaacatgaacttgtttttgccaaaaagagagatcatcggtacaatgctccatgagttaacccataatcgcttttcggccca tgatgtaaggttttatgactttcttgagggtctcaagagcaggttttttgagattcaggtgaaaggatctttacaaactacagggtatgtta actttagtgaagttctatctggtaatgcggcgagagggcaactgattcaaaaggaaaaagagaaaggacaaagattgggtggtaataagcat gcaaaacctatgagagtcctaatcttggaggcggccgagaagagaatgatagactctaaatggtgcggaggagctagcaatgaagtaggcct tccaaaaattgaagatctaatggacgatgaagaagctcaacactctgaactaaaggaagagaatacaaagaaggtcagaaaaattgttcaac ctagcaaaaagaaaattgtagatttggaaaacctaccgaatggcaagtccattattattgatctaactaatgacgatgactaa PAS_chr1- 56 atggaacacaattgtctgaaagtcaatgaattggcgctccagttggctcaatcactgcagaacagcaaagtcagcacagctgatcctctaaa 3_0105 gaagaggacaagcagctacagaggcctgagtagcgagcctataatcacagaggaagaaccaacaatcaagggcgactataatagattttaca gtcagtcttcagataagcaagtattggacaataaaccatggttgcaggatggaaactatttcaagactgtatacatttcaacgatagcacta ctgaagatgatgtctcatgcccggtccggtggttcaattgagattatgggcatgctgacaggtaaggtgtttgccaacacattagtcgtaat ggattgctacttacttccggttgaaggtacagagacacgagtgaatgctcaagcggaaggatatgagttcatggtctcttatttggataact taaaggaaatcaagcataacgagaatatcataggatggtatcactctcatcctggttatgggtgctggttgagtggaattgatgttgccact cagaatttaaaccaaaagtttcaagatccctacctggcgatagtgattgatcctgaaagatcagtcagacaaggatttgttgagattggagc attcagaacgtttgctgagccagccgttggaagatcgtcgtcgtcagtttcctctgcaagtggtgcaggaattagtgatgttgcgttttctt ccggtagaaacagtgcatctggaatgtcctcagttctgagtgcaagtaatattagcattgccgaagagctaagcaaacaatcgatcacccaa aatgtttttgacagaactactacaaagattcccaagggcaaaatgactgattttggagctcattcaggaaaatattactcgctagaggttaa ggttttcagatctccactggaggagaaactactggatacgtttggttctaaaacctggattaaaggtttaacgaactactccaacgttgtta atgccgaggaaactcaagtggagttaatgcataaaataatggaagccacggagaacttacggaaggaatctccttctaaattgccatctttg gtgatggggaacctgatttattcaggtgcctctcaaggaacaacagggaaccgcaagcgctcaatgtccaaatcttctatttattcgggttt acaagcttcatcgggtatacccagttctaggtatcctacgaagggaaaaaatatgagtggatctcaattcaatgatgacccgctagcaagat cactggataaaataccgccagatagtccagatcaacagtacgatggcgcattatccattcaacaaccgaaaagagcatataatacacatact tctagagcaggtgggttggccagcgttctgtcctctgggagtatggatcctcaaagttactccatggtaggacgaatgagtctaactaatca atcgccggggacagctctgagaggcctaaatacacctcccaacaaacgaccgcagagaaaccctggtcatacaagctcaggtcaaggaggaa cgcctggaggagtcagtcggtccaaagagaaaattaacaagccaataggtataagcatgattagcaaggatttcaaggttgtcatctcacaa caggtcaaccagatgctacgtcgtcacgtccagaatgacctttttggatccaatagtccctaa PAS_chr3_0635 57 atggatcatgcccaacgattgctagaactaagtttttacaatcaaagtctgggcaaatcagtgatagcaaagaaatacagaatagaatcctc tcgatatttgaatgaacaactggacaagtccttgacaagagataatgatctgattggattatgccgtatagcattagacaacaagttgacca tatcagataagattatatggatgagctctcaagttgaagacaacttctttccgccagtttttcaaggcttgaagacgtatattgatagcgac gagatttatcaagagaaacttttaagcgtaccagcggattttgaaccaatagttgaatggaagagttgcacagagttgcccaatgaatggtc aaacaatggtgtggacaatttatttcaggattctttagatgactgttcgtttgtagcttcatttctatcctgcaacaatattggtatccctc tcatggataaagtcattccccacaaaaactcgttcaaatatgcggttagactgactttcaatggttgcgaaaggttggtgtttattgatagc cgtttgcctttgcttaggaatacttccaagactttacgagtgtcaagtttttctaacaaagatctcttatggcctagcatcatcgaaaaagc tttcctgaaaatgtgtgatgatgggtacaagttttcaggatcaaattcagccattgcaaactatgctttgactggctggatccctgaagtca ttaaaacttcttcatgtacaatagcagatattagccgattgcatgaggattttcggaacggaaacgtagtactatgcttgggaacgggcaat ctgaccgagcgagaatgcaaacagtatggattgatccccaatcatgactatgctgtcactaaactatcatttacgaatgattcagaatacaa gtttgacattcgtaatccgtggactaaagggcagaaagcagtgacaattacagatctttcaacctttgaagttatctacgcaaacagaaatc ctataatgttttcgcacatgaaccagctaagcggtatctgtcaaagtcaggttaatgaagagttcatagatctaattcttaaccattcgcag tataccctaggcaatgacggtaattctacaattgatgtgattcttttctttgaaagacattcgttaagaaagaaaatcagtgcagagtctcg tattgagattttccaatcagaaggcgaaagactaatctccagaagaaataaagcaagcaaggaatgtgtttctaataataccaactttcatt tcataacaatcgaactgaaaccgttagaaaaggtaactgtggtaatagatatcggcgagtcttcgattcgaagccatccatttactctaaag gcttttgccaatgattcaactataactttgaacaaagcactttctagacctggttgtttcaagcaaatggacctagagctaacgcccttaaa ctctggtgggaattgggataattatgcttattacaaaaatccacaactcatagtcactcttcacggagattcaacggatgaagctccatttg aatctgctgttttcagcaaaagtgataagaccctatttacgtatacagtgttttggaaaagtgacgatccagactttcctttcatcactgac gcaagcaagaacaagctcgtaagcacagacaataagtataaatacagatcatgtacaagatcaagagttgtttcttgcgacaaaagctattt gttcgtgctgagctcctacgaacctgatgcaattgagtctttcaaagtattttttcaatgttcccacgatttttctatagagtgggctgaga cgtcgcttgggcttttcacaaaggaagaaactttctcctggaaggaccaattagtcaaggagttcattattcaagtctataacccttcaaag ttgaaagttcacgcagtaaacaccaacaacaaacgcagatcaaaactaaattgctctctctcattccaaaacacattaatcagctctttgca agactacacagacaatctctatggatgctttattagcgggaacttggagattcccggcaagtatctattacaagttcataaaaacattatat ctaacgaagaatgtttggtcgaaattggatctagttcgtcatttgagttatgggaacatcattaa PAS_chr4_0503 58 atgttgaaaactcgatttcattccagaaagggttttgtaatctacagtggagatgatgaagagagtgacgaagagagtaaacaatggatgtt tcccgagtcgacctttgtaaccaatgggtttgaccaattgttcaaggtgagaaatgtcaataccattaatgacgacgatgacggctaccaat cgttcgatcaaccggattgggcgcaagatttaaccgcagatactcagtatcttgctttaggtgacgaaggggagaatcatcgttcacaacaa gagataggcaacaggaaaagagccaacaaaaagcaaaagaagccaactaaagcaaagacaaaacgtcaacaaagacgcacagccaaaaatga tcaatccacggaacgatctgccatttcacaaccttctaacttaagtacactgaactccttactcaaatctgttcggtctgaactttccaatt ctgatgggagtccccacacattctacgatgtatctctctatgaagaagatctgaacaacctagctgatgacgaatggttgaacgataataac gtctcgtttatctacgagtacattgaaagattttacattacccgttgtttgagcgacaagcttcaattttcatcaaagaagatggtcaattc tcaaataatactcctccgaccttctatggtttttttgctggcacattcaactccaaaagatatccaggattttctcccaccgttggataagt ctggctttatattccttcctctgaacgacaatgatgatctggaaatggctgaaggtggatcccattggtgtcttttagttgtagctgttcac gataacaaatgtttcctctatgactcattagagaatgccaatctcacagagtctgttgcgcttgtgtctaagctgtccactctgctaaacag gcgaatacaactcgttgaaaatacacattgtcctcaacaactcaatggcagtgattgtggagtaatcacaacccaaattacagcactactgg tatcccgactgctttgtgttttgccgggacatcctataaatttggatcttcaaaatgtagctatcaacgcaataagcgggagaatcttcatg ttaaaactcctccaacatgttctgaacaattaa PAS_chr2- 59 atggcaccaccagtccctgtatatacgagagatgaagtcaagatgcaatttccacagtacatgatgaaatttttgccttcaaactgtgagct 1_0569 gtactccatcatccagaaccaatgtaccttctctgctgacgagataatatgtgtgcccttcaagagggtgtttgccaaatgccggaggggaa accaagaagccaagaggaacataataccagagaatggaggactgaatttaactggaaagaaactaatcccaagagaatacacagtcattgaa gttacggactccctaacgaacaagtacgacaatagtagcctcatggacagattttttgaggcagaaagagatttaatgataaggtttcaaga atatgaggaacggaacagtaaggaaggagaaataaagtag PAS_chr3_1223 60 atgctcagacagtttgctggaagggagttcaagcgtcggttttctacgggaatcaagacgatgccaacaaagcttaccaaactgccaaatgg tattcgtgtcgtaacggacgaagctccgggccattttagtgccatgggcattttcgttgatgctggttcaagatatgagagccagtttccag aattaaccggccactctcacatcatcgatagacttgcattcaaatcaacatccaaattcgatgggaaatctatggtagaaaacaccaatcat ttaggtggcaactttatgtgtgcctcttcaagagagtcattgatataccaggcttcagtgttcaacaaagatgtggacaagatggctgaaat cctcagttctacagtcaaagaacctttatttactgaggaggaagtttctaatcagatagcaacagcagattatgagttggatgagttatggc tgcaacctgacctaattcttcccgaattgtctcaacaggtagcttatggatcaaaaaatttgggttccccgctgctctgtccgaaggagtct ttagcaaacatctcaagagaatcccttttgaagtatcgtgaaatattttttagacctgagaacttggtcgttgctatgttgggagttcccca cgagaaggccttggaacttgttgataaaaatttaggcgatatgaaatctgtcggttccagtccagtggtcaaagaacctgctaaatatacag gaggagaactttctttgcctccagttcctcctatgggtgggcttcccgagtttcatcacatatatcttacatttgaaggtgtccccgtggac tctgacgatgtctactcactggctactttgcagatgctcgtcggtggtggtggatctttctctgctggtggtccaggaaaaggaatgtatgc cagagcatacacgcgagttctgaatcagtacggttttattgaaagttgcaattcatatatacacaatttctcagactcggggctgtttggtc tctcaatttcaagcattccgcaggcaaataaagttgttgcagaactcttaggtcatgaactgagctgcttgttttctgaaaatccgggcaaa ggtgctcttaccaatgccgaagtaaaccgtgccaaaaatcagctacggtcttctttgttgatgaacttggagagcaagatggttcaattaga agaactaggaagacacattcaagtttatggcagaaaagttgatgtcacagagatgtgtgataaaatcagcaaagttacaaaggaagatctag ttgcaattgcaaagaaagtcttgaccggaagcaacccgactatagttgttcaaggtgacagagaatcttatggagacattgagggtactttg gcatcttttggagttggtttagatgccgcttccaaagcttcaaagaaaaaaacgagaggttggttctaa PAS_chr2- 61 atggcaattatcaagttcaacgcaggcaaagtcaagattgacgaggaaaccaagctttgtacacccttggcaacaagaggagaaataatcgt 1_0597 ccaattgtcggctgagggcgaagagttttatgatttcaaatgggtccctactgagaacacagctggtgaaggtaaccagtcagagacattct tggtcattccgggcgatgtgacgtggaaacacgtcaaaagttgtaaagatggtagagttttcaaattgacatttttgagtagtggggcaaag agtttgttctggatgcaagatgataatggaaacgaggatgacccatcagagttgacaaccaaagataaggaaattagtgaaaaaattaccaa gttgttcgacgaagaagagtga PAS_chr1- 62 atgaaacacttggctgtccataagtacaaggtaggagccatcgcagctggcttggttgtctcctataaaatctttgcctaccgcgctgcgtc 1_0327 ttcctcctcctcaaacgtcatcaacttgaccaatatggcaaaaactccaatcactttaaaaccccctcaggctccactccgctgggaccata ctccagagcagatccttgccgaaactgataagtatatatctaccagtcaagaggttgacgattgggtggcaaacagctttgccactgccaat gtggacaccatcaagaaaatagccgccgctgagaatgaacaatacttgccactgtgtcaattgagtttttatcaacatatctcggataacca ggacgttcgtaatgccagtactgttagtgaggagaaaattgataagttctccatcgaatccaaccttagagaagatgtgttcaaaacagtga acaaagtgttcaaacaggttcaagaagattcggaactccaaaagaccttggacccagaatttaggcgtttactagaaaaattgaacctaggt tacgtgagatctggtttagatttatcccaggagaagagagaccaagtcaagagtttgaaacaagaactatcaaccatttcaatcaagtttaa taagaacttgggagaggaaactgaacacatttggttcaccactgaggagttaaaaggtgttccagaatcagttgttgagcagtttgaaacta agaatgagaatgatgttacttaccacaagatgacatacaagtatcctgacctgttcccggtactaaaatatgccgttaatccagctacgaga caaagagcttttgtcggggatcaaaacaagatacctgaaaattcaggattacttgtgaaagccgtcaatttgagaaacgaacttgcaaaagt tttgggttatgatacctatgctgactatatcctggaagtgaagatggccaagaactccaagaatgtttttgaatttcttgatgatgtaaggg aaaaactcagacctctcggagagaaggaactgcaaagaatgttgactctcaaggctaacgacccaaatgctgttgataaggaaaattactac gtctgggatcatcgttactatgataacaagcttcttgaatctgaatacaaagtggatgagcaaaagctggctgaatactttccaatggagtc caccattgaaaaaatgcttgccatttacgagcacttgttcaatttgcagtttcaacaagttgacgattcggagaaacaagtttggcatccag atgtaaaacaattctccgtttggaaaatcgataaccctgattctcctgaatttgtgggctggatctattttgatttgcatccaagagaagga aaatacggtcacgctgctaattttggaatcggtcctagttacatcaaagaagatgggagtaaaaattatcctgtcactgctttggtttgcaa cttttctaaaccatcaaaggataagccatccctattgaagcacaatgaagtcactacattcttccatgagctaggacatggtatccatgatt taattgggcaaactaggtatgctcgtttccatggtacttcagttgctcgtgatttcgttgaatgtccttcacagattctagagtactggacc tggactagagatcaactcaagtctctttcccaacattacaagacaggagaagccctctccgatgaactcattgattcgctagtcaagtccaa gcatgtcaatggcgccattttcaatcttaggcagttacactttggtctctttgacatgaaactacatactgccaaagagcctgaatctttag atgtgacaaggttgtggaacgaattacgtgaggaagtcgctctggttaagaatggtgaccaaattacgaaaggatacggttcatttggacac ctaatgggcggttatgctgctggttactacggatacctgtattctcaagtgtttgccagtgacatttattacacctttttcaaagctgatcc aatgagtacagctcaaggtatcaagtaccgtgatatcattcttgccagaggtggatcaagagaggagctagataatctcaaggaattacttg gaagagagcctacatctgatgcctttatgactgagcttggagtagaaaatggtgcgtccaagttgtaa PAS_chr2- 63 atgcgttttttggtctcatcctttcggcccttcagacatacaatttcgtcgcatatctcaatgggccaggctctgtctgccattcgtgtatt 2_0380 tcataaaaattctcactcacgtacccaaggtttaaggcgccactctcactactgttgccaccgcaagatagatatgagtacttctactaaac ttccagagcgtcaattgctaccagccaatgttaggcctaccaaatatgatttgacattggagcccttattttctaccttcaagtttaacgga gaagagactatacatttagatgttcaggaggactccagttctattacgctacacgctctagacatcgatctccaagattcactattgataac ttcaaacaagtctaagactcccccgcttcatgtgacaagcaatgatgatgaccaatcgctcacttttcaattcaaagagggtactctagtaa agggagataaggtgcagctgcagttgaaatttgttggtgaattgaatgataagatggccggtttttaccgctcttcatatgaagagaatgga gaaactaaatatttggcaactacccagatggagccaacagattgtcgtcgtgctttcccttcctttgatgagccatcgctaaaagccgtatt caacattgccctcattgctgatcagaaacttacttgtctctcaaacatggacgtgaaagaggaacaatctctcggagatagaaggaagaagg tgatattcaatcccactccactaatttctacttacctaattgcttttattgttggtgatttaaaatatattgaagccgactataactatcgc attcctgtcagagtttatgccacccctggtttagagaagcagggtcgtttttctgtcgagcttgctgctaaaacattagaattctttgagca acagtttgatattgattatcctcttccaaagatggacatggtggcgattcatgatttcagtgcaggagctatggaaaactttgggcttgtta cctatagagttgttgatttgctgtacgatgaaaaaaattcaaatttggctactaagcaacgtgttgcagaagttgtccaacacgaattggcg catcagtggtttggtaatcttgtcacaatggagtggtgggagggcctttggctgaatgaaggctttgctacatggatgtcttggtactcttg tgacaagtttttccctgattggaaagtatgggaacaatatgttacagattctttacaacaggctctggctctggacgctctacgtgcttctc accctattgaagttcctgtgaaaagggccgacgagatcaatcaaatttttgacgcaatttcctattctaaaggatcctccttgctaaaaatg atctccaaatggctcggagaggatgtgttcattaagggagtctccagttatttaaaaaagcacaggtatggtaatacgaaaaccaccgattt gtgggaatcgctttctgaggtgtctggaaaagatgtggtcaaagttatgagtatctggactggtaaaattggatttccaatcatctcagtaa ctgaaaatgcaaaccgtatcacttttactcagaacagatatttaactactggtgatgtaactcctgaagaggatacgacgatttatcctgtt tttttgggactcaaaacagaaagctcaactgatgagtcgctggtccttgactcaaggtcaatgtcagtagatatccagaattctgacttttt caaagttaatgctgaacaagccggtatttacaggaccaattatgcaccagagagatggatcaaacttggaaagcaacctcaccttctaagtg tagaagaccgtgctggtttggttgcggatgcgggcgctctggctagttctggtcactcatctacaaggaactttttgaaccttgtaaattca tggaaagatgagtctagctttgttgtctgggacgaaataacttcccgtgttgcagctttaaaagcagcttggttatttgaatcccaatctga cattgacgccctgaatgctttcgtaagagaccttatttctacgaagatcaaaagtatcggatggtcattcaatgataatgaaccattccttg aacaaagactaaagagccttctatatgctactgctgctggtgcaaaagtaccaggagtagttaaatcagcattgataaactttcaaaaatac gttgctggtgataagactgccattcaccctaacataaaggcagttacgtttcaaactgttgcggcccaaggatctgaaaaggaatgggatca gttactcgacatctacaagaaccctgtatctattgatgagaaaattattgctcttaggtctctcggaaggtttgaagatcccatcttgatcg caaagaccctggcactgttatttgatggttccgtaaggtcacaagatatttacgtaccaatgcaaggccttcgtgcgactaagataggagta gagtcacttttcaagtggttgactcttaattgggacaagatttataaattgcttccacctggtctgtcaatgcttggttctgtggttactat cagtacttctgggttcacttccttggatgatcaaaagcgtgtcaaagatttctttgcatcaaaggataccaaaggcttcgaccagggtttgg cccaggcgttagacaccatccaatccaaggcaagttgggtacaacgtgactctaggaatgtatccgattggctacgtgagcagggatacaaa aaatag PAS_chr3_0928 64 atgataaggatatccttgctgaaaagagcactgtttccctacgggcgactaccaatgcataatggtaggtggtattcagacataggtggcgg aaattcaaggaatcggaacgaacagaaaccaaaattgcctgtaccaactagtaatgaagttaaggacaatgagtcaaacccggacttcttta ttaaaaacggctttagatcagctgatattgcagagacatcctttgtgaaagacaagggtgctacagtcgaagaggaacgtaatacatcggac agttcacacgaatctcctcaacttaattttaaggaaaccaacgacgaaacgaattcaacgatccaaccaccagtggcaaaattacccacccc aaagcaattgaaacaatacctggataggttcatcgtgggacaagagaagtgcaagaagataatgtcggtcgcagtttacactcattatgttc gaataaataaccaggctcagaaacggaatcagaaggtcgattcctctgaagaaaatgttgagaatgggtttccaaatgttactaaagaattt gaggacgaaaatgacccagattatgttccggatttggagaaatcaaatgttcttttgctgggaccgtctggatcaggcaagaccctgattgc taagactctcgctaaatgtctgcaggttccatttataattcaagattgtacctccttgacccaggctggttatgttggcgaggatattgaga gctgtattgaaaagttgctaattgattcagactacgatattgaaaggtgtgaaaagggaattattgtgctggatgaaatagacaagttggcc aagccctctgtctatacaggaaccaaagatattgcaggagagggtgttcaacaaggccttttaaaactggttgaaggtactacagttacggt tcaatgcaagaggagcaatgctcctgatcataatcagttcggattgaatggcaaagctacaaatcaggacaaggaaaattatatcgttgaca ctacaaatatcttatttttaaccctgggagcgtttgtgaacctagataagattgttgcttataggctgaagcagaactctattggattcgat actgatgagtcgaaagatatttctgaaacagactcagtttccgacaaatctacattagaatatgttacacttccagatggatcaaaagtttc agctctggaacttgtgtcttctacggatctacagaattatgggttgattccagaactgatcggcaggcttccgattgtatcttcactttctc ctttaacagttgatgatcttgtggctgtcctgactgagcccaggaactcgatactaaagcaatatgtgcatttctttgacactgtcaatgtc aaacttgctatcacttccaaggcaatcagaaggatagccgagatctcgatcaagaatggtacaggtgcaagaggtctcagagccattttgga gaaactgctactcaatgccaagtatgattgccctggtagtagtatttcatttgtgttagttgatacagatgttataagtaagtctatcgatg agaataaggaaacgggggaattcgtcttcaaagatggtgagccaaagtattactcgcgtggagaattattttcctttttcaatgagttatca aaagaagacgaaaaactcaagacatcaattgaaaagatgtgccaaataccactttccaagaatcgcatagtttactccgaagaggagcaggc aaggttggattcttctaaacctctcgccgtgaagcactatgaacctttcatttga PAS_chr1- 65 atgagcttcaacctgctaagtgttcctttacgaacgtcaaagccgataccgttaggcgaaagcctaaaagagcttatcaacaatcagtacta 3_0184 ccagacatctgctgcgttcaaatcggatatcgaagagatcgaccaactaagaaatgatgtcctatcaatagaaccaaacaatgatggacttg cattgctcaagagatactatgtacagttagccagcattagccaaaaactccctgattattttatggagtatccctggtttggaacattagga taccaagtaactggccccgtagctctaaaatccctctatttcgaaagaatcaatatagcgtacaacatcgcagcgacgtattcaatcatagg tttaaacgagcccagagctacaggagaaggcttgaaaaaatcatgcatttattttcagtatagtagtggggcattcgaaagtgtactgaagc tagtggagcaaaaaccgaaagagctgacacttcccattgatcttagtgttaacattatgaaaaccctggctaaactcatgctggctcaggcc caggaatgtttttggcaaaaggctgtttctaacactttaaaagataacgttattgcaaggttggcctttcaagtatctcaattttacgatga agctctgtctatggcttacaagtgcgatattttaaagtctgaatggatagaacatatgagttgcaagaagctgcattttaaggctgcggccc aatttagacttgcttgtgtggcagtcgctgcttctagacatggagaggaaatagcaagattaaggattgcaaataccatttgcgaaacagca tctagagaagccaagtatcaccttccctctgtatcttccgatttggagagtctttcgaagataatcaaagactctttaagaagaagtgaacg tgataatgatctaatatatctgcaggaagttcctaatgaatcagatcttcctccaattgttgcagcatctatggttgaacctaagccaatag ttgagttaaattcagctgaatgtgcgaaagatacaaagaaatacggcaaaatccttttccatgatcttatgccatacttagtgattgaaatt gcacaggcatttagagagaggcaggattcttatgttgtaaagcatatcaaggagcccatggagatgctgacaaagattcttcacacaatcct tgctgaaaatggacttccggcgttgatagataccatacaaaggcctcaaagattgccaaccaacatccttgaacattgtcaaatactcaatg aaaggggtggcatggacaaacttaaggtatttttcgaagatatcagcaagctaagacacaaaagtgagcaagttctccaaaactgtgtcgaa ttgctacaaatggaagagtccgaaaatgaggaaatgagaaggaagcatggatcacagaggtggaattttgctgactctagggaggcatcagc agatgtcaggaaaagtgtacaggcactagagggctatttgaaacaggcccatgatggtgatcaagtgatctggaatgacttcgaacaattga agccactactaagcatgatgagtgctcctaattcaactaaattactggaagaatttgtaccaaattcaaaattcgtcagacttcctccagaa ttgaaccgaatcgttaacgaattaagagctgatgttaatcaggtcaaaaagctcgcatcgcaaagggaaacttttattaatacagttaaagt aaaaagcaccgacctgtccatattgcccttggtagtttcccattataagaaattacaacaaaacaacattaatacgatcacgacggaattgt tcgaagaagtgttcagacgacaggttagcaacttcgattctgatatcagatttgttcaaaaacacagggacaaccaaatcgagttagagaag catattaaatctttggtccaacaattcaatcagcttagagggaatatagatgcctcgcaagaacgccaaaatgcacttcagttgttggacga tgcctataacggataccttgatttggtaaacaacctcacacagggacttagtttttacaatgatttcactggaaaggcaaatgatgtctatt tgagatgtcaagaattctacaactttcgtaaacaagaagccatgaagctggagcaggaaatatatgctgtatttgaacaaggtaaatctcct cagaaaaaacaactagaagatcaggtttcagatcaaccaaaaagtgaagtcaagtcttcaaagggttattctaatgagctgtggaaccccga cgttggaattaaatttggctag PAS_chr1- 66 atggtggcctctcttcacattgtcaatccgaatttggcctccgctttcagtttgcctcccaggtcaaacactttgagcgtttccatacacgc 4_0286 ttcggctttgttacagatcctggaatcaagttacttcgaccagaataagaatggtcgtatcataggaaccctcctaggttctaggtctgaag agacaacggaggttcaagtcaaagactctttcatagtttcccacacggaggacggagacgagtttaccattgattcttctcaacgtgaattt gtcgccatccacaagaagtctagcccaagagactcagtcgtaggatggttttccattaactctaaggtcgacagctttatcggactggtcca tgactttttctcaaagggtccagatagcacacacccgtaccctgccatatatttgagtatccagttatgtgacgagagcggatccttcgtag agccagttttcaaggcgtacgttgcctccccagtgggatgttatggagctctggcaagtcacttagaccttgaaaaagctggctcttttgtc ttctctgaagtcccaaccaaggtcatatactctgctaacgaaaaaagtctgctggctcatttcaagaacaacgttgtggaacccaaagttcc aataccacaaaacgacacaaatcaactaatttcacaactcaacaaactcgacgtttccattgaccagttaatagactacgttgacaaagtca tttcaggatctctggatagaaatgatgtgaagaatgatgagattggccgtttcctgttgaccaacttagtttcccttccaacttctccttca aaggaagagctttcatcttccataagctctcatatccaggactcactgatgatcgactacttggcctccgccgtgaaaactcaattagatgt tagctccaaattaatgaacctggtacaagatgataaatag
(126) TABLE-US-00006 TABLE6 Polypeptidesequencesoftargetedproteases ProteaseGene Symbol/Locus SEQID tag NO: Polypeptidesequence PAS_chr4_0584 67 1 MLKDQFLLWVALIASVPVSGVMAAPSESGHNTVEKRDAKNVVGVQQLDFSVLRGDSFESA 61 SSENVPRLVRRDDTLEAELINQQSFYLSRLKVGSHQADIGILVDTGSSDLWVMDSVNPYC 121 SSRSRVKRDIHDEKIAEWDPINLKKNETSQNKNFWDWLVGTSTSSPSTATATGSGSGSGS 181 GSGSGSAATAVSVSSAQATLDCSTYGTFDHADSSTFHDNNTDFFISYADTTFASGIWGYD 241 DVIIDGIEVKELSFAVADMTNSSIGVLGIGLKGLESTYASASSVSEMYQYDNLPAKMVTD 301 GLINKNAYSLYLNSKDASSGSILFGGVDHEKYSGQLLTVPVINTLASSGYREAIRLQITL 361 NGIDVKKGSDQGTLLQGRFAALLDSGATLTYAPSSVLNSIGRNLGGSYDSSRQAYTIRCV 421 SASDTTSLVFNFGGATVEVSLYDLQIATYYTGGSATQCLIGIFSSGSDEFVLGDTFLRSA 481 YVVYDLDGLEVSLAQANFNETDSDVEAITSSVPSATRASGYSSTWSGSASGTVYTSVQME 541 SGAASSSNSSGSNMGSSSSSSSSSSSTSSGDEEGGSSANRVPFSYLSLCLVVILGVCIV PAS_chr3_1157 68 1 MIINHLVLTALSIALANDYESLDLRHIGVLYTAEIQIGSDETEIEVIVDTGSADLWVIDS 61 DAAVCELSYDEIEANSFSSASAKFMDKIAPPSQELLDGLSEFGFALDGEISQYLADKSGR 121 VSKREENQQDFNINRDEPVCEQFGSFDSSSSDTFQSNNSAFGIAYLDGTTANGTWVRDTV 181 RIGDFAISQQSFALVNITDNYMGILGLGPATQQTTNSNPIAANRFTYDGVVDSLRSQGFI 241 NSASFSVYLSPDEDNEHDEFSDGEILFGAIDRAKIDGPFRLFPYVNPYKPVYPDQYTSYV 301 TVSTIAVSSSDETLIIERRPRLALIDTGATFSYLPTYPLIRLAFSIHGGFEYVSQLGLFV 361 IRTSSLSVARNKVIEFKFGEDVVIQSPVSDHLLDVSGLFTDGQQYSALTVRESLDGLSIL 421 GDTFIKSAYLFFDNENSQLGIGQINVTDDEDIEVVGDFTIERDPAYSSTWSSDLPHETPT 481 RALSTASGGGLGTGINTATSRASSRSTSGSTSRTSSTSGSASGTSSGASSATQNDETSTD 541 LGAPAASLSATPCLFAILLLML PAS_chr1- 69 1 MVASHVNNASASRSNTSVSHASASSYDNKNGRGTGSRSTTVVKDSVSHTDGDTDSSRVAH 4_0289 61 KKSSRDSVVGWSNSKVDSGVHDSKGDSTHYAYSCDSGSVVKAYVASVGCYGAASHDKAGS 121 VSVTKVYSANKSAHKNNVVKVNDTNSNKDVSDDYVDKVSGSDRNDVKNDGRTNVSTSSKS 181 SSSSHDSMDYASAVKTDVSSKMNVDDK
(127) TABLE-US-00007 TABLE7 Forward(F)andReverse(R)Primersfor5 and3 homologyarms (HA)targetingproteaseORF SEQID Description NO: 5 to3 Sequence PASchr1-101745 HAF 70 ACCTATTGTTTACCTTCCTG PASchr1-101745 HAR 71 GAATTCTCTCACTTAATCTTTAGCTCCCATGCTCATCTTG PASchr1-101743 HAF 72 GCGGCCGCaagaagttgattGTTTATTTGTAGGCGGTGCC PASchr1-101743 HAR 73 GGGCTATCCGCCTTATCTTG PASchr1-102265 HAF 74 AATAACTTCATGACTGCATT PASchr1-102265 HAR 75 GAATTCTCTCACTTAATCTTAGTTTAAATAATATGGAGAT PASchr1-102263 HAF 76 GCGGCCGCaagaagttgattATTGGAGAAAAGGAATACAC PASchr1-102263 HAR 77 GGCATCTCCGTCTGGTGCAG KO_PAS_chr3_10875 HAF 78 CAAGGTTCGAAACTGCAGCT KO_PAS_chr3_10875 HAR 79 CTCACTTAATCTTCTGTACTCTGAAGAGAGAGCAAACCAATGGCAA KO_PAS_chr3_10873 HAF 80 AGAAGTTGATTGAGACTTTCAACGAGGGTCCTTTGGCAATCATTGGT KO_PAS_chr3_10873 HAR 81 ACCCCAGGACCAGGTATTTC KO_PAS_chr4_05845 HAF 82 TACTACAGGCTGGCTGTTCC KO_PAS_chr4_05845 HAR 83 CTCACTTAATCTTCTGTACTCTGAAGAAGTCCAACTGTTGAACGCC KO_PAS_chr4_05843 HAF 84 AGAAGTTGATTGAGACTTTCAACGAGGGTCCCCTTCAGCTACCTTT KO_PAS_chr4_05843 HAR 85 TCCCTGCTAAGCCCTAATCG KO_PAS_chr3_00765 HAF 86 AAGTTGTATGGCCGTCCTCA KO_PAS_chr3_00765 HAR 87 CTCACTTAATCTTCTGTACTCTGAAGTGAGTCTTGGTTGTGTCGGT KO_PAS_chr3_00763 HAF 88 AGAAGTTGATTGAGACTTTCAACGAGGCCTCCTGTTTGATCGGTTC KO_PAS_chr3_00763 HAR 89 GTGCCATGGTGACGTTACAG KO_PAS_chr3_06915 HAF 90 CGGAGTTATAGGGGACGCTT KO_PAS_chr3_06915 HAR 91 CTCACTTAATCTTCTGTACTCTGAAGCGTCACATCATAGCCGTTCTC KO_PAS_chr3_06913 HAF 92 AGAAGTTGATTGAGACTTTCAACGAGCGTCAAAAGTGGTCGTGGAC KO_PAS_chr3_06913 HAR 93 TGGCCCAGTTACACGGAATA KO_PAS_chr3_03035 HAF 94 GTCGATCGTTGGTGTGTGAC KO_PAS_chr3_03035 HAR 95 CTCACTTAATCTTCTGTACTCTGAAGGAGCCGACTTTGACATCGAC KO_PAS_chr3_03033 HAF 96 AGAAGTTGATTGAGACTTTCAACGAGAGCGAAGAGACTGGTTCCAA KO_PAS_chr3_03033 HAR 97 AGCTGTTCTAACCGTCCTCA KO_PAS_chr3_08155 HAF 98 CTTGGAATATCTGTGGGCGC KO_PAS_chr3_08155 HAR 99 CTCACTTAATCTTCTGTACTCTGAAGTCATGACCAGCAGTTGTTCA KO_PAS_chr3_08153 HAF 100 AGAAGTTGATTGAGACTTTCAACGAGATGCTGCAGGAAGGAACACT KO_PAS_chr3_08153 HAR 101 CAAACTCTGCACCTCCAAGC KO_PAS_chr3_11575 HAF 102 CTCTGATTGCACGAGAAGGC KO_PAS_chr3_11575 HAR 103 CTCACTTAATCTTCTGTACTCTGAAGTGAAAGGCGATTGGAGTTGC KO_PAS_chr3_11573 HAF 104 AGAAGTTGATTGAGACTTTCAACGAGCTGGCTCTGCTTCTGGTACT KO_PAS_chr3_11573 HAR 105 GATGTTGAGGCGGGCATAAG KO_PAS_chr1-4_01645 HAF 106 TTTCAACGGGGTTCTACGGA KO_PAS_chr1-4_01645 HAR 107 CTCACTTAATCTTCTGTACTCTGAAGGTGGTAGTATGTGTGTTGGTGT KO_PAS_chr1-4_01643 HAF 108 AGAAGTTGATTGAGACTTTCAACGAGCTGCGCTTTCAAGTACTGCA KO_PAS_chr1-4_01643 HAR 109 TGTCTTCCTCGTCTTCCTCG KO_PAS_chr3_09795 HAF 110 CGGGCAATAATCAGTGGAGC KO_PAS_chr3_09795 HAR 111 CTCACTTAATCTTCTGTACTCTGAAGCGTTGGAGGTAATGCATGGG KO_PAS_chr3_09793 HAF 112 AGAAGTTGATTGAGACTTTCAACGAGGGCGGACCGTGTATTAGAGA KO_PAS_chr3_09793 HAR 113 TCAGAGAAGCCAGTGGAAGG KO_PAS_chr3_08035 HAF 114 TTCCTCGGCCTCTTTATGCT KO_PAS_chr3_08035 HAR 115 CTCACTTAATCTTCTGTACTCTGAAGCAACGTGGCTAACTCCTTGG KO_PAS_chr3_08033 HAF 116 AGAAGTTGATTGAGACTTTCAACGAGGTTGTCGACGGCATTGAAGA KO_PAS_chr3_08033 HAR 117 TCGGTTCAAAGCCCCTAAGT KO_PAS_chr3_03945 HAF 118 AGGTGTGAAATGCGCTGATC KO_PAS_chr3_03945 HAR 119 CTCACTTAATCTTCTGTACTCTGAAGAAACCAACAACGCCTGGTAC KO_PAS_chr3_03943 HAF 120 AGAAGTTGATTGAGACTTTCAACGAGTCACAGGCTGAAGGATCGAA KO_PAS_chr3_03943 HAR 121 CCATGGTGTGTTTTCCGGTT KO_PAS_chr2-1_03665 HAF 122 TGAGGGACAAAGTAATGGGGT KO_PAS_chr2-1_03665 HAR 123 CTCACTTAATCTTCTGTACTCTGAAGACCGAAGTCATGGTTGGAAA KO_PAS_chr2-1_03663 HAF 124 AGAAGTTGATTGAGACTTTCAACGAGCTACCGCAGACAACCCATTC KO_PAS_chr2-1_03663 HAR 125 CGCTCCCTCATCGAGTACTT KO_PAS_chr3_08425 HAF 126 CAGACATCGTGGAAACTGCC KO_PAS_chr3_08425 HAR 127 CTCACTTAATCTTCTGTACTCTGAAGTATCTGCTTCGATCCCTGCA KO_PAS_chr3_08423 HAF 128 AGAAGTTGATTGAGACTTTCAACGAGTTCTCCCGTCCAGTTAGCAG KO_PAS_chr3_08423 HAR 129 ATTTCAGAAGCTCCGCATCC KO_PAS_chr1-3_01955 HAF 130 ACAAAAGCACGCGATTGAGA KO_PAS_chr1-3_01955 HAR 131 CTCACTTAATCTTCTGTACTCTGAAGACACTCACGGTTGTTTGCAA KO_PAS_chr1-3_01953 HAF 132 AGAAGTTGATTGAGACTTTCAACGAGAACCCCAACAAGCGGCTATA KO_PAS_chr1-3_01953 HAR 133 ACCCGGATCTGCTAGTGAAG KO_PAS_chr1-4_00525 HAF 134 CGTATGCTCGTGTGACTGTG KO_PAS_chr1-4_00525 HAR 135 CTCACTTAATCTTCTGTACTCTGAAGTTCCTATGCCTGGCGATGAT KO_PAS_chr1-4_00523 HAF 136 AGAAGTTGATTGAGACTTTCAACGAGAGGGAGTCTTGTATAGTTGAGCA KO_PAS_chr1-4_00523 HAR 137 AGCAGGGGTATTTTCACGGA KO_PAS_chr2-2_00575 HAF 138 AGCATGATTGTGTTGGGTGG KO_PAS_chr2-2_00575 HAR 139 CTCACTTAATCTTCTGTACTCTGAAGAATCCGATACTGTAGCCCCG KO_PAS_chr2-2_00573 HAF 140 AGAAGTTGATTGAGACTTTCAACGAGGCAAAGAAAACTGGCCACAC KO_PAS_chr2-2_00573 HAR 141 GGAAGGCCCTATTCACGACT KO_PAS_chr1-3_01505 HAF 142 CACCATTTCCCTGCTGTGTC KO_PAS_chr1-3_01505 HAR 143 CTCACTTAATCTTCTGTACTCTGAAGTCAATACCGAAGACTCCGCA KO_PAS_chr1-3_01503 HAF 144 AGAAGTTGATTGAGACTTTCAACGAGGGGAGGTATTCAGGAGGCAT KO_PAS_chr1-3_01503 HAR 145 GCTCGATCAGATATTGTCCGC KO_PAS_chr1-3_02215 HAF 146 AGCAGCTCTCCAATCAGTGT KO_PAS_chr1-3_02215 HAR 147 CTCACTTAATCTTCTGTACTCTGAAGCTGGAATTGTGATCCCGCTG KO_PAS_chr1-3_02213 HAF 148 AGAAGTTGATTGAGACTTTCAACGAGTTTTGAAGCAAGCCTACCCC KO_PAS_chr1-3_02213 HAR 149 CAGGATCCAGCCGCTAAAAC KO_PAS_FragD_00225 HAF 150 TGAACAAGCAGCCACATCAC KO_PAS_FragD_00225 HAR 151 CTCACTTAATCTTCTGTACTCTGAAGTGAGGGCCATTCTGACATACT KO_PAS_FragD_00223 HAF 152 AGAAGTTGATTGAGACTTTCAACGAGGTGAGGTATTTAACTGCACGAG KO_PAS_FragD_00223 HAR 153 TCGCCTACATAGTCTGCACA KO_PAS_chr2-1_01595 HAF 154 ACCTCATGCCATGTCTGTCA KO_PAS_chr2-1_01595 HAR 155 CTCACTTAATCTTCTGTACTCTGAAGTTGACTGCCGCTTCAAAGTC KO_PAS_chr2-1_01593 HAF 156 AGAAGTTGATTGAGACTTTCAACGAGCCGCCAGAGAATTTGTGCTT KO_PAS_chr2-1_01593 HAR 157 TAGAGGTGAACGTTTGGCCT KO_PAS_chr2-1_03265 HAF 158 AATCCATCACCTCCACCCAG KO_PAS_chr2-1_03265 HAR 159 CTCACTTAATCTTCTGTACTCTGAAGGCTGCTGGAGTAAAAGGTCC KO_PAS_chr2-1_03263 HAF 160 AGAAGTTGATTGAGACTTTCAACGAGCAAGCAGCAACCATCTACGG KO_PAS_chr2-1_03263 HAR 161 AACCTCATCCACTGTCAGCA KO_PAS_chr2-2_00565 HAF 162 GGAAGACAAAGTTCGCTCCG KO_PAS_chr2-2_00565 HAR 163 CTCACTTAATCTTCTGTACTCTGAAGTCATAGTTGAGAGCCTCCTTGT KO_PAS_chr2-2_00563 HAF 164 AGAAGTTGATTGAGACTTTCAACGAGACAATGCACTAGGACGGGAT KO_PAS_chr2-2_00563 HAR 165 CTTGAATCAGGCGACGTACC KO_PAS_chr1-4_06115 HAF 166 CCCAGCTCTCTTTCACTCCA KO_PAS_chr1-4_06115 HAR 167 CTCACTTAATCTTCTGTACTCTGAAGTTGAAGAGCAGCAGAGTCGA KO_PAS_chr1-4_06113 HAF 168 AGAAGTTGATTGAGACTTTCAACGAGTTAATTGCCCACAGTGTCGC KO_PAS_chr1-4_06113 HAR 169 ACCTTCCACAGTCGACGAAT KO_PAS_chr1-1_02745 HAF 170 ACAAACAGTCAAATGCACGGA KO_PAS_chr1-1_02745 HAR 171 CTCACTTAATCTTCTGTACTCTGAAGTCCTTCCACCTTTCCAACGT KO_PAS_chr1-1_02743 HAF 172 AGAAGTTGATTGAGACTTTCAACGAGGGGGTAGAGAAGTTAGGGAGG KO_PAS_chr1-1_02743 HAR 173 GGAACTACAACTGGAGGCCT KO_PAS_chr4_08345 HAF 174 TAGTGCCGGTTCCATGGATT KO_PAS_chr4_08345 HAR 175 CTCACTTAATCTTCTGTACTCTGAAGGGTCTATGGGTTGATGCGGA KO_PAS_chr4_08343 HAF 176 AGAAGTTGATTGAGACTTTCAACGAGATGTGTTGCTCGCTCTAGGT KO_PAS_chr4_08343 HAR 177 CGACAAACACACCAAGGTCC KO_PAS_chr3_08965 HAF 178 GTTGTTGGAGTGAGCGATGG KO_PAS_chr3_08965 HAR 179 CTCACTTAATCTTCTGTACTCTGAAGCCTCCGTTGATACTCCCGAT KO_PAS_chr3_08963 HAF 180 AGAAGTTGATTGAGACTTTCAACGAGTGCATTCAAGGCTGGCAAAT KO_PAS_chr3_08963 HAR 181 GCATATGGAGTGGTGTGCAG KO_PAS_chr3_05615 HAF 182 CGGGTAGCATTGAACGTACG KO_PAS_chr3_05615 HAR 183 CTCACTTAATCTTCTGTACTCTGAAGATGCTACGGTAAACACCCCA KO_PAS_chr3_05613 HAF 184 AGAAGTTGATTGAGACTTTCAACGAGACTGGAGAAAGCTTGGTCGA KO_PAS_chr3_05613 HAR 185 AGGCACCAGAAGAAAGAGCT KO_PAS_chr3_06335 HAF 186 GGACACGTTTGGAGCTTCTT KO_PAS_chr3_06335 HAR 187 CTCACTTAATCTTCTGTACTCTGAAGGCCCACCAATTCAGCAACTT KO_PAS_chr3_06333 HAF 188 AGAAGTTGATTGAGACTTTCAACGAGGATGCTGGTCACATGGTTCC KO_PAS_chr3_06333 HAR 189 AACCGCCAATAGTTTCAGCC KO_PAS_chr4_00135 HAF 190 GGATGAGAAAGCGGCTTCTG KO_PAS_chr4_00135 HAR 191 CTCACTTAATCTTCTGTACTCTGAAGGTGCCAAAAGTCTGATCCGG KO_PAS_chr4_00133 HAF 192 AGAAGTTGATTGAGACTTTCAACGAGTGCCACTTCGTTCTTTGACG KO_PAS_chr4_00133 HAR 193 ACGGATCAGTGATGGCGTAT KO_PAS_chr1-1_03795 HAF 194 ATGGGATCTGGACGACGTTT KO_PAS_chr1-1_03795 HAR 195 CTCACTTAATCTTCTGTACTCTGAAGAGCTGGATCACAAACATTCGG KO_PAS_chr1-1_03793 HAF 196 AGAAGTTGATTGAGACTTTCAACGAGCTTTGAGTGTTGGTCCCTGC KO_PAS_chr1-1_03793 HAR 197 CGGCTACCAAGTCAGACCTT KO_PAS_chr2-1_01725 HAF 198 GTTGCCCATTACGTCCTGTG KO_PAS_chr2-1_01725 HAR 199 CTCACTTAATCTTCTGTACTCTGAAGCCTTTGATCTTTGGTGCATCTTG KO_PAS_chr2-1_01723 HAF 200 AGAAGTTGATTGAGACTTTCAACGAGCACTACAGCTGGGAACGAGA KO_PAS_chr2-1_01723 HAR 201 ACGGGTTGGAAAAGTTGAGC KO_PAS_chr3_08665 HAF 202 AGTGGGGTTGGAGATTGGAG KO_PAS_chr3_08665 HAR 203 CTCACTTAATCTTCTGTACTCTGAAGACGATTCCAGCATAGCCTGT KO_PAS_chr3_08663 HAF 204 AGAAGTTGATTGAGACTTTCAACGAGCTGGTAGCCGCAAAACTTCA KO_PAS_chr3_08663 HAR 205 GCGTTGAATCCTCCTCGTTC KO_PAS_chr3_02995 HAF 206 CTGTGGGGTCTGAACATCCT KO_PAS_chr3_02995 HAR 207 CTCACTTAATCTTCTGTACTCTGAAGAGCTGCTAGGGTTCATTGAGT KO_PAS_chr3_02993 HAF 208 AGAAGTTGATTGAGACTTTCAACGAGCTCCCTTGGGTACGTCAACT KO_PAS_chr3_02993 HAR 209 TGGCAGTCTTCACATGTCCT KO_PAS_chr1-4_02515 HAF 210 AGCTGGTCAAGTCTGGTACC KO_PAS_chr1-4_02515 HAR 211 CTCACTTAATCTTCTGTACTCTGAAGGAGGTCTAGTGTGTGAGGCT KO_PAS_chr1-4_02513 HAF 212 AGAAGTTGATTGAGACTTTCAACGAGAGAAGGTATAGGGAATATGCGGT KO_PAS_chr1-4_02513 HAR 213 TAGCCACAACCCTGATGACG KO_PAS_chr4_08745 HAF 214 TACACTGGGACGCAGATGTT KO_PAS_chr4_08745 HAR 215 CTCACTTAATCTTCTGTACTCTGAAGTGCTCAAACTCTGTATCCGTTG KO_PAS_chr4_08743 HAF 216 AGAAGTTGATTGAGACTTTCAACGAGCTTTCAAGGCCGCAATGCTA KO_PAS_chr4_08743 HAR 217 CTTCCTTTGCAGTTGGTGGT KO_PAS_chr3_05135 HAF 218 GGGTCTTTGGCTTTGGTGAG KO_PAS_chr3_05135 HAR 219 CTCACTTAATCTTCTGTACTCTGAAGCGTCTCTGGAACTCGTCGAT KO_PAS_chr3_05133 HAF 220 AGAAGTTGATTGAGACTTTCAACGAGCCCCAAGTCAAGGAGGAGTT KO_PAS_chr3_05133 HAR 221 GAGTCCAATCACGGCCAATC KO_PAS_chr1-1_01275 HAF 222 TGCTTCTTCGGACAGATCGT KO_PAS_chr1-1_01275 HAR 223 CTCACTTAATCTTCTGTACTCTGAAGTACTGATTGAAGGGTCGGCA KO_PAS_chr1-1_01273 HAF 224 AGAAGTTGATTGAGACTTTCAACGAGTTGTACGGACCAGGAAGCAT KO_PAS_chr1-1_01273 HAR 225 TTCCTCTGCCTCTTCCTTGG KO_PAS_chr4_06865 HAF 226 AGCATGCAAACACGAGGTAC KO_PAS_chr4_06865 HAR 227 CTCACTTAATCTTCTGTACTCTGAAGAGAGGAAAACGAGCTTGGGT KO_PAS_chr4_06863 HAF 228 AGAAGTTGATTGAGACTTTCAACGAGATCAAGGTTGCCAGCGAATG KO_PAS_chr4_06863 HAR 229 ACCCTACAGAACCGCAATGA KO_PAS_chr2-2_01595 HAF 230 ACAGCCCAAATAGAGACGCA KO_PAS_chr2-2_01595 HAR 231 CTCACTTAATCTTCTGTACTCTGAAGAGGAGCCCAGTTTTACGTCA KO_PAS_chr2-2_01593 HAF 232 AGAAGTTGATTGAGACTTTCAACGAGTATCCCGCGGTGAAGACTAC KO_PAS_chr2-2_01593 HAR 233 GTGTTGCTAAGCCTGTGGAC KO_PAS_chr3_03885 HAF 234 TCCTCCTTTCGACGCTTCTT KO_PAS_chr3_03885 HAR 235 CTCACTTAATCTTCTGTACTCTGAAGACAGCTGTGAATCATGAAGTTTT KO_PAS_chr3_03883 HAF 236 AGAAGTTGATTGAGACTTTCAACGAGATTCTCACTGGCAGAACGGA KO_PAS_chr3_03883 HAR 237 TTTTCACGTTGAGGCCACTG KO_PAS_chr3_04195 HAF 238 AGCTCCGCAGTAACAGGAAT KO_PAS_chr3_04195 HAR 239 CTCACTTAATCTTCTGTACTCTGAAGTCAAAGCAACTTATGGCGGT KO_PAS_chr3_04193 HAF 240 AGAAGTTGATTGAGACTTTCAACGAGCTCTTCGCAGCACCAGAAAG KO_PAS_chr3_04193 HAR 241 TCGTTGTTGCTGGTGTTCTG KO_PAS_chr1-3_02585 HAF 242 AGTTTGAAGGCACGTTGGTC KO_PAS_chr1-3_02585 HAR 243 CTCACTTAATCTTCTGTACTCTGAAGACTCCAACAGGACTTTGAGGT KO_PAS_chr1-3_02583 HAF 244 AGAAGTTGATTGAGACTTTCAACGAGAAATGTGGAAGTTGCAGCGG KO_PAS_chr1-3_02583 HAR 245 AGGTTGATCGCCGTCTTGTA KO_PAS_chr4_09135 HAF 246 TCTTCATGAGGTGGTAGGCG KO_PAS_chr4_09135 HAR 247 CTCACTTAATCTTCTGTACTCTGAAGAGAGGGCAGATGACATACCG KO_PAS_chr4_09133 HAF 248 AGAAGTTGATTGAGACTTTCAACGAGGAGAAACTGGAGGTGCTCGT KO_PAS_chr4_09133 HAR 249 CAAGGCATTCAGTTGACCGT KO_PAS_chr1-1_00665 HAF 250 ACCAACGAGCCTTACAGACA KO_PAS_chr1-1_00665 HAR 251 CTCACTTAATCTTCTGTACTCTGAAGTTTTGACCGTCAGTGCATGG KO_PAS_chr1-1_00663 HAF 252 AGAAGTTGATTGAGACTTTCAACGAGGTCGGAGGTGTGAGAATTGA KO_PAS_chr1-1_00663 HAR 253 TGGGAACTATGTGGCTCCTC KO_PAS_chr2-2_03105 HAF 254 CGAGCTATCAGTACTCCCGG KO_PAS_chr2-2_03105 HAR 255 CTCACTTAATCTTCTGTACTCTGAAGGGTTCTCAGCTGTCCGAGAT KO_PAS_chr2-2_03103 HAF 256 AGAAGTTGATTGAGACTTTCAACGAGTAGCATTGCCCATCACAACG KO_PAS_chr2-2_03103 HAR 257 GTGGGAAGACTATTGATGCGA KO_PAS_chr1-3_02615 HAF 258 GGGAAATCGCTGAGGTGTAC KO_PAS_chr1-3_02615 HAR 259 CTCACTTAATCTTCTGTACTCTGAAGAGGTCATCTGGAAGCTTTGC KO_PAS_chr1-3_02613 HAF 260 AGAAGTTGATTGAGACTTTCAACGAGGGTGGCCAATGGTATTACTTTGA KO_PAS_chr1-3_02613 HAR 261 ATAAGAGCCCCGATACAGGC KO_PAS_chr2-1_05465 HAF 262 CTTGACACACTTTGCTCCTGA KO_PAS_chr2-1_05465 HAR 263 CTCACTTAATCTTCTGTACTCTGAAGAGTAGCTGACCTGTTGTGCC KO_PAS_chr2-1_05463 HAF 264 AGAAGTTGATTGAGACTTTCAACGAGGGACACCATATGATGCCCGA KO_PAS_chr2-1_05463 HAR 265 CAGATCAAGTCCAAGTCCGC KO_PAS_chr2-2_03985 HAF 266 AGAGACTTTGCGAGAGTCCC KO_PAS_chr2-2_03985 HAR 267 CTCACTTAATCTTCTGTACTCTGAAGTGCAATATCCAAACACGCCA KO_PAS_chr2-2_03983 HAF 268 AGAAGTTGATTGAGACTTTCAACGAGACTTCTGGAATCTTCGGGCA KO_PAS_chr2-2_03983 HAR 269 GGATGTTTGGGCCATTGTGA KO_PAS_chr4_08355 HAF 270 CAATCTCTCGCTTCATCACG KO_PAS_chr4_08355 HAR 271 CTCACTTAATCTTCTGTACTCTGAAGTCGCTGTTAACCATAATTCTTTG KO_PAS_chr4_08353 HAF 272 AGAAGTTGATTGAGACTTTCAACGAGGCGAGGGTTGAGGAGATTTT KO_PAS_chr4_08353 HAR 273 GGCCATGGCACTATTTTGTT KO_PAS_chr1-1_04915 HAF 274 ACGTACTTCCCGCCCAATAA KO_PAS_chr1-1_04915 HAR 275 CTCACTTAATCTTCTGTACTCTGAAGCCCACCTAAATTTCGAGTGCA KO_PAS_chr1-1_04913 HAF 276 AGAAGTTGATTGAGACTTTCAACGAGACACTTTCGCAGCTTTTGGT KO_PAS_chr1-1_04913 HAR 277 TCCTCCTTGCCATGAAGAGG KO_PAS_chr2-1_04475 HAF 278 GCCTGATGAAGATGATGCCG KO_PAS_chr2-1_04475 HAR 279 CTCACTTAATCTTCTGTACTCTGAAGAGGCTCAGTCACCTCTATGA KO_PAS_chr2-1_04473 HAF 280 AGAAGTTGATTGAGACTTTCAACGAGTGATCAAGAACACCGTCGAAG KO_PAS_chr2-1_04473 HAR 281 TCCCTTTGTTGGTCGTACGA KO_PAS_chr1-3_00535 HAF 282 TGGTTCAACTTGTAGCGCAT KO_PAS_chr1-3_00535 HAR 283 CTCACTTAATCTTCTGTACTCTGAAGGGGCTTGCTCAACTTTTGGA KO_PAS_chr1-3_00533 HAF 284 AGAAGTTGATTGAGACTTTCAACGAGCGACAATCTGGTAGCGCATC KO_PAS_chr1-3_00533 HAR 285 ATGCTCGTACAAAGACCCCA KO_PAS_chr3_02005 HAF 286 TGAGATCTCCAAGTGCAGCA KO_PAS_chr3_02005 HAR 287 CTCACTTAATCTTCTGTACTCTGAAGGACGGTCGATTTGGCTCATC KO_PAS_chr3_02003 HAF 288 AGAAGTTGATTGAGACTTTCAACGAGTGAAGAAGCTCAACACTCTGAAC KO_PAS_chr3_02003 HAR 289 TGATTGACGGCACCCTGTAT KO_PAS_chr1-3_01055 HAF 290 CAATAATTCAGCTGCGCCCT KO_PAS_chr1-3_01055 HAR 291 CTCACTTAATCTTCTGTACTCTGAAGCCTCTGTAGCTGCTTGTCCT KO_PAS_chr1-3_01053 HAF 292 AGAAGTTGATTGAGACTTTCAACGAGAGGAGTCAGTCGGTCCAAAG KO_PAS_chr1-3_01053 HAR 293 TGTGGGCTGGGATGTGTAAT KO_PAS_chr3_06355 HAF 294 AGCACGGTCAAGTAAATCGC KO_PAS_chr3_06355 HAR 295 CTCACTTAATCTTCTGTACTCTGAAGTGCTATCACTGATTTGCCCA KO_PAS_chr3_06353 HAF 296 AGAAGTTGATTGAGACTTTCAACGAGGGAGATTCCCGGCAAGTATC KO_PAS_chr3_06353 HAR 297 GGCTTTCTGACTACCTGGGT KO_PAS_chr4_05035 HAF 298 AAAGGGAAGAAGGGTGCAGT KO_PAS_chr4_05035 HAR 299 CTCACTTAATCTTCTGTACTCTGAAGAAGGTCGACTCGGGAAACAT KO_PAS_chr4_05033 HAF 300 AGAAGTTGATTGAGACTTTCAACGAGTGGTATCCCGACTGCTTTGT KO_PAS_chr4_05033 HAR 301 TGGAATGGCTCGAGAATGGT KO_PAS_chr2-1_05695 HAF 302 ACCAACAGGCTGAACACTAGA KO_PAS_chr2-1_05695 HAR 303 CTCACTTAATCTTCTGTACTCTGAAGTCGTCAGCAGAGAAGGTACA KO_PAS_chr2-1_05693 HAF 304 AGAAGTTGATTGAGACTTTCAACGAGACGGACTCCCTAACGAACAA KO_PAS_chr2-1_05693 HAR 305 TCTGATGGTTGGCTTTGCTT KO_PAS_chr3_12235 HAF 306 CGGTTTGTGGCCCATCTATG KO_PAS_chr3_12235 HAR 307 CTCACTTAATCTTCTGTACTCTGAAGAAAACCGACGCTTGAACTCC KO_PAS_chr3_12233 HAF 308 AGAAGTTGATTGAGACTTTCAACGAGAAGTCTTGACCGGAAGCAAC KO_PAS_chr3_12233 HAR 309 GGGCCTTAACAAACACCACA KO_PAS_chr2-1_05975 HAF 310 TAGAGGCGGAAAGGAACGAG KO_PAS_chr2-1_05975 HAR 311 CTCACTTAATCTTCTGTACTCTGAAGTTGCCAAGGGTGTACAAAGC KO_PAS_chr2-1_05973 HAF 312 AGAAGTTGATTGAGACTTTCAACGAGACCAAGTTGTTCGACGAAGA KO_PAS_chr2-1_05973 HAR 313 CAACACATACCAGGCGAAGG KO_PAS_chr1-1_03275 HAF 314 CCCTCCTCCGCCATCATTAT KO_PAS_chr1-1_03275 HAR 315 CTCACTTAATCTTCTGTACTCTGAAGTAGGAGACAACCAAGCCAGC KO_PAS_chr1-1_03273 HAF 316 AGAAGTTGATTGAGACTTTCAACGAGGGAGTAGAAAATGGTGCGTCC KO_PAS_chr1-1_03273 HAR 317 AATGGCTCCAAATCACAGGC KO_PAS_chr2-2_03805 HAF 318 GCTTTGAGGAATGCGTGAAGA KO_PAS_chr2-2_03805 HAR 319 CTCACTTAATCTTCTGTACTCTGAAGGTAGTGAGAGTGGCGCCTTA KO_PAS_chr2-2_03803 HAF 320 AGAAGTTGATTGAGACTTTCAACGAGTGGGTACAACGTGACTCTAGG KO_PAS_chr2-2_03803 HAR 321 ACACTCTTAAGGCTCGTCGT KO_PAS_chr3_09285 HAF 322 CTCCTCCACTTCAGTATCCGT KO_PAS_chr3_09285 HAR 323 CTCACTTAATCTTCTGTACTCTGAAGTTCCTTGAATTTCCGCCACC KO_PAS_chr3_09283 HAF 324 AGAAGTTGATTGAGACTTTCAACGAGGAGCAGGCAAGGTTGGATTC KO_PAS_chr3_09283 HAR 325 CTGGGCAGCAAATAACGGTT PAS_chr1-3_01845 HAF 326 CCAAAGTTGGCTCCGAGTAG PAS_chr1-3_01845 HAR 327 CTCACTTAATCTTCTGTACTCTGAAGCCTAACGGTATCGGCTTTGA PAS_chr1-3_01843 HAF 328 AGAAGTTGATTGAGACTTTCAACGAGGGCAAAATCCTTTTCCATGA PAS_chr1-3_01843 HAR 329 GAAGAAGGCCAAGTGTGATA KO_PAS_chr1-4_02895 HAF 330 GACGAGACGCTGTTCCTTTC KO_PAS_chr1-4_02895 HAR 331 CTCACTTAATCTTCTGTACTCTGAAGTGTGAAGAGAGGCCACCATT KO_PAS_chr1-4_02893 HAF 332 AGAAGTTGATTGAGACTTTCAACGAGTGATCGACTACTTGGCCTCC KO_PAS_chr1-4_02893 HAR 333 AACAACATTCAAGCTGCCGT
(128) TABLE-US-00008 TABLE8 Forwardandreverseprimersforamplifyingmodifiedsequences Description SEQIDNO: Sequence(5 to3) KO_PAS_chr3_1087VerificationF 334 ATCGGCAAAGATGAAGCGAC KO_PAS_chr3_1087VerificationR 335 GCTGGACACTTCTGAGCTCA KO_PAS_chr4_0584VerificationF 336 ACTTGTCAGGACGATACGGA KO_PAS_chr4_0584VerificationR 337 CCGGTCTCCCTGGAAATAGA KO_PAS_chr3_0076VerificationF 338 GCGAGGTCCTTGTCAATGAG KO_PAS_chr3_0076VerificationR 339 ACAAGAACTCGGGCTCCTTT KO_PAS_chr3_0691VerificationF 340 TTGCAGCGCTCCATAATGTC KO_PAS_chr3_0691VerificationR 341 GCTGATTCTGAGAACGCTGG KO_PAS_chr3_0303VerificationF 342 GCCATTCTTCGGTGCAGTAG KO_PAS_chr3_0303VerificationR 343 TAGAGTTGTCCCAAACGGCA KO_PAS_chr3_0815VerificationF 344 CGTGGTTCTCGAGGCTCTAT KO_PAS_chr3_0815VerificationR 345 GGAGTTGGAACGTCGTAGGA KO_PAS_chr3_1157VerificationF 346 AGTTGTCCGTCATTAGCCCT KO_PAS_chr3_1157VerificationR 347 TGTTCCCTTTCGGCTAGACA KO_PAS_chr1-4_0164VerificationF 348 ACGGTTGAGGGCATTACGTA KO_PAS_chr1-4_0164VerificationR 349 TTGTCTTCCACCCCTTCGTT KO_PAS_chr3_0979VerificationF 350 GGTTGGCCTTGGACATTGTT KO_PAS_chr3_0979VerificationR 351 TGCTCTTCGGTACTCATGCT KO_PAS_chr3_0803VerificationF 352 TTTGGCCATGCTGAGCTTTT KO_PAS_chr3_0803VerificationR 353 AAGCCCGATCACTTGCATTT KO_PAS_chr3_0394VerificationF 354 CACCTAATGTTTGGCACCCC KO_PAS_chr3_0394VerificationR 355 ATCCCAGACTGACATCGCAA KO_PAS_chr2-1_0366VerificationF 356 CCGCCAGAAATTCATGCCAT KO_PAS_chr2-1_0366VerificationR 357 TCGTTTCACTGTACCATGCA KO_PAS_chr3_0842VerificationF 358 ACCAGTCCGCATTTTCACTG KO_PAS_chr3_0842VerificationR 359 GTGGACAGCTGCAATCGTAG KO_PAS_chr1-3_0195VerificationF 360 CAACTGGGAAGCCTGCATTT KO_PAS_chr1-3_0195VerificationR 361 CCTTGCATATCCGTTTGCCA KO_PAS_chr1-4_0052VerificationF 362 GGAGGTTCAGGAGCAGGAAT KO_PAS_chr1-4_0052VerificationR 363 CGGTTTCATCTGTTGCCTCC KO_PAS_chr2-2_0057VerificationF 364 GTCGCCCATGTTCTTTCGAT KO_PAS_chr2-2_0057VerificationR 365 CAAACAGGCTGGAAACCACA KO_PAS_chr1-3_0150VerificationF 366 AATCTCCACGTTCAGTTGCG KO_PAS_chr1-3_0150VerificationR 367 TCATCCCTTGAAAACCCCGA KO_PAS_chr1-3_0221VerificationF 368 TTGTGGAGGGAGATTCAGGC KO_PAS_chr1-3_0221VerificationR 369 AAGGTAAGGAACGTGCTTGC KO_PAS_FragD_0022VerificationF 370 GTTCTACTGTTCACGTGCTCT KO_PAS_FragD_0022VerificationR 371 ACCGGTTAGAATACATGCTGC KO_PAS_chr2-1_0159VerificationF 372 CGAAAAGAAGCTGGACTCCG KO_PAS_chr2-1_0159VerificationR 373 TTCCATCGTACGACCAGTGT KO_PAS_chr2-1_0326VerificationF 374 AGCGATGAGGCCAACAGTAT KO_PAS_chr2-1_0326VerificationR 375 TGTCCAGCCCAAAAGACTGA KO_PAS_chr2-2_0056VerificationF 376 CTCCTGGGGCTCGTACTAAG KO_PAS_chr2-2_0056VerificationR 377 CCTCAATAACGACGGCCTTG KO_PAS_chr1-4_0611VerificationF 378 CCTTTTCCTGATCAGTGGGG KO_PAS_chr1-4_0611VerificationR 379 TGTTGGGGAATGAAACACGA KO_PAS_chr1-1_0274VerificationF 380 GAAGGACGAGTAGGGTTGCT KO_PAS_chr1-1_0274VerificationR 381 TCCTGATCTGGCTCGTTTGT KO_PAS_chr4_0834VerificationF 382 ACCTCCAACTCCTGAAAGCA KO_PAS_chr4_0834VerificationR 383 CCTCGAGTCTGGGCTTTACA KO_PAS_chr3_0896VerificationF 384 GGAGAGATGCCAGACCAAGT KO_PAS_chr3_0896VerificationR 385 AGCCTGTTCTACTGCATACGT KO_PAS_chr3_0561VerificationF 386 CCATTTCTTGTACCCTGGGC KO_PAS_chr3_0561VerificationR 387 GCAGAAAAGGCGCGAATTTC KO_PAS_chr3_0633VerificationF 388 GGGAAAGGATGTGGACCAAC KO_PAS_chr3_0633VerificationR 389 TGGCCAAGAGTGTCCAATTG KO_PAS_chr4_0013VerificationF 390 TAACAGATGGCGCACGTAGA KO_PAS_chr4_0013VerificationR 391 CCTTGCGTTCCCAGGTAAAG KO_PAS_chr1-1_0379VerificationF 392 TGTGGTATGGTTTGGGGCTA KO_PAS_chr1-1_0379VerificationR 393 ACTCCCGTTCCTCCATGTTC KO_PAS_chr2-1_0172VerificationF 394 ACGGTACAAAAGGCGTTTCA KO_PAS_chr2-1_0172VerificationR 395 AGTCAAACTCGGTGGTAGGT KO_PAS_chr3_0866VerificationF 396 CGGTTATCATGTGCCTGCTC KO_PAS_chr3_0866VerificationR 397 ATGTTGCTGCTCCGAAATCC KO_PAS_chr3_0299VerificationF 398 GATCTGCTGGCCTTGAGAGT KO_PAS_chr3_0299VerificationR 399 CTATGTCCTGGTGTTTGCCG KO_PAS_chr1-4_0251VerificationF 400 GCCAATGATGATCTCGCAGG KO_PAS_chr1-4_0251VerificationR 401 GCCTTTGATATGCCGTCGTT KO_PAS_chr4_0874VerificationF 402 TCGAGTAATGCTTCCCACCA KO_PAS_chr4_0874VerificationR 403 AGCTTTCACAACAGCGATCG KO_PAS_chr3_0513VerificationF 404 TGATTGCTTCTGGGTTGCTG KO_PAS_chr3_0513VerificationR 405 CAAAACCGGCGTAAAATGGC KO_PAS_chr1-1_0127VerificationF 406 TTGTGCTGCATCTGTGTGAG KO_PAS_chr1-1_0127VerificationR 407 AGCCTACAAGTGGTTACAGGT KO_PAS_chr4_0686VerificationF 408 GGAAACCGACCAGCCTAAAG KO_PAS_chr4_0686VerificationR 409 AGTCGCACCAGGTTATCACA KO_PAS_chr2-2_0159VerificationF 410 GGAAAGCTGCCCAGAAACTC KO_PAS_chr2-2_0159VerificationR 411 TGAGAGGATTCGTTGTGGCT KO_PAS_chr3_0388VerificationF 412 CTATGTCGAAGTAGCGGTGC KO_PAS_chr3_0388VerificationR 413 AGAGTGGCACTGCTATCGAA KO_PAS_chr3_0419VerificationF 414 CGTACAAACTTGGCAGCTGT KO_PAS_chr3_0419VerificationR 415 GCTGTGTTGTAAATTCCGGC KO_PAS_chr1-3_0258VerificationF 416 ACAACCCGGAAGACAACTCT KO_PAS_chr1-3_0258VerificationR 417 TGTCGTTGCCTTCCCGATAT KO_PAS_chr4_0913VerificationF 418 GAAGATGGGAGAGGGTGCTT KO_PAS_chr4_0913VerificationR 419 CTTGTTGACGACGGTAGCAG KO_PAS_chr1-1_0066VerificationF 420 CCCTAGTCTCGTTCGAAGGG KO_PAS_chr1-1_0066VerificationR 421 GGCACAGCAGGTTTTCGTAT KO_PAS_chr2-2_0310VerificationF 422 GGAGATTCTGATGCTACCCCA KO_PAS_chr2-2_0310VerificationR 423 TGGAGCCATCAGATCAGGAC KO_PAS_chr1-3_0261VerificationF 424 CCTGTTCTTGCAAGCCTTCA KO_PAS_chr1-3_0261VerificationR 425 TAAGACATGCGACCACCAGA KO_PAS_chr2-1_0546VerificationF 426 CATGGCCAATGTCGAACTGT KO_PAS_chr2-1_0546VerificationR 427 AGCTGGCTGAAAAGGTGTTG KO_PAS_chr2-2_0398VerificationF 428 CTCAGTGTTGGAAAGCACCC KO_PAS_chr2-2_0398VerificationR 429 TAGGGAATCTTTGGTGGCGT KO_PAS_chr4_0835VerificationF 430 GGAACCTAGAGCGAGCAACA KO_PAS_chr4_0835VerificationR 431 CAGGCTCTATTGTCGACGTG KO_PAS_chr1-1_0491VerificationF 432 GGAGGTGATGACAATGCCAC KO_PAS_chr1-1_0491VerificationR 433 CTGTGAAGCTCCTCCTACGT KO_PAS_chr2-1_0447VerificationF 434 GGACACTGCTGGACAAGAGA KO_PAS_chr2-1_0447VerificationR 435 TACTGACGCCGAAGAGCTAG KO_PAS_chr1-3_0053VerificationF 436 CCGATCGCAAAATAGTGGCA KO_PAS_chr1-3_0053VerificationR 437 GTTGTGGTTGTATGCGGTCA KO_PAS_chr3_0200VerificationF 438 CAATAACTCCACTGGTGCCG KO_PAS_chr3_0200VerificationR 439 TCGTTATACTCCAGCGTGCT KO_PAS_chr1-3_0105VerificationF 440 GGGCTCAAAATCTGGAACCA KO_PAS_chr1-3_0105VerificationR 441 CAATGCAGTACTCACCGGTG KO_PAS_chr3_0635VerificationF 442 AAGCTGACGACCCCTTAGAC KO_PAS_chr3_0635VerificationR 443 CTATCGTGTCTGGGCTGCTA KO_PAS_chr4_0503VerificationF 444 AAGGAGATTGCCGCAACTCT KO_PAS_chr4_0503VerificationR 445 GTGGAGTCAGAGTCGAGAGG KO_PAS_chr2-1_0569VerificationF 446 CCCAGCTTTTATACGGCTTGG KO_PAS_chr2-1_0569VerificationR 447 CAGCAAAAGCTCGTGATCCA KO_PAS_chr3_1223VerificationF 448 TGCGGGTAGTCGATTGATGT KO_PAS_chr3_1223VerificationR 449 TCACGTATCTCAGCAACAGGA KO_PAS_chr2-1_0597VerificationF 450 GGACCTAGGAAATACGCCCA KO_PAS_chr2-1_0597VerificationR 451 ACTCCAGTTCCACAAGTCCA KO_PAS_chr1-1_0327VerificationF 452 ACTGCCAACCGTTTACTCCA KO_PAS_chr1-1_0327VerificationR 453 GCGCGGAAGATTAAAGTCGT KO_PAS_chr2-2_0380VerificationF 454 TTGGACTCGATCGATGAGGG KO_PAS_chr2-2_0380VerificationR 455 TGATGACTTCCAAGATGCGC KO_PAS_chr3_0928VerificationF 456 TCACCTGGAGCAACTGATGT KO_PAS_chr3_0928VerificationR 457 GTTTGGTACGCTTGTAGGCC PAS_chr1-3_0184VerificationF 458 GATGAGCAAGCATCCATTCA PAS_chr1-3_0184VerificationR 459 AAAGACAGGAGCGTGAGCAT KO_PAS_chr1-4_0289VerificationF 460 CTCAACTTCGCTTGCCCTTT KO_PAS_chr1-4_0289VerificationR 461 TGGGAAACAGAACGATGAACT
(129) TABLE-US-00009 TABLE9 18BVector SEQ ID Description NO: 5 to3 Sequence 18Bsilk-like 462 ggtggttacggtccaggcgctggtcaacaaggtccaggaagtggtggtcaacaaggacct 60 polypeptide ggcggtcaaggaccctacggtagtggccaacaaggtccaggtggagcaggacagcagggt 120 encoding ccgggaggccaaggaccttacggaccaggtgctgctgctgccgccgctgccgctgccgga 180 sequence ggttacggtccaggagccggacaacagggtccaggtggagctggacaacaaggtccagga 240 tcacaaggtcctggtggacaaggtccatacggtcctggtgctggtcaacagggaccaggt 300 agtcaaggacctggttcaggtggtcagcagggtccaggaggacagggtccttacggccct 360 tctgccgctgcagcagcagccgctgccgcaggaggatacggacctggtgctggacaacga 420 tctcaaggaccaggaggacaaggtccttatggacctggcgctggccaacaaggacctggt 480 tctcagggtccaggttcaggaggccaacaaggcccaggaggtcaaggaccatacggacca 540 tccgctgcggcagctgcagctgctgcaggtggatatggcccaggagccggacaacagggt 600 cctggttcacaaggtccaggatctggtggtcaacagggaccaggcggccagggaccttat 660 ggtccaggagccgctgcagcagcagcagctgttggaggttacggccctggtgccggtcaa 720 caaggcccaggatctcagggtcctggatctggaggacaacaaggtcctggaggtcagggt 780 ccatacggaccttcagcagcagctgctgctgcagccgctggtggttatggacctggtgct 840 ggtcaacaaggaccgggttctcagggtccgggttcaggaggtcagcagggccctggtgga 900 caaggaccttatggacctagtgcggctgcagcagctgccgccgcaggtggttacggtcca 960 ggcgctggtcaacaaggtccaggaagtggtggtcaacaaggacctggcggtcaaggaccc 1020 tacggtagtggccaacaaggtccaggtggagcaggacagcagggtccgggaggccaagga 1080 ccttacggaccaggtgctgctgctgccgccgctgccgctgccggaggttacggtccagga 1140 gccggacaacagggtccaggtggagctggacaacaaggtccaggatcacaaggtcctggt 1200 ggacaaggtccatacggtcctggtgctggtcaacagggaccaggtagtcaaggacctggt 1260 tcaggtggtcagcagggtccaggaggacagggtccttacggcccttctgccgctgcagca 1320 gcagccgctgccgcaggaggatacggacctggtgctggacaacgatctcaaggaccagga 1380 ggacaaggtccttatggacctggcgctggccaacaaggacctggttctcagggtccaggt 1440 tcaggaggccaacaaggcccaggaggtcaaggaccatacggaccatccgctgcggcagct 1500 gcagctgctgcaggtggatatggcccaggagccggacaacagggtcctggttcacaaggt 1560 ccaggatctggtggtcaacagggaccaggcggccagggaccttatggtccaggagccgct 1620 gcagcagcagcagctgttggaggttacggccctggtgccggtcaacaaggcccaggatct 1680 cagggtcctggatctggaggacaacaaggtcctggaggtcagggtccatacggaccttca 1740 gcagcagctgctgctgcagccgctggtggttatggacctggtgctggtcaacaaggaccg 1800 ggttctcagggtccgggttcaggaggtcagcagggccctggtggacaaggaccttatgga 1860 cctagtgcggctgcagcagctgccgccgcaggtggttacggtccaggcgctggtcaacaa 1920 ggtccaggaagtggtggtcaacaaggacctggcggtcaaggaccctacggtagtggccaa 1980 caaggtccaggtggagcaggacagcagggtccgggaggccaaggaccttacggaccaggt 2040 gctgctgctgccgccgctgccgctgccggaggttacggtccaggagccggacaacagggt 2100 ccaggtggagctggacaacaaggtccaggatcacaaggtcctggtggacaaggtccatac 2160 ggtcctggtgctggtcaacagggaccaggtagtcaaggacctggttcaggtggtcagcag 2220 ggtccaggaggacagggtccttacggcccttctgccgctgcagcagcagccgctgccgca 2280 ggaggatacggacctggtgctggacaacgatctcaaggaccaggaggacaaggtccttat 2340 ggacctggcgctggccaacaaggacctggttctcagggtccaggttcaggaggccaacaa 2400 ggcccaggaggtcaaggaccatacggaccatccgctgcggcagctgcagctgctgcaggt 2460 ggatatggcccaggagccggacaacagggtcctggttcacaaggtccaggatctggtggt 2520 caacagggaccaggcggccagggaccttatggtccaggagccgctgcagcagcagcagct 2580 gttggaggttacggccctggtgccggtcaacaaggcccaggatctcagggtcctggatct 2640 ggaggacaacaaggtcctggaggtcagggtccatacggaccttcagcagcagctgctgct 2700 gcagccgctggtggttatggacctggtgctggtcaacaaggaccgggttctcagggtccg 2760 ggttcaggaggtcagcagggccctggtggacaaggaccttatggacctagtgcggctgca 2820 gcagctgccgccgca 2835 18B 463 GGYGPGAGQQGPGSGGQQGPGGQGPYGSGQQGPGGAGQQGPGGQGPYGPGAAAAAAAAAG polypeptide GYGPGAGQQGPGGAGQQGPGSQGPGGQGPYGPGAGQQGPGSQGPGSGGQQGPGGQGPYGP sequence SAAAAAAAAAGGYGPGAGQRSQGPGGQGPYGPGAGQQGPGSQGPGSGGQQGPGGQGPYGP SAAAAAAAAGGYGPGAGQQGPGSQGPGSGGQQGPGGQGPYGPGAAAAAAAVGGYGPGAGQ QGPGSQGPGSGGQQGPGGQGPYGPSAAAAAAAAGGYGPGAGQQGPGSQGPGSGGQQGPGG QGPYGPSAAAAAAAAGGYGPGAGQQGPGSGGQQGPGGQGPYGSGQQGPGGAGQQGPGGQG PYGPGAAAAAAAAAGGYGPGAGQQGPGGAGQQGPGSQGPGGQGPYGPGAGQQGPGSQGPG SGGQQGPGGQGPYGPSAAAAAAAAAGGYGPGAGQRSQGPGGQGPYGPGAGQQGPGSQGPG SGGQQGPGGQGPYGPSAAAAAAAAGGYGPGAGQQGPGSQGPGSGGQQGPGGQGPYGPGAA AAAAAVGGYGPGAGQQGPGSQGPGSGGQQGPGGQGPYGPSAAAAAAAAGGYGPGAGQQGP GSQGPGSGGQQGPGGQGPYGPSAAAAAAAAGGYGPGAGQQGPGSGGQQGPGGQGPYGSGQ QGPGGAGQQGPGGQGPYGPGAAAAAAAAAGGYGPGAGQQGPGGAGQQGPGSQGPGGQGPY GPGAGQQGPGSQGPGSGGQQGPGGQGPYGPSAAAAAAAAAGGYGPGAGQRSQGPGGQGPY GPGAGQQGPGSQGPGSGGQQGPGGQGPYGPSAAAAAAAAGGYGPGAGQQGPGSQGPGSGG QQGPGGQGPYGPGAAAAAAAVGGYGPGAGQQGPGSQGPGSGGQQGPGGQGPYGPSAAAAA AAAGGYGPGAGQQGPGSQGPGSGGQQGPGGQGPYGPSAAAAAAAA Repeat 464 GGYGPGAGQQGPGSGGQQGPGGQGPYGSGQQGPGGAGQQGPGGQGPYGPGAAAAAAAAAG sequenceofa GYGPGAGQQGPGGAGQQGPGSQGPGGQGPYGPGAGQQGPGSQGPGSGGQQGPGGQGPYGP silk-like SAAAAAAAAAGGYGPGAGQRSQGPGGQGPYGPGAGQQGPGSQGPGSGGQQGPGGQGPYGP polypeptide SAAAAAAAAGGYGPGAGQQGPGSQGPGSGGQQGPGGQGPYGPGAAAAAAAVGGYGPGAGQ QGPGSQGPGSGGQQGPGGQGPYGPSAAAAAAAAGGYGPGAGQQGPGSQGPGSGGQQGPGG QGPYGPSAAAAAAAA
(130) TABLE-US-00010 TABLE10 ZeocinCassettewithHAarmsforKU70deletioninP.pastoris SEQID Description NO: 5 to3 Sequence Plasmid 465 ggagttgaatcacatcttactggatagcgagctttttgacgaagtgaaaatttctaattttaaacaagaggaaggggtca sequence aaaacggagatatcttatacttggaaaaagagatgacaatcagtgatttcatcaattttgtatctagttggccttctgtg ttttcgtggaagcagcaacgaggaaaggagggtatcctagatgatttttacaacgaactgaacgactgctttgagggggg taacatgaaagtaatatggaactccgtcctagtatttgccaggaggaagcaaagggttgtataggctttagtacttatag aggaaacggggttacgtgcaagcgcgcatgcctgagctttgaggggggggactttcacatctcttcttctcacacttagc cctaacacagagaataataaaaagcattgcaagatgagtgttgtcagcaagcaatacgacatccacgaaggcattatctt tgtaattgaattgaccccggagcttcacgcgccggcttcagaagggaaatctcagctccagatcatcttagagaatgtca gtgaggttatttctgagctaatcattaccttgcccggtacaggaatagggtgttaccttattaattacgacggtggtcaa aacgacgaaatttaccccatttttgagttacaagacctgaatttggaaatgatgaaacaattgtaccaagtcttggagga ccatgtaagtgggcttaatcctctcgagaagcaattcccaattgaacacagtaaaccgttatcagccactctgttctttc acttaaggtctcttttttacatggcgaagactcataagcgtactggaagacattacaacttgaaaaagattttcttgttc actaataacgataaaccttacaatggaaactctcagctgagagttcccttgaagaaaaccctggctgattacaatgacgt agacattactttgattccgtttcttctgaacaagccttcaggtgtcaagtttgacaagacggaatactcagaaattttgt tctatgataaagatgcttgttcgatgtcaattgaggagatccgccaacgaatttctagacataaggagatcaagcgggtt tacttcacctgtcctttgaaaatcgcaaataacttgtgcatttctgtgaaaggttattctatgttttatcatgaaactcc aaggaagatcaaatttgtcgtcaatgagggttcaactttcaaagatgtggagacaaaatctcagtttgtcgatccaacat ccggaaaagagttttccagtgaacagctgatcaaagcatatcctctaggtgccgatgcttacattcctttaaactcagag caagtcaaaacaataaatcgatttaatgatatcatcaatatcccctctttggaaattctaggtttcagggatatatctaa ttggttgccacagtatcagtttggcaaagcatcgtttttatcccctaataactatggtgattttacacattcgcagagaa catttagttgtcttcagtaatgtcttgtttcttttgttgcagtggtgagccattttgacttcgtgaaagtttctttagaa tagttgtttccagaggccaaacattccacccgtagtaaagtgcaagcgtaggaagaccaagactggcataaatcaggtat aagtgtcgagcactggcaggtgatcttctgaaagtttctactagcagataagatccagtagtcatgcatatggcaacaat gtaccgtgtggatctaagaacgcgtcctactaaccttcgcattcgttggtccagtttgttgttatcgatcaacgtgacaa ggttgtcgattccgcgtaagcatgcatacccaaggacgcctgttgcaattccaagtgagccagttccaacaatctttgta atattagagcacttcattgtgttgcgcttgaaagtaaaatgcgaacaaattaagagataatctcgaaaccgcgacttcaa acgccaatatgatgtgcggcacacaataagcgttcatatccgctgggtgactttctcgctttaaaaaattatccgaaaaa atttttgacggctagctcagtcctaggtacgctagcattaaagaggagaaaatggctaaactgacctctgctgttccggt tctgaccgctcgtgacgttgctggtgctgttgagttctggaccgaccgtctgggtttctctcgtgacttcgttgaagacg acttcgctggtgttgttcgtgacgacgttaccctgttcatctctgctgttcaggaccaggttgttccggacaacaccctg gcttgggtttgggttcgtggtctggacgaactgtacgctgaatggtctgaagttgtttctaccaacttccgtgacgcttc tggtccggctatgaccgaaatcggtgaacagccgtggggtcgtgagttcgctctgcgtgacccggctggtaactgcgttc acttcgttgctgaagaacaggactaacacgtccgacggcggcccacgggtcccaggcctcggagatccgtcccccttttc ctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgctctaaccgaaaaggaag gagttagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaacgttatttatatttcaa atttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttgggacg ctcgaaggctttaatttgcaagctgtattagtttcacttttcagcaacctggtcggaaagatccacatcaagaatggata ccaaccccaagagtatgaaaatccttccctacaatggcacttcaaaatgttacgtgacgattaccttcaattggaacacg atatcgacatcagtgacccccttgagaaacaaaagtacataaacagcctcgatgagacaaaaaccaagatcatgaaacta cgggactatgtcaaggaaactgccgatgatgacgacccttcacggcttgccaacactctcaaagagctcaaccaagagct gaacaaaatttccaactttgatatcatcgccaataagaagccaaagacccccacgacagtagaccctgttcctactgatg atgacatcatcaacgcctggaaggcaggaactctgaacggtttcaaggtggatcaattacgaaaatacgtaaggtcacga aacaactttctggagacggcctccaaaaaggcagatctcatcgccaacattgacaagtactttcagcagaagttcaaaga gactaaggcctgattcgtgttccttactttttcctcgcaacgtgtttttttcccaccacattgcctatgttgtaatgcaa tgcagatgctggcccagtttttgacgattctcgaaaattggcattttcgtcgatgccattggccaaactgaaaattcaag acaaaatagattggattttatctgcaacgtcttccacctacacaaccactctacaaacttcagacaaacatgtttataaa agcagctactagatccaaaatgacaagttcgttattctctactacgtttgttgtggcatttggattggtggctagcaaca acctcttgccatgtcctgttgaccactctatgaataacgagactccgcaagaattgaaaccattgcaggctgaatcttct actagaaagttgaactcttccgcttaagtcaaataaaactactgacacagatgatgcacagaaacaacggatcacgctct tgactgattagtcccgtcattttggttctcattttcttcacagtcacctatcaatgtatgatcacctggaaggatttccc tacgatacttcaaatcttttacttgataatattactcattatggctcaggaatgcagactgcctgattcaagacgctgct cttcttatttaacacttgtacactaaccccatggaagccagggaagggaataaccatctctctggtaataaatcggtctt tatttatgcatagaaaaggaatctattatatttcgttcatttggcactctgctaactgtagattaacgggtctcgtaaat tcaaaatcttcttccgatcaaaccggggtgaaatattacttctcgtgcatagctaattttcaaataaccgtcctaaaatg aacggtcatttacctggactctcttgccaaatgggcaacaaaacataaagctgatcagaacgtaactagtctctcggaat ccat HAF 466 ggagttgaatcacatcttactg KU70HA1 467 gacaactaaatgttctctgcgaatgtgtaaaatcaccatagttattaggggataaaaacgatgctttgccaaactgatac tgtggcaaccaattagatatatccctgaaacctagaatttccaaagaggggatattgatgatatcattaaatcgatttat tgttttgacttgctctgagtttaaaggaatgtaagcatcggcacctagaggatatgctttgatcagctgttcactggaaa actcttttccggatgttggatcgacaaactgagattttgtctccacatctttgaaagttgaaccctcattgacgacaaat ttgatcttccttggagtttcatgataaaacatagaataacctttcacagaaatgcacaagttatttgcgattttcaaagg acaggtgaagtaaacccgcttgatctccttatgtctagaaattcgttggcggatctcctcaattgacatcgaacaagcat ctttatcatagaacaaaatttctgagtattccgtcttgtcaaacttgacacctgaaggcttgttcagaagaaacggaatc aaagtaatgtctacgtcattgtaatcagccagggttttcttcaagggaactctcagctgagagtttccattgtaaggttt atcgttattagtgaacaagaaaatctttttcaagttgtaatgtcttccagtacgcttatgagtcttcgccatgtaaaaaa gagaccttaagtgaaagaacagagtggctgataacggtttactgtgttcaattgggaattgcttctcgagaggattaagc ccacttacatggtcctccaagacttggtacaattgtttcatcatttccaaattcaggtcttgtaactcaaaaatggggta aatttcgtcgttttgaccaccgtcgtaattaataaggtaacaccctattcctgtaccgggcaaggtaatgattagctcag aaataacctcactgacattctctaagatgatctggagctgagatttcccttctgaagccggcgcgtgaagctccggggtc aattcaattacaaagataatgccttcgtggatgtcgtattgcttgctgacaacactcat KU70HA2 468 tcaggccttagtctctttgaacttctgctgaaagtacttgtcaatgttggcgatgagatctgcctttttggaggccgtct ccagaaagttgtttcgtgaccttacgtattttcgtaattgatccaccttgaaaccgttcagagttcctgccttccaggcg ttgatgatgtcatcatcagtaggaacagggtctactgtcgtgggggtctttggcttcttattggcgatgatatcaaagtt ggaaattttgttcagctcttggttgagctctttgagagtgttggcaagccgtgaagggtcgtcatcatcggcagtttcct tgacatagtcccgtagtttcatgatcttggtttttgtctcatcgaggctgtttatgtacttttgtttctcaagggggtca ctgatgtcgatatcgtgttccaattgaaggtaatcgtcacgtaacattttgaagtgccattgtagggaaggattttcata ctcttggggttggtatccattcttgatgtggatctttccgaccaggttgctgaaaagtgaaactaatac pILV5 469 ttcagtaatgtcttgtttcttttgttgcagtggtgagccattttgacttcgtgaaagtttctttagaatagttgtttcca gaggccaaacattccacccgtagtaaagtgcaagcgtaggaagaccaagactggcataaatcaggtataagtgtcgagca ctggcaggtgatcttctgaaagtttctactagcagataagatccagtagtcatgcatatggcaacaatgtaccgtgtgga tctaagaacgcgtcctactaaccttcgcattcgttggtccagtttgttgttatcgatcaacgtgacaaggttgtcgattc cgcgtaagcatgcatacccaaggacgcctgttgcaattccaagtgagccagttccaacaatctttgtaatattagagcac ttcattgtgttgcgcttgaaagtaaaatgcgaacaaattaagagataatctcgaaaccgcgacttcaaacgccaatatga tgtgcggcacacaataagcgttcatatccgctgggtgactttctcgctttaaaaaattatccgaaaaaattt RM2734;testR 470 cagaggccaaacattccacc pproRBS 471 ttaaagaggagaaa Shble(codon 472 atggctaaactgacctctgctgttccggttctgaccgctcgtgacgttgctggtgctgttgagttctggaccgaccgtct optimized) gggtttctctcgtgacttcgttgaagacgacttcgctggtgttgttcgtgacgacgttaccctgttcatctctgctgttc aggaccaggttgttccggacaacaccctggcttgggtttgggttcgtggtctggacgaactgtacgctgaatggtctgaa gttgtttctaccaacttccgtgacgcttctggtccggctatgaccgaaatcggtgaacagccgtggggtcgtgagttcgc tctgcgtgacccggctggtaactgcgttcacttcgttgctgaagaacaggactaa CYC1 473 cacgtccgacggcggcccacgggtcccaggcctcggagatccgtcccccttttcctttgtcgatatcatgtaattagtta terminator tgtcacgcttacattcacgccctccccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtc cctatttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctgtacagacg cgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttgggacgctcgaaggctttaatttgcaagct Rm3386;Ftest 474 aggagttagacaacctgaag oligo HAR 475 gtaactagtctctcggaatccat
(131) TABLE-US-00011 TABLE11 NourseothricinCassetteforproteasedeletioninP.pastoris SEQID Description NO: 5 to3 Sequence Plasmid 476 cttcagagtacagaagattaagtgagagaattctaccgttcgtatagcatacattatacgaagttatttcagtaatgtct sequence tgtttcttttgttgcagtggtgagccattttgacttcgtgaaagtttctttagaatagttgtttccagaggccaaacatt ccacccgtagtaaagtgcaagcgtaggaagaccaagactggcataaatcaggtataagtgtcgagcactggcaggtgatc ttctgaaagtttctactagcagataagatccagtagtcatgcatatggcaacaatgtaccgtgtggatctaagaacgcgt cctactaaccttcgcattcgttggtccagtttgttgttatcgatcaacgtgacaaggttgtcgattccgcgtaagcatgc atacccaaggacgcctgttgcaattccaagtgagccagttccaacaatctttgtaatattagagcacttcattgtgttgc gcttgaaagtaaaatgcgaacaaattaagagataatctcgaaaccgcgacttcaaacgccaatatgatgtgcggcacaca ataagcgttcatatccgctgggtgactttctcgctttaaaaaattatccgaaaaaatttttgacggctagctcagtccta ggtacgctagcattaaagaggagaaaatgactactcttgatgacacagcctacagatataggacatcagttccgggtgac gcagaggctatcgaagccttggacggttcattcactactgatacggtgtttagagtcaccgctacaggtgatggcttcac cttgagagaggttcctgtagacccacccttaacgaaagttttccctgatgacgaatcggatgacgagtctgatgctggtg aggacggtgaccctgattccagaacatttgtcgcatacggagatgatggtgacctggctggctttgttgtggtgtcctac agcggatggaatcgtagactcacagttgaggacatcgaagttgcacctgaacatcgtggtcacggtgttggtcgtgcact gatgggactggcaacagagtttgctagagaaagaggagccggacatttgtggttagaagtgaccaatgtcaacgctcctg ctattcacgcatataggcgaatgggtttcactttgtgcggtcttgatactgctttgtatgacggaactgcttctgatggt gaacaagctctttacatgagtatgccatgtccatagcacgtccgacggcggcccacgggtcccaggcctcggagatccgt cccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgctctaacc gaaaaggaaggagttagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaacgttatt tatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaagg ttttgggacgctcgaaggctttaatttgcaagctataacttcgtatagcatacattataccttgttatgcggccgcaaga agttgattgagactttcaacgag AOX1pA 477 cttcagagtacagaagattaagtgaga terminator Lox71F 478 taccgttcgtatagcatacattatacgaagttat pILV5 479 ttcagtaatgtcttgtttcttttgttgcagtggtgagccattttgacttcgtgaaagtttctttagaatagttgtttcca gaggccaaacattccacccgtagtaaagtgcaagcgtaggaagaccaagactggcataaatcaggtataagtgtcgagca ctggcaggtgatcttctgaaagtttctactagcagataagatccagtagtcatgcatatggcaacaatgtaccgtgtgga tctaagaacgcgtcctactaaccttcgcattcgttggtccagtttgttgttatcgatcaacgtgacaaggttgtcgattc cgcgtaagcatgcatacccaaggacgcctgttgcaattccaagtgagccagttccaacaatctttgtaatattagagcac ttcattgtgttgcgcttgaaagtaaaatgcgaacaaattaagagataatctcgaaaccgcgacttcaaacgccaatatga tgtgcggcacacaataagcgttcatatccgctgggtgactttctcgctttaaaaaattatccgaaaaaattt pproRBS 480 ttaaagaggagaaa nat 481 atgactactcttgatgacacagcctacagatataggacatcagttccgggtgacgcagaggctatcgaagccttggacgg (Nourseothricin ttcattcactactgatacggtgtttagagtcaccgctacaggtgatggcttcaccttgagagaggttcctgtagacccac resistance) ccttaacgaaagttttccctgatgacgaatcggatgacgagtctgatgctggtgaggacggtgaccctgattccagaaca tttgtcgcatacggagatgatggtgacctggctggctttgttgtggtgtcctacagcggatggaatcgtagactcacagt tgaggacatcgaagttgcacctgaacatcgtggtcacggtgttggtcgtgcactgatgggactggcaacagagtttgcta gagaaagaggagccggacatttgtggttagaagtgaccaatgtcaacgctcctgctattcacgcatataggcgaatgggt ttcactttgtgcggtcttgatactgctttgtatgacggaactgcttctgatggtgaacaagctctttacatgagtatgcc atgtccatag CYC1 482 cacgtccgacggcggcccacgggtcccaggcctcggagatccgtcccccttttcctttgtcgatatcatgtaattagtta terminator tgtcacgcttacattcacgccctccccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtc cctatttatttttttatagttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctgtacagacg cgtgtacgcatgtaacattatactgaaaaccttgcttgagaaggttttgggacgctcgaaggctttaatttgcaagct LoxKR3F 483 ataacttcgtatagcatacattataccttgttat HSP82 484 gcggccgcaagaagttgattgagactttcaacgag
(132) TABLE-US-00012 TABLE12 ExemplarynourseothricincassetteswithHAarmsforproteasedeletioninP.pastoris SEQ ID Description NO: 5 to3 Sequence Nourseothricin 485 tactacaggctggctgttcctcgcatggtgtttaatgtcctgactgggttttcgtttatcggtattaccggagccaccttg cassettewith actgtaagggaacgatactggactaagagagtaatgcgaaaggcaacagcgtttctggcgaacctaatcaatgacggttac homologyarms gagtttactactcctaaagccagtcttattttgctagagcgagtcaacgcttacttaaagggccagggacctaattatgac targeting atcgattttgacgagcaggaggcgttcattaaagaaatggaggagttgaggacctctggtggatatgagaacagatactca PAS_chr4_0584 tattcaggaaccgatgaaacacccagagatccgggttgcctgtttcttcccattgctttaaataaatggcactttgatgtg ctagactgcctgaggatatacggtactcaggaagatctggaatctaaattattaagtgttcagcaattggtgttacaatgt tgcatgaagcacagtggcatgactccagacatggtctttgcaacggaagtagctcagaagccgaccttcgaagacgacata gtttgtgatgatattgacgcttatgcccaggggggtgattgtctagattattgttacacgccaagcaattactccagaact ttagaaattcatggcaagattgctaccttacaacgagagctggggctatgctataatattctcggaattttggaccgtttt tccgattaaggtttttagctccattgcgccaacccccgctctccagactccttcgttatccagcattcagcatggacaggt tcaaaaaataaaatttcttgatatgggtccacttcaaacatgcgcctacctgtaggaaaaaaaaagagaacataaatatgc cgcgaacagaaaacgtaatgtactgttctatatataaactgttcagatcaatcataaattctcagtttcaaactttccgct cagccagattttattcgtaaagaacgcatcattggctctatgttgaaggatcagttcttgttatgggttgctttgatagcg agcgtaccggtttccggcgtgatggcagctcctagcgagtccgggcataacacggttgaaaaacgagatgccaaaaacgtt gttggcgttcaacagttggacttcttcagagtacagaagattaagtgagagaattctaccgttcgtatagcatacattata cgaagttatttcagtaatgtcttgtttcttttgttgcagtggtgagccattttgacttcgtgaaagtttctttagaatagt tgtttccagaggccaaacattccacccgtagtaaagtgcaagcgtaggaagaccaagactggcataaatcaggtataagtg tcgagcactggcaggtgatcttctgaaagtttctactagcagataagatccagtagtcatgcatatggcaacaatgtaccg tgtggatctaagaacgcgtcctactaaccttcgcattcgttggtccagtttgttgttatcgatcaacgtgacaaggttgtc gattccgcgtaagcatgcatacccaaggacgcctgttgcaattccaagtgagccagttccaacaatctttgtaatattaga gcacttcattgtgttgcgcttgaaagtaaaatgcgaacaaattaagagataatctcgaaaccgcgacttcaaacgccaata tgatgtgcggcacacaataagcgttcatatccgctgggtgactttctcgctttaaaaaattatccgaaaaaatttttgacg gctagctcagtcctaggtacgctagcattaaagaggagaaaatgactactcttgatgacacagcctacagatataggacat cagttccgggtgacgcagaggctatcgaagccttggacggttcattcactactgatacggtgtttagagtcaccgctacag gtgatggcttcaccttgagagaggttcctgtagacccacccttaacgaaagttttccctgatgacgaatcggatgacgagt ctgatgctggtgaggacggtgaccctgattccagaacatttgtcgcatacggagatgatggtgacctggctggctttgttg tggtgtcctacagcggatggaatcgtagactcacagttgaggacatcgaagttgcacctgaacatcgtggtcacggtgttg gtcgtgcactgatgggactggcaacagagtttgctagagaaagaggagccggacatttgtggttagaagtgaccaatgtca acgctcctgctattcacgcatataggcgaatgggtttcactttgtgcggtcttgatactgctttgtatgacggaactgctt ctgatggtgaacaagctctttacatgagtatgccatgtccatagcacgtccgacggcggcccacgggtcccaggcctcgga gatccgtcccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgct ctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaacg ttatttatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgag aaggttttgggacgctcgaaggctttaatttgcaagctataacttcgtatagcatacattataccttgttatgcggccgca agaagttgattgagactttcaacgagggtccccttcagctacctttctctctgtttggtagttattctcggcgtgtgtata gtatagtataaaagggcctacattggataggcttcaacattcctcaataaacaaacatccaacatcgcgcattccgcattt cgcatttcacatttcgcgcctgccttcctttaggttctttgaatcatcatcaatcgtcgccgtctacatcagagcaggact tatctttgccttccccaaaaattgccactccgtcaaatagattcttttgaatccttgactatttttgcctaaataggtttt tgttagtttttcttcaaagcccaaaagaaactctatttagattcatccagaaacaatctttttctcaccccatttcgaagt gccgtggagcacagacataaaaagatgactaccgttcaacctacagggccagacaggctcaccctgccgcatattctactg gaattcaacgatggctcctcgcagcatgcagtgatcgagctaagcatgaacgaggggattaatatatccacccatgagtgg aatccatccactaatgagcaatcgccacgggaagagagagcaccaccccaacaatccaatccatcgcatcatccagaatca tcgaacatagctactcaaagtcccgctcaggaaaccgagactcagcccggcattccaggactagataggcctgcctttgat acctcggcaacggggtcgtcagaacaggttgacccagtacagggaaggatcctggatgatattataggccaatcattaagg acttccgaagaagacgataccgaatcccgccagagaccacgagaccagaagaacattatgatcaccgtgaattacttgtac gcagacgacacaaattccagaagtgctaatacaaacaaccagacgcccaataacacttctagaacttccgacagtgaacgt gtgggctccttatcgttgcacgttccggatctaccagataatgccgacgattactatatcgatgtactcattaaactaacc acaagcattgccctcagcgtcatcacgtccatgatcaagaaacgattagggcttagcaggga PAS_chr4_0584 486 tactacaggctggctgttcctcgcatggtgtttaatgtcctgactgggttttcgtttatcggtattaccggagccaccttg HomologyArm1 actgtaagggaacgatactggactaagagagtaatgcgaaaggcaacagcgtttctggcgaacctaatcaatgacggttac gagtttactactcctaaagccagtcttattttgctagagcgagtcaacgcttacttaaagggccagggacctaattatgac atcgattttgacgagcaggaggcgttcattaaagaaatggaggagttgaggacctctggtggatatgagaacagatactca tattcaggaaccgatgaaacacccagagatccgggttgcctgtttcttcccattgctttaaataaatggcactttgatgtg ctagactgcctgaggatatacggtactcaggaagatctggaatctaaattattaagtgttcagcaattggtgttacaatgt tgcatgaagcacagtggcatgactccagacatggtctttgcaacggaagtagctcagaagccgaccttcgaagacgacata gtttgtgatgatattgacgcttatgcccaggggggtgattgtctagattattgttacacgccaagcaattactccagaact ttagaaattcatggcaagattgctaccttacaacgagagctggggctatgctataatattctcggaattttggaccgtttt tccgattaaggtttttagctccattgcgccaacccccgctctccagactccttcgttatccagcattcagcatggacaggt tcaaaaaataaaatttcttgatatgggtccacttcaaacatgcgcctacctgtaggaaaaaaaaagagaacataaatatgc cgcgaacagaaaacgtaatgtactgttctatatataaactgttcagatcaatcataaattctcagtttcaaactttccgct cagccagattttattcgtaaagaacgcatcattggctctatgttgaaggatcagttcttgttatgggttgctttgatagcg agcgtaccggtttccggcgtgatggcagctcctagcgagtccgggcataacacggttgaaaaacgagatgccaaaaacgtt gttggcgttcaacagttggactt PAS_chr4_0584 487 ggtccccttcagctacctttctctctgtttggtagttattctcggcgtgtgtatagtatagtataaaagggcctacattgg HomologyArm2 ataggcttcaacattcctcaataaacaaacatccaacatcgcgcattccgcatttcgcatttcacatttcgcgcctgcctt cctttaggttctttgaatcatcatcaatcgtcgccgtctacatcagagcaggacttatctttgccttccccaaaaattgcc actccgtcaaatagattcttttgaatccttgactatttttgcctaaataggtttttgttagtttttcttcaaagcccaaaa gaaactctatttagattcatccagaaacaatctttttctcaccccatttcgaagtgccgtggagcacagacataaaaagat gactaccgttcaacctacagggccagacaggctcaccctgccgcatattctactggaattcaacgatggctcctcgcagca tgcagtgatcgagctaagcatgaacgaggggattaatatatccacccatgagtggaatccatccactaatgagcaatcgcc acgggaagagagagcaccaccccaacaatccaatccatcgcatcatccagaatcatcgaacatagctactcaaagtcccgc tcaggaaaccgagactcagcccggcattccaggactagataggcctgcctttgatacctcggcaacggggtcgtcagaaca ggttgacccagtacagggaaggatcctggatgatattataggccaatcattaaggacttccgaagaagacgataccgaatc ccgccagagaccacgagaccagaagaacattatgatcaccgtgaattacttgtacgcagacgacacaaattccagaagtgc taatacaaacaaccagacgcccaataacacttctagaacttccgacagtgaacgtgtgggctccttatcgttgcacgttcc ggatctaccagataatgccgacgattactatatcgatgtactcattaaactaaccacaagcattgccctcagcgtcatcac gtccatgatcaagaaacgattagggcttagcaggga Nourseothricin 488 gccttctcgtgcaatcagagctgttgaaagagagaagagggcacacggaagctgctgttcaattgtgtgaattgaccggat cassettewith tacaacctgctggagtgataggagagctggttcgtgacgaggacggctctatgatgcgattagacgactgtgttcagtttg homologyarms gtctccgccacaacgtaaaaattatcaaccttgaccagatcattgaatacatggattccaagaacagctagatacgatgga targeting taggaatacagagatatcatgattgaggaacgtaagagctttttcgaaagtgtgagtttgtggtgagggccaggcggtggg PAS_chr3_1157 gaggtggtggggagcctccttggtcgaatgtagatatagtaagcaagacacaagagcgcgcgaagtcttcaacgaggcggc gttgggtcttgtacgcaacgtaatgactacacagttgagcttgtcgcgaaccggtcgacattttgatcatgcatactatgt tgagacaccatctcgtactattgcggcaaccagctgtaaatttgactaattaaagctgatgaaggatgcagggcgtcgtca attttttgattgattgcatttaattgtttgagccattcaaggctgaatgcccggcaccctagacccttcttgtgagtacta taaacccgcaggcagggtacccttggccttctgcgagactaccagtcataacgtatatccacaatgtactagtaatagccc cggaaaactctaatcccacagaacgtctaacgcctcctatgtcatcgatacccattcgcactactgccatggcccccctta cgtgatcatttcacttactcccgcctaagcttcgcccacatgcctgcgttttgccaagatttactgacgagtttggtttac tcatcctctatttataactactagactttcaccattcttcaccaccctcgtgccaatgatcatcaaccacttggtattgac agccctcagcattgcactagcaagtgcgcaactccaatcgcctttcacttcagagtacagaagattaagtgagagaattct accgttcgtatagcatacattatacgaagttatttcagtaatgtcttgtttcttttgttgcagtggtgagccattttgact tcgtgaaagtttctttagaatagttgtttccagaggccaaacattccacccgtagtaaagtgcaagcgtaggaagaccaag actggcataaatcaggtataagtgtcgagcactggcaggtgatcttctgaaagtttctactagcagataagatccagtagt catgcatatggcaacaatgtaccgtgtggatctaagaacgcgtcctactaaccttcgcattcgttggtccagtttgttgtt atcgatcaacgtgacaaggttgtcgattccgcgtaagcatgcatacccaaggacgcctgttgcaattccaagtgagccagt tccaacaatctttgtaatattagagcacttcattgtgttgcgcttgaaagtaaaatgcgaacaaattaagagataatctcg aaaccgcgacttcaaacgccaatatgatgtgcggcacacaataagcgttcatatccgctgggtgactttctcgctttaaaa aattatccgaaaaaatttttgacggctagctcagtcctaggtacgctagcattaaagaggagaaaatgactactcttgatg acacagcctacagatataggacatcagttccgggtgacgcagaggctatcgaagccttggacggttcattcactactgata cggtgtttagagtcaccgctacaggtgatggcttcaccttgagagaggttcctgtagacccacccttaacgaaagttttcc ctgatgacgaatcggatgacgagtctgatgctggtgaggacggtgaccctgattccagaacatttgtcgcatacggagatg atggtgacctggctggctttgttgtggtgtcctacagcggatggaatcgtagactcacagttgaggacatcgaagttgcac ctgaacatcgtggtcacggtgttggtcgtgcactgatgggactggcaacagagtttgctagagaaagaggagccggacatt tgtggttagaagtgaccaatgtcaacgctcctgctattcacgcatataggcgaatgggtttcactttgtgcggtcttgata ctgctttgtatgacggaactgcttctgatggtgaacaagctctttacatgagtatgccatgtccatagcacgtccgacggc ggcccacgggtcccaggcctcggagatccgtcccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacat tcacgccctccccccacatccgctctaaccgaaaaggaaggagttagacaacctgaagtctaggtccctatttattttttt atagttatgttagtattaagaacgttatttatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaac attatactgaaaaccttgcttgagaaggttttgggacgctcgaaggctttaatttgcaagctataacttcgtatagcatac attataccttgttatgcggccgcaagaagttgattgagactttcaacgagctggctctgcttctggtacttcttcaggtgc atcttctgctactcaaaatgacgaaacatccactgatcttggagctccagctgcatctttaagtgcaacgccatgtctttt tgccatcttgctgctcatgttgtagtagactttttttttcactgagtttttatgtactactgattacattgtgtaggtgta atgatgtgcactataatactaatatagtcaaaatgctacagaggaaagtgcaggttgcctgtggtggtttttcttattagc accctctgaacactctttacctctaacatcctcagccatgctaatcgcgcataaaataaatcttcgaacttttttccattt tatgctcataaagcttccttactgtcaccttatcaaaagagcttttgccactaaagtagtcacacccagaattgctcccga atatcgtccaacaatgctaggatctgtggaaagtttgacaaataatttgaacaccttgagcttgaagcttcctgaagttaa tatccaaggctcctttccagaaagtaacccagtggaccttttgagaaactacatcactcaagaacttagtaaaatttctgg agttgacaaagaattgattttcccagccttggaatggggtaccacactggaaaaaggtgatcttttgatcccagttcctcg tctgagaataaagggtgctaatcctaaagatttagccgaacaatgggctgctgcattcccaaagggtggatatcttaaaga cgttattgcgcaaggacctttcttgcagttcttttttaacacatcggttctgtacaagttggtgatatctgatgctctgga gagaggcgatgactttggtgcacttcctctaggaaagggacaaaaagttatagtggagttttcttctccaaatattgccaa acctttccacgctggccatcttagaagtacaatcatcggtggttttatttccaatctgtatgaaaagctgggtcatgaagt tatgaggatgaattatttgggagactggggaaaacaatttggtgttcttgcagtaggatttgagcgttacggtgatgaggc aaaattaaagactgatccaatcaaccatttgtttgaggtctatgttaaaatcaaccaagatattaaggctcaatcagagtc tactgaggagattgcagaagggcaatcattagatgaccaggcaagagcttttttcaagaaaatggaaaatggcgacgaatc ggctgtaagcttgtggaaaagattccgtgagttatccattgagaagtacattgatacttatgcccgcctcaacatc PAS_chr3_1157 489 gccttctcgtgcaatcagagctgttgaaagagagaagagggcacacggaagctgctgttcaattgtgtgaattgaccggat HomologyArm1 tacaacctgctggagtgataggagagctggttcgtgacgaggacggctctatgatgcgattagacgactgtgttcagtttg gtctccgccacaacgtaaaaattatcaaccttgaccagatcattgaatacatggattccaagaacagctagatacgatgga taggaatacagagatatcatgattgaggaacgtaagagctttttcgaaagtgtgagtttgtggtgagggccaggcggtggg gaggtggtggggagcctccttggtcgaatgtagatatagtaagcaagacacaagagcgcgcgaagtcttcaacgaggcggc gttgggtcttgtacgcaacgtaatgactacacagttgagcttgtcgcgaaccggtcgacattttgatcatgcatactatgt tgagacaccatctcgtactattgcggcaaccagctgtaaatttgactaattaaagctgatgaaggatgcagggcgtcgtca attttttgattgattgcatttaattgtttgagccattcaaggctgaatgcccggcaccctagacccttcttgtgagtacta taaacccgcaggcagggtacccttggccttctgcgagactaccagtcataacgtatatccacaatgtactagtaatagccc cggaaaactctaatcccacagaacgtctaacgcctcctatgtcatcgatacccattcgcactactgccatggcccccctta cgtgatcatttcacttactcccgcctaagcttcgcccacatgcctgcgttttgccaagatttactgacgagtttggtttac tcatcctctatttataactactagactttcaccattcttcaccaccctcgtgccaatgatcatcaaccacttggtattgac agccctcagcattgcactagcaagtgcgcaactccaatcgcctttca PAS_chr3_1157 490 ctggctctgcttctggtacttcttcaggtgcatcttctgctactcaaaatgacgaaacatccactgatcttggagctccag HomologyArm2 ctgcatctttaagtgcaacgccatgtctttttgccatcttgctgctcatgttgtagtagactttttttttcactgagtttt tatgtactactgattacattgtgtaggtgtaatgatgtgcactataatactaatatagtcaaaatgctacagaggaaagtg caggttgcctgtggtggtttttcttattagcaccctctgaacactctttacctctaacatcctcagccatgctaatcgcgc ataaaataaatcttcgaacttttttccattttatgctcataaagcttccttactgtcaccttatcaaaagagcttttgcca ctaaagtagtcacacccagaattgctcccgaatatcgtccaacaatgctaggatctgtggaaagtttgacaaataatttga acaccttgagcttgaagcttcctgaagttaatatccaaggctcctttccagaaagtaacccagtggaccttttgagaaact acatcactcaagaacttagtaaaatttctggagttgacaaagaattgattttcccagccttggaatggggtaccacactgg aaaaaggtgatcttttgatcccagttcctcgtctgagaataaagggtgctaatcctaaagatttagccgaacaatgggctg ctgcattcccaaagggtggatatcttaaagacgttattgcgcaaggacctttcttgcagttcttttttaacacatcggttc tgtacaagttggtgatatctgatgctctggagagaggcgatgactttggtgcacttcctctaggaaagggacaaaaagtta tagtggagttttcttctccaaatattgccaaacctttccacgctggccatcttagaagtacaatcatcggtggttttattt ccaatctgtatgaaaagctgggtcatgaagttatgaggatgaattatttgggagactggggaaaacaatttggtgttcttg cagtaggatttgagcgttacggtgatgaggcaaaattaaagactgatccaatcaaccatttgtttgaggtctatgttaaaa tcaaccaagatattaaggctcaatcagagtctactgaggagattgcagaagggcaatcattagatgaccaggcaagagctt ttttcaagaaaatggaaaatggcgacgaatcggctgtaagcttgtggaaaagattccgtgagttatccattgagaagtaca ttgatacttatgcccgcctcaacatc Nourseothricin 491 gacgagacgctgttcctttcaacttgtccacttggactgacaagtcaacacctgttactaattcttttgtcatctctcagt cassettewith atgaagacacgcgtgttcctcaatcagccaccagttctacacatccaaacatacctaaacacgccaaagagtatccgttag homologyarms caaatgggccacctgggtggtgttggaattcccattccagtatgtcgacagaccaaccaatatatccaggacaccaatatc targeting caccaccgcttcagcagcactaccactttgcttcacccaggcaactatcaaactctagctctgggacgtcatccgttcctt PAS_chr1-4_0289 tccaaccaccccctgctggtcaattacaaccacaaggtaattctatgttcatacacatgccattttcgctaaatggcccac cagctgctggacagcaattgataccaccccaaggactagcctcaatacctgtcggccccggcaacaacagttccctattgg ttagccaaggtgcacctggcggctattctttagcttcaccagcgttgtcaccggtagatgcgaccttcgaagatcccgtca agagactgcccaaaaagcggacaaaaactggatgtctcacttgccgtaagagacgaatcaaatgtgacgaacgcaagccgt tctgtttcaactgtgaaaaaagcaaaaaggtgtgtactggttttacgcatctattcaaagatccccctagcaaatcctacc ctcccagttcagatggtgcctcccctgttgccaatgaccaccctgtccccccaaggcaaaactttggtgaattgaggggca gtctgaattacatcatcaactagaagaatgcttattccttttctctactgtataatcacgacgttatgtcctttaatataa gaaacgacaattaaaccactttaggtggacataatccatttctggatgctgttcgatgtgtagtgtctaaaccgatactga gatttctctttctctttctcttttttttttttttcctaccatttccttcaagaaaatacacctttcgacagatcatcataa atggtggcctctcttcacacttcagagtacagaagattaagtgagagaattctaccgttcgtatagcatacattatacgaa gttatttcagtaatgtcttgtttcttttgttgcagtggtgagccattttgacttcgtgaaagtttctttagaatagttgtt tccagaggccaaacattccacccgtagtaaagtgcaagcgtaggaagaccaagactggcataaatcaggtataagtgtcga gcactggcaggtgatcttctgaaagtttctactagcagataagatccagtagtcatgcatatggcaacaatgtaccgtgtg gatctaagaacgcgtcctactaaccttcgcattcgttggtccagtttgttgttatcgatcaacgtgacaaggttgtcgatt ccgcgtaagcatgcatacccaaggacgcctgttgcaattccaagtgagccagttccaacaatctttgtaatattagagcac ttcattgtgttgcgcttgaaagtaaaatgcgaacaaattaagagataatctcgaaaccgcgacttcaaacgccaatatgat gtgcggcacacaataagcgttcatatccgctgggtgactttctcgctttaaaaaattatccgaaaaaatttttgacggcta gctcagtcctaggtacgctagcattaaagaggagaaaatgactactcttgatgacacagcctacagatataggacatcagt tccgggtgacgcagaggctatcgaagccttggacggttcattcactactgatacggtgtttagagtcaccgctacaggtga tggcttcaccttgagagaggttcctgtagacccacccttaacgaaagttttccctgatgacgaatcggatgacgagtctga tgctggtgaggacggtgaccctgattccagaacatttgtcgcatacggagatgatggtgacctggctggctttgttgtggt gtcctacagcggatggaatcgtagactcacagttgaggacatcgaagttgcacctgaacatcgtggtcacggtgttggtcg tgcactgatgggactggcaacagagtttgctagagaaagaggagccggacatttgtggttagaagtgaccaatgtcaacgc tcctgctattcacgcatataggcgaatgggtttcactttgtgcggtcttgatactgctttgtatgacggaactgcttctga tggtgaacaagctctttacatgagtatgccatgtccatagcacgtccgacggcggcccacgggtcccaggcctcggagatc cgtcccccttttcctttgtcgatatcatgtaattagttatgtcacgcttacattcacgccctccccccacatccgctctaa ccgaaaaggaaggagttagacaacctgaagtctaggtccctatttatttttttatagttatgttagtattaagaacgttat ttatatttcaaatttttcttttttttctgtacagacgcgtgtacgcatgtaacattatactgaaaaccttgcttgagaagg ttttgggacgctcgaaggctttaatttgcaagctataacttcgtatagcatacattataccttgttatgcggccgcaagaa gttgattgagactttcaacgagtgatcgactacttggcctccgccgtgaaaactcaattagatgttagctccaaattaatg aacctggtacaagatgataaataggaactcaaatacaaagcctaccattaatgactgttttatttttatactaaagtagct aaagggtgattatcaaggagtggttaacgatctattcctagcagggcactcagctcatcgatctttccaatatcggcgtat aacgcttccacttctatcaacgtatcttcgttaaaaagaccacctctggtgggaactaatccttctgctgccgcctctgct aaactctgtcttcgaatccgtttcttactaacatcagcttcgacagataagccactcttctttatctttttcttagatcct gttttgaatctcagggactttactggtgccataacaacttcctgttccagtaccttgttcttcttactcttttttggtatt aaagaatgtcccgccttgagtcctcgatcatccttggccatactcaatcgtctagtagtgctgttgaaatgctgtaaagaa gaggaatatcttcttaaatggttggtatctttttcagcaaccacacctttgtttcggaaagcggataatggcacattgctt ggattgatagaagaagctataaaagcccatcctgcgtttggagcagtttgattgctctgagttactatgttcaactgtgta ttggcaaaagccttagagtcgctgtctgattcgcttatattgagtaaatcatccaggtccaatagaggaacagaaccagtc tgcttcccttttggttttgtacgatccctaattgcacccttcacagaaagttctacccgtttggactttatactgtctttg ttctctgatactgatcgcattgaaaacccatcaataatctcaaagggtttgccacagtccgaggtggtccaaattccaatc actggagggataggatccactttggaagatgccagaacttcttttgcaattttggtaccaatttttttattggatgttttg ggaagagcttcatcttcatcagtggagttgctgctttcgttgtcatctactttttggtcatcttctagttcgtcgtcgtct gaagcaatagcatctgaggaggacgcatctccttcacctttgaaaaagtaattaaataggtaggagtcatcatcagaatct tgttcttggtctgatcccctttcgacggcagcttgaatgttgtt PAS_chr1-4_0289 492 gacgagacgctgttcctttcaacttgtccacttggactgacaagtcaacacctgttactaattcttttgtcatctctcagt HomologyArm1 atgaagacacgcgtgttcctcaatcagccaccagttctacacatccaaacatacctaaacacgccaaagagtatccgttag caaatgggccacctgggtggtgttggaattcccattccagtatgtcgacagaccaaccaatatatccaggacaccaatatc caccaccgcttcagcagcactaccactttgcttcacccaggcaactatcaaactctagctctgggacgtcatccgttcctt tccaaccaccccctgctggtcaattacaaccacaaggtaattctatgttcatacacatgccattttcgctaaatggcccac cagctgctggacagcaattgataccaccccaaggactagcctcaatacctgtcggccccggcaacaacagttccctattgg ttagccaaggtgcacctggcggctattctttagcttcaccagcgttgtcaccggtagatgcgaccttcgaagatcccgtca agagactgcccaaaaagcggacaaaaactggatgtctcacttgccgtaagagacgaatcaaatgtgacgaacgcaagccgt tctgtttcaactgtgaaaaaagcaaaaaggtgtgtactggttttacgcatctattcaaagatccccctagcaaatcctacc ctcccagttcagatggtgcctcccctgttgccaatgaccaccctgtccccccaaggcaaaactttggtgaattgaggggca gtctgaattacatcatcaactagaagaatgcttattccttttctctactgtataatcacgacgttatgtcctttaatataa gaaacgacaattaaaccactttaggtggacataatccatttctggatgctgttcgatgtgtagtgtctaaaccgatactga gatttctctttctctttctcttttttttttttttcctaccatttccttcaagaaaatacacctttcgacagatcatcataa atggtggcctctcttcaca PAS_chr1-4_0289 493 tgatcgactacttggcctccgccgtgaaaactcaattagatgttagctccaaattaatgaacctggtacaagatgataaat HomologyArm2 aggaactcaaatacaaagcctaccattaatgactgttttatttttatactaaagtagctaaagggtgattatcaaggagtg gttaacgatctattcctagcagggcactcagctcatcgatctttccaatatcggcgtataacgcttccacttctatcaacg tatcttcgttaaaaagaccacctctggtgggaactaatccttctgctgccgcctctgctaaactctgtcttcgaatccgtt tcttactaacatcagcttcgacagataagccactcttctttatctttttcttagatcctgttttgaatctcagggacttta ctggtgccataacaacttcctgttccagtaccttgttcttcttactcttttttggtattaaagaatgtcccgccttgagtc ctcgatcatccttggccatactcaatcgtctagtagtgctgttgaaatgctgtaaagaagaggaatatcttcttaaatggt tggtatctttttcagcaaccacacctttgtttcggaaagcggataatggcacattgcttggattgatagaagaagctataa aagcccatcctgcgtttggagcagtttgattgctctgagttactatgttcaactgtgtattggcaaaagccttagagtcgc tgtctgattcgcttatattgagtaaatcatccaggtccaatagaggaacagaaccagtctgcttcccttttggttttgtac gatccctaattgcacccttcacagaaagttctacccgtttggactttatactgtctttgttctctgatactgatcgcattg aaaacccatcaataatctcaaagggtttgccacagtccgaggtggtccaaattccaatcactggagggataggatccactt tggaagatgccagaacttcttttgcaattttggtaccaatttttttattggatgttttgggaagagcttcatcttcatcag tggagttgctgctttcgttgtcatctactttttggtcatcttctagttcgtcgtcgtctgaagcaatagcatctgaggagg acgcatctccttcacctttgaaaaagtaattaaataggtaggagtcatcatcagaatcttgttcttggtctgatccccttt cgacggcagcttgaatgttgtt