Patent classifications
G16B30/10
ALIGNMENT FREE FILTERING FOR IDENTIFYING FUSIONS
Cell free nucleic acids from a test sample obtained from an individual are analyzed to identify possible fusion events. Cell free nucleic acids are sequenced and processed to generate fragments. Fragments are decomposed into kmers and the kmers are either analyzed de novo or compared to targeted nucleic acid sequences that are known to be associated with fusion gene pairs of interest. Thus, kmers that may have originated from a fusion event can be identified. These kmers are consolidated to generate gene ranges from various genes that match sequences in the fragment. A candidate fusion event can be called given the spanning of one or more gene ranges across the fragment.
METHODS AND REAGENTS FOR CHARACTERIZING GENOMIC EDITING, CLONAL EXPANSION, AND ASSOCIATED APPLICATIONS
Methods for characterizing genome editing, clonal expansion and associated reagents for use in such methods are disclosed herein. Some embodiments of the technology are directed to characterizing a population of cells following an engineered genomic editing event, that includes in some embodiments characterizing genomic alterations occurring at both intended and unintended genomic loci within the genome of the populations of cells. Other embodiments are directed to utilizing Duplex Sequencing for assessing a clonal selection in mixed cell populations and/or cell populations following a genomic editing event. Further examples of the present technology are directed to methods for detecting and assessing clonal expansion of cells following a genomic editing event.
METHODS AND REAGENTS FOR CHARACTERIZING GENOMIC EDITING, CLONAL EXPANSION, AND ASSOCIATED APPLICATIONS
Methods for characterizing genome editing, clonal expansion and associated reagents for use in such methods are disclosed herein. Some embodiments of the technology are directed to characterizing a population of cells following an engineered genomic editing event, that includes in some embodiments characterizing genomic alterations occurring at both intended and unintended genomic loci within the genome of the populations of cells. Other embodiments are directed to utilizing Duplex Sequencing for assessing a clonal selection in mixed cell populations and/or cell populations following a genomic editing event. Further examples of the present technology are directed to methods for detecting and assessing clonal expansion of cells following a genomic editing event.
GENERATING PROTEIN SEQUENCES USING MACHINE LEARNING TECHNIQUES BASED ON TEMPLATE PROTEIN SEQUENCES
Systems and techniques are described to generate amino acid sequences of target proteins based on amino acid sequences of template proteins using machine learning techniques. The amino acid sequences of the target proteins can be generated based on data that constrains the modifications that can be made to the amino acid sequences of the template proteins. In illustrative examples, the template proteins can include antibodies produced by a non-human mammal that bind to an antigen and the target proteins can correspond to human antibodies with a region having at least a threshold amount of identity with the binding region of the template antibody. Generative adversarial networks can be used to produce the amino acid sequences of the target proteins.
GENERATING PROTEIN SEQUENCES USING MACHINE LEARNING TECHNIQUES BASED ON TEMPLATE PROTEIN SEQUENCES
Systems and techniques are described to generate amino acid sequences of target proteins based on amino acid sequences of template proteins using machine learning techniques. The amino acid sequences of the target proteins can be generated based on data that constrains the modifications that can be made to the amino acid sequences of the template proteins. In illustrative examples, the template proteins can include antibodies produced by a non-human mammal that bind to an antigen and the target proteins can correspond to human antibodies with a region having at least a threshold amount of identity with the binding region of the template antibody. Generative adversarial networks can be used to produce the amino acid sequences of the target proteins.
Non-invasive prenatal diagnosis of fetal genetic condition using cellular DNA and cell free DNA
Disclosed are methods for determining at least one sequence of interest of a fetus of a pregnant mother. In various embodiments, the method can determine one or more sequences of interest in a test sample that comprises a mixture of fetal cellular DNA and mother-and-fetus cfDNA. In some embodiments, methods are provided for determining whether the fetus has a genetic disease. In some embodiments, methods are provided for determining whether the fetus is homozygous in a disease causing allele when the mother is heterozygous of the same allele. In some embodiments, methods are provided for determining whether the fetus has a copy number variation (CNV) or a non-CNV genetic sequence anomaly.
Non-invasive prenatal diagnosis of fetal genetic condition using cellular DNA and cell free DNA
Disclosed are methods for determining at least one sequence of interest of a fetus of a pregnant mother. In various embodiments, the method can determine one or more sequences of interest in a test sample that comprises a mixture of fetal cellular DNA and mother-and-fetus cfDNA. In some embodiments, methods are provided for determining whether the fetus has a genetic disease. In some embodiments, methods are provided for determining whether the fetus is homozygous in a disease causing allele when the mother is heterozygous of the same allele. In some embodiments, methods are provided for determining whether the fetus has a copy number variation (CNV) or a non-CNV genetic sequence anomaly.
Systems and methods for paired end sequencing
Systems and methods for analyzing overlapping sequence information can obtain first and second overlapping sequence information for a polynucleotide, align the first and second sequence information, determine a degree of agreement between the first and second sequence information for a location along the polynucleotide, and determine a base call and a quality value for the location.
Systems and methods for paired end sequencing
Systems and methods for analyzing overlapping sequence information can obtain first and second overlapping sequence information for a polynucleotide, align the first and second sequence information, determine a degree of agreement between the first and second sequence information for a location along the polynucleotide, and determine a base call and a quality value for the location.
Processing of sequencing data streams
This disclosure relates to methods and systems for processing of sequencing data streams. The system receives sequences from a sequencer and stores them as data records on a database. The sequences are associated with a counter indicative of a number of times the associated sequence has been sequenced. The system progressively receives a further sequence as streaming data from the sequence. While receiving the further sequence, the system matches the streaming data against the stored sequences to determine a matching score. Upon the matching score exceeding a matching threshold for one of the multiple sequences in the database, the system selects the one of the sequences in the database based on the matching score and stores the further sequence on non-volatile memory where the counter value associated with the selected sequence is below a saturation threshold. The system also terminates the receiving where the counter value is above the saturation threshold.