Patent classifications
G06F19/22
METHODS FOR GENERATING ENGINEERED ENZYMES
Provided are improved methods for identifying the substrate recognition specificity or activity of a protease, convertase (sortase), or kinase. In some embodiments, methods are provided for identifying the endogenous protease or convertase cleaving patterns (e.g., “cleaveOme”) inside the secretory pathway of a living cell. Select embodiments involve aspects of yeast endoplasmic reticulum sequestration screening and next generation sequencing. Methods of producing polypeptides in Kex2 knockout yeast are also provided.
METHODS AND SYSTEMS FOR DETECTION OF ABNORMAL KARYOTYPES
Methods and systems for detecting abnormal karyotypes are disclosed. An example method can comprise determining read coverage data, allele balance distributions of heterozygous SNPs, and chromosomal segments where heterozygosity is not observed. The methods and systems can then determine one or more metrics which can be indicative of abnormal karyotype(s).
METHOD AND SYSTEM FOR CHARACTERIZATION OF CLOSTRIDIUM DIFFICILE ASSOCIATED CONDITIONS
An embodiment of a system and method for characterizing a Clostridium-associated condition in relation to a user includes: a handling network operable to receive containers including material from a set of users, the handling network including a sequencing system operable to determine microbiome sequences from sequencing the material; a processing system operable to generate a microbiome composition dataset and a microbiome functional diversity dataset based on the microbiome sequences, receive a supplementary dataset associated with the Clostridium-associated condition for the set of users; transform the supplementary dataset and features extracted from the microbiome composition dataset and the microbiome functional diversity dataset into a characterization model for the Clostridium-associated condition; and a therapy system operable to promote a therapy to the user based on characterizing the user in relation to the Clostridium-associated condition using the characterization model.
SYSTEMS, METHODS, AND MEDIA FOR DE NOVO ASSEMBLY OF WHOLE GENOME SEQUENCE DATA
Described are computer-implemented methods, systems, and media for de novo phased diploid assembly of nucleic acid sequence data generated from a nucleic acid sample of an individual utilizing nucleic acid tags to preserve long-range sequence context for the individual such that a subset of short-read sequence data derived from a common starting sequence shares a common tag. The phased diploid assembly is achieved without alignment to a reference sequence derived from organisms other than the individual. The methods, systems, and media described are computer-resource efficient, allowing scale-up.
Genomic features associated with epigenetic control regions and transgenerational inheritance of epimutations
CpG densities and sequence motifs that are characteristic of regions of DNA associated with epimutations and control of epimutations are provided. Such regions include, within approximately 400 (or fewer) base pairs, at least one, usually two, and preferably all three of the following features: i) a CpG density of 15% or less; ii) the presence of the sequence motif ATTTGTTTTTTCTTTTnT (SEQ ID NO: 1) where n is A, T, C or G, and statistically relevant variants thereof; and iii) the presence of the sequence motif GGGGGnGGGG (SEQ ID NO: 2), where n is A, T, C or G, and statistically relevant variants thereof.
Non-invasive determination of methylome of fetus or tumor from plasma
Systems, methods, and apparatuses can determine and use methylation profiles of various tissues and samples. Examples are provided. A methylation profile can be deduced for fetal/tumor tissue based on a comparison of plasma methylation (or other sample with cell-free DNA) to a methylation profile of the mother/patient. A methylation profile can be determined for fetal/tumor tissue using tissue-specific alleles to identify DNA from the fetus/tumor when the sample has a mixture of DNA. A methylation profile can be used to determine copy number variations in genome of a fetus/tumor. Methylation markers for a fetus have been identified via various techniques. The methylation profile can be determined by determining a size parameter of a size distribution of DNA fragments, where reference values for the size parameter can be used to determine methylation levels. Additionally, a methylation level can be used to determine a level of cancer.
Hardware acceleration of short read mapping for genomic and other types of analyses
A scalable FPGA-based solution to the short read mapping problem in DNA sequencing is disclosed which greatly accelerates the task of aligning short length reads to a known reference genome. A representative system comprises one or more memory circuits storing a plurality of short reads and a reference genome sequence; and one or more field programmable gate arrays configured to select a short read; to extract a plurality of seeds from the short read, each seed comprising a genetic subsequence of the short read; for each seed, to determine at least one candidate alignment location (CAL) in the reference genome sequence to form a plurality of CALs; for each CAL, to determine a likelihood of the short read matching the reference genome sequence in the vicinity of the CAL; and to select one or more CALs having the currently greater likelihood of the short read matching the reference genome sequence.
FINDING RELATIVES IN A DATABASE
Determining relative relationships of people who share a common ancestor within at least a threshold number of generations includes: receiving recombinable deoxyribonucleic acid (DNA) sequence information of a first user and recombinable DNA sequence information of a plurality of users; processing, using one or more computer processors, the recombinable DNA sequence information of the plurality of users in parallel; determining, based at least in part on a result of processing the recombinable DNA information of the plurality of users in parallel, a predicted degree of relationship between the first user and a user among the plurality of users, the predicted degree of relative relationship corresponding to a number of generations within which the first user and the second user share a common ancestor.
SYSTEMS AND METHODS FOR IDENTIFYING AND FLAGGING SAMPLES OF CONCERN
The present disclosure describes systems and methods for determining and flagging sequences that deviate from one or more reference sequences. Phylogenetic methods are used for determining the evolutionary history and evolutionary distances of sample isolates. The evolutionary distances of sample isolates may be compared to each other and/or reference isolates. Based on a comparison of the evolutionary distances, a determination of deviance is made for a sample sequence. The sample sequence is flagged for further analysis to determine the cause of deviation.
SYSTEM AND METHOD FOR PROCESS CONTROL OF GENE SEQUENCING
Systems, methods and computer-readable media are provided for determining the amount of sequencing required to achieve a target sequencing quality of a genetic sample to be sequenced. The method comprises receiving a genetic sample and sequencing a portion of the genetic sample. A sequencing quality metric belonging to a category of sequencing quality metrics is generated from the sequencing. The amount of sequencing of the genetic sample required to achieve the target sequencing quality is determined by inputting the sequencing quality metric into a trained model. A system is also disclosed for genetic sequencing. Corresponding methods and computer-readable media are also provided.