G06F19/22

METHODS OF HUMAN LEUKOCYTE ANTIGEN TYPING
20170342479 · 2017-11-30 ·

Described herein are methods, systems, and media for HLA typing an individual from nucleic acid or protein sequences. The methodology disclosed herein represents significant improvements over current methods of HLA typing.

SYSTEMS AND METHODS FOR IDENTIFYING SEQUENCE VARIATION

Systems and method for determining variants can receive mapped reads, align flow space information to a flow space representation of a corresponding portion of the reference. Reads spanning a position with a potential variant can be evaluated in a context specific manner. A list of probable variants can be provided.

Methods and systems for nucleic acid sequence analysis
09824180 · 2017-11-21 · ·

Disclosed are new and improved methods and systems for nucleic acid sequence analysis that can analyze data indicative of natural by-products of nucleotide incorporation events without the need for exogenous labels or dyes to identify nucleic acid sequences of interest. In particular, the methods and systems of the present teachings can process such data and various forms thereof to align fragments of the nucleic acid(s) of interest, particularly those analyzed using an addition sequencing technique, for example, as occurs with the use of nucleotide flows.

Sequential sequencing

Described herein are improved methods, compositions and kits for next generation sequencing (NGS). The methods, compositions and kits described herein enable phasing of two or more nucleic acid sequences in a sample, i.e. determining whether the nucleic acid sequences (which can comprise regions of sequence variation) are located on the same chromosome and/or the same chromosomal fragment. Phasing information can be obtained by performing multiple, successive sequencing reactions from the same immobilized nucleic acid template. The methods, compositions and kits provided herein can be useful, for example, for haplotyping, SNP phasing, or for determining downstream exons in RNA-seq.

Methods and Compositions for Characterizing Drug Resistant Bacteria From Formalin-Fixed Paraffin-Embedded Biological Samples
20170327873 · 2017-11-16 ·

The invention provides methods and compositions generally useful to the use of polymerase chain reaction (PCR) amplification of trace DNA sequences from formalin-fixed paraffin-embedded (FFPE) biopsy samples and specifically relevant to the identification of multi-drug resistant H. pylori in such biopsy samples.

METHODS OF DETERMINING GENOMIC HEALTH RISK
20170329893 · 2017-11-16 ·

Described are genomic health risk metrics elaborated herein to hold significant advantages for the health care industry. The likelihood that any given GSV will be deleterious is relatively small. Since every human genome sequenced may result in several million GSVs, the advantage of a genomic health risk metric such as a tolerability score, an n-mer score, a context dependent tolerance score, or a protein tolerability score to clinicians is that it will allow them to focus on and prioritize deleterious mutations.

DISPLAY OF ESTIMATED PARENTAL CONTRIBUTION TO ANCESTRY

Estimating parental contribution of ancestry includes: obtaining a set of ancestry assignment data associated with an individual's genotype data, at least some of the ancestry assignment data indicating that one or more segments of the individual's genotype data is deemed to be associated with a specific ancestry; determining whether in the individual's genotype data there is at least one confirmed region of overlapping ancestry assignment associated with the specific ancestry; in the event that it is determined that there is at least one confirmed region of overlapping ancestry assignment associated with the specific ancestry: specifying that parental contribution of the specific ancestry is made by both parents of the individual; in the event that it is determined that there is no confirmed region of overlapping ancestry assignment associated with the specific ancestry: statistically determining whether the parental contribution to the specific ancestry is made by only one parent of the individual or by both parents of the individual, the determination being based at least in part on one or more lengths of the one or more segments deemed to be associated with the specific ancestry; and outputting information pertaining to the parental contribution to the specific ancestry.

Load balancing and conflict processing in workflow with task dependencies

Embodiments in the disclosure are directed to the use of distributed computing to align reads against multiple portions of a reference dataset. Aligned portions of the reference dataset that correspond with an above-threshold alignment score can be assessed for the presence of sparse indicators that can be categorized and used to influence a determination of a state transition likelihood. Various tasks associated with the processing of reads (e.g., alignment, sparse indicator detection, and/or determination of a state transition likelihood) may be able to take advantage of parallel processing and can be distributed among the machines while considering the resource utilization of those machines. Different load-balancing mechanisms can be employed in order to achieve even resource utilization across the machines, and in some cases may involve assessing various processing characteristics that reflect a predicted resource expenditure and/or time profile for each task to be processed by a machine.

METHODS AND PROCESSES FOR NON-INVASIVE ASSESSMENT OF GENETIC VARIATIONS
20170316150 · 2017-11-02 ·

Provided herein are methods, processes and apparatuses for non-invasive assessment of genetic variations that make use of nucleic acid fragments from circulating cell free nucleic acid. Also provided herein are methods for partitioning one or more genomic regions of a reference genome into a plurality of portions according to one or more features.

METHODS AND SYSTEMS FOR IDENTIFYING LIGAND-PROTEIN BINDING SITES

The invention provides a novel integrated structure and system-based approach for drug target prediction that enables the large-scale discovery of new targets for existing drugs Novel computer-readable storage media and computer systems are also provided. Methods and systems of the invention use novel sequence order-independent structure alignment, hierarchical clustering, and probabilistic sequence similarity techniques to construct a probabilistic pocket ensemble (PPE) that captures even promiscuous structural features of different binding sites for a drug on known targets. The drug's PPE is combined with an approximation of the drug delivery profile to facilitate large-scale prediction of novel drug-protein interactions with several applications to biological research and drug development.