G16B10/00

SNP MARKERS OF DRUG REDUCED SUSCEPTIBILITY RELATED EVOLUTIONARY BRANCHES OF CLOSTRIDIUM DIFFICILE, METHOD FOR IDENTIFYING STRAIN CATEGORY, AND USE THEREOF

Provided are SNP markers of drug reduced susceptibility related evolutionary branches of Clostridium difficile, a method for identifying the category of a Clostridium difficile strain, and use thereof. The SNP markers are specific markers of three categories of the Clostridium difficile clade2 (mainly hypervirulent ribotype 027), allowing for rapid and accurate identification of the evolutionary branches of Clostridium difficile strains that are resistant to a variety of therapeutic drugs and related drugs. Accurate categorization of the drug reduced susceptibility related evolutionary branches not only provides evidence for the evolutionary traceability of drug-resistant pathogens, but also offers effective and actionable guidance on clinical drug usage.

METHODS AND SYSTEMS FOR DETERMINING ANCESTRAL RELATEDNESS

The present disclosure provides methods of estimating a degree of ancestral relatedness between individuals. In an aspect, a method comprises receiving haplotype data comprising genetic markers shared among a population of individuals; dividing the haplotype data into segments based on the genetic markers; for each of the population of test individuals: (i) based on the genetic markers, matching segments of the haplotype data that are identical-by-descent between two individuals, (ii) for each of the matched segments: dividing the matched segment into discrete genomic intervals, scoring each of the discrete genomic intervals based on a degree of matching within or between the individuals, correcting the scores for consistency, and (iii) calculating a weighted sum over the discrete genomic intervals of the matched segment, based on the corrected scores and assigned weights; and (d) estimating the degree of ancestral relatedness between the individuals based on the weighted sums of the matched segments.

METHOD FOR CONSTRUCTING FUNCTIONAL CLASSIFIERS FOR MICROBIOME ANALYSIS
20230245785 · 2023-08-03 ·

A method for classifying microbial function within any microbiome can be carried out with any coding system. The method, which does not entail measuring the distance between sequences, includes: (1) selecting a reference database that links a coding system to a set of biological sequences; (2) constructing an N×M matrix with each row (N) representing a code from the coding system, each column (M) representing a single biological sequence from the set, and cells representing the presence, absence, or frequency of the single biological sequence for one or more codes; (3) computing the pair-wise distance between the rows of the matrix to form an N×N matrix, wherein N is the number of codes in the matrix; (4) clustering the results to form a data tree; (5) generating a taxonomic tree from the cluster results; and (6) applying a classification tool to the taxonomic tree to classify the microbiome.

Functional analysis of time-series phylogenetic tumor evolution tree

A computer-implemented method includes determining, by a processor, from a time-series evolution tree comprising one or more clones at each of the plurality of time points, that the one or more clones are sensitive clones or resistant clones, wherein the time-series evolution tree is based on sequence data for a tumor from a subject at a plurality of time points, wherein each time point in the time-series evolution tree represents an event in the subject's cancer treatment, and wherein a clone is a collection of gene alterations; based at least in part on determining that the one or more clones that are the sensitive or resistant clones, determining, by the processor, a geneset composition of the one or more clones that are the sensitive or resistant clones; and based at least in part on determining the geneset composition, determining by the processor, a further treatment for the subject.

Functional analysis of time-series phylogenetic tumor evolution tree

A computer-implemented method includes determining, by a processor, from a time-series evolution tree comprising one or more clones at each of the plurality of time points, that the one or more clones are sensitive clones or resistant clones, wherein the time-series evolution tree is based on sequence data for a tumor from a subject at a plurality of time points, wherein each time point in the time-series evolution tree represents an event in the subject's cancer treatment, and wherein a clone is a collection of gene alterations; based at least in part on determining that the one or more clones that are the sensitive or resistant clones, determining, by the processor, a geneset composition of the one or more clones that are the sensitive or resistant clones; and based at least in part on determining the geneset composition, determining by the processor, a further treatment for the subject.

Characterizing heterogeneity with fine-scale population structure

Described are techniques for determining population structure from identity-by-descent (IBD) of individuals. The techniques may be used to predict that an individual belongs to zero, one or more of a number of communities identified within an IBD network. Additional data may be used to annotate the communities with birth location, surname, and ethnicity information. In turn, these data may be used to provide to an individual a prediction of membership to zero, one or more communities, accompanied by a summary of the information annotated to those communities. Ethnicity heterogeneity and age information may be tabulated and provided based on community membership information.

Characterizing heterogeneity with fine-scale population structure

Described are techniques for determining population structure from identity-by-descent (IBD) of individuals. The techniques may be used to predict that an individual belongs to zero, one or more of a number of communities identified within an IBD network. Additional data may be used to annotate the communities with birth location, surname, and ethnicity information. In turn, these data may be used to provide to an individual a prediction of membership to zero, one or more communities, accompanied by a summary of the information annotated to those communities. Ethnicity heterogeneity and age information may be tabulated and provided based on community membership information.

Method of deconvolution of mixed molecular information in a complex sample to identify organism(s)

The present invention relates to methods to determine the identity of one or more organisms present in a sample (if these are already reported in a taxonomic database) or the identity of the closest related organism reported in a taxonomic database. The present invention does this by comparing a data set acquired by analyzing at least one component of the biological sample to a database, so as to match each component of the analyzed content of the sample to one or more taxon(s) and then collating the phylogenetic distance between each taxa and the taxon with the highest number of matches in the data set. A deconvolution function is then generated for the taxon with the highest number of matches, based on a correlation curve between the number of matches per taxon (Y axis) and the phylogenetic distance (X axis), the outcome of this function providing the identity of the organism or the closest known organism to it.

Method of deconvolution of mixed molecular information in a complex sample to identify organism(s)

The present invention relates to methods to determine the identity of one or more organisms present in a sample (if these are already reported in a taxonomic database) or the identity of the closest related organism reported in a taxonomic database. The present invention does this by comparing a data set acquired by analyzing at least one component of the biological sample to a database, so as to match each component of the analyzed content of the sample to one or more taxon(s) and then collating the phylogenetic distance between each taxa and the taxon with the highest number of matches in the data set. A deconvolution function is then generated for the taxon with the highest number of matches, based on a correlation curve between the number of matches per taxon (Y axis) and the phylogenetic distance (X axis), the outcome of this function providing the identity of the organism or the closest known organism to it.

Method and Apparatus For Analysing a Sample
20220005545 · 2022-01-06 ·

We describe a method and apparatus for analysing a sample. The method may comprise extracting a plurality of sequence reads from within the sample. Genomic analysis is then performed on the plurality of sequence reads by comparing the plurality of sequence reads to reference genomes stored in a reference database, wherein each stored reference genome comprises a set of reference sequences. Before performing the genomic analysis, the method further comprises comparing screening sequences with at least one of the set of reference sequences and the plurality of sequence reads from the sample. When it is determined that a screening sequence matches at least one sequence within either the set of reference sequences or the plurality of sequence reads, the at least one matching sequence is masked.