Patent classifications
G16B20/30
Genome-Wide Detection of DNA Repeats Expanded in Disease
A method of detecting tandem repeat expansions associated with a disease is provided. The method includes the steps of: detecting tandem repeat sequences comprising a repeated motif sequence in nucleic acid samples from individuals within a population of interest, simulating the length distribution of the tandem repeat sequences in the population of interest to a normal distribution if no tandem repeat sequences are detected, and detecting one or more outlier tandem repeat sequences in the tandem repeat sequences detected, wherein an outlier tandem repeat sequence has a length that is greater than that in 90% of the tandem repeat sequences detected in the population interest and occur at a frequency of less than 1% of the tandem repeat sequences detected in a control population. The method is useful for the diagnosis of disease and subsequent treatment of a diagnosed individual.
METHODS AND SYSTEMS FOR DETERMINING POLYPEPTIDE INTERACTIONS
Methods and systems for identifying and/or quantifying polypeptide binding interactions of ligand-binding polypeptides are disclosed. Detailed methods include methods for identifying binding ligands of ligand-binding polypeptides and methods for assessing changes in binding behavior due to alterations of ligand-binding polypeptides. Detailed systems include array-based systems that permit detection of ligand binding interactions at single-analyte resolution.
METHODS AND SYSTEMS FOR DETERMINING POLYPEPTIDE INTERACTIONS
Methods and systems for identifying and/or quantifying polypeptide binding interactions of ligand-binding polypeptides are disclosed. Detailed methods include methods for identifying binding ligands of ligand-binding polypeptides and methods for assessing changes in binding behavior due to alterations of ligand-binding polypeptides. Detailed systems include array-based systems that permit detection of ligand binding interactions at single-analyte resolution.
Superior bioinformatics process for identifying at risk subject populations
A bioinformatics method for determining a risk score that indicates a risk that a subject, in particular a human, will experience a negative clinical event within a certain period of time. The risk score is based on a unique combination of activities of two or more cellular signaling pathways in a subject, wherein the selected cellular signaling pathways are the TGF-β pathway and one or more of a PI3K pathway, a Wnt pathway, an ER pathway, and an HH pathway. The invention includes an apparatus with a digital processor configured to perform such a method, a non-transitory storage medium storing instructions that are executable by a digital processing device to perform such a method, and a computer program comprising program code means for causing a digital processing device to perform such a method. The bioinformatics invention allows for more accurate prognosis of specific negative clinical events in a patient with, for example, a tumor or cancer, such as disease progression, recurrence, development of metastasis, or even death.
Superior bioinformatics process for identifying at risk subject populations
A bioinformatics method for determining a risk score that indicates a risk that a subject, in particular a human, will experience a negative clinical event within a certain period of time. The risk score is based on a unique combination of activities of two or more cellular signaling pathways in a subject, wherein the selected cellular signaling pathways are the TGF-β pathway and one or more of a PI3K pathway, a Wnt pathway, an ER pathway, and an HH pathway. The invention includes an apparatus with a digital processor configured to perform such a method, a non-transitory storage medium storing instructions that are executable by a digital processing device to perform such a method, and a computer program comprising program code means for causing a digital processing device to perform such a method. The bioinformatics invention allows for more accurate prognosis of specific negative clinical events in a patient with, for example, a tumor or cancer, such as disease progression, recurrence, development of metastasis, or even death.
METHOD AND APPARATUS FOR CLASSIFICATION MODEL TRAINING AND CLASSIFICATION, COMPUTER DEVICE, AND STORAGE MEDIUM
This disclosure relates to a method and an apparatus for classification model training. The method includes: obtaining a support set and a query set, the support set comprising support sample feature vectors and corresponding drug resistance category labels, and the query set comprising query sample feature vectors and corresponding drug resistance category labels; inputting the support set and the query set into an initial drug resistance classification model; performing drug resistance-related feature screening to obtain target support feature vectors and target query feature vectors; calculating an initial category representation vector corresponding to a drug resistance category; determining training drug resistance category information corresponding to the query sample feature vectors; updating the initial drug resistance classification model based on the training drug resistance category information and the corresponding drug resistance category labels; and obtaining a target drug resistance classification model in response to training being completed.
METHOD AND APPARATUS FOR CLASSIFICATION MODEL TRAINING AND CLASSIFICATION, COMPUTER DEVICE, AND STORAGE MEDIUM
This disclosure relates to a method and an apparatus for classification model training. The method includes: obtaining a support set and a query set, the support set comprising support sample feature vectors and corresponding drug resistance category labels, and the query set comprising query sample feature vectors and corresponding drug resistance category labels; inputting the support set and the query set into an initial drug resistance classification model; performing drug resistance-related feature screening to obtain target support feature vectors and target query feature vectors; calculating an initial category representation vector corresponding to a drug resistance category; determining training drug resistance category information corresponding to the query sample feature vectors; updating the initial drug resistance classification model based on the training drug resistance category information and the corresponding drug resistance category labels; and obtaining a target drug resistance classification model in response to training being completed.
BIOMARKER FOR PREDICTING AGE IN DAYS OF PIGS, AND PREDICTION METHOD
Provided are biomarkers and a prediction method for predicting age in days in pigs. The biomarkers for predicting age in days of pigs include one or more CpG sites with different methylation levels, and the different methylation levels of the CpG sites correspond to different ages in days of pigs. An Elastic Net linear regression model is constructed by using the methylation levels of the CpG sites and the weights corresponding to each CpG site, thereby predicting age in days of pigs to be tested. The above prediction method has high accuracy, and is accurate and reliable in detecting age in days of pigs, which fills the gap in the age prediction model of pigs based on DNA methylation, and provides an ideal model for investigating important scientific issues such as development and aging of human and animals.
GENERATIVE TNA SEQUENCE DESIGN WITH EXPERIMENT-IN-THE-LOOP TRAINING
A latent space is defined to represent sequences using training data and a machine-learning model. The training data identifies sequences of molecules and binding-approximation metrics that characterizes whether the molecules bind to a particular target and/or that approximate an extent to which the molecule is more likely to bind to the particular target than some other molecules. Supplemental training data is accessed that identifies other sequences of other molecules and binding affinity scores quantifying binding strengths between the molecules and the particular target. Projections of representations of the other sequences in the supplemental training data are projected in the latent space using the binding affinity scores. An area or position of interest within the latent space is identified based on the projections. A particular sequence represented within or at the area or position of interest or at the position of interest is identified for downstream processing.
GENERATIVE TNA SEQUENCE DESIGN WITH EXPERIMENT-IN-THE-LOOP TRAINING
A latent space is defined to represent sequences using training data and a machine-learning model. The training data identifies sequences of molecules and binding-approximation metrics that characterizes whether the molecules bind to a particular target and/or that approximate an extent to which the molecule is more likely to bind to the particular target than some other molecules. Supplemental training data is accessed that identifies other sequences of other molecules and binding affinity scores quantifying binding strengths between the molecules and the particular target. Projections of representations of the other sequences in the supplemental training data are projected in the latent space using the binding affinity scores. An area or position of interest within the latent space is identified based on the projections. A particular sequence represented within or at the area or position of interest or at the position of interest is identified for downstream processing.