Patent classifications
G16B35/00
EXPERIMENT AND MACHINE-LEARNING TECHNIQUES TO IDENTIFY AND GENERATE HIGH AFFINITY BINDERS
The present disclosure relates to in vitro experiments and in silico computation and machine-learning based techniques to iteratively improve a process for identifying binders that can bind any given molecular target. Particularly, aspects of the present disclosure are directed to obtaining sequence data for aptamers that bind to a target, where the sequence data has a first signal to noise ratio, generating, by a search process, a first set of aptamer sequences derived from the sequence data, obtaining subsequent sequence data for subsequent aptamers that bind to the target, where the subsequent aptamers includes aptamers synthesized from the first set of aptamer sequences, and the subsequent sequence data has a second signal to noise ratio greater than the first signal to noise ratio, generating, by a linear machine-learning model, a second set of aptamer sequences derived from the subsequent sequence data, and outputting the second set of aptamer sequences.
EXPERIMENT AND MACHINE-LEARNING TECHNIQUES TO IDENTIFY AND GENERATE HIGH AFFINITY BINDERS
The present disclosure relates to in vitro experiments and in silico computation and machine-learning based techniques to iteratively improve a process for identifying binders that can bind any given molecular target. Particularly, aspects of the present disclosure are directed to obtaining sequence data for aptamers that bind to a target, where the sequence data has a first signal to noise ratio, generating, by a search process, a first set of aptamer sequences derived from the sequence data, obtaining subsequent sequence data for subsequent aptamers that bind to the target, where the subsequent aptamers includes aptamers synthesized from the first set of aptamer sequences, and the subsequent sequence data has a second signal to noise ratio greater than the first signal to noise ratio, generating, by a linear machine-learning model, a second set of aptamer sequences derived from the subsequent sequence data, and outputting the second set of aptamer sequences.
ACCELERATED METHOD FOR GENERATING TARGET ELITE INBREDS WITH SPECIFIC AND DESIGNED TRAIT MODIFICATION
The present disclosure provides a method of generating a new trait converted elite cultivar through a method of breeding. For instance, the method involves the use of parent plants, which are respectively the traited variant of the parents of the non-traited elite cultivar and estimating a minimum population size necessary to generate a progeny plant comprising the desired trait and sharing a sufficiently high identity by descent with the non-traited elite cultivar to ensure replication and equivalency of general performance. The present method may be used to generate an elite cultivar in fewer generations, thereby accelerating new line production, and reducing costs. The present method may also be used to generate non-traited variants of traited lines.
In silico process for selecting protein formulation excipients
The invention relates to an in silico screening method to identify candidate excipients for reducing aggregation of a protein in a formulation. The method combines computational molecular modeling and molecular dynamics simulations to identify sites on a protein where non-specific self-interaction and interaction of different test excipients may occur, determine the relative binding energies of such interactions, and select one or more test excipients that meet specified interaction criteria for use as candidate excipients in empirical screening studies.
In silico process for selecting protein formulation excipients
The invention relates to an in silico screening method to identify candidate excipients for reducing aggregation of a protein in a formulation. The method combines computational molecular modeling and molecular dynamics simulations to identify sites on a protein where non-specific self-interaction and interaction of different test excipients may occur, determine the relative binding energies of such interactions, and select one or more test excipients that meet specified interaction criteria for use as candidate excipients in empirical screening studies.
EMBEDDING-BASED GENERATIVE MODEL FOR PROTEIN DESIGN
A system and method for designing protein sequences conditioned on a specific target fold. The system is a transformer-based generative framework for modeling a complex sequence-structure relationship. To mitigate the heterogeneity between the sequence domain and the fold domain, a Fold-to-Sequence model jointly learns a sequence embedding using a transformer and a fold embedding from the density of secondary structural elements in 3D voxels. The joint sequence-fold representation through novel intra-domain and cross-domain losses with an intra-domain loss forcing two semantically similar (where the proteins should have the same fold(s)) samples from the same domain to be close to each other in a latent space, while a cross-domain loss forces two semantically similar samples in different domains to be closer. In an embodiment, the Fold-to-Sequence model performs design tasks that include low resolution structures, structures with region of missing residues, and NMR structural ensembles.
Methods and systems for genetic analysis
This disclosure provides systems and methods for sample processing and data analysis. Sample processing may include nucleic acid sample processing and subsequent sequencing. Some or all of a nucleic acid sample may be sequenced to provide sequence information, which may be stored or otherwise maintained in an electronic storage location. The sequence information may be analyzed with the aid of a computer processor, and the analyzed sequence information may be stored in an electronic storage location that may include a pool or collection of sequence information and analyzed sequence information generated from the nucleic acid sample. Methods and systems of the present disclosure can be used, for example, for the analysis of a nucleic acid sample, for producing one or more libraries, and for producing biomedical reports. Methods and systems of the disclosure can aid in the diagnosis, monitoring, treatment, and prevention of one or more diseases and conditions.
Methods and systems for genetic analysis
This disclosure provides systems and methods for sample processing and data analysis. Sample processing may include nucleic acid sample processing and subsequent sequencing. Some or all of a nucleic acid sample may be sequenced to provide sequence information, which may be stored or otherwise maintained in an electronic storage location. The sequence information may be analyzed with the aid of a computer processor, and the analyzed sequence information may be stored in an electronic storage location that may include a pool or collection of sequence information and analyzed sequence information generated from the nucleic acid sample. Methods and systems of the present disclosure can be used, for example, for the analysis of a nucleic acid sample, for producing one or more libraries, and for producing biomedical reports. Methods and systems of the disclosure can aid in the diagnosis, monitoring, treatment, and prevention of one or more diseases and conditions.
AUTOMATED SCREENING OF ENZYME VARIANTS
Disclosed are methods for identifying bio-molecules with desired properties (or which are most suitable for a round of directed evolution) from complex bio-molecule libraries or sets of such libraries. Some embodiments of the present disclosure provide methods for virtually screening proteins for beneficial properties. Some embodiments of the present disclosure provide methods for virtually screening enzymes for desired activity and/or selectivity for catalytic reactions involving particular substrates. Some embodiments combine screening and directed evolution to design and develop proteins and enzymes having desired properties. Systems and computer program products implementing the methods are also provided.
AUTOMATED SCREENING OF ENZYME VARIANTS
Disclosed are methods for identifying bio-molecules with desired properties (or which are most suitable for a round of directed evolution) from complex bio-molecule libraries or sets of such libraries. Some embodiments of the present disclosure provide methods for virtually screening proteins for beneficial properties. Some embodiments of the present disclosure provide methods for virtually screening enzymes for desired activity and/or selectivity for catalytic reactions involving particular substrates. Some embodiments combine screening and directed evolution to design and develop proteins and enzymes having desired properties. Systems and computer program products implementing the methods are also provided.