Patent classifications
G16B50/50
SYSTEM AND METHOD FOR EFFECTIVE COMPRESSION, REPRESENTATION AND DECOMPRESSION OF DIVERSE TABULATED DATA
A method for controlling compression of data includes accessing genomic annotation data in one of a plurality of first file formats, extracting attributes from the genomic annotation data, dividing the genomic annotation data into multiple chunks, and processing the extracted attributes and chunks into correlated information. The method also includes selecting different compressors for the attributes and chunks identified in the correlated information and generating a file in a second file format that includes the correlated information and information indicative of the different compressors for the chunks and attributes indicated in the correlated information. The information indicative of the different compressors is processed into the second file format to allow selective decompression of the attributes and chunks indicated in correlated information.
QUALITY SCORE COMPRESSION
Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence includes at least one “N” base, generating, by one or more computers, a first encoding data set by using a first encoding process to encode each set of four quality scores of the read sequence into a single byte of memory, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.
QUALITY SCORE COMPRESSION
Methods, systems, and computer programs for compressing nucleic acid sequence data. A method can include obtaining nucleic acid sequence data representing: (i) a read sequence, and (ii) a plurality of quality scores, determining whether the read sequence includes at least one “N” base, based on a determination that the read sequence includes at least one “N” base, generating, by one or more computers, a first encoding data set by using a first encoding process to encode each set of four quality scores of the read sequence into a single byte of memory, and using a second encoding process to encode the first encoded data set, thereby compressing the data to be compressed.
Mining all atom simulations for diagnosing and treating disease
The present disclosure describes methods for determining the functional consequences of mutations. The methods include the use of machine learning to identify and quantify features of all atom molecular dynamics simulations to obtain the disruptive severity of genetic variants on molecular function.
Mining all atom simulations for diagnosing and treating disease
The present disclosure describes methods for determining the functional consequences of mutations. The methods include the use of machine learning to identify and quantify features of all atom molecular dynamics simulations to obtain the disruptive severity of genetic variants on molecular function.
Cyphergenics-based verifications of blockchains
A method for verifying a material data chain (MDC) that is maintained by a creator is disclosed. The method includes receiving an unverified portion of the MDC from the creator including a set of consecutive material data blocks (MDBs). Each respective MDB includes respective material data, respective metadata, and a creator verification value. The method includes modifying a genomic differentiation object assigned to the verification cohort based on first genomic regulation instructions (GRI) that were used by the creator to generate the creator verification value. For each MDB in the unverified portion, the method includes determining a verifier verification value based on the MDB, a preceding MDB in the MDC, and a genomic engagement factor (GEF) determined with respect to the MDB. The GEF corresponding to an MDB is determined by extracting a sequence from the metadata of a MDB and mapping the sequence into the modified genomic differentiation object.
Cyphergenics-based verifications of blockchains
A method for verifying a material data chain (MDC) that is maintained by a creator is disclosed. The method includes receiving an unverified portion of the MDC from the creator including a set of consecutive material data blocks (MDBs). Each respective MDB includes respective material data, respective metadata, and a creator verification value. The method includes modifying a genomic differentiation object assigned to the verification cohort based on first genomic regulation instructions (GRI) that were used by the creator to generate the creator verification value. For each MDB in the unverified portion, the method includes determining a verifier verification value based on the MDB, a preceding MDB in the MDC, and a genomic engagement factor (GEF) determined with respect to the MDB. The GEF corresponding to an MDB is determined by extracting a sequence from the metadata of a MDB and mapping the sequence into the modified genomic differentiation object.
BIOCOMPATIBLE NUCLEIC ACIDS FOR DIGITAL DATA STORAGE
A device for the storage and/or the editing of digital data including at least one double stranded, replicative, composite nucleic acid molecule. The composite nucleic acid molecule includes both digital data-encoding and non-digital data-encoding nucleic acids. The non-digital data-encoding nucleic acids may allow indexing and/or the provision of metadata for the flanking digital data-encoding nucleic acid. The composite nucleic acid molecules may be pooled to constitute an array and arrays may constitute a DNA drive, which represents the physical support on which the digital data are stored.
BIOCOMPATIBLE NUCLEIC ACIDS FOR DIGITAL DATA STORAGE
A device for the storage and/or the editing of digital data including at least one double stranded, replicative, composite nucleic acid molecule. The composite nucleic acid molecule includes both digital data-encoding and non-digital data-encoding nucleic acids. The non-digital data-encoding nucleic acids may allow indexing and/or the provision of metadata for the flanking digital data-encoding nucleic acid. The composite nucleic acid molecules may be pooled to constitute an array and arrays may constitute a DNA drive, which represents the physical support on which the digital data are stored.
Genetic Data in Transactions
A method performed by computer equipment of a consuming party, comprising: accessing an electronic document comprising a plurality of pointers, each pointer comprising a respective transaction identifier of a respective destination transaction stored on a blockchain, wherein the destination transactions comprise one or more first transactions storing respective genetic data of at least part of a reference genome, and one or more second transactions storing respective genetic data of at least a corresponding part of a target genome in compressed form compressed relative to the reference genome; accessing the genetic data from at least one of the first destination transactions and at least a corresponding one of the second destination transactions based on the respective identifiers accessed from the electronic document; and decompressing the accessed genetic data of the target genome based on the accessed genetic data of the reference genome.