G16B50/50

METHOD AND SYSTEM FOR COMPRESSING GENOME SEQUENCES USING GRAPHIC PROCESSING UNITS
20180011870 · 2018-01-11 ·

The present invention provides a method for compressing genome sequences readers using GPU processing unit. The method comprising the steps of: identifying position of each given genome reader characters string in the sequence of a reference genome, determining alignment of each reader string within the reference genome, comparing each reader characters string to corresponding reference genome sequence based on determined alignment, filtering characters in each reader by GPU processor by eliminating similar characters and extracting only characters differences in association to their position in the genome sequence and recording filtered data of each reader in association to its alignment in genome reference at the genome compressed database.

ARTIFICIAL INTELLIGENCE ANALYSIS OF RNA TRANSCRIPTOME FOR DRUG DISCOVERY
20230238081 · 2023-07-27 ·

A system and method may be provided to receive sample RNA reads from patients and generate lists of genes and their associated RNA expression levels in each patient. Some of the RNA reads may be matched to an RNA transcript or gene or gene family in terms of their match likelihood and other RNA reads may be matched to an RNA transcript or gene or gene family through the use of one or more machine learning classifiers. A machine learning classifier may be trained based on the plurality of the lists and a plurality of corresponding patients’ clinical status data to identify gene patterns that recur with a high degree of frequency in the plurality of the lists. Those gene patterns can be capable of modifying a disease or treatment response and can be targeted for drug/treatment development.

ARTIFICIAL INTELLIGENCE ANALYSIS OF RNA TRANSCRIPTOME FOR DRUG DISCOVERY
20230238081 · 2023-07-27 ·

A system and method may be provided to receive sample RNA reads from patients and generate lists of genes and their associated RNA expression levels in each patient. Some of the RNA reads may be matched to an RNA transcript or gene or gene family in terms of their match likelihood and other RNA reads may be matched to an RNA transcript or gene or gene family through the use of one or more machine learning classifiers. A machine learning classifier may be trained based on the plurality of the lists and a plurality of corresponding patients’ clinical status data to identify gene patterns that recur with a high degree of frequency in the plurality of the lists. Those gene patterns can be capable of modifying a disease or treatment response and can be targeted for drug/treatment development.

BIOLOGICAL SEQUENCE COMPRESSION USING SEQUENCE ALIGNMENT
20230230659 · 2023-07-20 ·

Compressing files is disclosed. An DNA sequence to be compressed is first aligned. Aligning the DNA sequence includes splitting the DNA sequences into smaller sequences or portions that can be aligned. After the DNA sequence is spilt one or more time and aligned, a compression matrix is generated. Each row of the compression matrix corresponds to part of the DNA sequence. A consensus sequence is determined from the compression matrix. Using the consensus sequence, pointer pairs are generated. Each pointer pair identifies a subsequence of the consensus matrix. The compressed file includes the pointer pairs and the consensus sequence.

BIOLOGICAL SEQUENCE COMPRESSION USING SEQUENCE ALIGNMENT
20230230659 · 2023-07-20 ·

Compressing files is disclosed. An DNA sequence to be compressed is first aligned. Aligning the DNA sequence includes splitting the DNA sequences into smaller sequences or portions that can be aligned. After the DNA sequence is spilt one or more time and aligned, a compression matrix is generated. Each row of the compression matrix corresponds to part of the DNA sequence. A consensus sequence is determined from the compression matrix. Using the consensus sequence, pointer pairs are generated. Each pointer pair identifies a subsequence of the consensus matrix. The compressed file includes the pointer pairs and the consensus sequence.

High-Capacity Storage of Digital Information in DNA

A method for storage of an item of information (210) is disclosed. The method comprises encoding bytes (720) in the item of information (210), and representing using a schema the encoded bytes by a DNA nucleotide to produce a DNA sequence (230). The DNA sequence (230) is broken into a plurality of overlapping DNA segments (240) and indexing information (250) added to the plurality of DNA segments. Finally, the plurality of DNA segments (240) is synthesized (790) and stored (795).

High-Capacity Storage of Digital Information in DNA

A method for storage of an item of information (210) is disclosed. The method comprises encoding bytes (720) in the item of information (210), and representing using a schema the encoded bytes by a DNA nucleotide to produce a DNA sequence (230). The DNA sequence (230) is broken into a plurality of overlapping DNA segments (240) and indexing information (250) added to the plurality of DNA segments. Finally, the plurality of DNA segments (240) is synthesized (790) and stored (795).

Methods Of Cross Correlation Of Biofield Scans To Enome Database, Genome Database, Blood Test, And Phenotype Data
20230215517 · 2023-07-06 ·

Systems and methods are provided for identifying characteristics of a subject using a biofield scan obtained from the subject. An embodiment can include a method for cross-correlating biofield scans to an enome database, and/or a genome database. A phenotype history and a biofield scan can be created from a user. A user's biofield scan can be created from measured amplitude and frequency. A database is created from a user's phenotype history, and biofield scan. The user's phenotype history and biofield scans are then correlated with known physical and biochemical characteristics. A biofield signature is created and compared to the user's phenotype history, and biofield scan.

Gene sequencing data compression preprocessing, compression and decompression method, system, and computer-readable medium

The present invention discloses a gene sequencing data compression preprocessing, compression and decompression method, a system, and a computer-readable medium. The preprocessing method implementation steps include: obtaining reference genome data; obtaining a mapping relationship between a short string K-mer and a prediction character c to obtain a prediction data model P1 containing any short string K-mer in the positive strand and negative strand of a reference genome and the prediction character c in a corresponding adjacent bit. The compression and decompression methods relate to performing compression/decompression on the basis of the prediction data model P1. The system is a computer system including a program for executing the previous method. The computer-readable medium includes a computer program for executing the previous method. The present invention can be oriented towards lossless gene sequencing data compression, provides fully effective information for a high-performance lossless compression and decompression algorithm for gene sequencing data.

Method for the Compression of Genome Sequence Data
20220415441 · 2022-12-29 ·

The invention relates to a reference-based method for the compression of genome sequence data produced by a sequencing machine. The sequences of nucleotides or bases, that have been previously aligned to a reference sequence, are determined to be perfectly mapped, imperfectly mapped or unmapped with the reference sequence; and then coded according to said determination. The determining step comprises comparing, for each imperfectly mapped sequence, the number of mismatches between said sequence and the reference sequence with a reference threshold value, and encoding the imperfectly mapped sequences according to distinct encoding processes, depending on the result of said comparison method for the compression of genome sequence data produced by a sequencing machine.