G16B30/20

Methods for Determining Lymphocyte Receptor Chain Pairs

Provided herein are high-throughput sequencing methods to study the diversity and functionality of lymphocyte receptor chains and pairing of the same. Specifically, the methods provided herein are used to identify with confidence one or more lymphocyte receptor chain pairs in a sample, for example one or more functional chain pairs.

Haplotype resolved genome sequencing
11492656 · 2022-11-08 · ·

Methods of determining a haplotype or partial haplotype of a DNA sample containing high molecular weight segments of genomic DNA are disclosed. Such methods may include sequencing DNA in an enriched DNA sample to produce a plurality of sequence reads, where some of the sequence reads contain a first allele of the first haplotype and other of the sequence reads contain a second allele of the first haplotype. Some methods align the sequence reads to a reference genome to produce aligned reads, where aligned reads from the first high molecular weight segment tend to cluster into islands on the reference genome. Some methods further determine distances separating adjacent aligned reads on the reference genome and select a first group of the aligned reads having separation distances to adjacent aligned reads that are smaller than a cutoff value. Using alleles from the first group of aligned reads, the method may define a first haplotype or first partial haplotype.

Haplotype resolved genome sequencing
11492656 · 2022-11-08 · ·

Methods of determining a haplotype or partial haplotype of a DNA sample containing high molecular weight segments of genomic DNA are disclosed. Such methods may include sequencing DNA in an enriched DNA sample to produce a plurality of sequence reads, where some of the sequence reads contain a first allele of the first haplotype and other of the sequence reads contain a second allele of the first haplotype. Some methods align the sequence reads to a reference genome to produce aligned reads, where aligned reads from the first high molecular weight segment tend to cluster into islands on the reference genome. Some methods further determine distances separating adjacent aligned reads on the reference genome and select a first group of the aligned reads having separation distances to adjacent aligned reads that are smaller than a cutoff value. Using alleles from the first group of aligned reads, the method may define a first haplotype or first partial haplotype.

Flexible decoding in DNA data storage based on redundancy codes

Data that has been stored according to a DNA data storage method can be decoded using a flexible approach that supports both solitary strand mapping and cluster-based trace reconstruction. Solitary strand mapping can place strings based on integrity verification. Redundancy information can be partitioned to support error correction during the solitary strand mapping while still achieving integrity verification. Clusters with verified strands can be skipped during cluster-based trace reconstruction. Useful for increasing the accuracy of the trace reconstruction procedure.

Flexible decoding in DNA data storage based on redundancy codes

Data that has been stored according to a DNA data storage method can be decoded using a flexible approach that supports both solitary strand mapping and cluster-based trace reconstruction. Solitary strand mapping can place strings based on integrity verification. Redundancy information can be partitioned to support error correction during the solitary strand mapping while still achieving integrity verification. Clusters with verified strands can be skipped during cluster-based trace reconstruction. Useful for increasing the accuracy of the trace reconstruction procedure.

IDENTIFYING PRESENCE AND COMPOSITION OF CELL-FREE NUCLEIC ACIDS
20230095082 · 2023-03-30 ·

This disclosure describes example techniques and systems for identifying the presence and/or composition of nucleic acids in the blood of a host organism of a model species harboring tissue of a donor organism of another species. For example, the technique may involve identifying the presence and composition of nucleic acids in the blood of a mouse harboring tissue of a human or another companion animal. These cell-free nucleic acids that are identified can be used as biomarkers to determine the presence of a disease, its biological behavior, its rate of progression, and/or the response of the disease to one or more unique therapies. In other examples, the cell-free nucleic acids may be used as biomarkers to determine a response of the host species to the tissue of the donor organism or a response of tissue derived from the second organism to transplantation within the first organism of the first species.

IDENTIFYING PRESENCE AND COMPOSITION OF CELL-FREE NUCLEIC ACIDS
20230095082 · 2023-03-30 ·

This disclosure describes example techniques and systems for identifying the presence and/or composition of nucleic acids in the blood of a host organism of a model species harboring tissue of a donor organism of another species. For example, the technique may involve identifying the presence and composition of nucleic acids in the blood of a mouse harboring tissue of a human or another companion animal. These cell-free nucleic acids that are identified can be used as biomarkers to determine the presence of a disease, its biological behavior, its rate of progression, and/or the response of the disease to one or more unique therapies. In other examples, the cell-free nucleic acids may be used as biomarkers to determine a response of the host species to the tissue of the donor organism or a response of tissue derived from the second organism to transplantation within the first organism of the first species.

Methods and systems for de novo peptide sequencing using deep learning

The present systems and methods introduce deep learning to de novo peptide sequencing from tandem mass spectrometry data. The systems and methods achieve improvements in sequencing accuracy over existing systems and methods and enables complete assembly of novel protein sequences without assisting databases. The present systems and methods are re-trainable to adapt to new sources of data and provides a complete end-to-end training and prediction solution, which is advantageous given the growing massive amount of data. The systems and methods combine deep learning and dynamic programming to solve optimization problems.

Methods and systems for de novo peptide sequencing using deep learning

The present systems and methods introduce deep learning to de novo peptide sequencing from tandem mass spectrometry data. The systems and methods achieve improvements in sequencing accuracy over existing systems and methods and enables complete assembly of novel protein sequences without assisting databases. The present systems and methods are re-trainable to adapt to new sources of data and provides a complete end-to-end training and prediction solution, which is advantageous given the growing massive amount of data. The systems and methods combine deep learning and dynamic programming to solve optimization problems.

Methods and systems for detecting sequence variants
11488688 · 2022-11-01 · ·

The invention provides methods for identifying rare variants near a structural variation in a genetic sequence, for example, in a nucleic acid sample taken from a subject. The invention additionally includes methods for aligning reads (e.g., nucleic acid reads) to a reference sequence construct accounting for the structural variation, methods for building a reference sequence construct accounting for the structural variation or the structural variation and the rare variant, and systems that use the alignment methods to identify rare variants. The method is scalable, and can be used to align millions of reads to a construct thousands of bases long, or longer.