G06V2201/04

SYSTEM AND METHOD FOR ACHIEVING HIGH GENE DATA RESOLUTION USING TRAINING SETS

Systems, methods, and computer program products for generating an enhanced set of sequences for taxonomical classification are disclosed. In various embodiments, a plurality of reference sequences are received. Each of the plurality of reference sequences corresponds to a taxonomical classification. A label corresponding to at least one of the reference sequences is assigned to each of a plurality of supplemental sequences. Each of the plurality of supplemental sequences and each of the plurality of reference sequences are truncated to a region of interest to thereby generate a truncated set of sequences. Similarity is measured between pairs of truncated sequences in the truncated set of sequences to determine whether the similarity is above a predetermined threshold. An intermediate taxonomical label is assigned to the pair of truncated sequences in the truncated set of sequences when the similarity is above the predetermined threshold to thereby generate an enhanced set of sequences.

Copy number variant caller

Direct targeted sequencing (DTS) methods and a hidden Markov model (HMM) can be used to call the copy number of a segment of interest within a region of interest. Described herein are methods for calling a copy number variant or a copy number variant abnormality using an HMM, and methods for determining a copy number based on a copy number likelihood model, in a test sequencing library that has be sequenced using DTS methods. Also described herein are methods for determining a copy number of a segment, including accounting for spurious capture probes that may arise from the DTS methods.

SYSTEMS AND METHODS FOR PROCESSING ELECTRONIC IMAGES
20220012880 · 2022-01-13 · ·

An image processing method including receiving a target image of a slide corresponding to a target specimen comprising a tissue sample of a patient; generating a machine learning system by processing a plurality of training images, each training image comprising an image of human tissue and a label characterizing at least one of a slide morphology, a diagnostic value, a pathologist review outcome, and an analytic difficulty; automatically identifying, using the machine learning system, an area of interest of the target image by analyzing microscopic features extracted from multiple image regions in the target image; determining, using the machine learning system, a probability of a target feature being present in the area of interest of the target image based on an average probability; and determining, using the machine learning system, a prioritization value, of a plurality of prioritization values.

Equalization-Based Image Processing and Spatial Crosstalk Attenuator

The technology disclosed attenuates spatial crosstalk from sequencing images for base calling. In particular, the technology disclosed accesses an image whose pixels depict intensity emissions from a target cluster and intensity emissions from additional adjacent clusters. The pixels include a center pixel that contains a center of the target cluster. Each pixel in the pixels is divisible into a plurality of subpixels. Depending upon a particular subpixel, in a plurality of subpixels of the center pixel, which contains the center of the target cluster, the technology disclosed selects, from a bank of subpixel lookup tables, a subpixel lookup table that corresponds to the particular subpixel. The selected subpixel lookup table contains pixel coefficients that are configured to maximizes a signal-to-noise ratio. The technology disclosed element-wise multiplies the pixel coefficients with the pixels and determines a weighted sum.

Method for constructing sequencing template based on image, and base recognition method and device

A method for constructing a sequencing template based on an image, a device, and a system. The image includes first, second, third and fourth images of one same field of view corresponding to base extensions of A, T/U, G, and C respectively; the first, second, third and fourth images respectively include images M1 and M2, images N1 and N2, images P1 and P2, and images Q1 and Q2; the method includes combining any two of the images M1, M2, N1, N2, P1, P2, Q1, and Q2to perform bright spot matching, and enabling such images to participate in the combination for at least one time to obtain a plurality of combined images including first coincident bright spots, and merging the first coincident bright spots on the plurality of combined images to obtain a bright spot set corresponding to the sequencing template.

PRIMARY ANALYSIS IN NEXT GENERATION SEQUENCING
20230326065 · 2023-10-12 ·

Image data analysis, and particularly identifying cluster or polony locations for performing base-calling in a digital image of a flow cell during DNA sequencing is described. A method may include generating a first plurality of flow cell images of a cellular sample immobilized on a support by conducting one or more cycles of sequencing reactions. The cellular sample may include a plurality of concatemer molecules therewithin. For the first plurality of flow cell image, pixel intensities, and a respective color purity of each of the pixel intensities may be determined. A base calling template may include base calling locations based on the pixel intensities and the respective color purity of the pixel intensities. The base calling template may be for registering a second plurality of flow cell images of the support in one or more subsequent cycles of the one or more cycles.

PRIMARY ANALYSIS IN NEXT GENERATION SEQUENCING

Image data analysis, particularly identifying cluster locations for performing base-calling in a digital flow cell image during DNA sequencing, is described. Each nucleic acid template molecule immobilized on a support may include an insert sequence and a sample index sequence. The sample index sequence may include a k-mer sequence. A sequencing system may conduct k cycles of sequencing reactions of the k-mer sequence before conducting one or more cycles of the insert sequence sequencing reactions and generate a first plurality of flow cell images. Pixel intensities may be determined for pixels of the first plurality of flow cell images. A base calling template may be determined and include base calling locations based on the pixel intensities and respective color purities of the pixel intensities. The base calling template may register a second plurality of flow cell images of the support in one or more cycles subsequent to the k cycles.

Systems and methods for processing electronic images
11776681 · 2023-10-03 · ·

An image processing method including identifying, using a machine learning system, an area of interest of a target image by analyzing features extracted from image regions in the target image, the machine learning system being generated by processing a plurality of training images each comprising an image of human tissue and a diagnostic label characterizing at least one of a slide morphology, a diagnostic value, and a pathologist review outcome; determining, using the machine learning system, a probability of a target feature being present in the area of interest of the target image based on an average probability; determining, using the machine learning system, a prioritization value, of a plurality of prioritization values, of the target image based on the probability of the target feature being present in the target image.

Somatic mutation detection apparatus and method with reduced sequencing platform-specific error

A mutation detection apparatus includes a memory configured to store software for implementing a neural network and a processor configured to detect a mutation by executing the software, wherein the processor is configured to generate first genome data extracted from a target tissue and second genome data extracted from a normal tissue, extract image data by preprocessing the first genome data and the second genome data, and detect a mutation of the target tissue on the basis of the image data through the neural network trained to correct a sequencing platform-specific false positive.

SINGLE-PASS PRIMARY ANALYSIS

Methods and systems for image analysis are provided, and in particular for identifying a set of base-calling locations in a flow cell for DNA sequencing. These include capturing flow cell images after each sequencing step performed on the flow cell, and identifying candidate cluster centers in at least one of the flow cell images. Intensities are determined for each candidate cluster center in a set of flow cell images. Purities are determined for each candidate cluster center based on the intensities. Each candidate cluster center with a purity greater than the purity of the surrounding candidate cluster centers within a distance threshold is added to a template set of base-calling locations.