G16B25/00

Method, apparatus, and computer-readable medium for predicting a hybridization rate constant of a first sequence

Embodiments of methods, systems, and tangible non-transitory computer readable medium having instructions are presented. A method includes calculating a plurality of feature values for a number of bioinformatic features of the desired hybridization reaction; and calculating distances between the plurality of feature values and corresponding database rate constant values stored in a database, the database comprising a plurality of hybridization reactions having known rate constants. The method additionally includes calculating a weighted average of a logarithm of the database rate constant values, with larger weights assigned to value instances having values lower in distance to the plurality of feature values of the desired hybridization reaction; and providing the weighted average as a predicted logarithm of the rate constant of the desired hybridization reaction.

Alignment free filtering for identifying fusions

Cell free nucleic acids from a test sample obtained from an individual are analyzed to identify possible fusion events. Cell free nucleic acids are sequenced and processed to generate fragments. Fragments are decomposed into kmers and the kmers are either analyzed de novo or compared to targeted nucleic acid sequences that are known to be associated with fusion gene pairs of interest. Thus, kmers that may have originated from a fusion event can be identified. These kmers are consolidated to generate gene ranges from various genes that match sequences in the fragment. A candidate fusion event can be called given the spanning of one or more gene ranges across the fragment.

Alignment free filtering for identifying fusions

Cell free nucleic acids from a test sample obtained from an individual are analyzed to identify possible fusion events. Cell free nucleic acids are sequenced and processed to generate fragments. Fragments are decomposed into kmers and the kmers are either analyzed de novo or compared to targeted nucleic acid sequences that are known to be associated with fusion gene pairs of interest. Thus, kmers that may have originated from a fusion event can be identified. These kmers are consolidated to generate gene ranges from various genes that match sequences in the fragment. A candidate fusion event can be called given the spanning of one or more gene ranges across the fragment.

Analytical signal for determination of the presence of a target nucleic acid sequence
11473127 · 2022-10-18 · ·

The present invention relates to a method for providing an analytical signal for determination of the presence of a target nucleic acid sequence in a sample. The present invention can contribute to dramatic improvement in methods for detecting target nucleic acid sequences using different detection temperatures and reference values. The present invention allows detection of a target nucleic acid sequence in a more accurate, effective and reproducible manner, by removing or adjusting a signal region that may affect the detection of a target nucleic acid sequence.

Methods and systems for modeling phasing effects in sequencing using termination chemistry

A method for nucleic acid sequencing includes receiving observed or measured nucleic acid sequencing data from a sequencing instrument that receives and processes a sample nucleic acid in a termination sequencing-by-synthesis process. The method also includes generating a set of candidate sequences of bases for the observed or measured nucleic acid sequencing data by determining a predicted signal for candidate sequences using a simulation framework. The simulation framework incorporates an estimated carry forward rate (CFR), an estimated incomplete extension rate (IER), an estimated droop rate (DR), an estimated reactivated molecules rate (RMR), and an estimated termination failure rate (TFR), the RMR being greater than or equal to zero and the TFR being lesser than one. The method also includes identifying, from the set of candidate sequences of bases, one candidate sequence leading to optimization of a solver function as corresponding to the sequence for the sample nucleic acid.

Methods and systems for modeling phasing effects in sequencing using termination chemistry

A method for nucleic acid sequencing includes receiving observed or measured nucleic acid sequencing data from a sequencing instrument that receives and processes a sample nucleic acid in a termination sequencing-by-synthesis process. The method also includes generating a set of candidate sequences of bases for the observed or measured nucleic acid sequencing data by determining a predicted signal for candidate sequences using a simulation framework. The simulation framework incorporates an estimated carry forward rate (CFR), an estimated incomplete extension rate (IER), an estimated droop rate (DR), an estimated reactivated molecules rate (RMR), and an estimated termination failure rate (TFR), the RMR being greater than or equal to zero and the TFR being lesser than one. The method also includes identifying, from the set of candidate sequences of bases, one candidate sequence leading to optimization of a solver function as corresponding to the sequence for the sample nucleic acid.

Methods for validation of microbiome sequence processing and differential abundance analyses via multiple bespoke spike-in mixtures

Compositions, systems and methods for generating and using internal standard spike-in mixes including a combination of template spikes. Compositions, systems and methods described herein are directed to using the internal standard spike-in mixes to evaluate a set of workflow pipelines to perform differential abundance analyses on a sample containing variations of a target nucleic acid sequence of interest. Compositions, systems and methods described herein are directed to using the internal spike-in mixes to validate results obtained from differential abundance analyses performed on a sample containing variations of a target nucleic acid sequence of interest, where the variations may be of highly variable levels of relative abundance.

Methods for validation of microbiome sequence processing and differential abundance analyses via multiple bespoke spike-in mixtures

Compositions, systems and methods for generating and using internal standard spike-in mixes including a combination of template spikes. Compositions, systems and methods described herein are directed to using the internal standard spike-in mixes to evaluate a set of workflow pipelines to perform differential abundance analyses on a sample containing variations of a target nucleic acid sequence of interest. Compositions, systems and methods described herein are directed to using the internal spike-in mixes to validate results obtained from differential abundance analyses performed on a sample containing variations of a target nucleic acid sequence of interest, where the variations may be of highly variable levels of relative abundance.

Conformal Inference for Optimization

Accurate function estimations and well-calibrated uncertainties are important for Bayesian optimization (BO). Most theoretical guarantees for BO are established for methods that model the objective function with a surrogate drawn from a Gaussian process (GP) prior. GP priors are poorly-suited for discrete, high-dimensional, combinatorial spaces, such as biopolymer sequences. Using a neural network (NN) as the surrogate function can obtain more accurate function estimates. Using a NN can allow arbitrarily complex models, removing the GP prior assumption, and enable easy pretraining, which is beneficial in the low-data BO regime. However, a fully-Bayesian treatment of uncertainty in NNs remains intractable, and existing approximate methods, like Monte Carlo dropout and variational inference, can highly miscalibrate uncertainty estimates. Conformal Inference Optimization (CI-OPT) uses confidence intervals calculated using conformal inference as a replacement for posterior uncertainties in certain BO acquisition functions. A conformal scoring function with properties amenable for optimization is effective on standard BO datasets and real-world protein datasets.

BIOMARKER SIGNATURE METHOD, AND APPARATUS AND KITS THEREFOR
20220325348 · 2022-10-13 ·

The present invention discloses methods, kits, and apparatus as well as reagents and compositions associated therewith for deriving an indicator for use in diagnosing the presence, absence or degree of at least one condition in a biological subject or in prognosing at least one condition in a biological subject. Also disclosed is a biomarker signature for use in diagnosing the presence, absence or degree of at least one condition in a biological subject or in prognosing at least one condition in a biological subject. The present invention further discloses methods, kits and apparatus, as well as reagents and compositions associated therewith, for identifying biomarkers for use in a biomarker signature.