G06V30/1983

INFORMATION PROCESSING APPARATUS AND NON-TRANSITORY COMPUTER READABLE MEDIUM STORING PROGRAM

An information processing apparatus includes a processor configured to extract a specific text string from a text string which is a text recognition target, calculate a reliability degree of text recognition for the specific text string, and output the reliability degree as a reliability degree of text recognition for an entirety of the text string which is the text recognition target.

STANDARDIZATION IN THE CONTEXT OF DATA INTEGRATION

Techniques are described relating to automatic data standardization in a managed services domain of a cloud computing environment. An associated computer-implemented method includes receiving a dataset during a data onboarding procedure and classifying datapoints within the dataset. The method further includes applying a machine learning data standardization model to each classified datapoint within the dataset and deriving a proposed set of data standardization rules for the dataset based upon any standardization modification determined consequent to model application. Optionally, the method includes presenting the proposed set of data standardization rules for client review and, responsive to acceptance of the proposed set of data standardization rules, applying the proposed set of data standardization rules to the dataset. The method further includes, responsive to acceptance of the proposed set of data standardization rules, updating the machine learning data standardization model accordingly.

User interface for regular expression generation

Disclosed herein are techniques related to automated generation of regular expressions. In some embodiments, a regular expression generator may receive input data comprising one or more character sequences. The regular expression generator may convert character sequences into a sets of regular expression codes and/or span data structures. The regular expression generator may identify a longest common subsequence shared by the sets of regular expression codes and/or spans, and may generate a regular expression based upon the longest common subsequence.

Systems and methods for using dynamic reference graphs to accurately align sequence reads

A method for matching character strings to a reference character string is disclosed. One or more processors receive a plurality of character strings. The one or more processors match each of the plurality of character strings to a main reference character string and registers a match to positions on the main reference character string that satisfy a pre-set match criteria. The one or more processors match each of the plurality of character strings to an alternate reference character string and registers a match to positions on the alternate reference character string that satisfy the pre-set match criteria. The alternate reference character string is derived from the main character string. The one or more processors identifies a match for each of the plurality of character strings that match to either a position on the main reference character string or the alternate reference character string.

Regular expression generation using combinatoric longest common subsequence algorithms

Disclosed herein are techniques related to automated generation of regular expressions. In some embodiments, a regular expression generator may receive input data comprising one or more character sequences. The regular expression generator may convert character sequences into a sets of regular expression codes and/or span data structures. The regular expression generator may identify a longest common subsequence shared by the sets of regular expression codes and/or spans, and may generate a regular expression based upon the longest common subsequence.

Recognition method and recognition system for unambiguously recognizing an object
20220044080 · 2022-02-10 ·

The presented invention relates to a computer-implemented recognition method (100) for unambiguously recognizing an object. The recognition method (100) comprises a first determining step (101) for determining, by means of a first optical sensor (201) at a first point in time, reference information by capturing a number of symbols applied to a reference object, a training step (103) for training a machine learner on the basis of the reference information and a provided ground truth which assigns respective reference information to a first class or a further class, a second determining step (105) for determining, by means of a second optical sensor (205) at a second point in time, sample information by capturing a number of symbols applied to a sample object, an assigning step (107) for the assigning of the sample information to the first class or the further class by the machine learner, and an outputting step (109) for outputting a validation signal in case the machine learner assigns the sample information to the first class.

Furthermore, the presented invention relates to a recognition system (200).

SUPPORT APPARATUS, GENERATION APPARATUS, ANALYSIS APPARATUS, SUPPORT METHOD, GENERATION METHOD, ANALYSIS METHOD, AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM
20220043847 · 2022-02-10 ·

A support apparatus includes a generation apparatus and an analysis apparatus. The generation apparatus executes (a-1) to (a-5) with I=1 to n, and generates pieces of process information. The generation apparatus extracts material words from a document i in (a-1), extracts a treatment word i from the document i in (a-2), extracts a synthesis condition i from the document i in (a-3), extracts a characteristic value i related to a target material from the document i in (a-4), and associates the material words, the treatment word i, the synthesis condition i, and the characteristic value i with each other to generate process information i in (a-5). The analysis apparatus includes a combiner that generates composite process information including a common part common to the pieces of process information and different parts different among the pieces of process information, and an outputter that outputs the composite process information.

A METHOD FOR DISTINGUISHING BETWEEN MORE THAN ONE FLUORESCENT SPECIES PRESENT IN A SAMPLE
20210334513 · 2021-10-28 ·

Methods and systems are provided for distinguishing between more than one fluorescent species present in a sample in fluorescence microscopy. The method involves illuminating the sample with at least one light source. More than two images of the illuminated sample are recorded over a period of time, each image comprising a plurality of pixels, wherein each pixel corresponds to a location in the sample and records a degree of fluorescence at the location in the sample at a particular point in time. A photostability characteristic of the degree of fluorescence at each pixel over the period of time over which the more than two images were recorded is determined and used to distinguish between the more than one fluorescent species present in the sample.

Stacked cross-modal matching

The present concepts relate to matching data of two different modalities using two stages of attention. First data is encoded as a set of first vectors representing components of the first data, and second data is encoded as a set of second vectors representing components of the second data. In the first stage, the components of the first data are attended by comparing the first vectors and the second vectors to generate a set of attended vectors. In the second stage, the components of the second data are attended by comparing the second vectors and the attended vectors to generate a plurality of relevance scores. Then, the relevance scores are pooled to calculate a similarity score that indicates a degree of similarity between the first data and the second data.

SIMILARITY PROCESSING METHOD, APPARATUS, SERVER AND STORAGE MEDIUM

The present application discloses a similarity processing method, an apparatus, a server and a storage medium, and relates to the fields of information processing and natural language processing. The specific implementation solution is as follows: acquiring a first character string and a second character string; determining a pronunciation pattern similarity and a character pattern similarity between the first character string and the second character string; and determining a comprehensive similarity between the first character string and the second character string, based on the pronunciation pattern similarity and the character pattern similarity.