G06V30/268

Image processing device, image processing method, and storage medium storing program

An image processing device including: a first feature quantity selecting unit configured to select a first feature quantity of a document image that is a character recognition target among first feature quantities that are recoded in advance and represent features of character strings of an item; a character recognition processing unit configured to perform a character recognition process for the document image; a character string selecting unit configured to select a character string of a specific item corresponding to the first feature quantity among the character strings acquired as a result of the character recognition process; and a determination result acquiring unit configured to acquire a determination result indicating whether or not a character string that has been input in advance matches the character string of the specific item in a case in which the character string selecting unit has not selected any one of the character strings.

MAPPER COMPONENT FOR A NEURO-LINGUISTIC BEHAVIOR RECOGNITION SYSTEM

Techniques are disclosed for generating a sequence of symbols based on input data for a neuro-linguistic model. The model may be used by a behavior recognition system to analyze the input data. A mapper component of a neuro-linguistic module in the behavior recognition system receives one or more normalized vectors generated from the input data. The mapper component generates one or more clusters based on a statistical distribution of the normalized vectors. The mapper component evaluates statistics and identifies statistically relevant clusters. The mapper component assigns a distinct symbol to each of the identified clusters.

Handwriting recognition systems and methods
11216688 · 2022-01-04 · ·

The present disclosure includes systems and methods for handwriting recognition. Handwriting data is received. Geometric data of text in handwriting data is determined. Sub-characters of the text are determined. Sub-characters of text are matched to a model. Most probable characters of the text is determined based on the matching.

Unsupervised representation learning for structured records

Techniques for generating record embeddings from structured records are described. A record embeddings generating engine processes structured records to build a token vocabulary. Token embeddings are created for each token in the vocabulary. The token embeddings are trained using a loss function that relates the token embeddings to the record-attribute-data structure of the structured records. A record embedding is assembled from the trained token embeddings.

Data extraction and duplicate detection

A system provides an end-to-end solution for invoice processing which includes reading files (such as pdfs and images), extracting key relevant information from the files, organizing the relevant information in a structured template as a key-value pair, and comparing files based on the similarities between different file fields to identify potential duplicate files.

METHOD FOR PREDICTING TRIP PURPOSES

Certain aspects of the present disclosure provide techniques for recommending trip purposes to users of an application. Embodiments include receiving labeled travel data from the application running on a remote device including a plurality of trip purposes. Embodiments include building a topic model representing words associated with a plurality of topics. Embodiments include training a topic prediction model, using the plurality of topics and one or more features derived from each of the plurality of trip records, to output a topic based on an input trip record. Embodiments include training a purpose prediction model, using the topic model and the plurality of trip purposes, to output a trip purpose based on an input topic. The trip purpose may be recommended to a user via a user interface of the application running on the remote device.

DATA EXTRACTION AND DUPLICATE DETECTION

A system provides an end-to-end solution for invoice processing which includes reading files (such as pdfs and images), extracting key relevant information from the files, organizing the relevant information in a structured template as a key-value pair, and comparing files based on the similarities between different file fields to identify potential duplicate files.

COMPUTATIONALLY REACTING TO A MULTIPARTY CONVERSATION

Technology is provided for causing a computing system to extract conversation features from a multiparty conversation (e.g., between a coach and mentee), apply the conversation features to a machine learning system to generate conversation analysis indicators, and apply a mapping of conversation analysis indicators to actions and inferences to determine actions to take or inferences to make for the multiparty conversation. In various implementations, the actions and inferences can include determining scores for the multiparty conversation such as a score for progress toward a coaching goal, instant scores for various points throughout the conversation, conversation impact score, ownership scores, etc. These scores can be, e.g., surfaced in various user interfaces along with context and benchmark indicators, used to select resources for the coach or mentee, used to update coach/mentee matchings, used to provide real-time alerts to signify how the conversation is going, etc.

Data extraction and duplicate detection

A system provides an end-to-end solution for invoice processing which includes reading invoices (both pdfs and images), extracting key relevant information from the face of invoices, organizing the relevant information in a structured template as a key-value pair, and comparing invoices based on the similarities between different invoice fields to identify potential duplicate invoices.

Sound playback interval control method, sound playback interval control program, and information processing apparatus
11386684 · 2022-07-12 · ·

A sound playback interval control method performed by a computer is provided for a speech recognition system. The method includes: arranging and displaying a word block subjected to correction and confirmation in a central portion of a first area on a display screen, the first area being an area in which a plurality of word blocks generated by using morphological analysis from a character string obtained by speech recognition are displayed, and performing playback control on sound of the word block subjected to correction and confirmation displayed in the first area.