G06V30/19113

Optical receipt processing
10229314 · 2019-03-12 · ·

Techniques for providing improved optical character recognition (OCR) for receipts are discussed herein. Some embodiments may provide for a system including one or more servers configured to perform receipt image cleanup, logo identification, and text extraction. The image cleanup may include transforming image data of the receipt by using image parameters values that optimize the logo identification, and performing logo identification using a comparison of the image data with training logos associated with merchants. When a merchant is identified, a second image clean up may be performed by using image parameter values optimized for text extraction. A receipt structure may be used to categorize the extracted text. Improved OCR accuracy is also achieved by applying on format rules of the receipt structure to the extracted text.

EFFICIENT IMAGE ANALYSIS

Methods, systems, and apparatus for efficient image analysis. In some aspects, a system includes a camera configured to capture images, one or more environment sensors configured to detect movement of the camera, a data processing apparatus, and a memory storage apparatus in data communication with the data processing apparatus. The data processing apparatus can access, for each of a multitude of images captured by a mobile device camera, data indicative of movement of the camera at a time at which the camera captured the image. The data processing apparatus can also select, from the images, a particular image for analysis based on the data indicative of the movement of the camera for each image, analyze the particular image to recognize one or more objects depicted in the particular image, and present content related to the one or more recognized objects.

Usage based resource utilization of training pool for chatbots

A training request including an identifier that is indicative of a type of a machine learning (ML) model that is to be trained is received. A plurality of workers are maintained in a training pool, and a plurality of jobs are maintained in a queue of training jobs. Each worker is configured to train a particular type of ML model. Upon the training request being validated, a training job is created for the request and submitted to the queue of training jobs. For each type of ML model, a first metric and a second metric is obtained. A target metric is computed based on the first and the second metrics. The number of workers included in the training pool is modified based on the target metric.

Efficient image analysis

Methods, systems, and apparatus for efficient image analysis. In some aspects, a system includes a camera configured to capture images, one or more environment sensors configured to detect movement of the camera, a data processing apparatus, and a memory storage apparatus in data communication with the data processing apparatus. The data processing apparatus can access, for each of a multitude of images captured by a mobile device camera, data indicative of movement of the camera at a time at which the camera captured the image. The data processing apparatus can also select, from the images, a particular image for analysis based on the data indicative of the movement of the camera for each image, analyze the particular image to recognize one or more objects depicted in the particular image, and present content related to the one or more recognized objects.

Image Analysis System for Testing in Manufacturing

A vision analytics and validation (VAV) system for providing an improved inspection of robotic assembly, the VAV system comprising a trained neural network three-way classifier, to classify each component as good, bad, or do not know, and an operator station configured to enable an operator to review an output of the trained neural network, and to determine whether a board including one or more bad or a do not know classified components passes review and is classified as good, or fails review and is classified as bad. In one embodiment, a retraining trigger to utilize the output of the operator station to train the trained neural network, based on the determination received from the operator station.

Dynamic presentation of targeted information in a mixed media reality recognition system
10007928 · 2018-06-26 · ·

A context-aware targeted information delivery system comprises a mobile device, an MMR matching unit, a plurality of databases for user profiles, user context and advertising information, a plurality of comparison engines and a plurality of weight adjusters. The mobile device is coupled to deliver an image patch to the MMR matching unit which in turn performs recognition to produce recognize text. The recognized text is provided to a first and second comparison engines to produce relevant topics and relevant ads. The relevant topics and relevant ads are adjusted with information from a user context database including information such as location, date, time, and other information from a user profile. The third comparison engine compares the relevant topics and relevant ads to produce a set of final ads that are most related to the topics of interest for the user and delivered for display on to the mobile device.

Efficient Image Analysis

A computing system includes one or more memory devices to store instructions; and one or more processors to execute the instructions to perform operations. The operations include: receiving a plurality of images captured by a camera; selecting a portion of the plurality of images having a quality rating above a threshold level; processing, by a coarse classifier, a first image among the portion of the plurality of images to determine whether the first image depicts at least one object from one or more particular classes of objects; in response to determining the first image depicts the at least one object from the one or more particular classes of objects, performing an object recognition process to recognize the at least one object; and presenting content related to the at least one object recognized via the object recognition process.

USAGE BASED RESOURCE UTILIZATION OF TRAINING POOL FOR CHATBOTS

A training request including an identifier that is indicative of a type of a machine learning (ML) model that is to be trained is received. A plurality of workers are maintained in a training pool, and a plurality of jobs are maintained in a queue of training jobs. Each worker is configured to train a particular type of ML model. Upon the training request being validated, a training job is created for the request and submitted to the queue of training jobs. For each type of ML model, a first metric and a second metric is obtained. A target metric is computed based on the first and the second metrics. The number of workers included in the training pool is modified based on the target metric.

PREDICTING MISSING ENTITY IDENTITIES IN IMAGE-TYPE DOCUMENTS
20240420497 · 2024-12-19 · ·

Techniques for predicting a missing value in an image-type document are disclosed. A system predicts the identity of a supplier associated with an image-type document in which the supplier's identity may not be extracted by text recognition. When a system determines that the supplier identity cannot be identified using a text recognition application, the system generates a set of machine learning model input features from features extracted from the image-type document to predict the supplier's identity. One input feature is a data file bounds feature indicating whether the image-type document is a scanned document or a non-scanned document. The system predicts a value for the supplier's identity based on the data file bounds value and additional feature values, including color channel characteristics and spatial characteristics of regions-of-interest. The system generates a mapping of values to defined attributes based in part on the predicted value for the supplier's identity.

Generating weighted contextual themes to guide unsupervised keyphrase relevance models

The present disclosure relates to systems, methods, and non-transitory computer readable media that utilize intelligent contextual bias weights for informing keyphrase relevance models to extract keyphrases. For example, the disclosed systems generate a graph from a digital document by mapping words from the digital document to nodes of the graph. In addition, the disclosed systems determine named entity bias weights for the nodes of the graph utilizing frequencies with which the words corresponding to the nodes appear within named entities identified from the digital document. Moreover, the disclosed systems generate a keyphrase summary for the digital document utilizing the graph and a machine learning model biased according to the named entity bias weights for the nodes of the graph.