Patent classifications
G06V30/19167
NEURAL NETWORK BASED SCENE TEXT RECOGNITION
A system uses a neural network based model to perform scene text recognition. The system achieves high accuracy of prediction of text from scenes based on a neural network architecture that uses double attention mechanism. The neural network based model includes a convolutional neural network component that outputs a set of visual features and an attention extractor neural network component that determines attention scores based on the visual features. The visual features and the attention scores are combined to generate mixed features that are provided as input to a character recognizer component that determines a second attention score and recognizes the characters based on the second attention score. The system trains the neural network based model by adjusting the neural network parameters to minimize a multi-class gradient harmonizing mechanism (GHM) loss. The multi-class GHM loss varies based on a level of difficulty of the sample.
REMOVAL OF SENSITIVE DATA FROM DOCUMENTS FOR USE AS TRAINING SETS
Systems and methods relating to the replacement or removal of sensitive data in images of documents. An initial image of a document with sensitive data is received at an execution module and changes are made based on the execution module's training. The changes include replacing or effectively removing the sensitive data from the image of the document. The resulting sanitized image is then sent to a user for validation of the changes. The feedback from the user is then used in training the execution module to refine its behaviour when applying changes to other initial images of documents. To train the execution module, training data sets of document images with sensitive data manually tagged by users are used. The execution module thus learns to identify sensitive data and its submodules replace that sensitive data with suitable replacement data. The feedback from the user works to improve the resulting sanitized images from the execution module.
Edge-based adaptive machine learning for object recognition
Examples of techniques for adaptive model training are provided. According to one or more embodiments of the present invention, a computer-implemented method for adaptive model training includes generating, by a processing system, a training instance based at least in part on a plurality of images that match a contextual specification of a target visual domain. The method further includes extracting, by the processing system, objects from one of the plurality of images. The method further includes for each extracted object, generating, by the processing system, a plurality of machine learning model features and label recommendations for a user.
Information processing apparatus and non-transitory computer readable medium for selecting a proper version of a recognition dictionary that is not necessarily a latest version
An information processing apparatus includes a selection unit that, when a target document is recognized, selects a first mode in which a latest version of a recognition dictionary is applied, or a second mode in which a version of the recognition dictionary is applied, the version of the recognition dictionary having a highest correct answer rate among plural versions different from the latest version, the correct answer rate being obtained from a recognition result and a confirmation or correction result of each of plural documents.
Asynchronous parameter aggregation for machine learning
Systems and methods are provided for training a machine learned model on a large number of devices, each device acquiring a local set of training data without sharing data sets across devices. The devices train the model on the respective device's set of training data. The devices communicate a parameter vector from the trained model asynchronously with a parameter server. The parameter server updates a master parameter vector and transmits the master parameter vector to the respective device.
IMAGE PROCESSING APPARATUS CAPABLE OF RESTORING DEGRADED IMAGE WITH HIGH ACCURACY, IMAGE PROCESSING METHOD, AND STORAGE MEDIUM
An image processing apparatus that is capable of restoring a degraded image with high accuracy The image processing apparatus acquires image data including an image of character information and identifies a character type of the character information. A learned model adapted to the identified character type is acquired from a plurality of learned models subjected to machine learning using images for learning, which are associated with a plurality of character type conditions, respectively, and correct answer images associated therewith. The image of character information is input to the acquired learned model to restore the image of character information.
Apparatus and methods for storing and dispensing medications
An apparatus for automated storage and dispensing of medications. Medications are stored in one or more inventory storage foam storage plates attached to a frame of the apparatus. Medications are delivered to the apparatus via a locked delivery container. A carrier mechanism retrieves medications from the inventory storage container and delivery container and moves medications to various subsystems of the apparatus. Information related to medications is communicated to a remote pharmacist prior to dispensing the medication. Multiple installations of the apparatus are centrally coordinated.
Using generative adversarial networks in compression
The compression system trains a machine-learned encoder and decoder through an autoencoder architecture. The encoder can be deployed by a sender system to encode content for transmission to a receiver system, and the decoder can be deployed by the receiver system to decode the encoded content and reconstruct the original content. The encoder is coupled to receive content and output a tensor as a compact representation of the content. The content may be, for example, images, videos, or text. The decoder is coupled to receive a tensor representing content and output a reconstructed version of the content. The compression system trains the autoencoder with a discriminator to reduce compression artifacts in the reconstructed content. The discriminator is coupled to receive one or more input content, and output a discrimination prediction that discriminates whether the input content is the original or reconstructed version of the content.
DOCUMENT PROCESSING USING HYBRID RULE-BASED ARTIFICIAL INTELLIGENCE (AI) MECHANISMS
A hybrid rule-based Artificial Intelligence (AI) document processing system processes a non-editable document with at least one invoice to accurately extract data from tables in the invoices. The non-editable document is preprocessed for conversion into a markup format and pages including the invoice are identified. The invoice is processed via a document process by parsing the pages in different directions to generate a first set of predictions and via a block process wherein logical information blocks from the invoice are processed to generate a second set of predictions. The missing entries from a selected table are identified by applying rules to the first set of predictions and the second set of predictions. Any discrepancy between the missing entry values between the first and second set of predictions are resolved and the resulting data is exported to downstream systems for further uses.
Edge-based adaptive machine learning for object recognition
Examples of techniques for adaptive object recognition for a target visual domain given a generic machine learning model are provided. According to one or more embodiments of the present invention, a computer-implemented method for adaptive object recognition for a target visual domain given a generic machine learning model includes creating, by a processing device, an adapted model and identifying classes of the target visual domain using the generic machine learning model. The method further includes creating, by the processing device, a domain-constrained machine learning model based at least in part on the generic machine learning model such that the domain-constrained machine learning model is restricted to recognize only the identified classes of the target visual domain. The method further includes computing, by the processing device, a recognition result based at least in part on combining predictions of the domain-constrained machine learning model and the adapted model.