Patent classifications
G06V30/19167
CONTINUOUS MACHINE LEARNING METHOD AND SYSTEM FOR INFORMATION EXTRACTION
Methods and systems for artificial intelligence (AI)-assisted document annotation and training of machine learning-based models for document data extraction are described. The methods and systems described herein take advantage of a continuous machine learning approach to create document processing pipelines that provide accurate and efficient data extraction from documents that include structured text, semi-structured text, unstructured text, or any combination thereof.
CHARACTER RECOGNITION MODEL TRAINING METHOD AND APPARATUS, CHARACTER RECOGNITION METHOD AND APPARATUS, DEVICE AND STORAGE MEDIUM
The present disclosure provides a character recognition model training method and apparatus, a character recognition method and apparatus, a device and a medium, relating to the technical field of artificial intelligence, and specifically to the technical fields of deep learning, image processing and computer vision, which can be applied to scenarios such as character detection and recognition technology. The specific implementing solution is: partitioning an untagged training sample into at least two sub-sample images; dividing the at least two sub-sample images into a first training set and a second training set; where the first training set includes a first sub-sample image with a visible attribute, and the second training set includes a second sub-sample image with an invisible attribute; performing self-supervised training on a to-be-trained encoder by taking the second training set as a tag of the first training set, to obtain a target encoder.
System and method for providing an interactive visual learning environment for creation, presentation, sharing, organizing and analysis of knowledge on subject matter
The embodiments herein disclose a system and a method for providing an online web-based interactive audio-visual platform for note creation, presentation, sharing, organizing, and analysis. The system provides a conceptual and interactive interface to content; analyses a student's notes and instantly determines the accuracy of the conceptual connections made and a student's understanding of a topic. The system enables the student to add and use audio, visual, drawing, text notes, and mathematical equations in addition to those suggested by the note taking solution; to collate notes from various sources in a meaningful manner by grouping concepts using colors, images, and text; and to personalize other maps developed within the same environment while maintaining links back to the original source from which the notes are derived. The system highlights keywords in conjunction with spoken text to complement the advantages of using visual maps to improve learning outcomes.
System and method for multi-modal image classification
Systems and methods for classifying images (e.g., ads) are described. An image is accessed. Optical character recognition is performed on at least a first portion of the image. Image recognition is performed via a convolutional neural network on at least a second portion of the image. At least one class for the image is automatically identified, via a fully connected neural network, based on one or more predictions, each of the one or more predictions being based on both the optical character recognition and the image recognition. Finally, the at least one class identified for the image is output.
Entity extraction with encoder decoder machine learning model
A method includes executing an encoder machine learning model on multiple token values contained in a document to create an encoder hidden state vector. A decoder machine learning model executing on the encoder hidden state vector generates raw text comprising an entity value and an entity label for each of multiple entities. The method further includes generating a structural representation of the entities directly from the raw text and outputting the structural representation of the entities of the document.
Information processing apparatus and non-transitory computer readable medium storing program
An information processing apparatus includes a processor configured to acquire a first recognition result and a first recognition probability on target data from a first recognizer, acquire a second recognition result and a second recognition probability on the target data from a second recognizer, execute checking of the first recognition result and the second recognition result, and execute first control in a case where the first recognition result and the second recognition result match each other as a result of the checking. The first control is control for executing either of first processing or second processing on the matched recognition result and outputting a processing result based on at least one of the first recognition probability or the second recognition probability. A human workload for the first processing is smaller than a human workload for the second processing.
OPTICAL CHARACTER RECOGNITION TRAINING WITH SEMANTIC CONSTRAINTS
A method, computer system, and a computer program product for optical character recognition training are provided. A text image and plain text labels for the text image may be received. The text image may include words. The plain text labels may include machine-encoded text corresponding to the words. Semantic feature vectors for the words, respectively, may be generated based on the plain text label. The text image, the plain text labels, and the semantic feature vectors may be input together into a machine learning model to train the machine learning model for optical character recognition. The plain text labels and the semantic feature vectors may be constraints for the training.
IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND STORAGE MEDIUM
A training image in accordance with a way a hane occurs, which is found in actual handwriting, is generated. Among line segments constituting a handwritten character in a character image representing the handwritten character, a line segment at which a handwritten hane may occur is detected. Then, by performing processing to add a simulated hane to the end portion of the detected line segment, a training image is generated.
Systems and methods for domain agnostic document extraction with zero-shot task transfer
A system for performing document extraction is configured to: (a) receive a first document; (b) extract the first document into document elements, the document elements including pages, lines, paragraphs, or any combination thereof; (c) determine a first set of fields of interest for the first document, wherein the first set of fields of interest are determined via a type of the first document or via a first set of queries for probing the first document; (d) determine, from a plurality of closed domain question answering (CDQA) models, a first set of CDQA models that provides answers to each field of interest included in the first set of fields of interest; and (e) provide answers to the first set of fields of interest to the client device.
METHOD OF FEDERATED LEARNING, ELECTRONIC DEVICE, AND STORAGE MEDIUM
A display method, an electronic device, and a storage medium, which relate to a field of natural language processing and a field of display. The display method includes: acquiring a content to be displayed; extracting a target term from the content using a term extraction rule; acquiring an annotation information for at least one target term, responsive to an extraction of the at least one target term; and displaying the annotation information for the at least one target term and the content.