Patent classifications
G06V30/148
GEOGRAPHIC MANAGEMENT OF DOCUMENT CONTENT
Methods and systems are provided to manage documents and extract information from documents by defining segments in each document, each of which is assigned a location in a coordinate system defined over a collection of documents. Metadata is attached to each segment to describe the contents, position, and semantic meaning of material within the segment. A segmenting-specific query language can be used to query the segments and respond to requests for information contained in the documents.
Text classification
A text classifying apparatus (100), an optical character recognition unit (1), a text classifying method (S220) and a program are provided for performing the classification of text. A segmentation unit (110) segments an image into a plurality of lines of text (401-412; 451-457; 501-504; 701-705) (S221). A selection unit (120) selects a line of text from the plurality of lines of text (S222-S223). An identification unit (130) identifies a sequence of classes corresponding to the selected line of text (S224). A recording unit (140) records, for the selected line of text, a global class corresponding to a class of the sequence of classes (S225-S226). A classification unit (150) classifies the image according to the global class, based on a confidence level of the global class (S227-S228).
Optical character recognition method and apparatus, electronic device and storage medium
The present application discloses a method and an apparatus for optical character recognition, an electronic device and a storage medium, and relates to the fields of artificial intelligence and deep learning. The method may include: determining, for a to-be-recognized image, a text bounding box of a text area therein, and extracting a text area image from the to-be-recognized image according to the text bounding box; determining a bounding box of text lines in the text area image, and extracting a text-line image from the text area image according to the bounding box; and performing text sequence recognition on the text-line image, and obtaining a recognition result. The application of the solution in the present application can improve a recognition speed and the like.
Optical character recognition method and apparatus, electronic device and storage medium
The present application discloses a method and an apparatus for optical character recognition, an electronic device and a storage medium, and relates to the fields of artificial intelligence and deep learning. The method may include: determining, for a to-be-recognized image, a text bounding box of a text area therein, and extracting a text area image from the to-be-recognized image according to the text bounding box; determining a bounding box of text lines in the text area image, and extracting a text-line image from the text area image according to the bounding box; and performing text sequence recognition on the text-line image, and obtaining a recognition result. The application of the solution in the present application can improve a recognition speed and the like.
Information processing apparatus, control method, and recording medium storing program
An information processing apparatus includes: a determiner that determines an area including a handwritten figure from image data; a recognizer that recognizes a handwritten character from the handwritten figure; an acquirer that acquires a file name; and a file generator that generates a file with a file name based on a handwritten character when the recognizer recognizes the handwritten character based on the image data and generates a file with the file name acquired by the acquirer when the recognizer does not recognize a handwritten character.
Predictive data analysis using image representations of categorical data to determine temporal patterns
There is a need for more effective and efficient predictive data analysis solutions and/or more effective and efficient solutions for generating image representations of categorical data. In one example, embodiments comprise receiving a categorical input feature, generating an image representation of the categorical input feature, generating an image-based prediction based at least in part on the image representation, and performing one or more prediction-based actions based at least in part on the image-based prediction.
METHODS, SYSTEMS, ARTICLES OF MANUFACTURE AND APPARATUS TO DECODE RECEIPTS BASED ON NEURAL GRAPH ARCHITECTURE
Methods, apparatus, systems, and articles of manufacture are disclosed to decode receipts based on neural graph architecture. An example apparatus for decoding receipts includes, vertex feature representation circuitry to extract features from optical-character-recognition (OCR) words, polar coordinate circuitry to: calculate polar coordinates of the OCR words based on respective ones of the extracted features, graph neural network circuitry to generate an adjacency matrix based on the extracted features, post-processing circuitry to traverse the adjacency matrix to generate cliques of OCR processed words, and output circuitry to generate lines of text based on the cliques of OCR processed words.
METHODS, SYSTEMS, ARTICLES OF MANUFACTURE, AND APPARATUS FOR DECODING PURCHASE DATA USING AN IMAGE
Methods, apparatus, systems, and articles of manufacture are disclosed that decode purchase data using an image. An example apparatus includes processor circuitry to execute machine readable instructions to at least crop an image of a receipt based on detected regions of interest, apply a first mask to a first cropped image to generate first bounding boxes corresponding to rows of the receipt, apply a second mask to a second cropped image to generate second bounding boxes corresponding to columns of the receipt, generate a structure of the receipt by mapping words detected by an optical character recognition engine to corresponding first bounding boxes and second bounding boxes based on a mapping criterion, classify the second bounding boxes by identifying an expression of interest in ones of the second bounding boxes, and generate purchase information by extracting text of interest from the structured receipt based on the classifications.
CHARACTER RECOGNITION OF LICENSE PLATE UNDER COMPLEX BACKGROUND
A system, method, and computer program product provides a way to separate connected or adhered adjacent characters of a digital image for license plate recognition. As a threshold processing, the method performs a recognition of character adhesion by obtaining character parameters using an image processor. The parameters include a horizontal max crossing and a ratio of width and height. A first rule-based module is used responsive to the character parameters to distinguish the adhered characters (character adhesions) that are easy to judge, leaving the uncertain part to a character adhesion classifier model for discrimination. Character adhesion data is obtained by data augmentation including the adding of a random distance between two single characters to create class like adhered characters. Then the character adhesion classifier model of single character and character adhesion data is trained. Any uncertain part can be distinguished by the trained character adhesion classifier model.
System and method for multi-modal image classification
Systems and methods for classifying images (e.g., ads) are described. An image is accessed. Optical character recognition is performed on at least a first portion of the image. Image recognition is performed via a convolutional neural network on at least a second portion of the image. At least one class for the image is automatically identified, via a fully connected neural network, based on one or more predictions, each of the one or more predictions being based on both the optical character recognition and the image recognition. Finally, the at least one class identified for the image is output.