Patent classifications
G06V30/1607
OPTICAL CHARACTER RECOGNITION OF SERIES OF IMAGES
Systems and methods for performing OCR of a series of images depicting text symbols. An example method comprises performing OCR a series of images to produce a current symbol sequence and corresponding symbol sequence quadrangle; associating the current symbol sequence with a previous symbol sequence for a previously received image; identifying a median string; determining a median symbol sequence quadrangle; and displaying, using the median symbol sequence quadrangle, a resulting OCR text representing at least a portion of the original document.
CHARACTER AUTHENTICITY DETERMINATION
A computer-implemented method for assessing if a character in a sample image is formed from a predefined selection of characters, comprising: processing a sample image with an alignment network to form a corrective transformation; applying the corrective transformation to the sample image to form a transformed image; computing a similarity of the transformed image with a corresponding reference image of a character from a predefined selection of characters to form a similarity score; and declaring the sample image not to comprise the character from the predefined selection of characters if the similarity score is less than a threshold.
CONVERTING AN IMAGE INTO A STRUCTURED TABLE
- Gopalakrishnan Venkateswaran ,
- Tumu Sree Bharath ,
- Jeet Mukeshkumar Patel ,
- Ajit Kumar Singh ,
- Milos Lazarevic ,
- Dhiresh Kumar Nagwani ,
- Abhas Sinha ,
- Ivan Vujic ,
- Naresh Jain ,
- Sanjay Krupakar Bhat ,
- Aleksandar Sretenovic ,
- Tamara Paunovic ,
- Aljosa Obuljen ,
- Sasa Vuckovic ,
- Dusan Lukic ,
- Catherine William Neylan ,
- Marko Rakita
A system for converting an image of an unstructured table into a structured table is provided. The system may comprise a memory storing machine readable instructions. The system may include a processor to receive an image of a unstructured table and convert the image of the unstructured table into a structured table. Converting the image of the unstructured table into the structured table may include providing cell mapping and low confidence determination to highlight potentially misconverted content. The low confidence determination may be based on a first input and a second input. The processor may export the structured table, upon validation, to an application that supports structured tables.
Computer-based systems and methods for recognizing and correcting distorted text in facsimile documents
A method includes passing an original text document through distortion filter generators to generate a training dataset that includes distorted text documents. Each distortion filter generator is configured to distort words or letters of words in phrases of text of a facsimile image in a respective unique manner. A neural network model is trained to recognize each respective distortion and match each respective distortion with each respective distortion filter generator based on the training dataset and the original text document. Image data of one facsimile having at least one text distortion is received and inputted to the trained neural network model. The output of the trained neural network model is coupled to an input of an optical character recognition (OCR) engine. The trained neural network model and the OCR engine convert the received image data of the incoming facsimile corrected for the at least one text distortion to machine-encoded text.
METHOD AND APPARATUS FOR DETERMINING INFORMATION ABOUT A DRUG-CONTAINING VESSEL
Information about a drug-containing vessel is determined by capturing image data of the curved surface of a cylindrical portion of a drug-containing vessel. The image data is unfurled from around the curved surface, binarised, and a template matching algorithm employed to determine that the label information comprises candidate information about the vessel and/or the drug.
Neural network based scene text recognition
A system uses a neural network based model to perform scene text recognition. The system achieves high accuracy of prediction of text from scenes based on a neural network architecture that uses double attention mechanism. The neural network based model includes a convolutional neural network component that outputs a set of visual features and an attention extractor neural network component that determines attention scores based on the visual features. The visual features and the attention scores are combined to generate mixed features that are provided as input to a character recognizer component that determines a second attention score and recognizes the characters based on the second attention score. The system trains the neural network based model by adjusting the neural network parameters to minimize a multi-class gradient harmonizing mechanism (GHM) loss. The multi-class GHM loss varies based on a level of difficulty of the sample.
Text recognition method and apparatus
A text recognition method and apparatus disclosed. The text recognition method includes: obtaining a to-be-detected image; determining a target text detection area in the to-be-detected image, where the target text detection area includes target text in the to-be-detected image, and the target text detection area is a polygonal area including m vertex pairs, m being a positive integer greater than 2; correcting the polygonal area to m1 rectangular areas to obtain a corrected target text detection area; and performing text recognition on the corrected target text detection area to determine the target text, and outputting the target text.
Optical character recognition of series of images
Systems and methods for performing OCR of a series of images depicting text symbols. An example method comprises performing OCR a series of images to produce symbol sequences and corresponding symbol sequence quadrangles; identifying median string; calculating transformation of the symbol sequence quadrangles into a common coordinate system; determining distances between the transformed symbol sequence quadrangles; identifying a median symbol sequence quadrangle; displaying, using the median symbol sequence quadrangle, a resulting OCR text representing at least a portion of the original document.
Device, Method, and Graphical User Interface for Processing Document
A method for detecting a document edge is provided. The method includes: obtaining multi-color channel data of each pixel in a color image (103), where the multi-color channel data includes two-dimensional coordinate values of the pixel and a value of the pixel on each color channel; performing line detection on the multi-color channel data of each pixel in the color image (105); and detecting a quadrilateral based on preset condition and some or all of straight lines obtained by performing the line detection (107). According to the foregoing method, a success rate of detecting a document edge can be increased.
Automated methods and systems for locating document subimages in images to facilitate extraction of information from the located document subimages
The present document is directed to methods and subsystems that identify and characterize document-containing subimages in a document-containing image. In one implementation, each type of document is modeled as a set of features that are extracted from a set of images known to contain the document. To locate and characterize a document subimage in an image, the currently described methods and subsystems extract features from the image and then match model features of each model in a set of models to the extracted features to select the model that best corresponds to the extracted features. Additional information contained in the selected model is then used to identify the location of the subimage corresponding to the document and to process the document subimage to correct for a variety of distortions and deficiencies in order to facilitate subsequent data extraction from the corrected document subimage.