G06V30/133

METHOD FOR AUGMENTING DATA FOR DOCUMENT CLASSIFICATION AND APPARATUS THEREOF
20230137931 · 2023-05-04 · ·

The disclosure relates to a data augmentation method for document classification based on artificial intelligence, which includes: obtaining a plurality of document data; measuring quality information of the plurality of document data; classifying the plurality of document data by quality using the measured quality information, and detecting a distribution of the plurality of document data classified by quality; and augmenting document data corresponding to a specific quality group based on the detected document data distribution by quality.

GENERALIZED ANOMALY DETECTION

Described are methods and systems for training a system for detecting anomalies in images of documents in a class of documents. A plurality of training document images of training documents in a class of documents are obtained. For each training document image, the training document image is segmented into a plurality of region of interest (ROI) images, each ROI image corresponding to a respective ROI of the training document. For each ROI image, a plurality of transformations are applied to the ROI image to generate respective transform-specific features for the ROI image and respective transform-specific anomaly scores from the transform-specific features. Based on the respective anomaly scores of the plurality of training document images, a transform-specific threshold is computed for each transformation to separate document images containing an anomaly from document images not containing an anomaly.

OPTICAL CHARACTER RECOGNITION QUALITY EVALUATION AND OPTIMIZATION
20230368551 · 2023-11-16 · ·

A processor may receive an image and determine a number of foreground pixels in the image. The processor may obtain a result of optical character recognition (OCR) processing performed on the image. The processor may identify at least one bounding box surrounding at least one portion of text in the result and overlay the at least one bounding box on the image to form a masked image. The processor may determine a number of foreground pixels in the masked image and a decrease in the number of foreground pixels in the masked image relative to the number of foreground pixels in the image. Based on the decrease, the processor may modify an aspect of the OCR processing for subsequent image processing.

TEXT EXTRACTION USING OPTICAL CHARACTER RECOGNITION

Provided herein are systems and methods for extracting text from a document. Different optical character recognition (OCR) tools are used to extract different versions of the text in the document. Metrics evaluating the quality of the extracted text are compared to identify and select higher quality extracted text. A selected portion of text is compared to a threshold to ensure minimal quality. The selected portion of text is then saved. Error correction can be applied to the selected portion of text based on errors specific to the OCR tools or the document contents.

CAPTURED DOCUMENT IMAGE ENHANCEMENT

A contextual feature matrix that aggregates contextual information within a captured image of a document at multiple scales is generated using a multiscale aggregator machine learning model. Pixel-wise enhancement curves for the captured image are estimated based on the contextual feature matrix using an enhancement curve prediction machine learning model. The pixel-wise enhancement curves are iteratively applied to the captured image to enhance the document within the captured image.

Transformation matching in optical recognition
11715311 · 2023-08-01 · ·

The present disclosure provides methods and systems for performing licence plate matching. A reference list comprising licence plate identifiers is obtained. At least one transformation rule is applied to the licence plate identifiers of the reference list to generate entries forming at least one augmented reference list, and storing the at least one augmented reference list in at least one database. A search request comprising an input licence plate identifier is obtained. The at least one transformation rule is applied to the input licence plate identifier to generate a search list comprising at least one search term. The at least one database is queried with the search list to identify a match between at least one of the entries of the at least one augmented reference list and the at least one search term of the search list. A signal responsive is output to identifying the match.

Information processing apparatus, method and non-transitory recording medium storing program codes for replacing color of character pixel based on selection of first and second processing methods
11528387 · 2022-12-13 · ·

An information processing apparatus includes circuitry configured to recognize a character in an original image by character recognition, to generate a font, calculate a certainty factor of the character recognition, determine a color of the font based on the certainty factor by a first processing method, and determine a replacement color by a second processing method based on the certainty factor, and convert the original image into an editable electronic document including the font and an image in which the character pixel has the replacement color. The replacement color is a color to which a color of a character pixel of the recognized character is replaced.

Computer-implemented machine learning for detection and statistical analysis of errors by healthcare providers

For training data pairs comprising training text (a radiological report) and training images (radiological images associated with the radiological report), a first encoder network determines word embeddings for the training text. A concept is generated from the operation of layers of the first encoder network, which is regularized by a first loss between the generated concept and a labeled concept for the training text. A second encoder network determines features for the training image. A heatmap is generated from the operation of layers of the second encoder network, which is regularized by a second loss between the generated heatmap and a labeled heatmap for the training image. A categorical cross entropy loss is calculated between a diagnostic quality category (classified by an error encoder) and a labeled diagnostic quality category for the training data pair. A total loss function comprising the first, second, and categorical cross entropy losses is minimized.

Image processing method and apparatus, and storage medium

Provided are an image processing method, apparatus and a storage medium. The method includes: collecting at least one image within a field of view of an image collection device in real time through the image collection device of a terminal device; determining whether a first image includes at least one character based on the collected first image; under a situation that the collected first image includes the character, outputting prompt information to a user; receiving a setting instruction input by the user; and improving sharpness of the character according to the setting instruction to obtain a second image, and compressing and storing the second image.

CHARACTER INPUT DEVICE, CHARACTER INPUT METHOD, AND COMPUTER-READABLE STORAGE MEDIUM STORING A CHARACTER INPUT PROGRAM
20220215681 · 2022-07-07 · ·

A first character string obtainment unit according to one or more embodiments may obtain a first character string in response to an input character string that has been input. A similar character extraction unit extracts similar characters having similar shapes as characters in the first character string. A second character string generation unit generates one or more second character strings in which some or all of the characters in the first character string are replaced with similar characters extracted by the similar character extraction unit. Then, a conversion candidate output unit outputs the first character string and the second character strings as conversion candidates for the input character string.