G06V30/166

Neural network-based optical character recognition

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural network-based optical character recognition. An embodiment of the system may generate a set of bounding boxes based on reshaped image portions that correspond to image data of a source image. The system may merge any intersecting bounding boxes into a merged bounding box to generate a set of merged bounding boxes indicative of image data portions that likely portray one or more words. Each merged bounding box may be fed by the system into a neural network to identify one or more words of the source image represented in the respective merged bounding box. The one or more identified words may be displayed by the system according to a standardized font and a confidence score.

Neural Network-based Optical Character Recognition

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural network-based optical character recognition. An embodiment of the system may generate a set of bounding boxes based on reshaped image portions that correspond to image data of a source image. The system may merge any intersecting bounding boxes into a merged bounding box to generate a set of merged bounding boxes indicative of image data portions that likely portray one or more words. Each merged bounding box may be fed by the system into a neural network to identify one or more words of the source image represented in the respective merged bounding box. The one or more identified words may be displayed by the system according to a standardized font and a confidence score.

SYSTEMS AND METHODS FOR MOBILE IMAGE CAPTURE AND PROCESSING OF DOCUMENTS
20210073786 · 2021-03-11 ·

Techniques for processing images of documents captured using a mobile device are provided. The images can include different sides of a document from a mobile device for an authenticated transaction. In an example implementation, a method incudes inspecting the images to detect a feature associated with a first side of the document. In response to determining an image is the first side of the document, a type of content is selected to be analyze on the image of the first side and one or more of regions of interests (ROIs) are identified on the image of the first side that are known to include the selected type of content. A process can include receiving a sub-image of the image of the first side from the preprocessing unit, and performing content detection test on the sub-image.

Electronic handwriting processor with convolutional neural networks
10949660 · 2021-03-16 · ·

An improved machine learning system is provided. For example, a content management server may provide real-time analysis of a user's handwriting to assess the user's knowledge of a language, including using a convolution neural network method. The convolution neural network method may be executed to normalize at least some identified strokes in the user's handwritten user input. Normalization may be performed by translating a window comprising a subset of pixels in a digital representation of the handwritten user input amongst a plurality of pixels in the digital representation.

Method for adaptive contrast enhancement in document images
10915746 · 2021-02-09 · ·

Systems and methods here may include utilizing a computer with a processor and a memory for receiving a pixelated image of an original size, converting the pixelated image to grayscale, calculating a magnitude of spatial gradients in the received pixelated grayscale image, downscaling the received pixelated grayscale image, computing a multiplicative gain correction for the downscaled received pixelated grayscale image, re-enlarging a gain multiplication for the original image, and applying the gain multiplication to the image to generate a processed image with higher contrast than the received pixelated image.

ROTATION AND SCALING FOR OPTICAL CHARACTER RECOGNITION USING END-TO-END DEEP LEARNING
20210073566 · 2021-03-11 ·

Disclosed herein are system, method, and computer program product embodiments for optical character recognition (OCR) pre-processing using machine learning. In an embodiment, a neural network may be trained to identify a standardized document rotation and scale expected by an OCR service performing character recognition. The neural network may then analyze a received document image to identify a corresponding rotation and scale of the document image relative to the expected standardized values. In response to this identification, the document image may be modified in the inverse to standardize the rotation and scale of the document image to match the format expected by the OCR service. In some embodiments, a neural network may perform the standardization as well as the character recognition using a shared computation graph.

APPARATUS FOR PREDICTING METADATA OF MEDICAL IMAGE AND METHOD THEREOF
20210012160 · 2021-01-14 ·

This disclosure relates to a computerized method to perform a machine learning on a relationship between medical images and metadata using a neural network and acquiring metadata by applying a machine learning model to medical images, and a method thereof. The apparatus and method may include training a prediction model for predicting metadata of medical images based on multiple medical images for learning and metadata matched with each of multiple medical images and predicting metadata of input medical image.

Systems and methods for mobile image capture and processing of documents

Techniques for processing images of documents captured using a mobile device are provided. The images can include different sides of a document from a mobile device for an authenticated transaction. In an example implementation, a method includes inspecting the images to detect a feature associated with a first side of the document. In response to determining an image is the first side of the document, a type of content is selected to be analyze on the image of the first side and one or more of regions of interests (ROIs) are identified on the image of the first side that are known to include the selected type of content. A process can include receiving a sub-image of the image of the first side from the preprocessing unit, and performing content detection test on the sub-image.

APPARATUS FOR PREDICTING METADATA OF MEDICAL IMAGE AND METHOD THEREOF
20200372299 · 2020-11-26 ·

This disclosure relates to a computerized method to perform a machine learning on a relationship between medical images and metadata using a neural network and acquiring metadata by applying a machine learning model to medical images, and a method thereof. The apparatus and method may include training a prediction model for predicting metadata of medical images based on multiple medical images for learning and metadata matched with each of multiple medical images and predicting metadata of input medical image.

Apparatus for predicting metadata of medical image and method thereof
10824908 · 2020-11-03 · ·

This disclosure relates to a computerized method to perform a machine learning on a relationship between medical images and metadata using a neural network and acquiring metadata by applying a machine learning model to medical images, and a method thereof. The apparatus and method may include training a prediction model for predicting metadata of medical images based on multiple medical images for learning and metadata matched with each of multiple medical images and predicting metadata of input medical image.