G06K9/34

Method of character recognition in written document

A method for recognizing characters in an image of a document having at least one alphanumeric field. The method includes the steps of enhancing an image contrast to highlight the characters in the image; detecting contours of objects in the image to create a mask that highlights the characters; segmenting the image using a tree with connected components and applying the mask thereto in order to extract the characters from the image; performing a character recognition on the extracted objects. A device for implementing the method.

People detection system with feature space enhancement

A people detection system with feature space enhancement is provided. The system includes a memory having computer-readable instructions stored therein. The system includes a processor configured to access a plurality of video frames captured using one or more overhead video cameras installed in a space and to extract one or more raw images of the space from the plurality of video frames. The processor is further configured to process the one or more raw images to generate a plurality of positive image samples and a plurality of negative image samples. The positive image samples include images having one or more persons present within the space and the negative image samples comprise images without the persons. The processor is configured to apply at least one of a crop factor and a resize factor to the positive and the negative image samples to generate curated positive and negative image samples and to detect one or more persons present in the space using a detection model trained by the curated positive and negative image samples.

Method and apparatus for segmenting sky area, and convolutional neural network
11151403 · 2021-10-19 · ·

The present disclosure provides a method and apparatus for segmenting a sky area, and a convolutional neural network. The method includes: acquiring, by the image input layer, an original image; extracting, by the first convolutional neural network, a plurality of sky feature images with different scales from the original image; processing, by the plurality of cascaded second convolutional neural networks, the plurality of sky feature images to output a target feature image; up-sampling, by the up-sampling layer, the target feature image to obtain an up-sampled feature image; determining, by the sky area determining layer, a pixel area of which a gray value is greater than or equal to a preset gray value in the up-sampled feature image as a sky area.

Text line image splitting with different font sizes
11151371 · 2021-10-19 · ·

A method for splitting text line images includes receiving a text line image and identifying that the text line image comprises a plurality of zones, wherein each zone includes text whose font differs from the text of adjacent zones. The method further includes selecting a splitting position between multiple zones and splitting the text line image at the splitting position into a plurality of image segments, wherein each image segment contains at least one zone of the text line image and performing optical character recognition on each image segment to recognize a text segment of the image segment. In certain implementations, the method further includes generating one or more confidence measurements and selecting a splitting position that corresponds to a large gradient in the confidence measurement.

Information processing apparatus and information processing method

An information processing apparatus includes processing circuitry. The circuitry acquires first ledger sheet definition information and second ledger sheet definition information from a memory. The first ledger sheet definition information defines relative positions of an item and a value of the item in a ledger sheet. The second ledger sheet definition information defines relative positions of an item and a value of the item in a ledger sheet unique to a user. Based on at least one of the first ledger sheet definition information and the second ledger sheet definition information, the circuitry extracts an item and a value of the item from reading result information that associates a character string read from a ledger sheet image with information representing a position of the character string, and the circuitry outputs the extracted item and value of the item as a recognition result.

Image processing device, method and non-transitory computer readable medium

An image processing device includes multiple image processing units, each trained to accommodate a different feature possibly contained in an image, a decision unit that decides a sequence of the multiple image processing units according to the features contained in an input image, and an application unit that applies the image processing units to the input image in the sequence decided by the decision unit.

Text wrap detection
11151370 · 2021-10-19 · ·

In implementations of text wrap detection, one or more computing devices of a system implement a text wrap module for detecting text wrap around a component of digital content of a document. The document is preprocessed to segregate the digital content into a text group and a non-text group. Members of the text group are overlaid with a graphical element colored to provide a contrast between the graphical element and the component of the digital content. The document is converted to a digital image and a feature map of the digital image is generated. The feature map is further processed using machine learning and a detection indication is output. The detection indication may indicate that text wrap is detected around a member of the text group, a member of the non-text group, or that no text wrap is detected.

TEXT CLASSIFICATION
20210319247 · 2021-10-14 ·

A text classifying apparatus (100), an optical character recognition unit (1), a text classifying method (S220) and a program are provided for performing the classification of text. A segmentation unit (110) segments an image into a plurality of lines of text (401-412; 451-457; 501-504; 701-705) (S221). A selection unit (120) selects a line of text from the plurality of lines of text (S222-S223). An identification unit (130) identifies a sequence of classes corresponding to the selected line of text (S224). A recording unit (140) records, for the selected line of text, a global class corresponding to a class of the sequence of classes (S225-S226). A classification unit (150) classifies the image according to the global class, based on a confidence level of the global class (S227-S228).

OCR SYSTEM
20210319248 · 2021-10-14 · ·

An OCR system which acquires character data from a form (50) through OCR processing is characterized by: managing an OCR information table (34e) in which an issuer name of an issuer on the form (50) is associated with a font name of a font used in the OCR processing; and, when the OCR processing is performed on an issuer-recorded content reading target area in the form (50), performing the OCR processing (S156) in the font indicated by the font name associated in the OCR information table with the issuer name of the issuer of the form (50).

ONLINE TRAINING DATA GENERATION FOR OPTICAL CHARACTER RECOGNITION

A method and system to generate training data for a deep learning model in memory instead of loading pre-generated data from disk storage. A corpus may be stored as lines of text. The lines of text can be manipulated in the memory of a central processing unit (CPU) of a computing system, using asynchronous multi-processing, in parallel with a training process being conducted on the system's graphics processing unit (GPU). With such an approach, for a given line of text, it is possible to take advantage of different fonts and different types of image augmentation without having to put the images in disk storage for subsequent retrieval. Consequently, the same line of text can be used to generate different training images for use in different epochs, providing more variability in training data (no training sample is trained on more than once). A single training corpus may yield many different training data sets. In one aspect, the model being trained is a deep learning model, which may be one of several different types of neural networks. The training enables the deep learning model to perform OCR on line images.