Patent classifications
G06V30/158
TEXT LINE IMAGE SPLITTING WITH DIFFERENT FONT SIZES
A method for splitting text line images includes receiving a text line image and identifying that the text line image comprises a plurality of zones, wherein each zone includes text whose font differs from the text of adjacent zones. The method further includes selecting a splitting position between multiple zones and splitting the text line image at the splitting position into a plurality of image segments, wherein each image segment contains at least one zone of the text line image and performing optical character recognition on each image segment to recognize a text segment of the image segment. In certain implementations, the method further includes generating one or more confidence measurements and selecting a splitting position that corresponds to a large gradient in the confidence measurement.
Character image processing method and apparatus, device, and storage medium
Provided are character image processing methods and apparatuses, devices, storage medium, and computer programs. The character image processing method mainly comprises: obtaining at least one image block containing a character in a character image to be processed; obtaining image block form transformation information of the image block on the basis of a neural network, the image block form transformation information being used for changing a character orientation in the image block to a predetermined orientation, and the neural network being obtained by means of training using an image block sample having form transformation label information; performing form transformation processing on the character image to be processed according to the image block form transformation information; and performing character recognition on the character image to be processed which is subjected to the form transformation.
TEXT PARTITIONING METHOD, TEXT CLASSIFYING METHOD, APPARATUS, DEVICE AND STORAGE MEDIUM
A text partitioning method, a text classifying method, an apparatus, a device and a storage medium, wherein the method includes: parsing a content image, to obtain a target text in a text format; according to a line break in the target text, partitioning the target text into a plurality of text sections; and according to a first data-volume threshold, partitioning sequentially the plurality of text sections into a plurality of text-to-be-predicted sets, wherein a data volume of a last one text section in each of the text-to-be-predicted sets is greater than a second data-volume threshold.
SYSTEM FOR DISTRIBUTED SERVER NETWORK WITH EMBEDDED IMAGE DECODER AS CHAIN CODE PROGRAM RUNTIME
A system is provided for a distributed server network with embedded image decoder as a chain code program runtime event. In particular, the system may comprise a distributed computing network comprising one or more decentralized nodes, each of which may store a separate copy of a distributed data register. The system may further comprise one or more specialized nodes which receive, assess, and analyze user input data, where the one or more specialized nodes may include a client identity node comprising an embedded image decoder which may be configured to analyze image portions of the user input data. Once the image data has been analyzed, the client identity node may convert the image data into a text format for storage within the distributed register.
Computer implemented method and system for optical character recognition
A computer implemented method for optical character recognition (OCR) of a character string in a text image. The method efficiently combines two different OCR engines with the computation that needs to be done by the second OCR engine depending on the results found by the first OCR engine. This method provides, in particular, a high speed and accurate results when the first OCR engine is fast and the second OCR engine is accurate. The combination is possible because the second OCR engine identifies each segment to be processed by the second OCR engine without needing to process all segments.
TEXT LINE NORMALIZATION SYSTEMS AND METHODS
A method for estimating text heights of text line images includes estimating a text height with a sequence recognizer. The method further includes normalizing a vertical dimension and/or position of text within a text line image based on the text height. The method may also further include calculating a feature of the text line image. In some examples, the sequence recognizer estimates the text height with a machine learning model.
Text line image splitting with different font sizes
A method for splitting text line images includes receiving a text line image and identifying that the text line image comprises a plurality of zones, wherein each zone includes text whose font differs from the text of adjacent zones. The method further includes selecting a splitting position between multiple zones and splitting the text line image at the splitting position into a plurality of image segments, wherein each image segment contains at least one zone of the text line image and performing optical character recognition on each image segment to recognize a text segment of the image segment. In certain implementations, the method further includes generating one or more confidence measurements and selecting a splitting position that corresponds to a large gradient in the confidence measurement.
Apparatus, method, and non-transitory recording medium for a document fold determination based on the change point block detection
An image processing apparatus includes a character determining unit configured to divide the read image into multiple blocks, each of the multiple blocks including multiple characters, and determine an inclination of each of the multiple characters included in each of the multiple blocks, a block processing unit configured to detect a change point block, the change point block being a block including characters having an inclination included in a first inclination interval, a number of the characters being equal to or larger than a first threshold, and including characters having an inclination included in a second inclination interval, a number of the characters being equal to or larger than the first threshold, the second inclination interval being different from the first inclination interval, and a fold determining unit configured to determine that the document is folded if the change point block is detected.
Neural network-based optical character recognition
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for neural network-based optical character recognition. An embodiment of the system may generate a set of bounding boxes based on reshaped image portions that correspond to image data of a source image. The system may merge any intersecting bounding boxes into a merged bounding box to generate a set of merged bounding boxes indicative of image data portions that likely portray one or more words. Each merged bounding box may be fed by the system into a neural network to identify one or more words of the source image represented in the respective merged bounding box. The one or more identified words may be displayed by the system according to a standardized font and a confidence score.
Method, terminal, and computer storage medium for image classification
Disclosed are a method, terminal and computer readable storage medium for image classification. The method includes: determining an image feature vector of an image based on a convolutional neural network, where the image comprises textual information; determining a text feature vector based on the textual information and an embedded network; determining an image-text feature vector by joining the image feature vector with the text feature vector; and determining a category of the image based on a result of a deep neural network, where the result is determined based on the image feature vector, the text feature vector and the image-text feature vector.