G06K9/34

Image processing apparatus
11188779 · 2021-11-30 · ·

Processing a dithered image comprising a grid of pixels including defining an array of pixels corresponding to a sub-region of the image; performing edge detection along the rows and the columns of the array; counting the number of edges detected along the rows of the array to determine the number of horizontal edges in the array; counting the number of edges detected along the columns of the array to determine the number of vertical edges in the array; identifying whether the sub-region is dithered based on the number of horizontal and vertical edges in the array; and selectively processing the corresponding sub-region of the image based on whether or not the sub-region is identified to be dithered. The identification step may also be based on the lengths of segments of similar pixels in the lines of the array.

Text rendering by microshifting the display in a head mounted display

Improved text rendering by microshifting the display in a head mounted display is provided. Systems, methods and computer-readable devices provide a head mounted display. The head mounted display includes a display unit; a rotational actuator coupled to the display unit; and a rotation processor having a rotation sensor coupled to the display unit wherein as the head mounted display is rotated, the rotation processor is operable to signal the rotational actuator to rotate the display unit to counter the rotation of the head mounted display.

Enhancing electronic documents for character recognition

Techniques for desirably translating a document image to an editable electronic textual document are presented. Utilizing respective applications, a document processing management component (DPMC) can convert the document image to a grayscale document image, remove noise from such image, rotate such image to reduce or eliminate any skewing of such image, and perform character recognition on the rotated grayscale document image to extract the textual information from such document to generate an electronic textual document. DPMC can associate a document identifier with the electronic textual document, and such document and document identifier can be stored in a data store. When such document is related to a device or other item, a code or textual string can be associated with the device or item, wherein a communication device can scan the code or textual string. In response, DPMC can retrieve such document, or information relating thereto, from the data store.

Method for Image Classification, Computer Device, and Storage Medium
20210365732 · 2021-11-25 ·

A method for image classification includes acquiring a to-be-classified image and inputting the to-be-classified image to a trained image classification model, the trained image classification model includes a localization segmentation sub-network, an alignment sub-network, and a classification sub-network, the alignment sub-network is formulized as a valve linkage function, the image classification model is obtained, and in a forward-propagation phase, an output of the valve linkage function is an aligned image; in a backward-propagation phase, the output of the valve linkage function is a function; subjecting the to-be-classified image through the localization segmentation sub-network for locating and segmenting a target object of the to-be-classified image to obtain a segmented image; subjecting the segmented image through the alignment sub-network, the alignment sub-network aligning the target object to obtain an aligned image; and subjecting the aligned image through the classification sub-network for fine-grain classification to obtain a class corresponding to the to-be-classified image.

DUAL STAGE NEURAL NETWORK PIPELINE SYSTEMS AND METHODS
20210365737 · 2021-11-25 ·

A method of identifying and recognizing characters using a dual-stage neural network pipeline, the method including: receiving, by a computing device, image data; providing the image data to a first convolutional layer of a convolutional neural network (CNN); applying, using the CNN, pattern recognition to the image data to identify a region of the image data containing text; providing sub-image data comprising the identified region of the image data to a convolutional recurrent neural network (CRNN); and recognizing, using the CRNN, the characters within the sub-image data.

IMAGE CLASSIFICATION METHOD, COMPUTER-READABLE STORAGE MEDIUM, AND COMPUTER DEVICE
20210365741 · 2021-11-25 ·

A computer device obtains a plurality of medical images. The device generates a texture image based on image data of a region of interest in the medical images. The device extracts a local feature from the texture image using a first network model. The device extracts a global feature from the medical images using a second network model. The device fuses the extracted local feature and the extracted global feature to form a fused feature. The device performs image classification based on the fused feature.

METHOD AND SYSTEM FOR SEGMENTING TOUCHING TEXT LINES IN IMAGE OF UCHEN-SCRIPT TIBETAN HISTORICAL DOCUMENT

A method and system for segmenting touching text lines in an image of a uchen-script Tibetan historical document are provided. The method includes: first obtaining a binary image of a uchen-script Tibetan historical document after layout analysis; detecting local baselines in the binary image, to generate a local baseline information set; detecting and segmenting a touching region in the binary image according to the local baseline information set, to generate a touching-region-segmented image; allocating connected components in the touching-region-segmented image to corresponding lines, to generate a text line allocation result; and splitting text lines in the touching-region-segmented image according to the text line allocation result, to generate a line-segmented image. In the present disclosure, touching text lines in a Tibetan historical document can be effectively segmented, and text line segmentation efficiency of the Tibetan historical document is improved.

INFORMATION PROCESSING METHOD AND INFORMATION PROCESSING SYSTEM

An information processing method includes the following executed by a computer: acquiring a first image and object data of an object appearing in the first image, extracting a portion of the first image that corresponds to a difference between the object data and an object detection result obtained by inputting the first image to a trained model, the trained model receiving an image as input to output an object detection result, acquiring a second image that includes a portion corresponding to the same object data as object data corresponding to the extracted portion of the first image, reflecting an image based on the extracted portion of the first image in the portion of the acquired second image that corresponds to the same object data, and generating training data for the trained model.

APPARATUS AND METHOD FOR TRAINING MODEL FOR IMAGE SEGMENTATION
20210366128 · 2021-11-25 ·

An image segmentation model training apparatus according to a disclosed embodiment includes a predictor, which generates a plurality of original masks using an unlabeled original image set and a pre-trained image segmentation model, a label generator, which generates a synthesized image set based on the original image set and the plurality of original masks, and generates a plurality of pseudo labels based on the synthesized image set, a pre-processor, which generates a training image set by performing pseudo labeling on the original image set and the synthesized image set using the plurality of pseudo labels, and a model trainer, which further trains the image segmentation model based on the training image set.

Terminal apparatus, character recognition system, and character recognition method
11182635 · 2021-11-23 · ·

A personal information separation unit separates a document image containing personal information into a personal information image containing the personal information and a general information image that does not contain the personal information on the basis of the document image, and transmits the general information image to a cloud server. A recognition result integration unit receives a general recognition result that is the recognition result of the character recognition processing for the general information image from the cloud server, and acquires a target recognition result that is the recognition result of the character recognition processing for the document image in accordance with the general recognition result and the information based on the personal information image.