Patent classifications
G06V30/162
Handwritten content removing method and device and storage medium
A handwritten content removing method and device and a storage medium. The handwritten content removing method comprises: acquiring an input image of a text page to be processed, the input image comprising a handwritten region, which comprises a handwritten content (S10); identifying the input image so as to determine the handwritten content in the handwritten region (S11); and removing the handwritten content in the input image so as to obtain an output image (S12).
Handwritten content removing method and device and storage medium
A handwritten content removing method and device and a storage medium. The handwritten content removing method comprises: acquiring an input image of a text page to be processed, the input image comprising a handwritten region, which comprises a handwritten content (S10); identifying the input image so as to determine the handwritten content in the handwritten region (S11); and removing the handwritten content in the input image so as to obtain an output image (S12).
IMAGE READING APPARATUS
An image reading apparatus includes a conveyance unit configured to convey an original; a reading unit comprising a reading sensor, the reading sensor having a light receiving element to receive light of a first color and a light receiving element to receive light of a second color that is different from the first color, wherein the reading unit is configured to read an image of the original conveyed by the conveyance unit by using the reading sensor to generate image data which represents a reading result of the reading unit; at least one processor configured to: determine a first abnormal position that is a position in a first direction of an abnormal pixel of the first color in an image represented by the image data.
Range and/or polarity-based thresholding for improved data extraction
Computerized techniques for improved binarization and extraction of information from digital image data are disclosed in accordance with various embodiments. The inventive concepts include rendering a digital image using a plurality of binarization thresholds to generate a plurality of binarized digital images, wherein at least some of the binarized digital images are generated using one or more binarization thresholds that are determined based on a priori knowledge regarding an object depicted in the digital image; identifying one or more connected components within the plurality of binarized digital images; and identifying one or more text regions within the digital image based on some or all of the connected components. Systems and computer program products are also disclosed.
Preprocessing images for OCR using character pixel height estimation and cycle generative adversarial networks for better character recognition
A text extraction computing method that comprises calculating an estimated character pixel height of text from a digital image. The method may scale the digital image using the estimated character pixel height and a preferred character pixel height. The method may binarizes the digital image. The method may remove distortions using a neural network trained by a cycle GAN on a set of source text images and a set of clean text images. The set of source text images and clean text images are unpaired. The source text images may be distorted images of text. Calculating the estimated character pixel height may include summarizing the rows of pixels into a horizontal projection, and determining a line-repetition period from the projection, and quantifying the portion of the line-repetition period that corresponds to the text as the estimated character pixel height. The method may extract characters from the digital image using OCR.
Preprocessing images for OCR using character pixel height estimation and cycle generative adversarial networks for better character recognition
A text extraction computing method that comprises calculating an estimated character pixel height of text from a digital image. The method may scale the digital image using the estimated character pixel height and a preferred character pixel height. The method may binarizes the digital image. The method may remove distortions using a neural network trained by a cycle GAN on a set of source text images and a set of clean text images. The set of source text images and clean text images are unpaired. The source text images may be distorted images of text. Calculating the estimated character pixel height may include summarizing the rows of pixels into a horizontal projection, and determining a line-repetition period from the projection, and quantifying the portion of the line-repetition period that corresponds to the text as the estimated character pixel height. The method may extract characters from the digital image using OCR.
SYSTEM LANGUAGE SWITCHING METHOD, READABLE STORAGE MEDIUM, TERMINAL DEVICE, AND APPARATUS
The present application relates to the technical field of computers, and particularly to a system language switching method, a computer readable storage medium, a terminal device, and a device. The method includes first obtaining a preset image for setting a system language of a target terminal, then extracting text information in the image and determining a target language corresponding to the text information, and finally switching the system language of the target terminal to the target language. Through the present application, the user only needs to prepare an image for setting the system language of the target terminal in advance, for example, a piece of paper with Chinese written, and a system can obtain the text information on the image through the processes of image acquisition, text information extraction, and the like, determine that the text message is Chinese, and finally switch the system language of the target terminal to Chinese. Operations in the entire process are extremely simple and convenient, greatly improving the user experience.
SYSTEMS AND METHODS FOR OBTAINING INSURANCE OFFERS USING MOBILE IMAGE CAPTURE
Systems and methods for using a mobile device to submit an application for an insurance policy using images of documents captured by the mobile device are provided herein. The information is then used by an insurance company to generate a quote which is then displayed to the user on the mobile device. A user captures images of one or more documents containing information needed to complete an insurance application, after which the information on the documents is extracted and sent to the insurance company where a quote for the insurance policy can be developed. The quote can then be transmitted back to the user. Applications on the mobile device are configured to capture images of the documents needed for an insurance application, such as a driver's license, insurance information card or a vehicle identification number (VIN). The images are then processed to extract the information needed for the insurance application.
Background noise reduction using a variable range of color values dependent upon the initial background color distribution
A method to reduce background noise in a document image. The method includes extracting, from the document image, a connected component corresponding to a background of the document image, generating a histogram of pixel values of the connected component, generating, using a non-linear mapping function based on the histogram, a non-linear probability distribution of the pixel values in the connected component, generating, based at least on a comparison between the non-linear probability distribution and a predetermined threshold, a replacement range of the pixel values, selecting, from the connected component, a pixel having a pixel value within the replacement range, and converting the pixel value of the pixel to a uniform background color.
PREPROCESSING IMAGES FOR OCR USING CHARACTER PIXEL HEIGHT ESTIMATION AND CYCLE GENERATIVE ADVERSARIAL NETWORKS FOR BETTER CHARACTER RECOGNITION
A text extraction computing method that comprises calculating an estimated character pixel height of text from a digital image. The method may scale the digital image using the estimated character pixel height and a preferred character pixel height. The method may binarizes the digital image. The method may remove distortions using a neural network trained by a cycle GAN on a set of source text images and a set of clean text images. The set of source text images and clean text images are unpaired. The source text images may be distorted images of text. Calculating the estimated character pixel height may include summarizing the rows of pixels into a horizontal projection, and determining a line-repetition period from the projection, and quantifying the portion of the line-repetition period that corresponds to the text as the estimated character pixel height. The method may extract characters from the digital image using OCR.