G06V30/164

RELIABLE DETERMINATION OF FIELD VALUES IN DOCUMENTS WITH REMOVAL OF STATIC FIELD ELEMENTS
20240144711 · 2024-05-02 ·

Aspects and implementations provide for mechanisms of detection of fields in electronic documents and determination of values of the detected field. The disclosed techniques include obtaining an input into a machine learning model (MLM), the input including a first image of a field extracted from a document and depicting one or more static elements of the field and a field value, the input and further including a second image of the field. The input may be processed using the MLM to identify one or more static regions that correspond to static elements of the field. The identified static regions may be used to modify the first image in which the static regions are removed or have a reduced visibility. The modified image may be used to determine the field value.

METHOD AND DEVICE FOR EXTRACTING INFORMATION IN HISTOGRAM
20190266435 · 2019-08-29 ·

The present application relates to a method and device for extracting information from a histogram for display on an electronic device. The method comprises the following steps: inputting, into the electronic device, a document, which includes a histogram to be processed; detecting each element in the histogram to be processed by using a target detection method based on a Faster R-CNN model pre-stored in the electronic device; performing text recognition on each detected text element box by the electronic device; to extract corresponding text information; and converting all the detected elements and text information into structured data for display on the electronic device. The method and the device can detect all the elements in the histogram through deep learning and the use of the Faster R-CNN model for target detection, thus providing a simple and effective solution for information extraction in the histogram.

CONTRAST ENHANCEMENT AND REDUCTION OF NOISE IN IMAGES FROM CAMERAS

The subject matter of this specification can be implemented in, among other things, a method including identifying one or more blocks in an electronic image that depicts text characters. The method includes identifying one or more text blocks among the blocks that depict the text characters. The method includes identifying an average text contrast for each of the text blocks. The method includes identifying a type for each pixel in each of the text blocks based on the average text contrast. The method includes performing local adaptive filtering on a first neighborhood of pixels around each pixel in each of the text blocks to determine a brightness for the pixel based on the identified type. The method includes storing, in at least one memory, the electronic image including the determined brightness for each pixel in each of the text blocks.

SALIENCE-AWARE CROSS-ATTENTION FOR ABSTRACTIVE SUMMARIZATION

A method including: receiving an input comprising natural language texts at an encoder; adding a token to the input; obtaining a last-layer hidden state as a natural language text representation; feeding the natural language text representation into a single-layer classification head; predicting a salience allocation based on the single-layer classification head; developing a salience-aware cross-attention (SACA) decoder to determine salience in the natural language text representation; mapping a plurality of salience degrees to a plurality of trainable salience embeddings; estimating an amount of signal to accept from the plurality of trainable salience embeddings; incorporating the salience allocation and the signal in a cross-attention layer model; and generating a summarization based on the SACA decoder and the cross-attention layer model.

SALIENCE-AWARE CROSS-ATTENTION FOR ABSTRACTIVE SUMMARIZATION

A method including: receiving an input comprising natural language texts at an encoder; adding a token to the input; obtaining a last-layer hidden state as a natural language text representation; feeding the natural language text representation into a single-layer classification head; predicting a salience allocation based on the single-layer classification head; developing a salience-aware cross-attention (SACA) decoder to determine salience in the natural language text representation; mapping a plurality of salience degrees to a plurality of trainable salience embeddings; estimating an amount of signal to accept from the plurality of trainable salience embeddings; incorporating the salience allocation and the signal in a cross-attention layer model; and generating a summarization based on the SACA decoder and the cross-attention layer model.

ELECTRONIC APPARATUS AND IMAGE PROCESSING METHOD
20240249542 · 2024-07-25 · ·

An electronic apparatus includes a processor configured to execute area extraction processing of extracting a character area including a character in an image, processing of applying dithering to an area other than the character area in the image, and emphasizing processing of emphasizing the character in the character area.

ELECTRONIC APPARATUS AND IMAGE PROCESSING METHOD
20240249542 · 2024-07-25 · ·

An electronic apparatus includes a processor configured to execute area extraction processing of extracting a character area including a character in an image, processing of applying dithering to an area other than the character area in the image, and emphasizing processing of emphasizing the character in the character area.

Model based document image enhancement

Systems and methods are disclosed for model based document image enhancement. Instead of requiring paired dirty and clean images for training a model to clean document images (which may cause privacy concerns), two models are trained on the unpaired images such that only the dirty images are accessed or only the clean images are accessed at one time. One model is a first implicit model to translate the dirty images from a source space to a latent space, and the other model is a second implicit model to translate the images from the latent space to clean images in a target space. The second implicit model is trained based on translating electronic document images in the target space to the latent space. In some implementations, the implicit models are diffusion models, such as denoising diffusion implicit models based on solving ordinary differential equations.

Optical character recognition systems and methods for personal data extraction

Methods and systems for extracting personal data from a sensitive document are provided. The system includes a document prediction module, a cropping module, a denoising module, and an optical character recognition (OCR) module. The document prediction module predicts type of document of the sensitive document using a keypoint matching-based approach and the cropping module extracts document shape and extracts one or more fields comprising text or pictures from the sensitive document. The denoising module prepares the one or more fields for optical character recognition, and the OCR module performs optical character recognition on the denoised one or more fields to detect characters in the one or more fields.

Optical character recognition systems and methods for personal data extraction

Methods and systems for extracting personal data from a sensitive document are provided. The system includes a document prediction module, a cropping module, a denoising module, and an optical character recognition (OCR) module. The document prediction module predicts type of document of the sensitive document using a keypoint matching-based approach and the cropping module extracts document shape and extracts one or more fields comprising text or pictures from the sensitive document. The denoising module prepares the one or more fields for optical character recognition, and the OCR module performs optical character recognition on the denoised one or more fields to detect characters in the one or more fields.