G06V30/414

ON-DEVICE ARTIFICIAL INTELLIGENCE SYSTEMS AND METHODS FOR DOCUMENT AUTO-ROTATION
20230049296 · 2023-02-16 ·

An auto-rotation module having a single-layer neural network on a user device can convert a document image to a monochrome image having black and white pixels and segment the monochrome image into bounding boxes, each bounding box defining a connected segment of black pixels in the monochrome image. The auto-rotation module can determine textual snippets from the bounding boxes and prepare them into input images for the single-layer neural network. The single-layer neural network is trained to process each input image, recognize a correct orientation, and output a set of results for each input image. Each result indicates a probability associated with a particular orientation. The auto-rotation module can examine the results, determine what degree of rotation is needed to achieve a correct orientation of the document image, and automatically rotate the document image by the degree of rotation needed to achieve the correct orientation of the document image.

ON-DEVICE ARTIFICIAL INTELLIGENCE SYSTEMS AND METHODS FOR DOCUMENT AUTO-ROTATION
20230049296 · 2023-02-16 ·

An auto-rotation module having a single-layer neural network on a user device can convert a document image to a monochrome image having black and white pixels and segment the monochrome image into bounding boxes, each bounding box defining a connected segment of black pixels in the monochrome image. The auto-rotation module can determine textual snippets from the bounding boxes and prepare them into input images for the single-layer neural network. The single-layer neural network is trained to process each input image, recognize a correct orientation, and output a set of results for each input image. Each result indicates a probability associated with a particular orientation. The auto-rotation module can examine the results, determine what degree of rotation is needed to achieve a correct orientation of the document image, and automatically rotate the document image by the degree of rotation needed to achieve the correct orientation of the document image.

METHOD AND PLATFORM OF GENERATING DOCUMENT, ELECTRONIC DEVICE AND STORAGE MEDIUM

A method and a platform of generating a document, an electronic device, and a storage medium are provided, which relate to a field of an artificial intelligence technology, in particular to fields of computer vision and deep learning technologies, and may be applied to a text recognition scenario and other scenarios. The method includes: performing a category recognition on a document picture to obtain a target category result; determining a target structured model matched with the target category result; and performing, by using the target structured model, a structure recognition on the document picture to obtain a structure recognition result, so as to generate an electronic document based on the structure recognition result, wherein the structure recognition result includes a field attribute recognition result and a field position recognition result.

METHOD AND PLATFORM OF GENERATING DOCUMENT, ELECTRONIC DEVICE AND STORAGE MEDIUM

A method and a platform of generating a document, an electronic device, and a storage medium are provided, which relate to a field of an artificial intelligence technology, in particular to fields of computer vision and deep learning technologies, and may be applied to a text recognition scenario and other scenarios. The method includes: performing a category recognition on a document picture to obtain a target category result; determining a target structured model matched with the target category result; and performing, by using the target structured model, a structure recognition on the document picture to obtain a structure recognition result, so as to generate an electronic document based on the structure recognition result, wherein the structure recognition result includes a field attribute recognition result and a field position recognition result.

PRINTING SYSTEM, IMAGE PROCESSING APPARATUS, AND COMPARISON METHOD
20230049493 · 2023-02-16 · ·

A printing system includes processing circuitry. The processing circuitry acquires print data of a plurality of pages and extracts comparison data from the print data for each page. The processing circuitry, from first image data read from a printed material on which the print data is printed, acquires second image data at a position corresponding to the comparison data, for each page of the printed material. The processing circuitry outputs a comparison result of the comparison data and the read image data for each page.

PRINTING SYSTEM, IMAGE PROCESSING APPARATUS, AND COMPARISON METHOD
20230049493 · 2023-02-16 · ·

A printing system includes processing circuitry. The processing circuitry acquires print data of a plurality of pages and extracts comparison data from the print data for each page. The processing circuitry, from first image data read from a printed material on which the print data is printed, acquires second image data at a position corresponding to the comparison data, for each page of the printed material. The processing circuitry outputs a comparison result of the comparison data and the read image data for each page.

CONTINUOUS MACHINE LEARNING METHOD AND SYSTEM FOR INFORMATION EXTRACTION

Methods and systems for artificial intelligence (AI)-assisted document annotation and training of machine learning-based models for document data extraction are described. The methods and systems described herein take advantage of a continuous machine learning approach to create document processing pipelines that provide accurate and efficient data extraction from documents that include structured text, semi-structured text, unstructured text, or any combination thereof.

Representative document hierarchy generation

In some aspects, a method includes performing optical character recognition (OCR) based on data corresponding to a document to generate text data, detecting one or more bounded regions from the data based on a predetermined boundary rule set, and matching one or more portions of the text data to the one or more bounded regions to generate matched text data. Each bounded region of the one or more bounded regions encloses a corresponding block of text. The method also includes extracting features from the matched text data to generate a plurality of feature vectors and providing the plurality of feature vectors to a trained machine-learning classifier to generate one or more labels associated with the one or more bounded regions. The method further includes outputting metadata indicating a hierarchical layout associated with the document based on the one or more labels and the matched text data.

Representative document hierarchy generation

In some aspects, a method includes performing optical character recognition (OCR) based on data corresponding to a document to generate text data, detecting one or more bounded regions from the data based on a predetermined boundary rule set, and matching one or more portions of the text data to the one or more bounded regions to generate matched text data. Each bounded region of the one or more bounded regions encloses a corresponding block of text. The method also includes extracting features from the matched text data to generate a plurality of feature vectors and providing the plurality of feature vectors to a trained machine-learning classifier to generate one or more labels associated with the one or more bounded regions. The method further includes outputting metadata indicating a hierarchical layout associated with the document based on the one or more labels and the matched text data.

METHOD OF DETECTING, SEGMENTING AND EXTRACTING SALIENT REGIONS IN DOCUMENTS USING ATTENTION TRACKING SENSORS

A method and system for detecting, segmenting, and extracting salient regions in documents by using attention tracking sensors is provided. The method includes: receiving an image that corresponds to a document; receiving, from a sensor, a sequence of measurements that correspond to a human reading of the document; determining, based on the sequence of measurements, at least one region of the document as being a salient document region; demarcating the salient document region in an electronically displayable manner; and outputting a file that includes a displayable version of the document with the demarcated document region. The salient document region may include a title, a section header, and/or a table. The sensor may be an eye-tracking sensor that detects a sequence of eye-gaze positions on the document as a function of time.