G06V30/1914

Synthetic augmentation of document images

A computerized method and system for adding distortions to a computer-generated image of a document stored in an image file. An original computer-generated image file is selected and is processed to generate one or more distorted image files for each original computer-generated image file by selecting one or more augmentation modules from a set of augmentation modules to form an augmentation sub-system. The original computer-generated image file is processed with the augmentation sub-system to generate an augmented image file by altering the original computer-generated image file to add distortions that simulate distortions introduced during scanning of a paper-based representation of a document represented in the original computer-generated image file.

Book scanning using machine-trained model
10991081 · 2021-04-27 · ·

This application discloses a technology for flattening a photographed page of a book and straightening texts therein. The technology uses one or more mathematical models to represent a curved shape of the photographed page with certain parameters. The technology also uses one or more photographic image processing techniques to dewarp the photographed page using the parameters of the curved shape. The technology uses one or more additional parameters that represent certain features of the photographed page to dewarp the photographed page.

Predictive Data Analysis Using Image Representations of Categorical and Scalar Feature Data
20210133455 · 2021-05-06 ·

There is a need for more effective and efficient predictive data analysis solutions and/or more effective and efficient solutions for generating image representations of categorical/scalar data. Various embodiments of the present invention address one or more of the noted technical challenges. In one example, a method comprises receiving the one or more categorical input features; generating an image representation of the one or more categorical input features, wherein the image representation comprises image region values each associated with a categorical input feature, and further wherein each image region value of the one or more image region values is determined based at least in part on the corresponding categorical input feature associated with the image region value; and processing the image representation using an image-based machine learning model to generate the image-based predictions.

ELECTRONIC DOCUMENT DATA EXTRACTION
20210133498 · 2021-05-06 ·

Methods, systems, and computer storage media are provided for data extraction. A target document representation may be generated based on modified text of a target electronic document. A measure of similarity may be determined between the target document representation and a reference document representation, which may be based on modified text of a reference electronic document. Based on the measure of similarity, the reference document representation may be selected. An extraction model associated with the selected reference document representation can then be used to extract data from the target document.

Image processing device for displaying object detected from input picture image
10930037 · 2021-02-23 · ·

An image processing device including an object detection unit for detecting one or more images of objects from an input picture image, on the basis of a model pattern of the object, and a detection result display unit for graphically superimposing and displaying a detection result. The detection result display unit includes a first frame for displaying the entire input picture image and a second frame for listing and displaying one or more partial picture images each including an image detected. In the input picture image displayed in the first frame, a detection result is superimposed and displayed on all the detected images, and in the partial picture image displayed in the second frame, a detection result of an image corresponding to each partial picture image is superimposed and displayed.

Generating variations of a known shred
10963685 · 2021-03-30 · ·

Introduced here is a machine learning related technique for supplying an observed model additional training data based upon previously received training data. To determine textual content of a character string based on a digital image that includes a handwritten version of the character string a substantial amount of training data is used. The character string can include one or more characters, and the characters can include any of letters, numerals, punctuation marks, symbols, spaces, etc. Disclosed herein is a technique to determine variations between different images of matching known character strings and substitute those variations into the images in order to create more images with the same known character string.

IMAGE PROCESSING SYSTEM, IMAGE PROCESSING METHOD, AND STORAGE MEDIUM
20210064859 · 2021-03-04 ·

According to the present disclosure, a handwriting image and a background image are combined, thereby generating a combined image, a correct answer label image is generated based on the handwriting image, and the generated combined image and the generated correct answer label image are used as learning data for training a neural network.

Method and apparatus for continuously displaying images on basis of similarity of images

Various embodiments of the present disclosure may store instructions to perform image recognition for a plurality of images, calculate similarity between the plurality of images, based at least partially on a result of the image recognition, create a group including at least two images of the plurality of images, based at least partially on the calculated similarity, determine a sequence for displaying, the at least two images included in the group, based at least partially on similarity between the at least two images included in the group, and output the at least two images onto the display in the sequence. In addition, various other embodiments are possible.

SYSTEMS AND METHODS FOR INFORMATION EXTRACTION FROM TEXT DOCUMENTS WITH SPATIAL CONTEXT

Performing information extraction from an electronic document is disclosed. A method comprises: receiving a semi-structured input document; retrieving an entity model that provides one or more domain variable definitions for one or more domain variables, wherein the entity model and the input document correspond to a common domain; determining that the input document includes an entity that satisfies a first domain variable definition corresponding to a first domain variable; retrieving a relational model that provides, for the first domain variable, one or more relational definitions comprising spatial restrictions for one or more values corresponding to the first domain variable; extracting one or more data elements from the input document that satisfy the one or more relational definitions; and generating an information graph having a structured data format, wherein the one or more data elements extracted from the input document correspond to the first domain variable in the structured data format.

Image creation and assessment method and system
10867562 · 2020-12-15 ·

A software-based method and system enables the creation and automatic assessment of an image such as a free-body diagram by adding pre-made or user-created components to create the image. The input to the software may be a vectorized entity representation of constituent elements of a user's submitted image, including element properties. A pre-authored correct solution and a grading rubric may be used to compute a grade to be returned to the user as output. The assigned grade can be proportional to the severity of any errors in accordance with the grading rubric. The method and system may also generate feedback to the user indicating where and why grade points were lost.