G06V30/412

TECHNIQUES FOR DOCUMENT CREATION BASED ON IMAGE SECTIONS
20230222282 · 2023-07-13 · ·

In an embodiment, an image reception system is communicatively coupled to an image analysis system and is configured to receive a digital image and analyze the pixels of the digital image to determine one or more regions in the digital image. For each region in the one or more regions in the digital image, the image analysis system recognizes the content in the region. A document creation system communicatively coupled to the image analysis system is configured to create a digital document based on the recognized content for the one or more regions. In some embodiments, the image analysis system is further configured to analyze the digital image to detect one or more of the following: region markers, tables, headers.

Intelligent recognition and extraction of numerical data from non-numerical graphical representations

Embodiments of the invention are directed to systems, methods, and computer program products for a unique platform for analyzing, classifying, extracting, and processing information from graphical representations. Embodiments of the inventions are configured to provide an end to end automated solution for extracting data from graphical representations and creating a centralized database for providing graphical attributes, image skeletons, and other metadata information integrated with a graphical representation classification training layer. The invention is designed to receive a graphical representation for analysis, intelligently identify and extract objects and data in the graphical representation, and store the data attributes of the graphical representation in an accessible format in an automated fashion.

METHODS AND SYSTEMS FOR DETERMINING AUTHENTICITY OF A DOCUMENT
20230222826 · 2023-07-13 ·

A method for determining authenticity of a document is provided that includes receiving, by an electronic device, an image of a document, assigning a label to the image, and obtaining vectors for each image in a subset of images. Each image is of a document and is assigned the same label as the received image. Moreover, the method includes encoding the received image into a vector, calculating a distance between the vector of the received image and each obtained vector, comparing each of the calculated distances against a threshold distance, and calculating a number of the calculated distances that are less than or equal to the threshold distance. In response to determining the calculated number is at least equal to a required number, the document in the received image is determined to be authentic. Otherwise, the received image requires manual review.

METHODS AND SYSTEMS FOR DETERMINING AUTHENTICITY OF A DOCUMENT
20230222826 · 2023-07-13 ·

A method for determining authenticity of a document is provided that includes receiving, by an electronic device, an image of a document, assigning a label to the image, and obtaining vectors for each image in a subset of images. Each image is of a document and is assigned the same label as the received image. Moreover, the method includes encoding the received image into a vector, calculating a distance between the vector of the received image and each obtained vector, comparing each of the calculated distances against a threshold distance, and calculating a number of the calculated distances that are less than or equal to the threshold distance. In response to determining the calculated number is at least equal to a required number, the document in the received image is determined to be authentic. Otherwise, the received image requires manual review.

METHODS, SYSTEMS, ARTICLES OF MANUFACTURE AND APPARATUS TO LABEL TEXT ON IMAGES

Methods, systems, articles of manufacture and apparatus are disclosed to label text on images. An example apparatus includes colorizer circuitry to apply color to text boxes corresponding to optical character recognition (OCR) data associated with an image, OCR manager circuitry to render an OCR text prompt associated with the OCR data, the OCR text prompt to be rendered proximate to respective ones of the text boxes, the OCR text prompt to display a text portion of the OCR data, and edit circuitry to (a) render an interface in response to selection of the OCR text prompt, the interface populated with the text portion of the OCR data, and (b) in response to an overwrite input to the interface, update the text portion of the OCR data in a memory corresponding to the image.

Image analysis based document processing for inference of key-value pairs in non-fixed digital documents

An online system extracts information from non-fixed form documents. The online system receives an image of a form document and obtains a set of phrases and locations of the set of phrases on the form image. For at least one field, the online system determines key scores for the set of phrases. The online system identifies a set of candidate values for the field from the set of identified phrases and identifies a set of neighbors for each candidate value from the set of identified phrases. The online system determines neighbor scores, where a neighbor score for a candidate value and a respective neighbor is determined based on the key score for the neighbor and a spatial relationship of the neighbor to the candidate value. The online system selects a candidate value and a respective neighbor based on the neighbor score as the value and key for the field.

Image analysis based document processing for inference of key-value pairs in non-fixed digital documents

An online system extracts information from non-fixed form documents. The online system receives an image of a form document and obtains a set of phrases and locations of the set of phrases on the form image. For at least one field, the online system determines key scores for the set of phrases. The online system identifies a set of candidate values for the field from the set of identified phrases and identifies a set of neighbors for each candidate value from the set of identified phrases. The online system determines neighbor scores, where a neighbor score for a candidate value and a respective neighbor is determined based on the key score for the neighbor and a spatial relationship of the neighbor to the candidate value. The online system selects a candidate value and a respective neighbor based on the neighbor score as the value and key for the field.

Information processing apparatus and non-transitory computer readable medium

An information processing apparatus includes a processor configured to: acquire an image corresponding to a key character string from a target image in response to the key character string that serves as a character string specified beforehand as a key and is acquired from results of character recognition performed on the target image including character strings; extract, by using results of acquiring the image corresponding to the key character string, from the results of the character recognition a value character string that serves as a character string indicating a value corresponding to the key character string; and output the key character string and the value character string corresponding to the key character string.

HANDWRITING RECOGNITION PIPELINES FOR GENEALOGICAL RECORDS

Disclosed herein relates to example embodiments for recognizing handwritten information in a genealogical record. A computing server may receive a genealogical record. The genealogical record may take the form of an image of a physical form having a structured layout, fields, and handwritten information. The computing server may divide the genealogical record into a plurality of areas based on the structured layout. The computing server may identify, for a particular area, a type of field that is included within the particular area. The computing server may select a handwriting recognition model for identifying the handwritten information in the particular area. The handwriting recognition model may be selected based on the type of the field. The computing server may input an image of the particular area to the handwriting recognition model to generate text of the handwritten information. The computing server may store the text of the handwritten information.

HANDWRITING RECOGNITION PIPELINES FOR GENEALOGICAL RECORDS

Disclosed herein relates to example embodiments for recognizing handwritten information in a genealogical record. A computing server may receive a genealogical record. The genealogical record may take the form of an image of a physical form having a structured layout, fields, and handwritten information. The computing server may divide the genealogical record into a plurality of areas based on the structured layout. The computing server may identify, for a particular area, a type of field that is included within the particular area. The computing server may select a handwriting recognition model for identifying the handwritten information in the particular area. The handwriting recognition model may be selected based on the type of the field. The computing server may input an image of the particular area to the handwriting recognition model to generate text of the handwritten information. The computing server may store the text of the handwritten information.