G06V30/414

METHOD AND SYSTEM FOR CLASSIFYING DOCUMENT IMAGES

A method and system are used for managing and classifying electronic document images. Each of the electronic document images is divided into an array of image segments. The method extracts image features from each of the image segments to obtain numerical coefficients for each of the image segments. The numerical coefficients are compared with each other to generate sub-codes. A classification code is determined as a combination of the sub-codes. The classification codes of a plurality of electronic document images can be stored in a database for further analysis. Based on the classification codes, similarity rates between at two document images can be determined.

Collision avoidance for document field placement

Users of a database management engine may generate fillable digital documents by mapping interface elements onto form documents. When a user maps interface elements onto a form document, the user may accidentally overlap two or more interface elements. To rectify this, the database management engine may modify the position of one of interface elements based on a set of positioning rules. In addition, the database management engine may identify and suggest mappings to users based on similar documents that have been previously mapped. The database management engine identifies similar documents using information about the document, the user, and the mapping itself. The mapping associated with the most similar document may be provided to the user as a suggested mapping. The database management engine converts the form document and finalized mapping into a fillable digital document. The fillable digital document is sent to recipients, who complete the fillable digital document.

Collision avoidance for document field placement

Users of a database management engine may generate fillable digital documents by mapping interface elements onto form documents. When a user maps interface elements onto a form document, the user may accidentally overlap two or more interface elements. To rectify this, the database management engine may modify the position of one of interface elements based on a set of positioning rules. In addition, the database management engine may identify and suggest mappings to users based on similar documents that have been previously mapped. The database management engine identifies similar documents using information about the document, the user, and the mapping itself. The mapping associated with the most similar document may be provided to the user as a suggested mapping. The database management engine converts the form document and finalized mapping into a fillable digital document. The fillable digital document is sent to recipients, who complete the fillable digital document.

Analyzing documents using machine learning

A document analysis device that includes a memory operable to store a machine learning model configured to receive a sentence as an input and to output a classification identifier that is associated with a sentence type for the received sentence. The device further includes an artificial intelligence (AI) processing engine configured to receive a document comprising text, to sentences within the document, and to classify the sentences using the machine learning model. The AI processing engine is further configured to identify tagging rules for the document and to annotate one or more sentences from the document with a sentence type that matches a sentence type that is identified by the tagging rules for the document.

Method and apparatus for customizing natural language processing model

A method for model customization according to an embodiment includes providing a user with prediction results of each of a plurality of pre-trained natural language processing models for a document subjected to analysis selected from a document set including a plurality of documents, acquiring user feedback on the prediction results from the user, generating a plurality of augmented documents from at least one of the plurality of documents based on data attributes of each of the plurality of documents and the user feedback; and retraining at least one of the plurality of natural language processing models, using training data including the plurality of augmented documents.

Method and apparatus for customizing natural language processing model

A method for model customization according to an embodiment includes providing a user with prediction results of each of a plurality of pre-trained natural language processing models for a document subjected to analysis selected from a document set including a plurality of documents, acquiring user feedback on the prediction results from the user, generating a plurality of augmented documents from at least one of the plurality of documents based on data attributes of each of the plurality of documents and the user feedback; and retraining at least one of the plurality of natural language processing models, using training data including the plurality of augmented documents.

Using serial machine learning models to extract data from electronic documents
11594057 · 2023-02-28 · ·

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for machine learning. One of the methods includes receiving a document having a plurality of first text strings; extracting the plurality of first text strings from the document; providing the extracted plurality of first text strings to a first machine learning model, wherein the first machine learning model is trained to output a numerical vector representation for each input first text string; providing the output vector representations from the first machine learning model to a second machine learning model, wherein the second machine learning model is trained to output a second text string for each input vector representation; and processing the second text strings to generate an output.

Using serial machine learning models to extract data from electronic documents
11594057 · 2023-02-28 · ·

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for machine learning. One of the methods includes receiving a document having a plurality of first text strings; extracting the plurality of first text strings from the document; providing the extracted plurality of first text strings to a first machine learning model, wherein the first machine learning model is trained to output a numerical vector representation for each input first text string; providing the output vector representations from the first machine learning model to a second machine learning model, wherein the second machine learning model is trained to output a second text string for each input vector representation; and processing the second text strings to generate an output.

METHOD AND DEVICE FOR PROVIDING A TRUSTED ENVIRONMENT FOR EXECUTING AN ANALOGUE-DIGITAL SIGNATURE
20180013563 · 2018-01-11 ·

The invention relates to the field of providing a trusted environment for executing an analogue-digital signature. The claimed document-signing device in the form of a stylus includes a protective compartment, in which the following are disposed: a microcontroller with a programme code; a memory with a secret digital signature key; and additionally inertial sensors, which are connected to the microcontroller; a lens; and a camera, which is also connected to the microcontroller. A wireless interface is used in order to communicate with a computer. The inertial sensors serve to verify the handwritten signature of the user, while the lens and camera serve to carry out a comparison with the text of an electronic document uploaded via the wireless interface. In this way it is ensured that verified information enters the trusted environment of the stylus.

Text recognition for a neural network
11710304 · 2023-07-25 · ·

Image data having text associated with a plurality of text-field types is received, the image data including target image data and context image data. The target image data including target text associated with a text-field type. The context image data providing a context for the target image data. A trained neural network that is constrained to a set of characters for the text-field type is applied to the image data. The trained neural network identifies the target text of the text-field type using a vector embedding that is based on learned patterns for recognizing the context provided by the context image data. One or more predicted characters are provided for the target text of the text-field type in response to identifying the target text using the trained neural network.