G06V30/41

Systems and methods for generating document score adjustments

Disclosed is a computer-implemented method for determining a score adjustment for a search document, comprising determining a first attractiveness model of a first document from one or more documents based on one or more user interactions associated with the first document; determining a second attractiveness model of a second document from one or more documents based on one or more user interactions associated with the second document; determining one or more pairwise comparisons of documents based on the first and second attractiveness models of the first and second documents; training an adjustment model based on the pairwise comparisons of documents; and inputting the search document into the adjustment model to determine the score adjustment.

SYSTEM AND METHOD FOR GENERATING BEST POTENTIAL RECTIFIED DATA BASED ON PAST RECORDINGS OF DATA
20230237822 · 2023-07-27 · ·

Various methods, apparatuses/systems, and media for data processing are disclosed. A processor receives a digital document; applies an optical character recognition (OCR) algorithm on said received digital document by utilizing an OCR tool; identifies defective data extracted by the OCR tool resulted from relatively inferior image quality of the received digital document; implements an auto rectification algorithm on the identified defective data; automatically generates, in response to implementing the auto rectification algorithm, corresponding auto-rectified data for each identified defective data; records the defective data and corresponding auto-rectified data at a field level; receives user input data on said recorded auto-rectified data; determines whether the auto-rectified data is correct or not; and populates, based on determining that the auto-rectified data is correct, a machine learning model with said received user input data to be utilized for subsequently received digital document.

SYSTEM AND METHOD FOR GENERATING BEST POTENTIAL RECTIFIED DATA BASED ON PAST RECORDINGS OF DATA
20230237822 · 2023-07-27 · ·

Various methods, apparatuses/systems, and media for data processing are disclosed. A processor receives a digital document; applies an optical character recognition (OCR) algorithm on said received digital document by utilizing an OCR tool; identifies defective data extracted by the OCR tool resulted from relatively inferior image quality of the received digital document; implements an auto rectification algorithm on the identified defective data; automatically generates, in response to implementing the auto rectification algorithm, corresponding auto-rectified data for each identified defective data; records the defective data and corresponding auto-rectified data at a field level; receives user input data on said recorded auto-rectified data; determines whether the auto-rectified data is correct or not; and populates, based on determining that the auto-rectified data is correct, a machine learning model with said received user input data to be utilized for subsequently received digital document.

Method and System of Predictive Document Verification and Machine Learning Therefor
20230230088 · 2023-07-20 · ·

Provided are methodology and system countering fraudulent document and/or image use when authentication of a transaction based on a given document or image use is required. Additionally provided is a manner of machine learning adapting the methodology for implementation thereof.

INFORMATION PROCESSING APPARATUS, NON-TRANSITORY COMPUTER READABLE MEDIUM, AND INFORMATION PROCESSING METHOD
20230231956 · 2023-07-20 · ·

An information processing apparatus includes a processor configured to: obtain image data; obtain information including at least one of setting information set in advance for optical character recognition processing by plural apparatuses capable of communicating with the information processing apparatus or attribute information of each of the plural apparatuses; and based on the obtained image data and the obtained information, determine an apparatus used for optical character recognition processing of the image data from among the plural apparatuses.

INFORMATION PROCESSING APPARATUS, NON-TRANSITORY COMPUTER READABLE MEDIUM, AND INFORMATION PROCESSING METHOD
20230231956 · 2023-07-20 · ·

An information processing apparatus includes a processor configured to: obtain image data; obtain information including at least one of setting information set in advance for optical character recognition processing by plural apparatuses capable of communicating with the information processing apparatus or attribute information of each of the plural apparatuses; and based on the obtained image data and the obtained information, determine an apparatus used for optical character recognition processing of the image data from among the plural apparatuses.

Image processing apparatus, image processing method, and storage medium
11704921 · 2023-07-18 · ·

Character recognition processing suitable to a handwritten character area and a printed character area among character areas in a scanned image of a document is performed. Next, character recognition results for the handwritten character area and character recognition results for the printed character area are integrated and a likelihood indicating a probability of being an extraction target is calculated for a candidate character string that is an extraction candidate among the integrated character recognition results and a character string that is the item value is determined. Then, at the time of the determination, different evaluation indications are used in a case where a character originating from the handwritten character area is included in characters constituting the candidate character string and in a case where such a character is not included.

Image processing apparatus, image processing method, and storage medium
11704921 · 2023-07-18 · ·

Character recognition processing suitable to a handwritten character area and a printed character area among character areas in a scanned image of a document is performed. Next, character recognition results for the handwritten character area and character recognition results for the printed character area are integrated and a likelihood indicating a probability of being an extraction target is calculated for a candidate character string that is an extraction candidate among the integrated character recognition results and a character string that is the item value is determined. Then, at the time of the determination, different evaluation indications are used in a case where a character originating from the handwritten character area is included in characters constituting the candidate character string and in a case where such a character is not included.

METHOD OF RECOGNIZING TEXT, DEVICE, STORAGE MEDIUM AND SMART DICTIONARY PEN

A method of recognizing a text, which relates to a field of an artificial intelligence technology, in particular to a field of computer vision and deep learning technology, and may be applied to optical character recognition or other applications. The method includes: acquiring a plurality of image sequences by continuously scanning a document; performing an image stitching, so as to obtain a plurality of successive frames of stitched images corresponding to the plurality of image sequences respectively, an overlapping region exists between each two successive frames of stitched images; performing a text recognition based on the plurality of successive frames of stitched images, so as to obtain a plurality of corresponding recognition results; and performing a de-duplication on the plurality of recognition results based on the overlapping region between each two successive frames of stitched images, so as to obtain a text recognition result for the document.

METHOD OF RECOGNIZING TEXT, DEVICE, STORAGE MEDIUM AND SMART DICTIONARY PEN

A method of recognizing a text, which relates to a field of an artificial intelligence technology, in particular to a field of computer vision and deep learning technology, and may be applied to optical character recognition or other applications. The method includes: acquiring a plurality of image sequences by continuously scanning a document; performing an image stitching, so as to obtain a plurality of successive frames of stitched images corresponding to the plurality of image sequences respectively, an overlapping region exists between each two successive frames of stitched images; performing a text recognition based on the plurality of successive frames of stitched images, so as to obtain a plurality of corresponding recognition results; and performing a de-duplication on the plurality of recognition results based on the overlapping region between each two successive frames of stitched images, so as to obtain a text recognition result for the document.