Patent classifications
G06V30/41
METHODS, SYSTEMS, AND COMPUTER PROGRAM PRODUCTS FOR AUTOMATICALLY PROCESSING A CLINICAL RECORD FOR A PATIENT TO DETECT PROTECTED HEALTH INFORMATION (PHI) VIOLATIONS
A method includes receiving a record including at least one page and containing clinical information associated with a first patient; receiving respective first patient identification values for one or more patient identification parameters corresponding to the first patient; automatically processing the record to identify first example instances referencing the patient identification parameters including values therefor; automatically processing the record to identify second example instances of the first patient identification values; automatically processing the record to determine whether any of the at least one page contained therein cannot be semantically linked to another one of the at least one page in the record; and assigning a grade to each of the at least one page indicating a degree of confidence that the record does not include clinical information associated with a second patient based on the first example instances, the second example instances, and the determination whether any of the at least one page contained therein cannot be semantically linked to another one of the at least one page in the record.
HANDWRITTEN CONTENT REMOVING METHOD AND DEVICE AND STORAGE MEDIUM
A handwritten content removing method and device and a storage medium. The handwritten content removing method comprises: acquiring an input image of a text page to be processed, the input image comprising a handwritten region, which comprises a handwritten content (S10); identifying the input image so as to determine the handwritten content in the handwritten region (S11); and removing the handwritten content in the input image so as to obtain an output image (S12).
MACHINE LEARNING MODEL-AGNOSTIC CONFIDENCE CALIBRATION SYSTEM AND METHOD
A method may include extracting, from a document, a first key-value pair including a key and a first value and corresponding to a first confidence score, extracting a second key-value pair including the key and a second value corresponding to a second confidence score, classifying a first match probability for the first key-value pair and a second match probability for the second key-value pair, generating a first calibrated confidence score for the first confidence score and a second calibrated confidence score for the second confidence score by transforming, using precision lookup tables constructed from training records, the first match probability to the first calibrated confidence score and the second match probability to second calibrated confidence score, selecting, using the first and second calibrated confidence scores, one of the first key-value pair and the second key-value pair, and presenting, in a graphical user interface (GUI), the selected key-value pair.
DOCUMENT AUTHENTICITY IDENTIFICATION METHOD AND APPARATUS, COMPUTER-READABLE MEDIUM, AND ELECTRONIC DEVICE
A document authenticity identification method is provided. A dynamic anti-counterfeiting point is detected in each document image of a subset of a plurality of document images. A static anti-counterfeiting point is detected in a document image of the plurality of document images. A static anti-counterfeiting point feature is generated based on image feature information of the static anti-counterfeiting point that is extracted from the document image. A dynamic anti-counterfeiting point feature is generated based on image feature information of the dynamic anti-counterfeiting point and variation feature information of the dynamic anti-counterfeiting point. A first authenticity result corresponding to the static anti-counterfeiting point is determined based on the static anti-counterfeiting point feature. A second authenticity result corresponding to the dynamic anti-counterfeiting point is determined based on the dynamic anti-counterfeiting point feature. Authenticity of the document is determined based on the first authenticity result and the second authenticity result.
SYSTEMS AND METHODS FOR EXTRACTING AND PROCESSING DATA USING OPTICAL CHARACTER RECOGNITION IN REAL-TIME ENVIRONMENTS
Methods and systems for extracting and processing data using optical character recognition in real-time environments. For example, the methods and systems provide novel techniques during extracting data using OCR and for a mechanism to process that data. These methods and systems are particularly relevant in real-time environments as the methods and system limit the need for manual review.
Amendment Tracking In An Online Document System
An online document system can allow users to track various amendments made over time and corresponding to an original document. The online document system accesses the original document comprising a plurality of content sections and a set of amendment documents each comprising one or more amendments to the original document. The online document system applies a machine-learned model to the original document and the set of amendment documents to identify, for each amendment, a content section of the plurality that corresponds to the amendment and a type of amendment corresponding to the amendment. The online document system generates an amended original document comprising the plurality of content sections modified to include each amendment. The online document system displays the amended original document by displaying each of the plurality of content sections and, in conjunction with each content section, any amendments corresponding to the content section are highlighted.
METHOD FOR DETECTING FRAUD IN DOCUMENTS
Described are methods and systems for detecting fraud in documents. First images of a first set of genuine documents and second images of a second set of genuine documents are obtained. A printed feature, spacings between printed features in the first images, and positions of printed features in the second images are selected. Selected features, spacings and positions are annotated to obtain original landmark locations for each printed feature, spacing and position. Annotated features, spacings and positions are transformed to obtain transformed features, transformed spacings and transformed positions. The transformed features, spacings and positions are combined with a noise model to generate modified features, modified spacings and modified positions. Each modified feature, modified spacing and modified position comprises annotations indicating modified landmark locations. Input data for a machine learning model is generated using original landmark locations and modified landmark locations. The machine learning model is trained using the input data.
METHOD FOR DETECTING FRAUD IN DOCUMENTS
Described are methods and systems for detecting fraud in documents. First images of a first set of genuine documents and second images of a second set of genuine documents are obtained. A printed feature, spacings between printed features in the first images, and positions of printed features in the second images are selected. Selected features, spacings and positions are annotated to obtain original landmark locations for each printed feature, spacing and position. Annotated features, spacings and positions are transformed to obtain transformed features, transformed spacings and transformed positions. The transformed features, spacings and positions are combined with a noise model to generate modified features, modified spacings and modified positions. Each modified feature, modified spacing and modified position comprises annotations indicating modified landmark locations. Input data for a machine learning model is generated using original landmark locations and modified landmark locations. The machine learning model is trained using the input data.
IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND NON-TEMPORARY RECORDING MEDIUM
An image processing device includes a reader and a controller. The reader acquires a document image formed on a sheet. The controller acquires an area image in a designated area designated in the document image acquired by the reader, performs a character recognition process and a code recognition process on the area image, and in response to the character recognition process not recognizing a character, employs a code recognition result by a code recognition process performed on the area image as a recognition result of the area image.
AUTOMATED CATEGORIZATION AND ASSEMBLY OF LOW-QUALITY IMAGES INTO ELECTRONIC DOCUMENTS
An apparatus includes a memory and processor. The memory stores document categories, text generated from an image a physical document page, and a machine learning algorithm. The text includes errors associated with noise in the image. The machine learning algorithm is configured to extract features associated with natural language processing and features associated with the errors from the text. The machine learning algorithm is also configured to generate a feature vector that includes the first and second pluralities of features, and to generate, based on the feature vector, a set of probabilities, each of which is associated with a document category and indicates a probability that the physical document from which the text was generated belongs to that document category. The processor applies the machine learning algorithm to the text, to generate the set of probabilities, identifies a largest probability, and assigns the image to the associated document category.