Patent classifications
G06V30/40
Document portion identification in a recorded video
Document portion identification in a recorded video is disclosed, including: obtaining a recorded video; identifying a document portion that appears during the recorded video, wherein the document portion belongs to a document; and determining a video segment during which the document portion appears in the recorded video.
OPTICAL RECEIPT PROCESSING
Techniques for providing improved optical character recognition (OCR) for receipts are discussed herein. Some embodiments may provide for a system including one or more servers configured to perform receipt image cleanup, logo identification, and text extraction. The image cleanup may include transforming image data of the receipt by using image parameters values that optimize the logo identification, and performing logo identification using a comparison of the image data with training logos associated with merchants. When a merchant is identified, a second image clean up may be performed by using image parameter values optimized for text extraction. A receipt structure may be used to categorize the extracted text. Improved OCR accuracy is also achieved by applying on format rules of the receipt structure to the extracted text.
OPTICAL RECEIPT PROCESSING
Techniques for providing improved optical character recognition (OCR) for receipts are discussed herein. Some embodiments may provide for a system including one or more servers configured to perform receipt image cleanup, logo identification, and text extraction. The image cleanup may include transforming image data of the receipt by using image parameters values that optimize the logo identification, and performing logo identification using a comparison of the image data with training logos associated with merchants. When a merchant is identified, a second image clean up may be performed by using image parameter values optimized for text extraction. A receipt structure may be used to categorize the extracted text. Improved OCR accuracy is also achieved by applying on format rules of the receipt structure to the extracted text.
LEARNING USER INTERFACE CONTROLS VIA INCREMENTAL DATA SYNTHESIS
A User Interface (UI) interface object detection system employs an initial dataset comprising a set of images, that may include synthesized images, to train a Machine Learning (ML) engine to generate an initial trained model. A data point generator is employed to generate an updated synthesized image set which is used to further train the ML engine. The data point generator may employ images generated by an application program as a reference by which to generate the updated synthesized image set. The images generated by the application program may be tagged in advance. Alternatively, or in addition, the images generated by the application program may be captured dynamically by a user using the application program.
LEARNING USER INTERFACE CONTROLS VIA INCREMENTAL DATA SYNTHESIS
A User Interface (UI) interface object detection system employs an initial dataset comprising a set of images, that may include synthesized images, to train a Machine Learning (ML) engine to generate an initial trained model. A data point generator is employed to generate an updated synthesized image set which is used to further train the ML engine. The data point generator may employ images generated by an application program as a reference by which to generate the updated synthesized image set. The images generated by the application program may be tagged in advance. Alternatively, or in addition, the images generated by the application program may be captured dynamically by a user using the application program.
ANOMALY AND FRAUD DETECTION WITH FAKE EVENT DETECTION USING MACHINE LEARNING
The present disclosure involves systems, software, and computer implemented methods for transaction auditing. One example method includes training at least one machine learning model to determine features that can be used to determine whether an image is an authentic image of a document or an automatically generated document image, using a training set of authentic images and a training set of automatically generated document images. A request to classify an image as either an authentic image of a document or an automatically generated document image is received. The machine learning model(s) are used to classify the image as either an authentic image of a document or an automatically generated document image, based on features included in the image that are identified by the machine learning model(s). A classification of the image is provided. The machine learning model(s) are updated based on the image and the classification of the image.
ANOMALY AND FRAUD DETECTION WITH FAKE EVENT DETECTION USING MACHINE LEARNING
The present disclosure involves systems, software, and computer implemented methods for transaction auditing. One example method includes training at least one machine learning model to determine features that can be used to determine whether an image is an authentic image of a document or an automatically generated document image, using a training set of authentic images and a training set of automatically generated document images. A request to classify an image as either an authentic image of a document or an automatically generated document image is received. The machine learning model(s) are used to classify the image as either an authentic image of a document or an automatically generated document image, based on features included in the image that are identified by the machine learning model(s). A classification of the image is provided. The machine learning model(s) are updated based on the image and the classification of the image.
EFFICIENT BOUNDING BOX MERGING
A system can merge text bounding boxes such as Optical Character Recognition (OCR) bounding boxes. A document can comprise a plurality of the text bounding boxes. Distance thresholds between text bounding boxes can be utilized for comparison against a distance threshold. Distance thresholds can vary depending on context information associated with the document. In response to a determination that text bounding boxes satisfy the distance threshold, the text bounding boxes can be assigned to a bounding box group.
EFFICIENT BOUNDING BOX MERGING
A system can merge text bounding boxes such as Optical Character Recognition (OCR) bounding boxes. A document can comprise a plurality of the text bounding boxes. Distance thresholds between text bounding boxes can be utilized for comparison against a distance threshold. Distance thresholds can vary depending on context information associated with the document. In response to a determination that text bounding boxes satisfy the distance threshold, the text bounding boxes can be assigned to a bounding box group.
COMPUTER-VISION PICKUP SYSTEM AND METHODS
Real-time video is captured of a pickup area for orders at a store. The images are analyzed and tracked for unique orders being placed in the pickup area and orders being removed from the pickup area. A customer-operated device is operated by a customer to identify the store where the customer placed an order in a remote location from the pickup area. Images of the orders that are present within the pickup area and order identifying information for the orders are provided to the customer via the customer-operated device.