G06V30/147

IMAGE CAPTURING METHOD
20230138373 · 2023-05-04 ·

An image capturing method has: providing an image capturing area on a display screen of a user device; providing an indication area in the image capturing area; identifying each license plate in the indication area; calculating a license plate area of said each license plate in the indication area; calculating a center point difference from a center point of said each license plate in the indication area to a center point of the indication area; marking a license plate that has a license plate area larger than half of a largest license plate area and has a smallest center point difference; and capturing an image including the marked license plate in the image capturing area.

UNIFIED PRETRAINING FRAMEWORK FOR DOCUMENT UNDERSTANDING

The technology described includes methods for pretraining a document encoder model based on multimodal self cross-attention. One method includes receiving image data that encodes a set of pretraining documents. A set of sentences is extracted from the image data. A bounding box for each sentence is generated. For each sentence, a set of predicted features is generated by using an encoder machine-learning model. The encoder model performs cross-attention between a set of masked-textual features for the sentence and a set of masked-visual features for the sentence. The set of masked-textual features is based on a masking function and the sentence. The set of masked-visual features is based on the masking function and the corresponding bounding box. A document-encoder model is pretrained based on the set of predicted features for each sentence and pretraining tasks. The pretraining tasks includes masked sentence modeling, visual contrastive learning, or visual-language alignment.

GENERALIZED ANOMALY DETECTION

Described are methods and systems for training a system for detecting anomalies in images of documents in a class of documents. A plurality of training document images of training documents in a class of documents are obtained. For each training document image, the training document image is segmented into a plurality of region of interest (ROI) images, each ROI image corresponding to a respective ROI of the training document. For each ROI image, a plurality of transformations are applied to the ROI image to generate respective transform-specific features for the ROI image and respective transform-specific anomaly scores from the transform-specific features. Based on the respective anomaly scores of the plurality of training document images, a transform-specific threshold is computed for each transformation to separate document images containing an anomaly from document images not containing an anomaly.

READING SYSTEM, READING DEVICE, READING METHOD, AND STORAGE MEDIUM
20220398824 · 2022-12-15 · ·

According to one embodiment, a reading system includes an extractor, a determiner, and a reader. The extractor extracts a candidate image from an input image. The candidate image is a candidate of a portion of the input image in which a segment display is imaged. The determiner uses the candidate image and a mask to calculate a match ratio indicating a certainty of a segment display being included in the candidate image, and determines that the candidate image is an image of a segment display when the match ratio is not less than a threshold. The mask and the threshold are preset. The reader reads a numerical value displayed in a segment display from the candidate image determined to be an image of a segment display.

MERCHANT ADVERTISEMENT INFORMED ITEM LEVEL DATA PREDICTIONS
20230018545 · 2023-01-19 ·

Systems as described herein may include predicting item level data based on merchant advertisement information. A transaction pattern may be detected. The merchant advertisement information may be retrieved and parsed to generate a price list. A number of transactions that each shares a common payment amount may be determined and the number may reach a threshold value. Items from the price list may be matched with the common payment amount. The transaction records may be updated to indicate likely item level transaction information. In a variety of embodiments, the likely transaction information may be presented to a user.

Methods and system for imaging of moving printed materials by an optical device having plurality of cameras arranged in an array
11811983 · 2023-11-07 · ·

A system for capturing images during production of printed material includes an optical device comprising a plurality of cameras arranged in an array with adjacent pairs of cameras having overlapping fields of view. An imaging controller device determines a layout of content on printed material, and determines, based on the layout, an optical system configuration profile. Determining the optical system configuration profile includes selecting one or more cameras for capturing images of regions of interest on the printed material and determining a trigger interval for triggering the selected one or more cameras. The imaging controller device triggers the selected cameras at times determined based on the trigger interval to capture images of the regions of interest on the printed material as the printed material moves in fields of view of the one or more cameras during production of the printed material.

Method for detecting image of esophageal cancer using hyperspectral imaging

This application provides a method for detecting images of testing object using hyperspectral imaging. Firstly, obtaining a hyperspectral imaging information according to a reference image, hereby, obtaining corresponded hyperspectral image from an input image and obtaining corresponded feature values for operating Principal components analysis to simplify feature values. Then, obtaining feature images by Convolution kernel, and then positioning an image of an object under detected by a default box and a boundary box from the feature image. By Comparing with the esophageal cancer sample image, the image of the object under detected is classifying to an esophageal cancer image or a non-esophageal cancer image. Thus, detecting an input image from the image capturing device by the convolutional neural network to judge if the input image is the esophageal cancer image for helping the doctor to interpret the image of the object under detected.

FRAMEWORK FOR DOCUMENT LAYOUT AND INFORMATION EXTRACTION

Provided herein are system, apparatus, device, method, and/or computer program product embodiments, and/or combinations and sub-combinations thereof, for extracting data from a file. Embodiments described herein provide a framework to merge outputs of various models comprising extracted information from a file with its location information and annotated regions of interest into an output file ingestible by a database or knowledge base.

AUTOMATIC MULTI-PLATE RECOGNITION

A method and related system operations include obtaining a video stream with an image sensor of a camera device, detecting a plurality of target objects by executing a neural network model based on the video stream with a vision processor unit of the camera device. The method also includes generating a plurality of bounding boxes, determining a plurality of character sequences by, for each respective bounding box of the plurality of bounding boxes, performing a set of optical character recognition (OCR) operations to determine a respective character sequence of the plurality of character sequences. The method also includes updating a plurality of tracklets to indicate the plurality of bounding boxes and storing the plurality of tracklets in association with the plurality of character sequences in a memory of the camera device.

Data processing and classification

The present invention discloses a method, a system and a computer program product for data processing and classification. The invention provides warm start and cold start classification tools for classification of data obtained from known or unknown entities. The system and method are also configured to be employed over blockchain based networks.