G06V30/19153

VIDEO OBJECT DETECTION WITH CO-OCCURRENCE

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for model co-occurrence object detection. One of the methods includes accessing, for a training image, first data that indicates a detected bounding box for a first object depicted in the training image and a predicted type label, accessing, for the training image, ground truth data for one or more ground truth objects, determining, using the first data and the ground truth data, that i) the detected bounding box represents an object that is not a ground truth object represented by the ground truth data or ii) the predicted type label for the first object does not match a ground truth label for the first object identified by the ground truth data, determining a penalty to adjust the model using a distance between the detected bounding box and the labeled bounding box, and training the model using the penalty.

PICTURE PROCESSING METHOD, AND TASK DATA PROCESSING METHOD AND APPARATUS
20200401829 · 2020-12-24 ·

A picture processing method is provided for a computer device. The method includes obtaining a to-be-processed picture; extracting a text feature in the to-be-processed picture using a machine learning model; and determining text box proposals at any angles in the to-be-processed picture according to the text feature. Corresponding subtasks are performed by using processing units corresponding to substructures in the machine learning model, and at least part of the processing units comprise a field-programmable gate array (FPGA) unit. The method also includes performing rotation region of interest (RROI) pooling processing on each text box proposal, and projecting the text box proposal onto a feature graph of a fixed size, to obtain a text box feature graph corresponding to the text box proposal; and recognizing text in the text box feature graph, to obtain a text recognition result.

MACHINE LEARNING BASED INFORMATION EXTRACTION
20240062568 · 2024-02-22 ·

Computer-readable media, methods, and systems are disclosed for applying machine learning mechanisms to classify and validate documents based on expense rule sets and external data validation services. Document images associated with expenses are received in connection with a reimbursable event. For each received document image data associated with the received document image is transmitted to an optical character recognition image processor that can recognize contents and associated coordinates. OCR data is received and transmitted to a text tokenizer. Tokenized text is received corresponding to expense details, and the tokenized text and coordinates are sent to a text feature generator. Text feature vectors are received and transmitted to a document classifier and a document classification received. Document fields are extracted and based thereon a document is validates and a corresponding reimbursement instruction generated.

Visual labeling for machine learning training
11907336 · 2024-02-20 · ·

Systems, methods, and computer-readable media are disclosed for visual labeling of training data items for training a machine learning model. Training data items may be generated for training the machine learning model. Visual labels, such as QR codes, may be created for the training data items. The creation of the training data item and the visual label may be automated. The visual labels and the training data items may be combined to obtain a labeled training data item. The labeled training data item may comprise a separator to distinguish the training data item from the visual label. The labeled training data item may be used for training and validation of the machine learning model. The machine learning model may analyze the training data item, attempt to identify the training data item, and compare the identification against the embedded label.

CHARACTER RECOGNITION DEVICE AND CHARACTER RECOGNITION METHOD

A character recognition device includes a recognizer that recognizes at least one character string from an image including a trailer captured by an imaging device, an attribute determinator that determines an attribute of the character string recognized by the recognition unit, and a trailer ID estimator that estimate whether the character string is a trailer ID based on the attribute of the character string determined by the attribute determinator.

ENTITY EXTRACTION VIA DOCUMENT IMAGE PROCESSING

A document processing system processes a document image to identify document image regions including floating images, structured data units, and unstructured floating text. A first masked image is generated by deleting any floating images from the document image and a second masked image is generated by deleting any structured data units from the first masked image. The structured data units and the unstructured floating text are thus identified serially one after another. Textual data is extracted from the structured data units and the unstructured floating text by processing the corresponding document image regions via optical character recognition (OCR). Entities are extracted from the textual data using natural language processing (NLP) techniques.

Medical image processing method and apparatus, electronic medical device, and storage medium

A medical image processing method includes: obtaining a biological tissue image including a biological tissue, recognizing, in the biological tissue image, a first region of a lesion object in the biological tissue; recognizing a lesion attribute matching the lesion object; dividing an image region of the biological tissue in the biological tissue image into a plurality of quadrant regions; obtaining target quadrant position information of a quadrant region in which the first region is located; and generating medical service data according to the target quadrant position information and the lesion attribute.

Character recognition device and character recognition method

A character recognition device includes a recognizer that recognizes at least one character string from an image including a trailer captured by an imaging device, an attribute determinator that determines an attribute of the character string recognized by the recognition unit, and a trailer ID estimator that estimate whether the character string is a trailer ID based on the attribute of the character string determined by the attribute determinator.

MEDICAL IMAGE PROCESSING METHOD AND APPARATUS, ELECTRONIC MEDICAL DEVICE, AND STORAGE MEDIUM
20240266054 · 2024-08-08 ·

A medical image processing method includes: obtaining a biological tissue image including a biological tissue, recognizing, in the biological tissue image, a first region of a lesion object in the biological tissue; recognizing a lesion attribute matching the lesion object; dividing an image region of the biological tissue in the biological tissue image into a plurality of quadrant regions; obtaining target quadrant position information of a quadrant region in which the first region is located; and generating medical service data according to the target quadrant position information and the lesion attribute.

Picture processing method, and task data processing method and apparatus

A picture processing method is provided for a computer device. The method includes obtaining a to-be-processed picture; extracting a text feature in the to-be-processed picture using a machine learning model; and determining text box proposals at any angles in the to-be-processed picture according to the text feature. Corresponding subtasks are performed by using processing units corresponding to substructures in the machine learning model, and at least part of the processing units comprise a field-programmable gate array (FPGA) unit. The method also includes performing rotation region of interest (RROI) pooling processing on each text box proposal, and projecting the text box proposal onto a feature graph of a fixed size, to obtain a text box feature graph corresponding to the text box proposal; and recognizing text in the text box feature graph, to obtain a text recognition result.