G06V30/1918

Image processing apparatus, image processing method, and storage medium
11704921 · 2023-07-18 · ·

Character recognition processing suitable to a handwritten character area and a printed character area among character areas in a scanned image of a document is performed. Next, character recognition results for the handwritten character area and character recognition results for the printed character area are integrated and a likelihood indicating a probability of being an extraction target is calculated for a candidate character string that is an extraction candidate among the integrated character recognition results and a character string that is the item value is determined. Then, at the time of the determination, different evaluation indications are used in a case where a character originating from the handwritten character area is included in characters constituting the candidate character string and in a case where such a character is not included.

Multiple field of view (FOV) vision system

Multiple field of view (FOV) systems are disclosed herein. An example system includes a bioptic barcode reader having a target imaging region. The bioptic barcode reader includes at least one imager having a first FOV and a second FOV and is configured to capture an image of a target object from each FOV. The example system includes one or more processors configured to receive the images and a trained object recognition model stored in memory communicatively coupled to the one or more processors. The memory includes instructions that, when executed, cause the one or more processors to analyze the images to identify at least a portion of a barcode and one or more features associated with the target object. The instructions further cause the one or more processors to determine a target object identification probability and to determine whether a predicted product identifies the target object.

PREPROCESSOR TRAINING FOR OPTICAL CHARACTER RECOGNITION

A method includes executing a Optical Character Recognition (OCR) preprocessor on training images to obtain OCR preprocessor output, executing an OCR engine on the OCR preprocessor output to obtain OCR engine output, and executing an approximator on the OCR preprocessor output to obtain approximator output. The method further includes iteratively adjusting the approximator to simulate the OCR engine using the OCR engine output and the approximator output, and generating OCR preprocessor losses using the approximator output and target labels. The method further includes iteratively adjusting the OCR preprocessor using the OCR preprocessor losses to obtain a customized OCR preprocessor.

TERM WEIGHT GENERATION METHOD, APPARATUS, DEVICE AND MEDIUM
20230057010 · 2023-02-23 ·

A term weight determination method includes: obtaining a video and video-associated text, the video-associated text including at least one term; generating a halfway vector of the term by performing multimodal feature fusion on the features of the video, the video-associated text and the at least one term; and generating the weight of the at least one term based on the halfway vector of the at least one term.

INTER-WORD SCORE CALCULATION APPARATUS, QUESTION AND ANSWER EXTRACTION SYSTEM AND INTER-WORD SCORE CALCULATION METHOD
20220366714 · 2022-11-17 · ·

The degree of relatedness between words included in an amount of data can be calculated, allowing suitable related words and phrases to be extracted. An inter-word score calculation apparatus includes document data (response history document) wherein documents input from outside are accumulated, term list data wherein predetermined terms are written, a word combination unit that can execute a combination process of amplifying an amplification candidate word, which is a word corresponding to a term written in the term list data and included in a document constituting the document data and adding the amplified word to the document data, and an inter-word score calculation unit calculating a degree of relatedness between words in the document data using a predetermined calculation method, wherein the number of documents accumulated in the document data is smaller than a first predetermined amount, the word combination unit adds the amplification candidate word to the document data.

Cross modality training of machine learning models

There is provided a method, comprising: providing a training dataset including, medical images and corresponding text based reports, and concurrently training a natural language processing (NLP) machine learning (ML) model for generating a NLP category for a target text based report and a visual ML model for generating a visual finding for a target image, by: training the NLP ML model using the text based reports of the training dataset and a ground truth comprising the visual finding generated by the visual ML model in response to an input of the images corresponding to the text based reports of the training dataset, and training the visual ML model using the images of the training dataset and a ground truth comprising the NLP category generated by the NLP ML model in response to an input of the text based reports corresponding to the images of the training dataset.

SYSTEMS AND METHODS FOR SYNCHRONIZING AN IMAGE SENSOR

Systems and methods for synchronization are provided. In some aspects, a method for synchronizing an image sensor is provided. The method includes receiving image data captured using an image sensor that is moving along a pathway, and assembling an image sensor trajectory using the image data. The method also includes receiving position data acquired along the pathway using a position sensor, wherein timestamps for the image data and position data are asynchronous, and assembling a position sensor trajectory using the position data. The method further includes generating a spatial transformation that aligns the image sensor trajectory and position sensor trajectory, and synchronizing the image sensor based on the spatial transformation.

PICTURE SEARCH METHOD AND APPARATUS, ELECTRONIC DEVICE, COMPUTER-READABLE STORAGE MEDIUM
20230082638 · 2023-03-16 · ·

A picture search method and apparatus, an electronic device, a computer-readable storage medium, and a computer program product, relating to the field of artificial intelligence that can obtain an OCR result of pictures in a preset picture library in response to a picture search request; traverse pictures which are not subjected to low-dimensional OCR processing and high-dimensional OCR processing in the preset picture library, and perform the low-dimensional OCR processing based on an OCR threshold on each of the traversed pictures to obtain a low-dimensional OCR result of each corresponding picture; determining a target picture matching a key character string in the preset picture library according to at least one of the low-dimensional OCR result and the high-dimensional OCR result of each picture; and determining the target picture as a search result of the picture search request, and displaying the search result.

CONTENT RECOGNITION METHOD AND APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM

A method for content recognition includes acquiring, from a content for recognition, a text piece and a media piece associated with the text piece, performing a first feature extraction on the text piece to obtain text features, performing a second feature extraction on the media piece associated with the text piece to obtain media features, and determining feature association measures between the media features and the text features. A feature association measure for a first feature in the media features and a second feature in the text features indicating an association degree between the first feature and the second feature. The method further includes adjusting the text features based on the feature association measures to obtain adjusted text features, and performing a recognition based on the adjusted text features to obtain a content recognition result of the content. Apparatus and non-transitory computer-readable storage medium counterpart embodiments are also contemplated.

MODULAR ARCHITECTURE FOR ASYNCHRONOUS LATE-FUSION OF OBJECTS
20230074275 · 2023-03-09 ·

Systems and methods for asynchronous late-fusion of measurements. State information and intermediate values may be calculated as measurements arrive and are stored. When late sensor measurements arrive out of the temporal order in which the measurements were generated, the late measurements are stored in temporal order rather than the order in which measurements arrive. State information is then recalculated to account for the late-arriving sensor measurement, with state outputs propagated forward in temporal order using the previously computed intermediate values to speed up computation. In this manner, more accurate revised state information is efficiently generated, accounting for any late-arriving measurements. This modular processing framework also enables sensors to be added or changed, which may cause measurements to arrive asynchronously, without having to reprogram the processing framework.