Patent classifications
G06V30/24
Storage Medium Storing Editing Program and Information Processing Apparatus
A non-transitory computer-readable storage medium stores an editing program including a set of program instructions for an information processing apparatus comprising a controller and an input interface. The set of program instructions, when executed by the controller, causes the information processing apparatus to perform: acquiring a plurality of strokes inputted via the input interface; calculating a distance between two strokes of the acquired plurality of strokes; in response to determining that the calculated distance is shorter than a distance threshold, recognizing the two strokes as a same item; in response to determining that the calculated distance is longer than or equal to the distance threshold, recognizing the two strokes as separate items; and changing the distance threshold based on input via the input interface.
Method and system for visio-linguistic understanding using contextual language model reasoners
This disclosure relates generally to visio-linguistic understanding. Conventional methods use contextual visio-linguistic reasoner for visio-linguistic understanding which requires more compute power and large amount of pre-training data. Embodiments of the present disclosure provide a method for visio-linguistic understanding using contextual language model reasoner. The method converts the visual information of an input image into a format that the contextual language model reasoner understands and accepts for a downstream task. The method utilizes the image captions and confidence score associated with the image captions along with a knowledge graph to obtain a combined input in a format compatible with the contextual language model reasoner. Contextual embeddings corresponding to the downstream task is obtained using the combined input. The disclosed method is used to solve several downstream tasks such as scene understanding, visual question answering, visual common-sense reasoning and so on.
Method and system for visio-linguistic understanding using contextual language model reasoners
This disclosure relates generally to visio-linguistic understanding. Conventional methods use contextual visio-linguistic reasoner for visio-linguistic understanding which requires more compute power and large amount of pre-training data. Embodiments of the present disclosure provide a method for visio-linguistic understanding using contextual language model reasoner. The method converts the visual information of an input image into a format that the contextual language model reasoner understands and accepts for a downstream task. The method utilizes the image captions and confidence score associated with the image captions along with a knowledge graph to obtain a combined input in a format compatible with the contextual language model reasoner. Contextual embeddings corresponding to the downstream task is obtained using the combined input. The disclosed method is used to solve several downstream tasks such as scene understanding, visual question answering, visual common-sense reasoning and so on.
IMAGE FORMING APPARATUS THAT PERFORMS INSPECTION PROCESSING ON PRINT DATA AND METHOD OF CONTROLLING IMAGE FORMING APPARATUS
An image forming apparatus capable of controlling execution of inspection without increasing a time period required to complete printing. On a registration screen, whether or not to execute inspection of data to be printed is set, and keywords indicative of confidentiality are registered. Text information is extracted from the data to be printed, and whether or not any keyword matching the text information has been registered is determined. Execution of print processing of the data to be printed is controlled based on a result of the determination. When non-execution of inspection is set, the print processing of the data to be printed is executed without executing the determination, whereas when execution of inspection is set, the print processing of the data to be printed is controlled based on a result of the determination.
Recognition and selection of a discrete pattern within a scene containing multiple patterns
A memory device is provided including instructions that, when executed, cause one or more processors to perform the steps including receiving a plurality of images acquired by a camera, the plurality of images including a plurality of optical patterns, wherein an optical pattern of the plurality of optical patterns encodes an object identifier. The steps include presenting the plurality of images comprising the plurality of optical patterns on a display, and presenting a plurality of visual indications overlying the plurality of optical patterns in the plurality of images. The steps also include identifying a selected optical pattern of the plurality of optical patterns based on a user action and a position of the selected optical pattern in one or more of the plurality of images. The steps also include decoding the selected optical pattern to generate the object identifier and storing the object identifier in a second memory device.
Shadow and cloud masking for remote sensing images in agriculture applications using a multilayer perceptron
A method for shadow and cloud masking for remote sensing images of an agricultural field using multi-layer perceptrons includes electronically receiving an observed image, performing using at least one processor an image segmentation of the observed image to divide the observed image into a plurality of image segments or superpixels, extracting features for each of the image segments using the at least one processor, and determining by a cloud mask generation module executing on the at least one processor a classification for each of the image segments using the features extracted for each of the image segments, wherein the cloud mask generation module applies a classification model including an ensemble of multilayer perceptrons to generate a cloud mask for the observed image such that each pixel within the observed image has a corresponding classification.
Mapping optical-code images to an overview image
Images of optical codes are mapped to an overview image to localize optical codes within a space. By localizing optical codes, information about locations of various products can be ascertained. One or more techniques can be used to map the images of optical codes to the overview image. The overview image can be a composite image formed by stitching together several images.
SYSTEMS AND METHODS FOR HANDWRITING RECOGNITION
Examples described herein generally relate to systems and methods for handwriting recognition. In an example, a computing device may receive input corresponding to a handwritten word and apply first recognition model to the input. The first recognition model may be configured to determine a first confidence level of a first portion of the input is greater than a second confidence level of a second portion of the input. The computing device may also apply a second recognition model to the input, wherein the second recognition model is different from the first recognition model and combine results of the first recognition model and the second recognition model to determine a list of candidate words. The computing device may also output one or more candidate words from the list of candidate words.
SYSTEMS AND METHODS FOR HANDWRITING RECOGNITION
Examples described herein generally relate to systems and methods for handwriting recognition. In an example, a computing device may receive input corresponding to a handwritten word and apply first recognition model to the input. The first recognition model may be configured to determine a first confidence level of a first portion of the input is greater than a second confidence level of a second portion of the input. The computing device may also apply a second recognition model to the input, wherein the second recognition model is different from the first recognition model and combine results of the first recognition model and the second recognition model to determine a list of candidate words. The computing device may also output one or more candidate words from the list of candidate words.
IMAGE RECOGNITION METHOD AND APPARATUS, TRAINING METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM
An image recognition method and apparatus, a training method, an electronic device, and a storage medium are provided. The image recognition method includes: acquiring an image to be recognized, the image to be recognized including a target text; and determining text content of the target text based on knowledge information and image information of the image to be recognized.