Patent classifications
G06V30/1444
LEARNING CONTRASTIVE REPRESENTATION FOR SEMANTIC CORRESPONDENCE
A multi-level contrastive training strategy for training a neural network relies on image pairs (no other labels) to learn semantic correspondences at the image level and region or pixel level. The neural network is trained using contrasting image pairs including different objects and corresponding image pairs including different views of the same object. Conceptually, contrastive training pulls corresponding image pairs closer and pushes contrasting image pairs apart. An image-level contrastive loss is computed from the outputs (predictions) of the neural network and used to update parameters (weights) of the neural network via backpropagation. The neural network is also trained via pixel-level contrastive learning using only image pairs. Pixel-level contrastive learning receives an image pair, where each image includes an object in a particular category.
Electronic device and method for providing multiple services respectively corresponding to multiple external objects included in image
An electronic device according to various embodiments includes a communication circuit, a memory, and a processor, and the processor is configured to: receive a first image from a first external electronic device by using the communication circuit; perform image recognition with respect to the first image by using the first image; generate information regarding an external object included in the first image, based on a result of the recognition; based on the information regarding the external object satisfying a first designated condition, transmit at least a portion of the first image to a second external electronic device corresponding to the first designated condition; and, based on the information regarding the external object satisfying a second designated condition, transmit the at least portion of the first image to a third external electronic device corresponding to the second designated condition.
TEXT RECOGNITION IN IMAGE
According to implementations of the subject matter described herein, there is provided a solution for text recognition in an image. In this solution, a target text line area, which is expected to include a text to be recognized, is determined from an image. Probability distribution information of a character model element(s) present in the target text line area is determined using a single character model. The single character model is trained based on training text line areas and respective ground-truth texts in the training text line areas. Texts in the training text line areas are arranged in different orientations, and/or the ground-truth texts comprise texts are related to various languages (e.g., texts related to a Latin and an Eastern languages). The text in the target text line area can be determined based on the determined probability distribution information. The single character model enables more efficient and convenient text recognition.
IMAGE PROCESSING APPARATUS, NON-TRANSITORY STORAGE MEDIUM, AND IMAGE PROCESSING METHOD
When a character string corresponding to a predetermined item is not extracted in a first document image as a processing target by entity extraction processing, the character string corresponding to the predetermined item in the first document image is acquired based on positional information about an area where the character string corresponding to the predetermined item is previously extracted in a second document image having the same format as that of the first document image.
COUNTERFEIT OBJECT DETECTION USING IMAGE ANALYSIS
A system may receive user interface information that indicates an image, associated with a web page, that depicts an object for which a counterfeit estimation is to be determined, text associated with the web page, or an entity identifier that identifies an entity associated with the web page and the object. The system may determine a first estimation that the object is counterfeit based on performing an image analysis of the image, a second estimation that the object is counterfeit based on performing text analysis of the text, or a third estimation that the object is counterfeit based on performing an entity analysis of the entity. The system may determine the counterfeit estimation based on the first estimation, the second estimation, or the third estimation. The counterfeit estimation may indicate a likelihood that the object is counterfeit. The system may transmit information that identifies the counterfeit estimation.
AUTOMATED LICENSE PLATE RECOGNITION SYSTEM AND RELATED METHOD
Systems, methods, devices and computer readable media for determining a geographical location of a license plate are described herein. A first image of a license plate is acquired by a first image acquisition device of a camera unit and a second image of the license plate is acquired by a second image acquisition device of the camera unit. A three-dimensional position of the license plate relative to the camera unit is determined based on stereoscopic image processing of the first image and the second image. A geographical location of the camera unit is obtained. A geographical location of the license plate is determined from the three-dimensional position of the license plate relative to the camera unit and the geographical location of the camera unit. Other systems, methods, devices and computer readable media for detecting a license plate and identifying a license plate are described herein.
Reading support system and moving body
According to one embodiment, a reading support system includes a processing device. The processing device includes an extractor and a type determiner. The extractor extracts a plurality of regions from a candidate region. The candidate region is a candidate of a region in which a meter is imaged. The regions respectively include a plurality of characters of the meter. The type determiner determines a type of the meter based on positions of the regions.
Systems and methods for image modification and image based content capture and extraction in neural networks
Systems and methods for image modification to increase contrast between text and non-text pixels within the image. In one embodiment, an original document image is scaled to a predetermined size for processing by a convolutional neural network. The convolutional neural network identifies a probability that each pixel in the scaled is text and generates a heat map of these probabilities. The heat map is then scaled back to the size of the original document image, and the probabilities in the heat map are used to adjust the intensities of the text and non-text pixels. For positive text, intensities of text pixels are reduced and intensities of non-text pixels are increased in order to increase the contrast of the text against the background of the image. Optical character recognition may then be performed on the contrast-adjusted image.
SYSTEMS AND METHODS FOR IMAGE MODIFICATION AND IMAGE BASED CONTENT CAPTURE AND EXTRACTION IN NEURAL NETWORKS
Systems and methods for image modification to increase contrast between text and non-text pixels within the image. In one embodiment, an original document image is scaled to a predetermined size for processing by a convolutional neural network. The convolutional neural network identifies a probability that each pixel in the scaled is text and generates a heat map of these probabilities. The heat map is then scaled back to the size of the original document image, and the probabilities in the heat map are used to adjust the intensities of the text and non-text pixels. For positive text, intensities of text pixels are reduced and intensities of non-text pixels are increased in order to increase the contrast of the text against the background of the image. Optical character recognition may then be performed on the contrast-adjusted image.
METHOD AND ELECTRONIC DEVICE FOR RECOGNIZING PRODUCT
A method and electronic device for recognizing a product are provided. The method includes obtaining first feature information and second feature information from an image related to a product, obtaining fusion feature information based on the first feature information and the second feature information by using a main encoder model that reflects a correlation between feature information of different modalities, matching the fusion feature information against a database of the product, and providing information about the product, based on a result of the matching.