Patent classifications
G06V20/63
Methods and arrangements for identifying objects
In some arrangements, product packaging is digitally watermarked over most of its extent to facilitate high-throughput item identification at retail checkouts. Imagery captured by conventional or plenoptic cameras can be processed (e.g., by GPUs) to derive several different perspective-transformed views—further minimizing the need to manually reposition items for identification. Crinkles and other deformations in product packaging can be optically sensed, allowing such surfaces to be virtually flattened to aid identification. Piles of items can be 3D-modelled and virtually segmented into geometric primitives to aid identification, and to discover locations of obscured items. Other data (e.g., including data from sensors in aisles, shelves and carts, and gaze tracking for clues about visual saliency) can be used in assessing identification hypotheses about an item. Logos may be identified and used—or ignored—in product identification. A great variety of other features and arrangements are also detailed.
IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND PROGRAM
By processing an image in which a plurality of articles (such as products) placed on a shelf are captured, an image processing unit (120) determines the article name (such as the product name) of each of the plurality of articles captured in the image. Determining an article name also includes determining identification information tied to the article name. When a specific condition is satisfied for an article the article name of which is indeterminable by the image processing unit (120) (hereinafter described as an undetermined article), an article inference unit (130) infers the article name of the undetermined article to be the article name of an article positioned adjacent to the undetermined article.
USER INTERFACES FOR MANAGING VISUAL CONTENT IN MEDIA
The present disclosure generally relates to methods and user interfaces for managing visual content at a computer system. In some embodiments, methods and user interfaces for managing visual content in media are described. In some embodiments, methods and user interfaces for managing visual indicators for visual content in media are described. In some embodiments, methods and user interfaces for inserting visual content in media are described. In some embodiments, methods and user interfaces for identifying visual content in media are described. In some embodiments, methods and user interfaces for translating visual content in media are described.
Geographic object detection apparatus and geographic object detection method
A geographic object recognition unit (120) recognizes, using image data (192) obtained by photographing in a measurement region where a geographic object exists, a type of the geographic object from an image that the image data (192) represents. A position specification unit (130) specifies, using three-dimensional point cloud data (191) indicating a three-dimensional coordinate value of each of a plurality of points in the measurement region, a position of the geographic object.
Video analytics traffic monitoring and control
A controlled intersection employs video analytics to identify incoming vehicles coupled with autonomous driving capabilities in the vehicle to selectively provide intervention for collision avoidance. A camera image of an approaching vehicle is used to identify a range and speed, and to compute whether intervention is appropriate based on a detected distance and speed from the intersection. A vehicle approaching a stop signal (e.g. “red light”) at an unsafe rate of speed triggers an invocation of on-board autonomous systems in the vehicle that provide appropriate warnings and ultimately, forced braking if warnings go unheeded. A registration system maintains a local grouping of vehicles in proximity to an intersection for minimizing latency in vehicle identification for commencing intervention. In this manner, on-board vehicle collision avoidance systems collaborate with complementary traffic control logic at a controlled intersection for preventing inadvertent or intentional disregard of a red signal.
TEXT EXTRACTION METHOD, TEXT EXTRACTION MODEL TRAINING METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM
A text extraction method and a text extraction model training method are provided. The present disclosure relates to the technical field of artificial intelligence, in particular to the technical field of computer vision. An implementation of the method comprises: obtaining a visual encoding feature of a to-be-detected image; extracting a plurality of sets of multimodal features from the to-be-detected image, wherein each set of multimodal features includes position information of one detection frame extracted from the to-be-detected image, a detection feature in the detection frame and first text information in the detection frame; and obtaining second text information matched with a to-be-extracted attribute based on the visual encoding feature, the to-be-extracted attribute and the plurality of sets of multimodal features, wherein the to-be-extracted attribute is an attribute of text information needing to be extracted.
Systems and methods of legibly capturing vehicle markings
A system and method for legible capture of vehicle identification data includes video cameras and a computer. Recording attributes such as gain, gain shutter speed, and white balance are adjusted throughout ranges to maximize the likelihood of capturing at least one frame in which characters, such as those on the license plate, are legible. Successful capture of a legible frame may trigger storage of the data, while unsuccessful capture may trigger additional scans.
Generating searchable text for documents portrayed in a repository of digital images utilizing orientation and text prediction neural networks
The present disclosure relates to generating computer searchable text from digital images that depict documents utilizing an orientation neural network and/or text prediction neural network. For example, one or more embodiments detect digital images that depict documents, identify the orientation of the depicted documents, and generate computer searchable text from the depicted documents in the detected digital images. In particular, one or more embodiments train an orientation neural network to identify the orientation of a depicted document in a digital image. Additionally, one or more embodiments train a text prediction neural network to analyze a depicted document in a digital image to generate computer searchable text from the depicted document. By utilizing the identified orientation of the depicted document before analyzing the depicted document with a text prediction neural network, the disclosed systems can efficiently and accurately generate computer searchable text for a digital image that depicts a document.
METHOD ON IDENTIFYING INDICIA ORIENTATION AND DECODING INDICIA FOR MACHINE VISION SYSTEMS
A method and system for performing indicia recognition includes obtaining, at an image sensor, an image of an object of interest and identifying at least one region of interest in the image. The region of interest contains one or more indicia indicative of the object of interest. The processor then determines positions of each region of interest and further determines a geometric shape based on the positions of each of the regions of interest. An orientation classification is identified for each region of interest is based on a respective position relative to the geometric shape for reach region of interest. The processor then identifies and performs one or more transformations for each region of interest, with each transformation determined by each regions respective orientation classification. The processor then performs indicia recognition on each of the one or more transformed regions of interest.
INTELLIGENT AUTOMATIC LICENSE PLATE RECOGNITION FOR ELECTRONIC TOLLING ENVIRONMENTS
An intelligent automatic license plate recognition (IALPR) system implements technical solutions that improve the accuracy of automatic license plate recognition. The IALPR analyzes an image of a vehicle proximate to a toll collection point using optical character recognition (OCR), and determines candidate license plate identifications based, at least in part, on the corresponding OCR confidence level. The IALPR can also perform fingerprinting for candidate license plate images and matching analysis with a knowledge base, resulting in additional confidence levels. The IALPR can also perform behavioral analysis on the candidate license plate identifications, including trip context analysis, historical behavioral analysis, or other analytics. The IALPR can generate an overall confidence level for the candidate license plate identifications responsive to the OCR and vehicle fingerprint confidence levels and the behavioral analysis. This enhanced analysis helps the IALPR reduce the number of incorrect license plate identifications and reduce the need for human review.