G06V10/759

Object detection device, object detection method, and program
12148195 · 2024-11-19 · ·

An object detection device that detects a specific object included in an input image includes a first candidate region specifying unit that specifies a first candidate region in which an object candidate is included from a first input image obtained by imaging a subject in a first posture, a second candidate region specifying unit that specifies a second candidate region in which an object candidate is included from a second input image obtained by imaging the subject in a second posture different from the first posture, a deformation displacement field generation unit that generates a deformation displacement field between the first input image and the second input image, a coordinate transformation unit that transforms a coordinate of the second candidate region to a coordinate of the first posture based on the deformation displacement field, an association unit that associates the first candidate region with the transformed second candidate region that is close to the first candidate region, and a same object determination unit that determines that the object candidates included in the candidate regions associated with each other by the association unit are the same object and are the specific object.

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, RECORDING MEDIUM, AND IN-VEHICLE SYSTEM
20240375613 · 2024-11-14 ·

The present technology relates to an information processing device, an information processing method, a recording medium, and an in-vehicle system capable of suitably recognizing an object using a captured image. An information processing device according to the present technology includes: a first detection unit that detects an adhering substance on a lens of a camera provided in a vehicle from a captured image captured by the camera using a first discriminator using a neural network; a second detection unit that detects the adhering substance from the captured image using a second discriminator using an optical flow; and a region identification unit that identifies a region of the adhering substance in the captured image on the basis of a first detection result by the first detection unit and a second detection result by the second detection unit. The present technology can be applied to, for example, a vehicle that performs automated driving.

METHOD AND SYSTEM FOR VALIDATING PROMOTIONAL EMAILS AND PRODUCT AVAILABILITY FROM E-COMMERCE WEBSITES

A method and system for validating promotional emails and product availability from E-commerce websites is disclosed. In one embodiment, the method includes retrieving a first set of images corresponding to an image strip and a second set of images corresponding to a promotional email from a database. The first set of images and the second set of images may be associated with one or more products. The method further includes calculating a similarity score between each of the first set of images and each of the second set of images using a first Computer Vision (CV) technique. The method further includes selecting one or more valid images from the second set of images based on the similarity score. The method further includes determining a stock availability status of at least one product presented in the one or more valid images from at least one website using a deep learning algorithm.

Method and apparatus for training visual language pre-training model, and device and medium

Provided in the present application are a method and apparatus for training a visual language pre-training model, and a device and a medium. The method includes: acquiring pairing groups respectively corresponding to N images, wherein the pairing group of a first image includes: a first pairing group which is composed of the first image and description text of the first image, and a second pairing group which is composed of a local image of the first image and description text of the local image, N is an integer greater than 1, and the first image is any one of the N images; and training a visual language pre-training model according to the pairing groups respectively corresponding to the N images.

METHOD AND DEVICE FOR AUTHENTICATING AN IDENTITY OF A PERSON
20240371196 · 2024-11-07 · ·

A method for authenticating an identity of a person may involve acquiring a sample fingerprint image of a finger of the person. Two or more sub-portions may be extracted from the sample fingerprint image to form a plurality of fingerprint test images. A fingerprint recognition may be independently run on the plurality of fingerprint test images. The identity of the person may be authenticated upon a positive outcome from the independently run finger-print recognitions. A device may be provided for authenticating an identity of a person.

Multi-source multi-modal activity recognition in aerial video surveillance

Multi-source multi-modal activity recognition for conducting aerial video surveillance comprising detecting and tracking multiple dynamic targets from a moving platform, representing FMV target tracks and chat-messages as graphs of attributes, associating FMV tracks and chat-messages using a probabilistic graph based mapping approach; and detecting spatial-temporal activity boundaries.

SERVER, METHOD, NON-TRANSITORY COMPUTER READABLE MEDIUM ENCODED WITH PROGRAM, AND SYSTEM FOR RECOGNIZING INDIVIDUAL IDENTIFICATION INFORMATION OF MACHINE
20180068203 · 2018-03-08 ·

A server communicable with a plurality of machine tools via a network, includes a control unit and a storage unit, in which the control unit includes: an accumulating unit that collects display screen data from the machine tool at a predetermined time interval, and accumulates the display screen data in the storage unit along with time information and individual identification information of the machine tool; a receiver that receives a captured image of a display screen of a machine tool sent from an information terminal; a comparison unit that conducts a comparison between the captured image of the display screen of the machine tool received from the information terminal, and the display screen data accumulated in the storage unit, and specifies the display screen data that is matching; and a transmitter that transmits individual identification information of a specified machine tool as a result of the comparison to the terminal.

Method of detecting structural parts of a scene

A method of detecting the structural elements within a scene sensed by at least one sensor within a locale, the method comprising: a) capturing data from the sensor, which data provides a first representation of the sensed scene at the current time; b) generating a second representation of the sensed scene where the second representation is generated from a prior model of the locale; and c) comparing the first and second representations with one another to determine which parts of the first representation represent structural elements of the locale.

INFORMATION PROCESSING APPARATUS, METHOD OF CONTROLLING INFORMATION PROCESSING APPARATUS, AND STORAGE MEDIUM
20180005405 · 2018-01-04 ·

An information processing apparatus, comprising: a control unit configured to control a pattern that a projection apparatus projects onto an object; an obtainment unit configured to obtain a plurality of images respectively captured at a plurality of times by a plurality of image capturing apparatuses that capture the object onto which the pattern has been projected; and a measurement unit configured to measure range information of the object by performing matching, between images respectively captured by the plurality of image capturing apparatuses, using information of temporal change of pixel values of the images.

Automatically generating context-based alternative text using artificial intelligence techniques

Methods, apparatus, and processor-readable storage media for automatically generating context-based alternative text using artificial intelligence techniques are provided herein. An example computer-implemented method includes generating text captions for an image derived from a web page by processing the image using an artificial intelligence-based image captioning model; determining context information pertaining to the image by processing the image using an artificial intelligence-based context and emotion recognition library; generating context-based alternative text for at least a portion of the image by processing, using at least one artificial intelligence-based alternative text generation model, at least a portion of one or more of the generated text caption(s) for the image and the determined context information pertaining to at least a portion of the image; and performing one or more automated actions based on the generated context-based alternative text.