Patent classifications
G06V10/462
DETECTING OBJECTS IN A VIDEO USING ATTENTION MODELS
The present disclosure describes techniques of detecting objects in a video. The techniques comprises extracting features from each frame of the video; generating a first attentive feature by applying a first attention model on at least some of features extracted from any particular frame among the plurality of frames, wherein the first attention model identifies correlations between a plurality of locations in the particular frame by computing relationships between any two locations among the plurality of locations; generating a second attentive feature by applying a second attention model on at least one pair of features at different levels selected from the features extracted from the particular frame, wherein the second attention model identifies a correlation between at least one pair of locations corresponding to the at least one pair of features; and generating a representation of an object included in the particular frame.
Article management system, information processing apparatus, and control method and control program of information processing apparatus
An apparatus of this invention is directed to an information processing apparatus that effectively counts, on a type basis, articles of a plurality of types displayed in a depth direction on a display shelf. The information processing apparatus includes a display count acquirer that acquires display count information of articles using article presence/absence sensors provided on the display shelf on which the articles are placed, an article identifier that acquires article identification information capable of identifying the types of articles based on an image acquired by capturing the display shelf, and a display recognizer that recognizes, based on the display count information and the article identification information, display count of each type of the articles.
Feature density object classification, systems and methods
A system capable of determining which recognition algorithms should be applied to regions of interest within digital representations is presented. A preprocessing module utilizes one or more feature identification algorithms to determine regions of interest based on feature density. The preprocessing modules leverages the feature density signature for each region to determine which of a plurality of diverse recognition modules should operate on the region of interest. A specific embodiment that focuses on structured documents is also presented. Further, the disclosed approach can be enhanced by addition of an object classifier that classifies types of objects found in the regions of interest.
Advanced driver assist system, method of calibrating the same, and method of detecting object in the same
An advanced driver assist system (ADAS) includes a processing circuit and a memory storing instructions executable by the processing circuit. The processing circuit executes the instructions to cause the ADAS to: obtain, from a vehicle, a video sequence including a plurality of frames captured while driving the vehicle, where each of the frames corresponds to a stereo image including a first viewpoint image and a second viewpoint image; determine depth information in the stereo image based on reflected signals received while driving the vehicle; fuse the stereo image and the depth information to generated fused information, and detect at least one object included in the stereo image based on the fused information.
VISUAL OBJECT DETECTION IN A SEQUENCE OF IMAGES
There is provided mechanisms for visual object detection in a sequence of images. A method is performed by a visual object detector (200). The method comprises obtaining (S102) a sequence of images of a scene. The sequence of images at least comprises a current image of the scene and a previous image of the scene. The method comprises extracting (S104) a set of objects from the sequence of images by performing visual object detection in the sequence of images. Performing the visual object detection in at least part of the current image is conditioned on a set of conditions being fulfilled. The set of conditions at least pertains to an image-wise descriptor classification score computed for at least one of the previous image and the current image and pertaining to which type of content the scene comprises, and an image overlapping score pertaining to how much overlap in image area there is between the previous image and the current image. The method comprises constructing (S106) an image representation of the scene using the extracted set of objects.
Machine learning in augmented reality content items
Systems and methods herein describe receiving an image via an image capture device, using a machine learning model, generating an image augmentation decision, accessing an augmented reality content item, associating the generated image augmentation decision with the augmented reality content item, modifying the received image using the augmented reality content item and the associated image augmentation decision, and causing presentation of the modified image on a graphical user interface of a computing device.
PARAMETERISING AND MATCHING IMAGES OF FRICTION SKIN RIDGES
An apparatus and method configured to parameterise an image indicating friction skin ridges, the apparatus comprising means for: obtaining a first biometric parameter indicative of a group characteristic of a plurality of the friction ridges; obtaining a second biometric parameter indicative of one or more individual characteristics of one or more individual friction ridges of the plurality of friction ridges; and determining a third biometric parameter dependent on the first biometric parameter and dependent on the second biometric parameter. The first biometric parameter comprises a circular variance field indicative of variation of directions of the plurality of friction ridges. There is also provided a matching apparatus and method.
HAND-EYE CALIBRATION OF CAMERA-GUIDED APPARATUSES
The invention describes a generic framework for hand-eye calibration of camera-guided apparatuses, wherein the rigid 3D transformation between the apparatus and the camera must be determined. An example of such an apparatus is a camera-guided robot.
TECHNOLOGIES FOR AUTOMATICALLY DETERMINING AND DISPLAYING SALIENT PORTIONS OF IMAGES
Systems and methods for automatically determining and displaying salient portions of images are disclosed. According to certain aspects, an electronic device may support a design application that may apply a saliency detection learning model to a digital image, resulting in the application generating one or more salient portions of the digital image. The electronic device may generate a digital rendering of the salient portion of the image on digital models of items or products, and may enable a user to review the digital rendering. The user may also choose alternative salient portions of the digital image and/or aspect ratios for those salient portions for inclusion on a digital model of the item or product.
Systems and methods for automated trade-in with limited human interaction
Aspects described herein may facilitate an automated trade-in of a vehicle with limited human interaction. A server may receive a request to begin a value determination of a vehicle associated with the user. The server may receive first data comprising: vehicle-specific identifying information, and multimedia content showing a first aspect of the vehicle. The user may be directed to place the vehicle within a predetermined staging area. The server may receive, from one or more image sensors associated with the staging area, second data comprising multimedia content showing a second aspect of the vehicle. The server may create a feature vector comprising the first data and the second data. The feature vector may be inputted into a machine learning algorithm corresponding to the vehicle-specific identifying information of the vehicle. Based on the machine learning algorithm, the server may determine a value of the vehicle.