Patent classifications
G06V20/41
Action recognition method and apparatus, and human-machine interaction method and apparatus
A computer device extracts a plurality of target windows from a target video. Each of the target windows comprises a respective plurality of consecutive video frames. For each of the target windows, the device performs action recognition on the respective plurality of consecutive video frames corresponding to the target window to obtain respective first action feature information of the target window. The device obtains a similarity between the first action feature information of the target window and preset feature information. The device determines, from the respective obtained similarities corresponding to the plurality of target windows, a highest first similarity and a first target window corresponding to the highest first similarity. The device also determines a dynamic action corresponding to the highest first similarity as the preset dynamic action in accordance with threshold settings.
GRAPHICS FUSION TECHNOLOGY SCENE DETECTION AND RESOLUTION CONTROLLER
Disclosed are embodiments of a graphics scene detection technique that provides an adaptive scale factor dependent on an image quality, such that certain images may be downscaled prior to being displayed to preserve system resources without significantly affecting image quality for a user. The inventors recognized and appreciated that certain images may be presented at a lower resolution to a user without being perceived as lower image quality. Some aspects provide a scene detection module that determines a quality score from a graphic command output for an image. Depending on the quality score, the scene detection module may output a quality-aware scale factor that can be applied to reduce pixel resolution of an image before displaying the image to a user. Resultingly, the computing device may be improved by saving system resources including memory bandwidth, processing power for other apps or instances, without negatively affecting visual perception of the scene.
EFFICIENT EXPLORER FOR RECORDED MEETINGS
One example method includes generating a searchable video library. Video files are processed to extract text corresponding to the speech and to the images. The extracted text is semantically searched such that specific portions or locations of video files can be identified and returned in response to a query.
SEMI-SUPERVISED VIDEO TEMPORAL ACTION RECOGNITION AND SEGMENTATION
Systems, apparatuses, and methods include technology that generates final frame predictions for a first plurality of frames of a video, where the first plurality of frames is associated with unlabeled data. The technology predicts an ordered list of actions for the first plurality of frames based on the final frame predictions, and temporally aligning the ordered list of actions to the final frame predictions to generate labels.
METHOD FOR DECODING IMMERSIVE VIDEO AND METHOD FOR ENCODING IMMERSIVE VIDEO
A method of processing an immersive video includes classifying each of a plurality of objects included in a view image as one of a first object group and a second object group, acquiring a patch for each of the plurality of objects, and packing patches to generate at least one atlas. In this instance, patches derived from objects belonging to the first object group may be packed in a different region or a different atlas from a region or an atlas of patches derived from objects belonging to the second object group.
IMAGE DETECTION METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM
An image detection method determines a target object. A plurality of original images of a scene in front of a vehicle are obtained. An object in one original image is detected, and a degree of similarity between the object in the original image and the target object in a preset image is calculated. If the degree of similarity is greater than a preset similarity threshold, it is determined that the original image is a target image and the object is the target object. A position of the target object relative to the vehicle is determined and output. The method can recognize objects of interest in front of a driver.
Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
An adaptive interface for a programmable system, for predicting a desired user function, based on user history, as well as machine internal status and context. The apparatus receives an input from the user and other data. A predicted input is presented for confirmation by the user, and the predictive mechanism is updated based on this feedback. Also provided is a pattern recognition system for a multimedia device, wherein a user input is matched to a video stream on a conceptual basis, allowing inexact programming of a multimedia device. The system analyzes a data stream for correspondence with a data pattern for processing and storage. The data stream is subjected to adaptive pattern recognition to extract features of interest to provide a highly compressed representation which may be efficiently processed to determine correspondence. Applications of the interface and system include a VCR, medical device, vehicle control system, audio device, environmental control system, securities trading terminal, and smart house. The system optionally includes an actuator for effecting the environment of operation, allowing closed-loop feedback operation and automated learning.
Method and apparatus for providing special effects to video
A method of providing a special effect includes, in response to the selection of background music to be applied to a video, applying the background music and a special effect associated with the background music to the video based on a first feature extracted from the background music and a second feature extracted from the video.
Computing system with content-characteristic-based trigger feature
In one aspect, an example method includes (i) receiving, by a computing system, media content; (ii) generating, by the computing system, a fingerprint of a portion of the received media content; (iii) determining, by the computing system, that the received media content has a predefined characteristic; (iv) responsive to determining that the received media content has the predefined characteristic, transmitting, by the computing system, the generated fingerprint to a content identification server to identify the portion of the received media content; and (v) performing an action based on the identified portion of media content.
Observed-object recognition system and method
To accurately recognize observed objects. An observed-object recognition system includes an observation region estimation portion, an existence region estimation portion, and an object recognition portion. The observation region estimation portion estimates an observation region that is relatively highly likely to be an observation point in at least one first-person image in a first-person video (a video based on the first-person perspective). Based on the observation region, the existence region estimation portion estimates an existence region that belongs to the first-person image and causes an observed object to exist. The object recognition portion recognizes an object in the estimated existence region of the first-person image.