Patent classifications
G06V20/48
LIVE STREAMING SAMPLING METHOD AND APPARATUS, AND ELECTRONIC DEVICE
An embodiment of the disclosure provides a live steam sampling method and device and electronic equipment, which belong to the technical field of image processing. The method comprises: extracting multi-frame sampled images from live stream; screening out forward similar images from all sampled images, wherein the forward similar images are sampled images similar to a sampled image of a previous adjacent frame; and taking the rest of the sampled images, except from the forward similar images, in all sampled images as sampled images of the live stream. According to the solution of the disclosure, a solution of effectively eliminating the forward similar images in sampling operation based on a FPS is realized, thereby avoiding the leakage conditions, reducing total workload of sampling and checking, and improving efficiency of live stream sampling.
TRANSFORMING ASSET OPERATION VIDEO TO AUGMENTED REALITY GUIDANCE MODEL
A method, computer system, and a computer program product for AR guidance is provided. The present invention may include detecting a plurality of objects in a video recording associated with completing a task. The present invention may include generating a plurality of three-dimensional (3D) object models based on scanning a plurality of real objects in a task space. The present invention may include matching the detected plurality of objects in the video recording with the generated plurality of 3D object models representing the plurality of real objects in the task space. The present invention may include generating, based on the video recording, an augmented reality (AR) guidance model for completing the task, wherein the generated AR guidance model replaces the detected plurality of objects in the video recording with the generated plurality of 3D object models representing the plurality of real objects in the task space.
INFORMATION GENERATION METHOD AND APPARATUS
Provided in embodiments of the disclosure are an information generation method and apparatus. The information generation method comprises: obtaining an input video, and extracting video frames and audio data in the input video; processing the video frames to determine a target video frame, and processing the audio data to obtain text information; determining, based on a corresponding time of the target video frame in the input video and corresponding time of the text information in the input video, target text information corresponding to the target video frame; and processing the target video frame and the target text information to generate graphic and text information.
Data collection from a medicament delivery device
A data collection apparatus comprises a camera configured to obtain first and second recordings of at least a portion of a medicament delivery device, said portion including a first component and a second component, the first component being configured to move relative to the second component as medicament is expelled from the medicament delivery device. The data collection apparatus also comprises a processing arrangement configured to determine, from said first and second recordings, positions in the first and second recordings of the first component relative to the second component, and determine an amount of medicament that has been expelled by the medicament delivery device by comparing the position, in the first recording, of the first component relative to the second component with a position, in the second recording, of the first component relative to the second component.
Method and device for comparing media features
The disclosure is related to a method and device for comparing media features, the method comprising: obtaining first media feature sequences of a first media object and second media feature sequences of a second media object, the first media feature sequence comprises a plurality of first media feature units arranged in sequence, and the second media feature sequence comprises a plurality of second media feature units arranged in sequence; determining unit similarities between the first media feature units and the second media feature units; determining a similarity matrix between the first media feature sequences and the second media feature sequences according to the unit similarities; determining a similarity of the first media object and the second media object according to the similarity matrix.
Video processing for enabling sports highlights generation
One or more highlights of a video stream may be identified. The highlights may be segments of a video stream, such as a broadcast of a sporting event, that are of particular interest to one or more users. According to one method, at least a portion of the video stream may be stored. The portion of the video stream may be compared with templates of a template database to identify the one or more highlights. Each highlight may be a subset of the video stream that is deemed likely to match the one or more templates. The highlights, an identifier that identifies each of the highlights within the video stream, and/or metadata pertaining particularly to the one or more highlights may be stored to facilitate playback of the highlights for the users.
METHODS, SYSTEMS, AND MEDIA FOR GENERATING VIDEO CLASSIFICATIONS USING MULTIMODAL VIDEO ANALYSIS
Methods, systems, and media for generating video classifications using multimodal video analysis are provided. In some embodiments, a method for classifying videos comprising: receiving, from a computing device, a video identifier; parsing a video associated with the video identifier into an audio portion and a plurality of image frames; analyzing the plurality of images frames associated with the video using (i) an optical character recognition technique to obtain first textual information corresponding to text appearing in at least one of the plurality of image frames and (ii) an image classifier to obtain, for each of a plurality of objects appearing in at least one of the plurality of frames of the video, a probability that an object appearing in at least one of the plurality of images falls within an image class; concurrently with analyzing the plurality of image frames associated with the video, analyzing the audio portion of the video using an automated speech recognition technique to obtain second textual information corresponding to words spoken in the video; combining the first textual information, the probability of each of the plurality of objects appearing in the at least one of the plurality of frames of the video, and the second textual information to obtain a combined analysis output for the video; determining, using a neural network, a safety score for each of a plurality of categories that the video contains content belonging to a category of the plurality of categories, wherein the combined analysis output is input into the neural network; and, in response to receiving the video identifier, transmitting a plurality of safety scores corresponding to the plurality of categories to the computing device for the video associated with the video identifier.
METHOD AND APPARATUS FOR EXTRACTING A FINGERPRINT OF VIDEO HAVING A PLURALITY OF FRAMES
A method for extracting a fingerprint of a video having a plurality of frames includes obtaining a plurality of pixel value matrices from each of the plurality of frames, calculating maximum values of average pixel values in each axis of the plurality of pixel value matrices for each of the plurality of frames, and calculating the fingerprint of the video based on a temporal correlation of the maximum values calculated for the plurality of frames.
METHOD AND APPARATUS FOR EXTRACTING A FINGERPRINT OF A VIDEO HAVING A PLURALITY OF FRAMES
A method for extracting a fingerprint of a video includes calculating 2D discrete cosine transform (DCT) coefficients from each of the plurality of frames of the video, extracting, from the 2D DCT coefficients, a coefficient having a basis satisfying at least one of up-down symmetry or left-right symmetry, and calculating a fingerprint of the video based on the extracted coefficient.
System and method for player reidentification in broadcast video
A system and method of re-identifying players in a broadcast video feed are provided herein. A computing system retrieves a broadcast video feed for a sporting event. The broadcast video feed includes a plurality of video frames. The computing system generates a plurality of tracks based on the plurality of video frames. Each track includes a plurality of image patches associated with at least one player. Each image patch of the plurality of image patches is a subset of the corresponding frame of the plurality of video frames. For each track, the computing system generates a gallery of image patches. A jersey number of each player is visible in each image patch of the gallery. The computing system matches, via a convolutional autoencoder, tracks across galleries. The computing system measures, via a neural network, a similarity score for each matched track and associates two tracks based on the measured similarity.