Patent classifications
H04N5/2625
Apparatus and method for tuning an audiovisual system to viewer attention level
An audiovisual signal processing device including a signal processing unit configured to receive a video signal comprising a plurality of images; identify at least first and second video components of the video signal, the first video component relating to motion of the first video component between at least a first portion of the plurality of images and the second video component relating to a substantial lack of motion of the second video component between at least a second portion of the plurality of images; select at least one image from the second portion of the plurality of images including the second video component; and insert the first video component into a plurality of the selected at least one image to create a temporally repeating sequence of the at least one image including the first video component.
TRACKING OF HANDHELD SPORTING IMPLEMENTS USING COMPUTER VISION
A path and/or orientation of at least a portion of a handheld sporting implement swung by an athlete is tracked using two or more cameras. At least two sets of video images of the handheld sporting implement being swung are obtained using at least two different cameras having different positions. Motion regions within video images are identified, and candidate locations in 2D space of an identifiable portion (e.g., a head) of the handheld sporting implement is/are identified within the motion region(s). Based thereon, a probable location in 3D space of the identifiable portion is identified, for each of a plurality of instants during which the handheld sporting implement was swung. A piecewise 3D trajectory of at least the identifiable portion (e.g., the head) of the sporting implement is approximated from the probable locations in 3D space of the head for multiple instants during which the sporting implement was swung.
Method and device for shooting image, and storage medium
A method for shooting image, applied to an electronic device in which an image capturing device is mounted, includes: receiving a first input from a user; responsive to the first input, acquiring a first image and a second image captured by the image capturing device for a target object, a framing range of the first image being same as a framing range of the second image; and performing synthesis processing on the first image and the second image to generate a target image, the target image including a first object image of the target object in the first image and a second object image of the target object in the second image.
IMAGING DEVICE AND IMAGE PROCESSING METHOD
There is provided an imaging device including: an image processing unit; a face recognition processing unit; a storage unit; and a composition processing unit for generating composite data by a composition process so that persons photographed in each of a plurality of image data are included in one image data. The face recognition processing unit recognizes a first person by performing a face recognition process on first image data. When second image data obtained by photographing a second person at the different photographing timing of the first image data, with the same background as the first image data, is recorded in the storage unit, the composition processing unit generates composite data in which the first person and the second person are superimposed on the same background.
Real-time hyper-lapse video creation via frame selection
Various technologies described herein pertain to creation of an output hyper-lapse video from an input video. Values indicative of overlaps between pairs of frames in the input video are computed. A value indicative of an overlap between a pair of frames can be computed based on a sparse set of points from each of the frames in the pair. Moreover, a subset of the frames from the input video are selected based on the values of the overlaps between the pairs of the frames in the input video and a target frame speed-up rate. Further, the output hyper-lapse video is generated based on the subset of the frames. The output hyper-lapse video can be generated without a remainder of the frames of the input video other than the subset of the frames.
PHOTO-VIDEO BASED SPATIAL-TEMPORAL VOLUMETRIC CAPTURE SYSTEM FOR DYNAMIC 4D HUMAN FACE AND BODY DIGITIZATION
The photo-video based spatial-temporal volumetric capture system more efficiently, produces high frame rate and high resolution 4D dynamic human videos, without a need for 2 separate 3D and 4D scanner systems, by combining a set of high frame rate machine vision video cameras with a set of high resolution photography cameras. It reduces a need for manual CG works, by temporally up-sampling shape and texture resolution of 4D scanned video data from a temporally sparse set of higher resolution 3D scanned keyframes that are reconstructed both by using machine vision cameras and photography cameras. Unlike typical performance capture system that uses single static template model at initialization (e.g. A or pose), the photo-video based spatial-temporal volumetric capture system stores multiple keyframes of high resolution 3D template models for robust and dynamic shape and texture refinement of 4D scanned video sequence. For shape up-sampling, the system can apply mesh-tracking based temporal shape super resolution. For texture up-sampling, the system can apply machine learning based temporal texture super resolution.
IMAGE PROCESSING METHOD, APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM
An image processing method can include: acquiring a plurality of image frames obtained by photographing for a subject and a plurality of photographing state parameters respectively adopted when photographing the plurality of image frames, the subject being located at a same position of each image frame; determining a zoom ratio corresponding to the plurality of photographing state parameters respectively according to a preset corresponding relationship between the photographing state parameters and the zoom ratio; zooming corresponding image frames according to the determined zoom ratio to obtain a plurality of target image frames, and encoding the obtained plurality of target image frames into a video file.
Image capture device with an automatic image capture capability
An image capture device may automatically capture images. An image sensor may generate visual content based on light that becomes incident thereon. A motion of interest within the visual content may be identified, and multiple images may be generated to include portions of the visual content that span a time duration of interest.
DATA PROCESSING SYSTEMS AND METHODS FOR ENHANCED AUGMENTATION OF INTERACTIVE VIDEO CONTENT
Data processing systems and methods are disclosed for augmenting video content with one or more augmentations to produce augmented video. Elements within video content may be identified by spatiotemporal indices and may have associated values. An advertiser can pay to have an augmentation added to an element that, for example, advertises the advertiser's goods and/or includes a link that, when activated, takes a user to the advertiser's web site. Elements may have associated contexts that can be used to determine augmentations and element value, such as a position and/or current use of the element.
High frame rate reconstruction with N-tap camera sensor
A camera captures image data at a target frame rate. The camera includes a sensor and a controller. The sensor is configured to detect light from a local area and includes a plurality of augmented pixels. Each augmented pixel comprises at least a first and a second gate. The first gates are configured to store a first plurality of image frames as first image data according to a first activation pattern. The second gates are configured to store a second plurality of image frames as second image data according to a second activation pattern. The controller reads out the image data to generate a first image from the first image data and a second image from the second image data. The first and second images may be used to reconstruct a combined set of image frames at the target frame rate with a reconstruction algorithm.