Patent classifications
G11B27/28
VIDEO PROCESSING OPTIMIZATION AND CONTENT SEARCHING
Techniques are disclosed for automatic scene detection and character extraction. In one example, audiovisual content with video frames, an audio recording, and timing information is received. A score, based on the frame's visual characteristics, is determined for a first frame and subsequent frames. The first frame's score and subsequent frame's scores are compared to determine if the difference between the scores is above a threshold. When the difference in scores is above a threshold, the subsequent frame is classified as a new scene. The audiovisual content is segmented into scenes and textual characters are identified in at least one frame from each scene. The characters are stored and indexed in a searchable database with the timing information for the scene where the characters were identified. The audio recording is transcribed and the transcribed words are stored and indexed in the searchable database with timing information.
Playback device, playback method, and recording medium
A decoding system decodes a video stream, which is encoded video information. The decoding system includes a decoder that acquires the video steam and generates decoded video information, and a maximum luminance information acquirer that acquires, in a case where a dynamic range of luminance of the video stream is a second dynamic range that is wider than a first dynamic range, maximum luminance information indicating the maximum luminance of the video stream from the video stream. The decoding system also includes an outputter that outputs the decoded video information and the maximum luminance information. Where the dynamic range of luminance of the video stream is expressed by the maximum luminance of all pictures in the video stream as the maximum luminance information, the outputter outputs the decoded video information, along with the maximum luminance information indicating the maximum luminance of all pictures in the video stream.
Playback device, playback method, and recording medium
A decoding system decodes a video stream, which is encoded video information. The decoding system includes a decoder that acquires the video steam and generates decoded video information, and a maximum luminance information acquirer that acquires, in a case where a dynamic range of luminance of the video stream is a second dynamic range that is wider than a first dynamic range, maximum luminance information indicating the maximum luminance of the video stream from the video stream. The decoding system also includes an outputter that outputs the decoded video information and the maximum luminance information. Where the dynamic range of luminance of the video stream is expressed by the maximum luminance of all pictures in the video stream as the maximum luminance information, the outputter outputs the decoded video information, along with the maximum luminance information indicating the maximum luminance of all pictures in the video stream.
System and method to support synchronization, closed captioning and highlight within a text document or a media file
The present invention relates to a system and method for synchronizing and highlighting a target text and audio associated with a reference document. The system and method may comprise one or more of an input unit, an extracting unit, a mapping unit, a processing unit, and an image resizing unit. The system and method may synchronize the target text and audio in order to provide a user with a Read Along. The invention further synchronizes and highlights closed captions and audio that helps people with hearing impairment to comprehend better while watching a movie or listening to songs.
Information processing apparatus, information processing method, and non-transitory computer readable medium
An information processing apparatus includes a processor configured to: record a motion of a user made until a video pause instruction to pause displayed video is given; and display a still image in the video displayed at a second time point earlier than a first time point, the first time point being a time point at which the video pause instruction is received, the second time point being a threshold time point at which acceleration involved with the motion of the user first exceeds a predetermined threshold and which is a most recent threshold time point.
VIDEO INFORMATION GENERATION METHOD, APPARATUS, AND SYSTEM AND STORAGE MEDIUM
This application provides a video information generation method, apparatus, and system and a storage medium. The video information generation method includes: obtaining a plurality of temporally consecutive target images; obtaining first information of a target object in the target images; and associating first information of a same target object located in different target images to generate target information. In the video information generation method provided in this application, the first information of the target object in the target images is obtained, and the first information of the same target object located in different target images is associated. In this way, target information with a relatively small amount of data can be obtained, thereby improving the efficiency of remotely viewing a video by a user.
SEGMENTATION CONTOUR SYNCHRONIZATION WITH BEAT
Systems and methods for rendering a segmentation contour effect are described. More specifically, video data including one or more video frames and audio data are obtained. Based on the video data, one or more segments in each of the one or more video frame are determined. The audio data is analyzed to determine beat characteristics of each beat. A segmentation contour effect to be applied to the one or more segments in the video data is determined based on the beat characteristics. A rendered video is generated by synchronizing the segmentation contour effect to audio data.
Systems and methods for generating comic books from video and images
Techniques for a comic book feature are described herein. A visual data stream of a video may be parsed into a plurality of frames. Scene boundaries may be determined to generate a scene using the plurality of frames where a scene includes a subset of frames. A key frame may be determined for the scene using the subset of frames. An audio portion of an audio data stream of the video may be identified that maps to the subset of frames based on time information. The key frame may be converted to a comic image based on an algorithm. First dimensions and placement for a data object may be determined for the comic image. The data object may include the audio portion for the comic image. A comic panel may be generated for the comic image that incorporates the data object using the determined first dimensions and the placement.
Systems And Methods For Providing Real-Time Composite Video From Multiple Source Devices Featuring Augmented Reality Elements
Systems and methods for superimposing the human elements of video generated by computing devices, wherein a first user device and second user device capture and transmit video to a central server which analyzes the video to identify and extract human elements, superimpose these human elements upon one another, adds in at least one augmented reality element, and then transmits the newly created superimposed video back to at least one of the user devices.
Method and system for automatic object-aware video or audio redaction
A system and a method for automatic video and/or audio redaction are provided herein. The method may include the following steps: obtaining an input video; obtaining at least one prespecified object, being a visual or an acoustic object or a descriptor thereof; analyzing the input video, to detect a matched object, being an object having descriptors similar to the descriptors of the at least one prespecified object; and generating a redacted video by removing or replacing the matched objects therefrom.