H04N21/234336

METHODS AND SYSTEMS FOR SELECTIVE PLAYBACK AND ATTENUATION OF AUDIO BASED ON USER PREFERENCE

Systems and methods are presented for providing to filter unwanted sounds from a media asset. Voice profiles of a first character and a second character are generated based on a first voice signal and a second voice signal received from the media device during a presentation. The user provides a selection to avoid a certain sound or voice in association with the second character. During a presentation of the media asset, a second audio segment is analyzed to determine, based on the voice profile of the second character, whether the second voice signal includes the voice of a second character. If so, the second voice signal output characteristics are adjusted to reduce the sound.

TEXT-DRIVEN EDITOR FOR AUDIO AND VIDEO ASSEMBLY
20220130427 · 2022-04-28 ·

The disclosed technology is a system and computer-implemented method for assembling and editing a video program from spoken words or soundbites. The disclosed technology imports source audio/video clips and any of multiple formats. Spoken audio is transcribed into searchable text. The text transcript is synchronized to the video track by timecode markers. Each spoken word corresponds to a timecode marker, which in turn corresponds to a video frame or frames. Using word processing operations and text editing functions, a user selects video segments by selecting corresponding transcribed text segments. By selecting text and arranging that text, a corresponding video program is assembled. The selected video segments are assembled on a timeline display in any chosen order by the user. The sequence of video segments may be reordered and edited, as desired, to produce a finished video program for export.

Real-time scrubbing of online videos

A method for traversing a streaming video file includes receiving a representative streaming video file that includes less information than a higher-resolution streaming video file and spans the entire streaming video file. Based on navigation information associated with the representative streaming video file, a playback engine navigates to a different portion of the streaming video file. The navigation information may be based on input information received from a viewer of the streaming video file. One advantage of the disclosed method is that it enables fast and accurate navigation of a streaming video.

VIDEO PROCESSING

A video processing method and apparatus is provided. The video processing method includes: extracting at least two types of modal information from a received target video; extracting text information from the at least two types of modal information based on extraction manners corresponding to the at least two types of modal information; and performing matching between preset object information of a target object and the text information to determine an object list corresponding to the target object included in the target video.

SYSTEMS AND METHODS FOR GENERATING DIGITAL VIDEO CONTENT FROM NON-VIDEO CONTENT
20220124385 · 2022-04-21 · ·

Embodiments of the present invention provide for generating digital video content from non-video content. The systems and methods provide for, upon receiving an input from an end user to generate the digital video content, retrieving the non-video content; extracting metadata from the non-video content; combining the non-video content, the extracted metadata, and user preferences into a digital content instruction package; and generating the digital video content based on the digital content instructions package, wherein the creating of the digital video content includes (i) modifying the digital video content based on the user preferences and (ii) displaying the modified digital video content to the end user.

METHOD AND SYSTEM TO HIGHLIGHT VIDEO SEGMENTS IN A VIDEO STREAM

A method to highlight video segments in a video stream, where the method includes receiving a video stream from a video source, identifying a highlight segment within the video stream based on a machine learning model, the highlight segment being deemed to be worthy of replay by the machine learning model, and starting and ending frames of the highlight segment being identified by applying the machine learning model to the video stream and corresponding audio data, and providing an availability indication of the highlight segment in the video stream once the starting and ending frames of the highlight segment are identified.

Real time popularity based audible content acquisition

A personalized news service provides personalized news programs for its users by generating personalized combinations of audible versions of news stories derived from text-based versions of the news stories. The audible versions may be generated from the text-based version by a text-to-speech system, or may by recording a person reading aloud the text-based version. To acquire recordings, the personalized news service can make a determination that a particular news story has a threshold extent of popularity. The news service can then transmit a request to a remote recording station for a recording of a verbal reading of the particular news story. The news service can then receive the requested recording from the remote recording station.

Intelligent commentary generation and playing methods, apparatuses, and devices, and computer storage medium

The present disclosure provides an intelligent commentary generation method. The method includes: obtaining a match data stream; parsing the match data stream, to obtain candidate events from the match data stream; determining events from the candidate events, to generate a sequence of events; and generating commentary scripts corresponding to the match data stream according to the sequence of events.

VIDEO PROCESSING FOR EMBEDDED INFORMATION CARD LOCALIZATION AND CONTENT EXTRACTION
20220027631 · 2022-01-27 · ·

Metadata for one or more highlights of a video stream may be extracted from one or more card images embedded in the video stream. The highlights may be segments of the video stream, such as a broadcast of a sporting event, that are of particular interest. According to one method, video frames of the video stream are stored. One or more information cards embedded in a decoded video frame may be detected by analyzing one or more predetermined video frame regions. Image segmentation, edge detection, and/or closed contour identification may then be performed on identified video frame region(s). Further processing may include obtaining a minimum rectangular perimeter area enclosing all remaining segments, which may then be further processed to determine precise boundaries of information card(s). The card image(s) may be analyzed to obtain metadata, which may be stored in association with at least one of the video frames.

DATA STORAGE SERVER WITH ON-DEMAND MEDIA SUBTITLES
20220030286 · 2022-01-27 ·

A network-attached storage device (NAS) includes a non-volatile memory module storing a media stream, a network interface, and control circuitry coupled to the non-volatile memory module and to the network interface and configured to connect to a client over a network connection using the network interface, receive a request for the media stream from the client, determine subtitle preferences associated with the request for the media stream, access an audio stream associated with the media stream, generate subtitles based on the audio stream, and send a transport stream to the client over the network connection, the transport stream including the media stream and the subtitles.