Patent classifications
H04N21/2335
Rendering content on computing systems
A system and method for rendering video content is disclosed. Video content is retrieved from a network and rendered by a graphics processing unit (GPU). The retrieved video content is rendered when a display of the video content is in an application foreground, and stopped when the display of the video content is moved from the application foreground to an application background. The rendering of the video content is then resumed when the display of the video content is returned from the application background to the application foreground.
Video stream processing method, computer device, and storage medium
A video stream processing method includes: obtaining first audio stream data included in live video stream data; performing speech recognition on the first audio stream data, to obtain a speech recognition text; and generating second audio stream data according to the speech recognition text. The second audio stream data includes a second speech, and a language of the second speech being different from a language of the first speech. The method also includes merging the second audio stream data and the live video stream data according to time information, to obtain processed live video stream data. The time information indicates a playing timestamp of the second audio stream data and the live video stream data.
Text-to-speech audio segment retrieval
A client computing system sends to a server system a presentation request for an audio presentation of electronic communications, and receives a manifest from the server system. The manifest indicates a plurality of segment-specific retrieval locations in which a different one of the plurality of segment-specific retrieval locations is indicated for each of a plurality of text-to-speech audio segments of the audio presentation. For each of the plurality of text-to-speech audio segments, the client computing system identifies a presentation order of the text-to-speech audio segment within the audio presentation; sends to the server system a segment request for the text-to-speech audio segment at the segment-specific retrieval location for that text-to-speech audio segment; receives from the server system the text-to-speech audio segment responsive to the segment request for that text-to-speech audio segment; and outputs the text-to-speech audio segment in the identified presentation order.
Content recommendation system
A method of setting controls on a digital communication device operated by an end user participant in a digital content event provided from a digital communication network includes operating a tracking control in the digital communication network to track activity of the digital communication device in a context of the digital content event; operating a timer in conjunction with the activity to form a time-stamped set of ranking controls; attenuating the time-stamped set of ranking controls according to an elapsed time; applying the time-stamped set of ranking controls and content inputs from a digital content manager to operate a ranking control and digital filter to generate a control interface for the digital communication device, the control interface comprising a plurality of individually operable controls; and configuring the digital communication device with the control interface.
WATERMARKING WITH PHASE SHIFTING
Apparatus, devices, systems, methods, and articles of manufacture are disclosed for watermarking with phase shifting. An example watermark encoding apparatus includes memory, machine readable instructions, and processor circuitry to execute the instructions to select a plurality of frequencies for encoding a watermark symbol, apply a phase shift pattern to the plurality of frequencies, the phase shift pattern based on a phase reference, and embed in the plurality of frequencies the applied phase shift pattern in a media signal to encode the watermark symbol in the a media signal, and embed the phase reference into the media signal.
A METHOD AND SYSTEM FOR CONTENT INTERNATIONALIZATION & LOCALISATION
A method of processing a video file to generate a modified video file, the modified video file including a translated audio content of the video file, the method comprising: receiving the video file; accessing a facial model or a speech model for a specific speaker, wherein the facial model maps speech to facial expressions, and the speech model maps text to speech; receiving a reference content for the originating video file for the specific speaker; generating modified audio content for the specific speaker and/or modified facial expression for the specific speaker; and modifying the video file in accordance with the modified content and/or the modified expression to generate the modified video file.
Content providing server, content providing terminal and content providing method
Disclosed is a scene meta information generating apparatus including: a subtitle information generating unit configured to detect a plurality of unit subtitles based on a subtitle file related to image contents and correct the plurality of unit subtitles; an audio information generating unit configured to extract audio information from the image contents, detect a plurality of speech segments based on the audio information, and perform speech-recognition on audio information in each speech segment; and an image information generating unit configured to detect a video segment corresponding to each speech segment, perform image-recognition on image frames in the video segment, and selecting a representative image from the image frames.
FRAGMENT-ALIGNED AUDIO CODING
Audio video synchronization and alignment or alignment of audio to some other external clock are rendered more effective or easier by treating fragment grid and frame grid as independent values, but, nevertheless, for each fragment the frame grid is aligned to the respective fragment's beginning. A compression effectiveness lost may be kept low when appropriately selecting the fragment size. On the other hand, the alignment of the frame grid with respect to the fragments' beginnings allows for an easy and fragment-synchronized way of handling the fragments in connection with, for example, parallel audio video streaming, bitrate adaptive streaming or the like.
APPARATUS AND METHOD FOR PROVIDING AUDIO DESCRIPTION CONTENT
A method and apparatus are described. The method includes receiving a set of audio soundtracks associated with media content, determining if one of the audio soundtracks is an audio description soundtrack, and modifying at least one main audio soundtrack to include an indication that an audio description soundtrack is available for the media content if it is determined that one of the audio soundtracks is an audio description audio soundtrack. The apparatus includes memory that stores a set of audio soundtracks associated with media content. The apparatus also includes an audio processing circuit configured to retrieve the set of audio soundtracks, determine if one of the audio soundtracks is an audio description soundtrack, and modify at least one main audio soundtrack to include an indication that an audio description soundtrack is available for the media content if it is determined that one of the audio soundtracks is an audio description soundtrack.
Method and system for implementing split and parallelized encoding or transcoding of audio and video content
Novel tools and techniques are provided for implementing split and parallelized encoding or transcoding of audio and video. In various embodiments, a computing system might split an audio-video file that is received from a content source into a single video file and a single audio file. The computing system might encode or transcode the single audio file. Concurrently, the computing system might split the single video file into a plurality of video segments. A plurality of parallel video encoders/transcoders might concurrently encode or transcode the plurality of video segments, each video encoder/transcoder encoding or transcoding one video segment of the plurality of video segments. Subsequently, the computing system might assemble the plurality of encoded or transcoded video segments with the encoded or transcoded audio file to produce an encoded or transcoded audio-video file, which may be output to a display device(s), an audio playback device(s), or the like.