Patent classifications
H04N21/2335
Automated voice translation dubbing for prerecorded video
A method for aligning a translation of original caption data with an audio portion of a video is provided. The method includes identifying, by a processing device, original caption data for a video that includes a plurality of caption character strings. The processing device identifies speech recognition data that includes a plurality of generated character strings and associated timing information for each generated character string. The processing device maps the plurality of caption character strings to the plurality of generated character strings using assigned values indicative of semantic similarities between character strings. The processing device assigns timing information to the individual caption character strings based on timing information of mapped individual generated character strings. The processing device aligns a translation of the original caption data with the audio portion of the video using assigned timing information of the individual caption character strings.
HIGH-SPEED REAL-TIME DATA TRANSMISSION METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
A high-speed real-time data transmission method includes performing deduplication processing on first encoded data from a transmission device to obtain target data. The first encoded data is obtained by encoding corresponding data using a first encoding algorithm. The method further includes encoding the target data using a second encoding algorithm to obtain second encoded data, and sending the second encoded data to a receiving device. A compression ratio of the second encoding algorithm is greater than a compression ratio of the first encoding algorithm.
Adaptive marketing in cloud-based content production
Methods, apparatus and systems related to production of a movie, a TV show or a multimedia content are described. In one example aspect, a system for producing a multimedia digital content includes a pre-production subsystem configured to receive information about a storyline, cameras, cast, and other assets for the content from a user. The pre-production subsystem is configured to generate one or more machine-readable scripts that include information about one or more advertisements. The system includes a production subsystem configured to receive the one or more machine-readable scripts from the pre-production system to obtain a footage according to the storyline. The production subsystem is further configured to embed one or markers corresponding to the one or more advertisements in the footage. The system also includes a post-production editing subsystem configured to detect the one or more markers embedded in the footage and replace each of the one or more markers with a corresponding advertising target.
METHODS AND APPARATUS FOR INTEGRATING MEDIA ACROSS A WIDE AREA NETWORK
A system for distributing media includes a wide area network (WAN), a media player coupled to the WAN at a first home, and a media server coupled to the WAN at a second home for providing media. A service is coupled to the WAN for receiving a request for media from the media player and for establishing a connection between the first and second homes over the WAN. Media is streamed across the WAN from the second home to the first home. The system may include a storage device coupled to the media player where media is transferred across the WAN for storage at the storage device. A media device may be coupled to the media player for playing the streamed/transferred media where the media player and the media device may comprise a television, stereo, or computer and the media item may comprise video, photographs, or audio.
Just after broadcast media content
Methods and apparatus are described for making broadcast content available as an on-demand asset soon after all of the video fragments of the broadcast content have been made available. As the video fragments of the broadcast content are made available, they are requested and archived. When all of the fragments for the duration of the broadcast content are available (e.g., a live event ends), a VOD-style manifest is generated and the archived fragments are made available for downloading or streaming using the VOD-style manifest.
METHODS AND DEVICES FOR PERSONALIZING AUDIO CONTENT
The present document describes a method (400) for personalizing audio content. The method (400) comprises receiving (401) a manifest file (140) for the audio content. The manifest file (140) comprises at least one adaptation set (281, 282) referencing an audio bitstream (121), where the audio bitstream (121) comprises a plurality of audio objects (181), and a plurality of different preselection elements (291, 292, 293) for the adaptation set (281, 282), wherein the different preselection elements (291, 292, 293) specify different combinations of the plurality of audio objects (181). The method (400) further comprises selecting (402) a preselection element (291) from the plurality of different preselection elements (291, 292, 293), and causing (403) rendering of an audio signal which depends on the selected preselection element (291).
Systems and methods for adaptive streaming of multimedia content
The disclosed computer-implemented method includes determining that audio quality is to be adjusted for a multimedia streaming connection over which audio data and video data are being streamed to a content player. The audio data is streamed at a specified audio quality level and the video data is streamed at a specified video quality level. The method also includes determining that a specified minimum video quality level is to be maintained while adjusting the audio quality level. Still further, the method includes dynamically adjusting the audio quality level of the multimedia streaming connection while maintaining the video quality level of the multimedia streaming connection at at least the specified minimum video quality level. Various other methods, systems, and computer-readable media are also disclosed.
APPARATUS AND METHOD FOR AUDIO ENCODING
An audio encoding apparatus comprises an audio receiver (201) receiving audio items representing an audio scene and a metadata receiver (203) receives input presentation metadata for the audio items describing presentation constraints for the rendering of the audio items. The presentation constraints constrain a rendering parameter that can be adapted when rendering the audio items. An audio encoder (205) generates encoded audio data for the audio scene by encoding the plurality of audio items with the encoding being adapted in response to the input presentation metadata. A metadata circuit (207) generates output presentation metadata from the input presentation metadata. The output presentation metadata comprises data for encoded audio items which constrain the extent by which an adaptable parameter of a rendering can be adapted when rendering the encoded audio items. An output (209) generates an encoded audio data stream comprising the encoded audio data and the output presentation metadata.
Methods and apparatus for integrating media across a wide area network
A system for distributing media includes a wide area network (WAN), a media player coupled to the WAN at a first home, and a media server coupled to the WAN at a second home for providing media. A service is coupled to the WAN for receiving a request for media from the media player and for establishing a connection between the first and second homes over the WAN. Media is streamed across the WAN from the second home to the first home. The system may include a storage device coupled to the media player where media is transferred across the WAN for storage at the storage device. A media device may be coupled to the media player for playing the streamed/transferred media where the media player and the media device may comprise a television, stereo, or computer and the media item may comprise video, photographs, or audio.
Method and apparatus for efficient delivery and usage of audio messages for high quality of experience
A method and a system for virtual reality, augmented reality, mixed reality, or 360-degree Video environment is disclosed. The system receives Video Streams associated to audio and video scenes to be reproduced and Audio Streams associated to audio and video scenes to be reproduced. There are provided a Video decoder which decodes signal from the Video Stream for the representation of the audio and video scene; an Audio decoder which decodes signal from the Audio Stream for the representation of the audio and video scene to the user; and a region of interest processor deciding, based e.g. on the user's viewport, head orientation, movement data, or metadata, whether an Audio information message is to be reproduced. At the decision, the reproduction of the Audio information message is caused.