H04N21/439

Methods and apparatus to detect commercial advertisements associated with media presentations

Methods and apparatus to detect commercial advertisements associated with media presentations are disclosed. An example method involves receiving a video frame and detecting a change in box-formatting between the video frame and a subsequent video frame. A transition between the video frame and the subsequent video frame is indicated as a commercial advertisement transition based on the detected change in box-formatting.

Detection of volume adjustments during media replacement events using loudness level profiles

In one aspect, an example method includes (i) determining, by a playback device, a loudness level of first media content that the playback device is receiving from a first source; (ii) comparing, by the playback device, the determined loudness level of the first media content with a reference loudness level indicated by a loudness level profile for the first media content; (iii) determining, by the playback device, a target volume level for the playback device based on a difference between the determined loudness level of the first media content and the reference loudness level; and (iv) while the playback device presents second media content from a second source in place of the first media content, adjusting, by the playback device, a volume of the playback device toward the target volume level.

Detection of volume adjustments during media replacement events using loudness level profiles

In one aspect, an example method includes (i) determining, by a playback device, a loudness level of first media content that the playback device is receiving from a first source; (ii) comparing, by the playback device, the determined loudness level of the first media content with a reference loudness level indicated by a loudness level profile for the first media content; (iii) determining, by the playback device, a target volume level for the playback device based on a difference between the determined loudness level of the first media content and the reference loudness level; and (iv) while the playback device presents second media content from a second source in place of the first media content, adjusting, by the playback device, a volume of the playback device toward the target volume level.

ADJUSTING AUDIO AND NON-AUDIO FEATURES BASED ON NOISE METRICS AND SPEECH INTELLIGIBILITY METRICS

Some implementations involve determining a noise metric and/or a speech intelligibility metric and determining a compensation process corresponding to the noise metric and/or the speech intelligibility metric. The compensation process may involve altering a processing of audio data and/or applying a non-audio-based compensation method. In some examples, altering the processing of the audio data does not involve applying a broadband gain increase to the audio signals. Some examples involve applying the compensation process in an audio environment. Other examples involve determining compensation metadata corresponding to the compensation process and transmitting an encoded content stream that includes encoded compensation metadata, encoded video data and encoded audio data from a first device to one or more other devices.

Implementation method and system of real-time subtitle in live broadcast and device

The present disclosure describes techniques of synchronizing subtitles in live broadcast The disclosed techniques comprise obtaining a source signal and a simultaneous interpretation signal in a live broadcast; performing voice recognition on the simultaneous interpretation signal in real-time to obtain corresponding translation text; delaying the simultaneous interpretation signal to obtain a first delayed signal; delaying the source signal to obtain a second delayed signal; obtaining proofreading results of the first delayed signal and the corresponding translation text; determining proofread subtitles based on the proofreading results; and sending the proofread subtitles and the second delay signal to a live display interface.

METADATA FOR DUCKING CONTROL

An audio encoding device and an audio decoding device are described herein. The audio encoding device may examine a set of audio channels/channel groups representing a piece of sound program content and produce a set of ducking values to associate with one of the channels/channel groups. During playback of the piece of sound program content, the ducking values may be applied to all other channels/channel groups. Application of these ducking values may cause (1) the reduction in dynamic range of ducked channels/channel groups and/or (2) movement of channels/channel groups in the sound field. This ducking may improve intelligibility of audio in the non-ducked channel/channel group. For instance, a narration channel/channel group may be more clearly heard by listeners through the use of selective ducking of other channels/channel groups during playback.

DECODER FOR DECODING A MEDIA SIGNAL AND ENCODER FOR ENCODING SECONDARY MEDIA DATA COMPRISING METADATA OR CONTROL DATA FOR PRIMARY MEDIA DATA
20180007398 · 2018-01-04 ·

An encoder for encoding secondary media data including metadata and control data for primary media data is shown, wherein the encoder is configured to encode the secondary media data using adding redundancy or bandlimiting and wherein the encoder is configured to output the encoded secondary media data as a stream of digital words. Therefore, the stream of digital words may be formed such that it is capable to resist a typical processing of a digital audio stream. Furthermore, processors for processing a digital audio stream are able to process the stream of digital words, since the stream of digital words may be designed as an audio-like or analog-like digital stream.

DECODER FOR DECODING A MEDIA SIGNAL AND ENCODER FOR ENCODING SECONDARY MEDIA DATA COMPRISING METADATA OR CONTROL DATA FOR PRIMARY MEDIA DATA
20180007398 · 2018-01-04 ·

An encoder for encoding secondary media data including metadata and control data for primary media data is shown, wherein the encoder is configured to encode the secondary media data using adding redundancy or bandlimiting and wherein the encoder is configured to output the encoded secondary media data as a stream of digital words. Therefore, the stream of digital words may be formed such that it is capable to resist a typical processing of a digital audio stream. Furthermore, processors for processing a digital audio stream are able to process the stream of digital words, since the stream of digital words may be designed as an audio-like or analog-like digital stream.

Systems and Methods for Assessing Viewer Engagement
20180007431 · 2018-01-04 ·

A system for quantifying viewer engagement with a video playing on a display includes at least one camera to acquire image data of a viewing area in front of the display. A microphone acquires audio data emitted by a speaker coupled to the display. The system also includes a memory to store processor-executable instructions and a processor. Upon execution of the processor-executable instructions, the processor receives the image data and the audio data and determines an identity of the video displayed on the display based on the audio data. The processor also estimates a first number of people present in the viewing area and a second number of people engaged with the video. The processor further quantifies the viewer engagement of the video based on the first number of people and the second number of people.

PRESENTING MOBILE CONTENT BASED ON PROGRAMMING CONTEXT
20180011849 · 2018-01-11 ·

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating search queries in response to obtaining audio samples on a client device. In one aspect, a method includes the actions of i) receiving audio data from a client device, ii) identifying specific content from captured media based on the received audio data, wherein the identified specific content is associated with the received audio data and the captured media includes at least one of audio media or audio-video media, iii) obtaining additional metadata associated with the identified content, iv) generating a search query based at least in part on the obtained additional metadata, and v) returning one or more search results to the client device, the one or more search results responsive to the search query and associated with the received audio data.