H04N21/8106

Real time popularity based audible content acquisition

A personalized news service provides personalized news programs for its users by generating personalized combinations of audible versions of news stories derived from text-based based versions of the news stories. The audible versions may be generated from the text-based version by a text-to-speech system, or may by recording a person reading aloud the text-based version. To acquire recordings, the personalized news service can make a determination that a particular news story has a threshold extent of popularity. The news service can then transmit a request to a remote recording station for a recording of a verbal reading of the particular news story. The news service can then receive the requested recording from the remote recording station.

Signaling loudness adjustment for an audio scene
11595730 · 2023-02-28 · ·

Aspects of the disclosure include methods, apparatuses, and non-transitory computer-readable storage mediums for loudness adjustment for an audio scene associated with an MPEG-I immersive audio stream. One apparatus includes processing circuitry that receives a first syntax element indicating a number of sound signals included in the audio scene. The processing circuitry determines whether one or more speech signals are included in the sound signals indicated by the first syntax element. The processing circuitry determines a reference speech signal from the one or more speech signals based on the one or more speech signals being included in the sound signals. The processing circuitry adjusts a loudness level of the reference speech signal of the audio scene based on an anchor speech signal. The processing circuitry adjusts loudness levels of the sound signals based on the adjusted loudness level of the reference speech signal.

Intelligent Digital Interruption Management
20230056988 · 2023-02-23 ·

The present invention is a system to manage interrupt notifications on an operating system based on the characteristics of content in which an end user is currently immersed or engaged. For example, relatively high bitrate video throughput is indicative of corresponding high information depth and more action occurring in the scene. For periods of high information depth, interrupt notifications are deferred until the information depth falls into a relative trough. Additional embodiments of the invention process scene transitions, technical cues, dialog and lyrics to release queued interrupt notification at optimal times. A vamping process is also provided when interrupt notification are released to keep the end user prescient to the background application in which they were engaged prior to the interrupt notification coming into focus.

Content-modification system with volume level adjustment feature

In one aspect, a method includes receiving first content at a content-presentation device and presenting the first content, the first content comprising a first audio-content component. The content-presentation device may receive second content comprising a second audio-content component. The content-presentation device may determine a switch time at which to switch from presenting the first content to presenting the second content. During a first time interval prior to the switch time and ending at the switch time, the volume of the first audio-content component may be decreased to zero. At the switch time, the content-presentation device may switch from presenting the first content to presenting the second content. During a second time interval beginning at the switch time and ending at a second time after the switch time, the volume of the second audio-content component may be increased from zero to a non-zero volume level.

Testing platform for HDMI enhanced audio return channel
11589036 · 2023-02-21 · ·

A bidirectional media communication channel testing platform includes an HDMI testing device including a video input port and an audio output port; a plurality of media streaming devices, each including a video transmission channel and an audio return channel; and a bidirectional switch including a video path and an audio path. The video path is configured to selectively couple the video input port of the HDMI testing device to a video transmission channel of a selected one of the plurality of media streaming devices, and the audio path is configured to concurrently couple the audio output port of the HDMI testing device to the audio return channel of each of the plurality of media streaming devices, regardless of a switching state of the video path of the bidirectional switch.

Identifying and removing restricted information from videos
11587591 · 2023-02-21 · ·

A video is provided to viewers using a web-based platform without restricted audio, such as a copyrighted soundtrack. To do so, a video comprising at least two audio layers is received. The audio layers can include separate and distinct audio layers or a mix of audio from separate sources. A restricted audio element is identified in a first audio layer and a speech element is identified in a second audio layer. A stitched text string can be generated by performing speech-to-text on both audio layers and removing the text corresponding to the restricted audio element of the second audio layer. When playing back the video, a portion of the video is muted based on the restricted audio element. A voice synthesizer is employed to generate audible sound during the muted portion using the stitched text string.

Systems and methods for determining whether to adjust volumes of individual audio components in a media asset based on a type of a segment of the media asset
11503379 · 2022-11-15 · ·

Systems and methods are provided herein for determining whether to adjust volumes of individual audio components in a media asset based on a type of segment of the media asset that is playing back. A media guidance application may determine that a user is playing back a segment of a media asset. The media guidance application may determine a type corresponding to the segment. The media guidance application may parse a plurality of audio components of the media asset that are playing back during the segment. The media guidance application may determine, for each audio component, whether to adjust the volume playing back during the segment based on the type. For each audio component of the plurality of audio components, in response to determining to adjust the volume, the media guidance application may adjust the volume of the audio component playing back during the segment.

METHODS, SYSTEMS, AND MEDIA FOR PROVIDING DYNAMIC MEDIA SESSIONS WITH AUDIO STREAM EXPANSION FEATURES

Methods, systems, and media for providing dynamic media sessions with audio stream expansion features are provided. In some embodiments, the methods include: receiving an indication that audio content associated with a video content item is to be presented by a follower device synchronously with the audio content presented by the leader device; identifying candidate follower devices by determining whether devices connected to a local area network are capable of being designated as a follower device; causing a user interface to be presented that indicates each candidate follower device; receiving, via the user interface, a selection of one of the candidate follower devices; and transmitting, from the leader to the selected follower device, control instructions that cause the audio content associated with the video content item to be presented synchronously by the selected follower device with the video content item presented by the leader device.

VIDEO PROCESSING METHOD AND APPARATUS, AND ELECTRONIC DEVICE AND STORAGE MEDIUM
20220358966 · 2022-11-10 ·

The present invention provides a video processing method and apparatus, and an electronic device and a storage medium. The video processing method comprises: obtaining video materials; obtaining an audio material; determining music points of the audio material, and extracting video segments from each of the video materials according to the music points; stitching the extracted video segments to obtain a composite video; and adding the audio material to an audio track of the composite video to obtain a target video. The present invention improves the efficiency of producing a rhythmic video and reduces production costs.

Image and Audio Apparatus and Method

An apparatus including circuitry configured for causing audio processing to a spatial audio-visual representation of an image and sound apparatus, the spatial audio-visual representation being live or reproduced from recording; and modifying the audio processing applied to an audio-visually manipulated spatial section of the spatial audio-visual representation in response to information a prior audio-visual manipulation with data processing in the audio-visually manipulated spatial section.