G10L19/008

SYSTEMS AND METHODS FOR IMPLEMENTING CROSS-FADING, INTERSTITIALS AND OTHER EFFECTS DOWNSTREAM
20180012611 · 2018-01-11 ·

Systems and methods are presented for cross-fading (or other multiple clip processing) of information streams on a user or client device, such as a telephone, tablet, computer or MP3 player, or any consumer device with audio playback. Multiple clip processing can be accomplished at a client end according to directions sent from a service provider that specify a combination of (i) the clips involved; (ii) the device on which the cross-fade or other processing is to occur and its parameters; and (iii) the service provider system. For example, a consumer device with only one decoder, can utilize that decoder (typically hardware) to decompress one or more elements that are involved in a cross-fade at faster than real time, thus pre-fetching the next element(s) to be played in the cross-fade at the end of the currently being played element. The next elements(s) can, for example, be stored in an input buffer, then decoded and stored in a decoded sample buffer, all prior to the required presentation time of the multiple element effect. At the requisite time, a client device component can access the respective samples of the decoded audio clips as it performs the cross-fade, mix or other effect. Such exemplary embodiments use a single decoder and thus do not require synchronized simultaneous decodes.

SYSTEMS AND METHODS FOR IMPLEMENTING CROSS-FADING, INTERSTITIALS AND OTHER EFFECTS DOWNSTREAM
20180012611 · 2018-01-11 ·

Systems and methods are presented for cross-fading (or other multiple clip processing) of information streams on a user or client device, such as a telephone, tablet, computer or MP3 player, or any consumer device with audio playback. Multiple clip processing can be accomplished at a client end according to directions sent from a service provider that specify a combination of (i) the clips involved; (ii) the device on which the cross-fade or other processing is to occur and its parameters; and (iii) the service provider system. For example, a consumer device with only one decoder, can utilize that decoder (typically hardware) to decompress one or more elements that are involved in a cross-fade at faster than real time, thus pre-fetching the next element(s) to be played in the cross-fade at the end of the currently being played element. The next elements(s) can, for example, be stored in an input buffer, then decoded and stored in a decoded sample buffer, all prior to the required presentation time of the multiple element effect. At the requisite time, a client device component can access the respective samples of the decoded audio clips as it performs the cross-fade, mix or other effect. Such exemplary embodiments use a single decoder and thus do not require synchronized simultaneous decodes.

AUDIO ENCODER AND DECODER WITH DYNAMIC RANGE COMPRESSION METADATA

An audio processing unit (APU) is disclosed. The APU includes a buffer memory configured to store at least one frame of an encoded audio bitstream, where the encoded audio bitstream includes audio data and a metadata container. The metadata container includes a header and one or more metadata payloads after the header. The one or more metadata payloads include dynamic range compression (DRC) metadata, and the DRC metadata is or includes profile metadata indicative of whether the DRC metadata includes dynamic range compression (DRC) control values for use in performing dynamic range compression in accordance with at least one compression profile on audio content indicated by at least one block of the audio data.

AUDIO ENCODER AND DECODER WITH DYNAMIC RANGE COMPRESSION METADATA

An audio processing unit (APU) is disclosed. The APU includes a buffer memory configured to store at least one frame of an encoded audio bitstream, where the encoded audio bitstream includes audio data and a metadata container. The metadata container includes a header and one or more metadata payloads after the header. The one or more metadata payloads include dynamic range compression (DRC) metadata, and the DRC metadata is or includes profile metadata indicative of whether the DRC metadata includes dynamic range compression (DRC) control values for use in performing dynamic range compression in accordance with at least one compression profile on audio content indicated by at least one block of the audio data.

Audio Signal Processing Apparatuses and Methods
20180012607 · 2018-01-11 ·

Audio signal processing apparatuses and methods are provided, such as an audio signal downmixing apparatus for processing an input audio signal into an output audio signal, wherein the input audio signal comprises a plurality of input channels recorded at a plurality of spatial positions and the output audio signal comprises a plurality of primary output channels. The audio signal downmixing apparatus comprises a downmix matrix determiner configured to determine for each frequency bin j of a plurality of frequency bins a downmix matrix D.sub.U with j being an integer in the range from 1 to N, and a processor configured to process the input audio signal using the downmix matrix D.sub.U into the output audio signal.

Audio Signal Processing Apparatuses and Methods
20180012607 · 2018-01-11 ·

Audio signal processing apparatuses and methods are provided, such as an audio signal downmixing apparatus for processing an input audio signal into an output audio signal, wherein the input audio signal comprises a plurality of input channels recorded at a plurality of spatial positions and the output audio signal comprises a plurality of primary output channels. The audio signal downmixing apparatus comprises a downmix matrix determiner configured to determine for each frequency bin j of a plurality of frequency bins a downmix matrix D.sub.U with j being an integer in the range from 1 to N, and a processor configured to process the input audio signal using the downmix matrix D.sub.U into the output audio signal.

ADAPTIVE AUDIO CONSTRUCTION

Described herein is a method for creating an object-based audio signal from an audio input, the audio input including one or more audio channels that are recorded to collectively define an audio scene. The one or more audio channels are captured from a respective one or more spatially separated microphones disposed in a stable spatial configuration. The method includes the steps of: a) receiving the audio input; b) performing spatial analysis on the one or more audio channels to identify one or more audio objects within the audio scene; c) determining contextual information relating to the one or more audio objects; d) defining respective audio streams including audio data relating to at least one of the identified one or more audio objects; and e) outputting an object-based audio signal including the audio streams and the contextual information.

AUDIO METADATA PROVIDING APPARATUS AND METHOD, AND MULTICHANNEL AUDIO DATA PLAYBACK APPARATUS AND METHOD TO SUPPORT DYNAMIC FORMAT CONVERSION

An audio metadata providing apparatus and method and a multichannel audio data playback apparatus and method to support a dynamic format conversion are provided. Dynamic format conversion information may include information about a plurality of format conversion schemes that are used to convert a first format set by an author of multichannel audio data into a second format that is based on a playback environment of the multichannel audio data and that are each set for corresponding playback periods of the multichannel audio data. The audio metadata providing apparatus may provide audio metadata including the dynamic format conversion information. The multichannel audio data playback apparatus may identify the dynamic format conversion information from the audio metadata, may convert the first format of the multichannel audio data into the second format based on the identified dynamic format conversion information, and may play back the multichannel audio data in the second format.

System for maintaining reversible dynamic range control information associated with parametric audio coders

On the basis of a bitstream (P), an n-channel audio signal (X) is reconstructed by deriving an m-channel core signal (Y) and multichannel coding parameters (α) from the bitstream, where 1≤m<n. Also derived from the bitstream are pre-processing dynamic range control, DRC, parameters (DRC2) quantifying an encoder-side dynamic range limiting of the core signal. The n-channel audio signal is obtained by parametric synthesis in accordance with the multichannel coding parameters and while cancelling any encoder-side dynamic range limiting based on the pre-processing DRC parameters. In particular embodiments, the reconstruction further includes use of compensated post-processing DRC parameters quantifying a potential decoder-side dynamic range compression. Cancellation of an encoder-side range limitation and range compression are preferably performed by different decoder-side components. Cancellation and compression may be coordinated by a DRC pre-processor.

System for maintaining reversible dynamic range control information associated with parametric audio coders

On the basis of a bitstream (P), an n-channel audio signal (X) is reconstructed by deriving an m-channel core signal (Y) and multichannel coding parameters (α) from the bitstream, where 1≤m<n. Also derived from the bitstream are pre-processing dynamic range control, DRC, parameters (DRC2) quantifying an encoder-side dynamic range limiting of the core signal. The n-channel audio signal is obtained by parametric synthesis in accordance with the multichannel coding parameters and while cancelling any encoder-side dynamic range limiting based on the pre-processing DRC parameters. In particular embodiments, the reconstruction further includes use of compensated post-processing DRC parameters quantifying a potential decoder-side dynamic range compression. Cancellation of an encoder-side range limitation and range compression are preferably performed by different decoder-side components. Cancellation and compression may be coordinated by a DRC pre-processor.