G10L19/24

INTEGRATION OF HIGH FREQUENCY AUDIO RECONSTRUCTION TECHNIQUES

A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

DETERMINATION OF SPATIAL AUDIO PARAMETER ENCODING AND ASSOCIATED DECODING
20220343928 · 2022-10-27 ·

An apparatus comprising means configured to: generate spatial audio signal directional metadata parameters for a block of time-frequencies; generate encoded spatial audio signal directional metadata parameters (108) for a block of time-frequencies based on a first quantization resolution (203); compare a number of bits used for the encoded spatial audio signal directional parameters (108) for the block of time-frequencies based on the first quantization resolution against a determined number of bits; output or store the encoded spatial audio signal directional metadata parameters for a block of time-frequencies (108) based on a first quantization resolution when the number of bits used for the encoded spatial audio signal directional parameters for the block of time-frequencies (108) based on the first quantization resolution is less than a determined number of bits (217); generate encoded spatial audio signal directional metadata parameters (108) for the block of time-frequencies based on a second quantization resolution when the number of bits used for the encoded spatial audio signal directional parameters for the block of time-frequencies (108) based on the first quantization resolution is more than the determined number of bits and a difference between the determined number of bits and the number of bits used for the encoded spatial audio signal directional parameters (108) for the block of time-frequencies based on the first quantization resolution is less than a determined number of bits is within a determined threshold (217); generate encoded spatial audio signal directional metadata parameters (108) for the block of time-frequencies based on a third quantization resolution when the number of bits used for the encoded spatial audio signal directional parameters (108) for the block of time-frequencies based on the first quantization resolution is more than the determined number of bits and the difference between the determined number of bits and the number of bits used for the encoded spatial audio signal directional parameters (108) for the block of time-frequencies based on the first quantization resolution is greater than the determined threshold, wherein the third quantization resolution is determined such that a number of bits used for the encoded spatial audio signal directional parameters for the block of time-frequencies based on the third quantization resolution is always equal to or less than the determined number of bits (217).

Backward-compatible integration of harmonic transposer for high frequency reconstruction of audio signals

A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag.

Backward-compatible integration of harmonic transposer for high frequency reconstruction of audio signals

A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag.

METHODS AND APPARATUS FOR DECODING A COMPRESSED HOA SIGNAL

Methods and apparatus for decoding a compressed Higher Order Ambisonics (HOA) representation of a sound or soundfield. The method may include receiving a bit stream containing the compressed HOA representation and decoding, based on a determination that there are multiple layers, the compressed HOA representation from the bitstream to obtain a sequence of decoded HOA representations. A first subset of the sequence of decoded HOA representations is determined based only on corresponding ambient HOA components. A second subset of the sequence of decoded HOA representations is determined based on corresponding ambient HOA components and corresponding predominant sound components. For a frame k, the sequence of decoded HOA representations are represented at least in part by

[00001]c^nk1=c^AMB,nk1c^nk1=c^PS,nk1+c^AMB,nk1,for n in the first subsetfor n in the second subset

where

[00002]c^AMB,nk1

corresponds to the corresponding ambient HOA components and

[00003]METHODS AND APPARATUS FOR DECODING A COMPRESSED HOA SIGNAL

Methods and apparatus for decoding a compressed Higher Order Ambisonics (HOA) representation of a sound or soundfield. The method may include receiving a bit stream containing the compressed HOA representation and decoding, based on a determination that there are multiple layers, the compressed HOA representation from the bitstream to obtain a sequence of decoded HOA representations. A first subset of the sequence of decoded HOA representations is determined based only on corresponding ambient HOA components. A second subset of the sequence of decoded HOA representations is determined based on corresponding ambient HOA components and corresponding predominant sound components. For a frame k, the sequence of decoded HOA representations are represented at least in part by

[00001]c^nk1=c^AMB,nk1c^nk1=c^PS,nk1+c^AMB,nk1,for n in the first subsetfor n in the second subset

where

[00002]c^AMB,nk1

corresponds to the corresponding ambient HOA components and

[00003]SUPPORT FOR GENERATION OF COMFORT NOISE, AND GENERATION OF COMFORT NOISE

A method for generation of comfort noise for at least two audio channels. The method comprises determining a spatial coherence between audio signals on the respective audio channels, wherein at least one spatial coherence value per frame and frequency band is determined to form a vector of spatial coherence values. A vector of predicted spatial coherence values is formed by a weighted combination of a first coherence prediction and a second coherence prediction that are combined using a weight factor α. The method comprises signaling information about the weight factor α to the receiving node, for enabling the generation of the comfort noise for the at least two audio channels at the receiving node.

AUTOMATIC DISCOVERY AND LOCALIZATION OF VOICE DEGRADATION FAULTS USING ULTRASOUND TECHNIQUES
20220337442 · 2022-10-20 ·

A method comprises, at a local participant device, establishing audio connections with remote participant devices over a network for an online voice conference. The method includes generating ultrasound signals for corresponding ones of the remote participant devices, and transmitting the ultrasound signals over corresponding ones of the audio connections. The method further includes collecting indications, transmitted by corresponding ones of the remote participant devices over the network, that indicate whether the corresponding ones of the remote participant devices detected the ultrasound signals. The method includes identifying which of the remote participant devices detected the ultrasound signals based on the indications, and localizing degraded voice quality to particular ones of the local participant device and the remote participant devices based, at least in part, on results of identifying.

AUTOMATIC DISCOVERY AND LOCALIZATION OF VOICE DEGRADATION FAULTS USING ULTRASOUND TECHNIQUES
20220337442 · 2022-10-20 ·

A method comprises, at a local participant device, establishing audio connections with remote participant devices over a network for an online voice conference. The method includes generating ultrasound signals for corresponding ones of the remote participant devices, and transmitting the ultrasound signals over corresponding ones of the audio connections. The method further includes collecting indications, transmitted by corresponding ones of the remote participant devices over the network, that indicate whether the corresponding ones of the remote participant devices detected the ultrasound signals. The method includes identifying which of the remote participant devices detected the ultrasound signals based on the indications, and localizing degraded voice quality to particular ones of the local participant device and the remote participant devices based, at least in part, on results of identifying.

Audio Transcoding Method and Apparatus, Audio Transcoder, Device, and Storage Medium

Provided is an audio transcoding method, including: (301) performing entropy decoding on a first audio stream with a first bitrate, to obtain an audio feature parameter and an excitation signal of the first audio stream, the excitation signal being a quantized audio signal; (302) obtaining a time-domain audio signal corresponding to the excitation signal based on the audio feature parameter and the excitation signal; (303) re-quantizing the excitation signal and the audio feature parameter based on the time-domain audio signal and a target transcoding bitrate, to obtain a target excitation signal and a target audio feature parameter; and (304) performing entropy coding on the target audio feature parameter and the target excitation signal, to obtain a second audio stream with a second bitrate, the second bitrate being lower than the first bitrate.