G10L19/173

Optimized Audio Forwarding
20220038818 · 2022-02-03 ·

Methods and systems for optimizing a routing of audio data to audio transmitting devices using a Bluetooth network are disclosed. One method includes receiving an encoded audio bitstream at a first speaker of the audio rendering system comprising a first and a second audio channels, separating a first set of spectral components of the first audio channel and a second set of spectral components of the second audio channel from the encoded audio bitstream, without decoding the audio bitstream, generating a first encoded bitstream from the first set of spectral components, and forwarding the first encoded bitstream to a second speaker of the audio rendering system over the wireless link.

Audio metadata providing apparatus and method, and multichannel audio data playback apparatus and method to support dynamic format conversion

An audio metadata providing apparatus and method and a multichannel audio data playback apparatus and method to support a dynamic format conversion are provided. Dynamic format conversion information may include information about a plurality of format conversion schemes that are used to convert a first format set by an author of multichannel audio data into a second format that is based on a playback environment of the multichannel audio data and that are each set for corresponding playback periods of the multichannel audio data. The audio metadata providing apparatus may provide audio metadata including the dynamic format conversion information. The multichannel audio data playback apparatus may identify the dynamic format conversion information from the audio metadata, may convert the first format of the multichannel audio data into the second format based on the identified dynamic format conversion information, and may play back the multichannel audio data in the second format.

Frame coding for spatial audio data

The techniques disclosed herein provide apparatuses and related methods for the communication of spatial audio and related metadata. In some implementations, a source provides prerecorded spatial audio that has embedded metadata. A computing device processes the prerecorded spatial audio to generate an audio codec that is segmented to include a first section of audio data and a second section that includes metadata extracted from the prerecorded spatial audio. The generated audio codec may be received by a device that includes an encoder. The encoder may process the generated audio codec to generate audio data that includes the metadata.

Speech transcoding in packet networks

Speech transcoding in packet networks may be useful when both incoming and outgoing speech streams of the transcoding entity are packet based. This can be any transcoding entity having packet interfaces. A method can include omitting jitter buffering before decoding in a transcoder and omitting bad frame handling in a decoding stage of a transcoder. The method can also include freezing a decoder and the encoder when a packet is not received. The method can also include sending packet loss information from the decoder to the encoder as side information when the packet is not received. The method can further include setting an outgoing packet stream to permit detection of missing packets by a downstream decoder upon receiving a valid packet after the packet is not received.

Multi-Channel Signal Encoding Method, Multi-Channel Signal Decoding Method, Encoder, and Decoder
20220046376 · 2022-02-10 ·

A multi-channel signal encoding method includes determining a downmixed signal of a first channel signal and a second channel signal in a multi-channel signal, and reverberation gain parameters corresponding to different subbands of the first channel signal and the second channel signal, where the obtained reverberation gain parameters are belonging to at least two reverberation gain parameter groups. The method further includes selecting, from the at least two reverberation gain parameter groups, a target reverberation gain parameter group. The method further includes generating parameter indication information, where the parameter indication information indicates the target reverberation gain parameter group. The method further includes encoding reverberation gain parameters corresponding to the target reverberation gain parameter group, the parameter indication information, and the downmixed signal to obtain a bitstream.

SOUND QUALITY DETECTION METHOD AND DEVICE FOR HOMOLOGOUS AUDIO AND STORAGE MEDIUM
20220230645 · 2022-07-21 ·

Provided are a method for detecting tone quality of homologous audio, a device and storage medium, which belong to the technical field of audio. The method comprises: acquiring a plurality of audio files to be detected belonging to homologous audio files (101); extracting the features of each audio file of the plurality of audio files, to obtain at least one audio feature of each audio file, and to generate the corresponding relationship list between the at least one audio feature of each audio file and the audio file identifier (102); on the basis of the corresponding relationship list between the at least one audio feature of the plurality of audio file and the audio file identifier, determining the tone quality score of each audio file of the plurality of audio files through a tone quality detecting model (103). The tone quality detection of the homologous audio files is achieved, which is convenient to store, acquire and manage the homologous audio files according to the tone quality, and the storing, obtaining and managing costs of the homologous audio files can be saved.

Apparatus and method for processing an encoded audio signal
20210383818 · 2021-12-09 ·

An apparatus for processing an encoded audio signal, which includes a sequence of access units, each access unit including a core signal with a first spectral width and parameters describing a spectrum above the first spectral width, has a demultiplexer generating, from an access unit of the encoded audio signal, the core signal and a set of the parameters, an upsampler upsampling the core signal of the access unit and outputting a first upsampled spectrum and a timely consecutive second upsampled spectrum, the first upsampled spectrum and the second upsampled spectrum, both, having a same content as the core signal and having a second spectral width being greater than the first spectral width of the core spectrum, a parameter converter converting parameters of the set of parameters of the access unit to obtain converted parameters, and a spectral gap filling processor processing the first upsampled spectrum and the second upsampled spectrum using the converted parameters.

DIRECTIONAL LOUDNESS MAP BASED AUDIO PROCESSING
20210383820 · 2021-12-09 ·

An audio analyzer configured to obtain spectral domain representations of two or more input audio signals. Additionally the audio analyzer is configured to obtain directional information associated with spectral bands of the spectral domain representations and to obtain loudness information associated with different directions as an analysis result. Contributions to the loudness information are determined in dependence on the directional information.

Methods, Encoder And Decoder For Linear Predictive Encoding And Decoding Of Sound Signals Upon Transition Between Frames Having Different Sampling Rates
20210375296 · 2021-12-02 ·

Methods, an encoder and a decoder are configured for transition between frames with different internal sampling rates. Linear predictive (LP) filter parameters are converted from a sampling rate S1 to a sampling rate S2. A power spectrum of a LP synthesis filter is computed, at the sampling rate S1, using the LP filter parameters. The power spectrum of the LP synthesis filter is modified to convert it from the sampling rate S1 to the sampling rate S2. The modified power spectrum of the LP synthesis filter is inverse transformed to determine autocorrelations of the LP synthesis filter at the sampling rate S2. The autocorrelations are used to compute the LP filter parameters at the sampling rate S2.

AUDIO CODEC EXTENSION
20220165281 · 2022-05-26 ·

An apparatus comprising means configured to: receive a primary track comprising at least one audio signal; receive at least one secondary track, each of the at least one secondary track comprising at least one audio signal, wherein the at least one secondary track is based on the primary track; and decode and render the primary track and the at least one secondary track using spatial audio decoding.