G10L19/04

MACHINE LEARNING-BASED KEY GENERATION FOR KEY-GUIDED AUDIO SIGNAL TRANSFORMATION
20230186926 · 2023-06-15 ·

A method comprise: receiving input audio and target audio having a target audio characteristic; using a first neural network, trained to generate key parameters that represent the target audio characteristic based on one or more of the target audio and the input audio, generating the key parameters; and configuring a second neural network, trained to be configured by the key parameters, with the key parameters to cause the second neural network to perform a signal transformation of the input audio, to produce output audio having an output audio characteristic corresponding to and that matches the target audio characteristic.

Audio decoding device and method with decoding branches for decoding audio signal encoded in a plurality of domains

An audio encoder has a first information sink oriented encoding branch such as a spectral domain encoding branch, a second information source or SNR oriented encoding branch such as an LPC-domain encoding branch, and a switch for switching between the first and second encoding branches, the second encoding branch having a converter into a specific domain different from the spectral domain such as an LPC analysis stage generating an excitation signal, and the second encoding branch having a specific domain coding branch such as LPC domain processing branch, and a specific spectral domain coding branch such as LPC spectral domain processing branch, and an additional switch for switching between the specific domain coding branch and the specific spectral domain coding branch. An audio decoder has a first domain decoder, a second domain decoder, and a third domain decoder as well as two cascaded switches for switching between the decoders.

Speech coding using auto-regressive generative neural networks

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for coding speech using neural networks. One of the methods includes obtaining a bitstream of parametric coder parameters characterizing spoken speech; generating, from the parametric coder parameters, a conditioning sequence; generating a reconstruction of the spoken speech that includes a respective speech sample at each of a plurality of decoder time steps, comprising, at each decoder time step: processing a current reconstruction sequence using an auto-regressive generative neural network, wherein the auto-regressive generative neural network is configured to process the current reconstruction to compute a score distribution over possible speech sample values, and wherein the processing comprises conditioning the auto-regressive generative neural network on at least a portion of the conditioning sequence; and sampling a speech sample from the possible speech sample values.

Wireless sound transmission and method
09832575 · 2017-11-28 · ·

A system for providing sound to at least one user has at least one audio signal source, a transmission unit with a digital transmitter audio data packets from the audio signal source via a wireless digital audio link; at least one receiver unit having at least one digital receiver; and a hearing stimulator responsive to audio signals from the receiver unit. The transmission unit encodes the audio signal as audio data blocks distributed onto at least two audio data packets, one of which is a low-quality packet, and one of which is a high quality packet only a low-quality version of the audio signal being retrievable from the low-quality packets, and a high-quality version of the audio signal being retrievable from both the low-quality packets and the high-quality packets. The low-quality packets and the high-quality packets are transmitted in respective dedicated slots of a multiple access protocol frame.

ADAPTIVE AUDIO CODEC SYSTEM, METHOD AND ARTICLE
20170330574 · 2017-11-16 ·

A decoder generates decoded signals based on quantized signals. The decoder includes an inverse quantizer and a predictor circuit. The quantized signals are generated in an encoder by low-pass filtering an input signal and encoding the filtered signal using adaptive differential pulse code modulation. The predictor circuit has filter coefficients based on a frequency response of the low-pass filter used to filter the input signal.

ADAPTIVE AUDIO CODEC SYSTEM, METHOD AND ARTICLE
20170330572 · 2017-11-16 ·

An encoder generates quantized signal words based on a difference signal. The encoder includes an adaptive quantizer. A step size applied by the adaptive quantizer is generated in a feedback loop and based on a loading factor and quantized signal words generated by the adaptive quantizer. The encoder includes coding circuitry which generates code words based on quantized signal words generated by the adaptive quantizer. The coding circuitry generates an escape code in response to a quantized signal word not being associated with a corresponding coding code word.

ADAPTIVE AUDIO CODEC SYSTEM, METHOD AND ARTICLE
20170330575 · 2017-11-16 ·

An adaptive noise shaping filter flattens signal components below a threshold frequency range in a filtered signal to be encoded. An encoder generates quantized signals based on a difference signal and includes an adaptive quantizer and a decoder. The decoder generates feedback signals and has an inverse quantizer and a predictor. The predictor has determined control parameters based on the threshold frequency range.

Support for generation of comfort noise, and generation of comfort noise

A method for generation of comfort noise for at least two audio channels. The method comprises determining a spatial coherence between audio signals on the respective audio channels, wherein at least one spatial coherence value per frame and frequency band is determined to form a vector of spatial coherence values. A vector of predicted spatial coherence values is formed by a weighted combination of a first coherence prediction and a second coherence prediction that are combined using a weight factor a. The method comprises signaling information about the weight factor a to the receiving node, for enabling the generation of the comfort noise for the at least two audio channels at the receiving node.

Support for generation of comfort noise, and generation of comfort noise

A method for generation of comfort noise for at least two audio channels. The method comprises determining a spatial coherence between audio signals on the respective audio channels, wherein at least one spatial coherence value per frame and frequency band is determined to form a vector of spatial coherence values. A vector of predicted spatial coherence values is formed by a weighted combination of a first coherence prediction and a second coherence prediction that are combined using a weight factor a. The method comprises signaling information about the weight factor a to the receiving node, for enabling the generation of the comfort noise for the at least two audio channels at the receiving node.

METHOD AND APPARATUS FOR CONTROLLING MULTICHANNEL AUDIO FRAME LOSS CONCEALMENT
20220059099 · 2022-02-24 ·

A method of approximating a lost or corrupted multichannel audio frame of a multichannel audio signal in a decoding device is provided. The device may generate a down-mix error concealment frame and transform the frame into a frequency domain to generate a transformed down-mix error concealment frame. The device may decorrelate the transformed frame to generate a decorrelated concealment frame. The device may obtain a residual signal spectrum of a stored residual signal of a previously received multichannel audio signal frame and generate an energy adjusted decorrelated residual signal concealment frame using the residual signal spectrum. The device may obtain a set of multi-channel audio substitution parameters and provide the frames and substitution parameters to an audio synthesis component to generate a synthesized multichannel audio frame. The device performs an inverse frequency domain transformation of the audio frame to generate a substitution frame for the lost or corrupted audio frame.