G10L21/0388

APPARATUS AND METHOD FOR ENCODING AND DECODING AN ENCODED AUDIO SIGNAL USING TEMPORAL NOISE/PATCH SHAPING

An apparatus for decoding an encoded audio signal, includes: a spectral domain audio decoder for generating a first decoded representation of a first set of first spectral portions being spectral prediction residual values; a frequency regenerator for generating a reconstructed second spectral portion using a first spectral portion of the first set of first spectral portions, wherein the reconstructed second spectral portion additionally includes spectral prediction residual values; and an inverse prediction filter for performing an inverse prediction over frequency using the spectral residual values for the first set of first spectral portions and the reconstructed second spectral portion using prediction filter information included in the encoded audio signal.

APPARATUS AND METHOD FOR ENCODING AND DECODING AN ENCODED AUDIO SIGNAL USING TEMPORAL NOISE/PATCH SHAPING

An apparatus for decoding an encoded audio signal, includes: a spectral domain audio decoder for generating a first decoded representation of a first set of first spectral portions being spectral prediction residual values; a frequency regenerator for generating a reconstructed second spectral portion using a first spectral portion of the first set of first spectral portions, wherein the reconstructed second spectral portion additionally includes spectral prediction residual values; and an inverse prediction filter for performing an inverse prediction over frequency using the spectral residual values for the first set of first spectral portions and the reconstructed second spectral portion using prediction filter information included in the encoded audio signal.

ENHANCED MULTI-CHANNEL ACOUSTIC MODELS

This specification describes computer-implemented methods and systems. One method includes receiving, by a neural network of a speech recognition system, first data representing a first raw audio signal and second data representing a second raw audio signal. The first raw audio signal and the second raw audio signal describe audio occurring at a same period of time. The method further includes generating, by a spatial filtering layer of the neural network, a spatial filtered output using the first data and the second data, and generating, by a spectral filtering layer of the neural network, a spectral filtered output using the spatial filtered output. Generating the spectral filtered output comprises processing frequency-domain data representing the spatial filtered output. The method still further includes processing, by one or more additional layers of the neural network, the spectral filtered output to predict sub-word units encoded in both the first raw audio signal and the second raw audio signal.

ENHANCED MULTI-CHANNEL ACOUSTIC MODELS

This specification describes computer-implemented methods and systems. One method includes receiving, by a neural network of a speech recognition system, first data representing a first raw audio signal and second data representing a second raw audio signal. The first raw audio signal and the second raw audio signal describe audio occurring at a same period of time. The method further includes generating, by a spatial filtering layer of the neural network, a spatial filtered output using the first data and the second data, and generating, by a spectral filtering layer of the neural network, a spectral filtered output using the spatial filtered output. Generating the spectral filtered output comprises processing frequency-domain data representing the spatial filtered output. The method still further includes processing, by one or more additional layers of the neural network, the spectral filtered output to predict sub-word units encoded in both the first raw audio signal and the second raw audio signal.

Natural ear

Methods and systems for assisting tonally-challenged singers. A microphone can be integrated with a sound reinforcement system used in a live performance. The microphone, which can transduce the performer's voice, can serve multiple purposes such as, for example, to feed input to the natural ear and to the sound reinforcement system. The processed sound of the performer's voice (with fundamental frequencies emphasized) can be mixed into the signal fed to a stage “monitor” speaker facing the performer or a headset worn by the performer.

Natural ear

Methods and systems for assisting tonally-challenged singers. A microphone can be integrated with a sound reinforcement system used in a live performance. The microphone, which can transduce the performer's voice, can serve multiple purposes such as, for example, to feed input to the natural ear and to the sound reinforcement system. The processed sound of the performer's voice (with fundamental frequencies emphasized) can be mixed into the signal fed to a stage “monitor” speaker facing the performer or a headset worn by the performer.

APPARATUS AND METHOD FOR DECODING AND ENCODING AN AUDIO SIGNAL USING ADAPTIVE SPECTRAL TILE SELECTION

An apparatus for decoding an encoded signal includes: an audio decoder for decoding an encoded representation of a first set of first spectral portions to obtain a decoded first set of first spectral portions; a parametric decoder for decoding an encoded parametric representation of a second set of second spectral portions to obtain a decoded representation of the parametric representation, wherein the parametric information includes, for each target frequency tile, a source region identification as a matching information; and a frequency regenerator for regenerating a target frequency tile using a source region from the first set of first spectral portions identified by the matching information.

APPARATUS AND METHOD FOR DECODING AND ENCODING AN AUDIO SIGNAL USING ADAPTIVE SPECTRAL TILE SELECTION

An apparatus for decoding an encoded signal includes: an audio decoder for decoding an encoded representation of a first set of first spectral portions to obtain a decoded first set of first spectral portions; a parametric decoder for decoding an encoded parametric representation of a second set of second spectral portions to obtain a decoded representation of the parametric representation, wherein the parametric information includes, for each target frequency tile, a source region identification as a matching information; and a frequency regenerator for regenerating a target frequency tile using a source region from the first set of first spectral portions identified by the matching information.

Inter-channel encoding and decoding of multiple high-band audio signals

A device includes an encoder and a transmitter. The encoder is configured to generate a first high-band portion of a first signal based on a left signal and a right signal. The encoder is also configured to generate a set of adjustment gain parameters based on a high-band non-reference signal. The high-band non-reference signal corresponds to one of a left high-band portion of the left signal or a right high-band portion of the right signal as a high-band non-reference signal. The transmitter is configured to transmit information corresponding to the first high-band portion of the first signal. The transmitter is also configured to transmit the set of adjustment gain parameters corresponding to the high-band non-reference signal.

Inter-channel encoding and decoding of multiple high-band audio signals

A device includes an encoder and a transmitter. The encoder is configured to generate a first high-band portion of a first signal based on a left signal and a right signal. The encoder is also configured to generate a set of adjustment gain parameters based on a high-band non-reference signal. The high-band non-reference signal corresponds to one of a left high-band portion of the left signal or a right high-band portion of the right signal as a high-band non-reference signal. The transmitter is configured to transmit information corresponding to the first high-band portion of the first signal. The transmitter is also configured to transmit the set of adjustment gain parameters corresponding to the high-band non-reference signal.