Patent classifications
G10L19/035
AUDIO TRANSMITTER PROCESSOR, AUDIO RECEIVER PROCESSOR AND RELATED METHODS AND COMPUTER PROGRAMS
An audio transmitter processor for generating an error protected frame using encoded audio data of an audio frame, the encoded audio data for the audio frame having a first amount of information units and a second amount of information units, has: a frame builder for building a codeword frame having a codeword raster, wherein the frame builder is configured to determine a border between a first amount of information units and a second amount of information units so that a starting information unit of the second amount of information units coincides with a codeword border; and an error protection coder to obtain a plurality of processed codewords representing the error protected frame.
Signal encoding method and device and signal decoding method and device
A spectrum encoding method includes selecting an important spectral component in band units for a normalized spectrum and encoding information of the selected important spectral component for a band, based on a number, a position, a magnitude and a sign thereof. A spectrum decoding method includes obtaining from a bitstream, information about an important spectral component for a band of an encoded spectrum and decoding the obtained information of the important spectral component, based on a number, a position, a magnitude and a sign of the important spectral component.
Signal encoding method and device and signal decoding method and device
A spectrum encoding method includes selecting an important spectral component in band units for a normalized spectrum and encoding information of the selected important spectral component for a band, based on a number, a position, a magnitude and a sign thereof. A spectrum decoding method includes obtaining from a bitstream, information about an important spectral component for a band of an encoded spectrum and decoding the obtained information of the important spectral component, based on a number, a position, a magnitude and a sign of the important spectral component.
Low-complexity tonality-adaptive audio signal quantization
The invention provides an audio encoder for encoding an audio signal so as to produce therefrom an encoded signal, the audio encoder including: a framing device configured to extract frames from the audio signal; a quantizer configured to map spectral lines of a spectrum signal derived from the frame of the audio signal to quantization indices, wherein the quantizer has a dead-zone, in which the input spectral lines are mapped to quantization index zero; and a control device configured to modify the dead-zone; wherein the control device includes a tonality calculating device configured to calculate at least one tonality indicating value for at least one spectrum line or for at least one group of spectral lines, wherein the control device is configured to modify the dead-zone for the at least one spectrum line or the at least one group of spectrum lines depending on the respective tonality indicating value.
Low-complexity tonality-adaptive audio signal quantization
The invention provides an audio encoder for encoding an audio signal so as to produce therefrom an encoded signal, the audio encoder including: a framing device configured to extract frames from the audio signal; a quantizer configured to map spectral lines of a spectrum signal derived from the frame of the audio signal to quantization indices, wherein the quantizer has a dead-zone, in which the input spectral lines are mapped to quantization index zero; and a control device configured to modify the dead-zone; wherein the control device includes a tonality calculating device configured to calculate at least one tonality indicating value for at least one spectrum line or for at least one group of spectral lines, wherein the control device is configured to modify the dead-zone for the at least one spectrum line or the at least one group of spectrum lines depending on the respective tonality indicating value.
Audio signal encoding and decoding
An audio codec suitable for robust wireless transmission of high quality audio with low latency, still at a moderate bit rate. The encoding and decoding methods are based on ADPCM and in addition to the encoded output bits APM, additional data QB are included in output data blocks, namely data QB representing an internal value of the adaptive quantization ADQ of the ADPCM encoding algorithm, especially a scaling factor encoded and truncated to such as 8 bits. Further, output data blocks preferably include data CFB representing an internal value of the predictor PR of the ADPCM encoding algorithm, especially data CFB representing coefficients of a lattice prediction FIR filter which, truncated to such as 8 bits, can be sequentially included in output data blocks. These additional data QB, CFB regarding internal values of the ADPCM encoding algorithm can be utilized at the encoder side to increase robustness against loss of data blocks in wireless transmission. Especially, the decoding algorithm may comprise comparing its current internal ADPCM decoding values corresponding to the received internal values QB, CFB from the encoder, and in case there is a difference, the decoder can adapt or overwrite its internal values to the ones received QB, CFB. This helps to ensure fast recovery after lost data blocks, thereby ensuring robustness against artefacts in the reconstructed signal, e.g. clicks in case of audio.
Audio signal encoding and decoding
An audio codec suitable for robust wireless transmission of high quality audio with low latency, still at a moderate bit rate. The encoding and decoding methods are based on ADPCM and in addition to the encoded output bits APM, additional data QB are included in output data blocks, namely data QB representing an internal value of the adaptive quantization ADQ of the ADPCM encoding algorithm, especially a scaling factor encoded and truncated to such as 8 bits. Further, output data blocks preferably include data CFB representing an internal value of the predictor PR of the ADPCM encoding algorithm, especially data CFB representing coefficients of a lattice prediction FIR filter which, truncated to such as 8 bits, can be sequentially included in output data blocks. These additional data QB, CFB regarding internal values of the ADPCM encoding algorithm can be utilized at the encoder side to increase robustness against loss of data blocks in wireless transmission. Especially, the decoding algorithm may comprise comparing its current internal ADPCM decoding values corresponding to the received internal values QB, CFB from the encoder, and in case there is a difference, the decoder can adapt or overwrite its internal values to the ones received QB, CFB. This helps to ensure fast recovery after lost data blocks, thereby ensuring robustness against artefacts in the reconstructed signal, e.g. clicks in case of audio.
QUANTIZATION OF SPATIAL AUDIO DIRECTION PARAMETERS
A method for spatial audio signal encoding comprising: obtaining a plurality of audio direction parameters, wherein each parameter comprises an elevation value and an azimuth value and wherein each parameter has an ordered position; deriving for each of the plurality of audio direction parameters a corresponding derived audio direction parameter (SP) comprising an elevation and an azimuth value, corresponding derived audio direction parameters (SP) being arranged in a manner determined by a spatial utilization defined by the elevation values and the azimuth values of the plurality of audio direction parameters; rotating each derived audio direction parameter (SP) by the azimuth value (φ.sub.0) of an audio direction parameter in the first position of the plurality of audio direction parameters and quantizing the rotation to determine for each a corresponding quantized rotated derived audio direction parameter; changing the ordered position of an audio direction parameter to a further position coinciding with a position of a rotated derived audio direction parameter when the azimuth value of the audio direction parameter is closest to the azimuth value of the further rotated derived audio direction parameter compared to the azimuth values of other rotated derived audio direction parameters, followed by determining for each of the plurality audio direction parameters a difference between each audio direction parameter and their corresponding quantized rotated derived audio direction parameter; and quantizing a difference for each of the plurality of audio direction parameters, wherein a difference quantization resolution for each of the plurality of audio direction parameters is defined based on a spatial extent of the audio direction parameters.
QUANTIZATION OF SPATIAL AUDIO DIRECTION PARAMETERS
A method for spatial audio signal encoding comprising: obtaining a plurality of audio direction parameters, wherein each parameter comprises an elevation value and an azimuth value and wherein each parameter has an ordered position; deriving for each of the plurality of audio direction parameters a corresponding derived audio direction parameter (SP) comprising an elevation and an azimuth value, corresponding derived audio direction parameters (SP) being arranged in a manner determined by a spatial utilization defined by the elevation values and the azimuth values of the plurality of audio direction parameters; rotating each derived audio direction parameter (SP) by the azimuth value (φ.sub.0) of an audio direction parameter in the first position of the plurality of audio direction parameters and quantizing the rotation to determine for each a corresponding quantized rotated derived audio direction parameter; changing the ordered position of an audio direction parameter to a further position coinciding with a position of a rotated derived audio direction parameter when the azimuth value of the audio direction parameter is closest to the azimuth value of the further rotated derived audio direction parameter compared to the azimuth values of other rotated derived audio direction parameters, followed by determining for each of the plurality audio direction parameters a difference between each audio direction parameter and their corresponding quantized rotated derived audio direction parameter; and quantizing a difference for each of the plurality of audio direction parameters, wherein a difference quantization resolution for each of the plurality of audio direction parameters is defined based on a spatial extent of the audio direction parameters.
Self-supervised audio representation learning for mobile devices
Systems and methods for training a machine-learned model are provided. A method can include can include obtaining an unlabeled audio signal, sampling the unlabeled audio signal to select one or more sampled slices, inputting the one or more sampled slices into a machine-learned model, receiving, as an output of the machine-learned model, one or more determined characteristics associated with the audio signal, determining a loss function for the machine-learned model based at least in part on a difference between the one or more determined characteristics and one or more corresponding ground truth characteristics of the audio signal, and training the machine-learned model from end to end based at least in part on the loss function. The one or more determined characteristics can include one or more reconstructed portions of the audio signal temporally adjacent to the one or more sampled slices or an estimated distance between two sampled slices.