G10L19/02

Reducing Perceived Effects of Non-Voice Data in Digital Speech
20230043682 · 2023-02-09 ·

Non-voice data is embedded in a voice bit stream that includes frames of voice bits by selecting a frame of voice bits to carry the non-voice data, placing non-voice identifier bits in a first portion of the voice bits in the selected frame, and placing the non-voice data in a second portion of the voice bits in the selected frame. The non-voice identifier bits are employed to reduce a perceived effect of the non-voice data on audible speech produced from the voice bit stream.

Audio reconstruction method and device which use machine learning

Provided are an audio reconstruction method and device for providing improved sound quality by reconstructing a decoding parameter or an audio signal obtained from a bitstream, by using machine learning. The audio reconstruction method includes obtaining a plurality of decoding parameters of a current frame by decoding a bitstream, determining characteristics of a second parameter included in the plurality of decoding parameters and associated with a first parameter, based on the first parameter included in the plurality of decoding parameters, obtaining a reconstructed second parameter by applying a machine learning model to at least one of the plurality of decoding parameters, the second parameter, and the characteristics of the second parameter, and decoding an audio signal, based on the reconstructed second parameter.

Audio reconstruction method and device which use machine learning

Provided are an audio reconstruction method and device for providing improved sound quality by reconstructing a decoding parameter or an audio signal obtained from a bitstream, by using machine learning. The audio reconstruction method includes obtaining a plurality of decoding parameters of a current frame by decoding a bitstream, determining characteristics of a second parameter included in the plurality of decoding parameters and associated with a first parameter, based on the first parameter included in the plurality of decoding parameters, obtaining a reconstructed second parameter by applying a machine learning model to at least one of the plurality of decoding parameters, the second parameter, and the characteristics of the second parameter, and decoding an audio signal, based on the reconstructed second parameter.

AUDIO ENCODING/DECODING APPARATUS AND METHOD USING VECTOR QUANTIZED RESIDUAL ERROR FEATURE

An audio encoding/decoding apparatus and method using vector quantized residual error features are disclosed. An audio signal encoding method includes outputting a bitstream of a main codec by encoding an original signal, decoding the bitstream of the main codec, determining a residual error feature vector from a feature vector of a decoded signal and a feature vector of the original signal, and outputting a bitstream of additional information by encoding the residual error feature vector.

Methods, Apparatus and Systems for Determining Reconstructed Audio Signal

According to an aspect of the present invention, a method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes obtaining a decoded baseband audio signal by decoding an encoded audio signal and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal. The method further includes generating a noise component based on a noise parameter. Finally, the method includes adjusting a phase of the high-frequency reconstructed signal and obtaining a time-domain reconstructed audio signal by combining the decoded baseband audio signal and the combined high-frequency signal to obtain a time-domain reconstructed audio signal.

Methods, Apparatus and Systems for Determining Reconstructed Audio Signal

According to an aspect of the present invention, a method for reconstructing an audio signal having a baseband portion and a highband portion is disclosed. The method includes obtaining a decoded baseband audio signal by decoding an encoded audio signal and obtaining a plurality of subband signals by filtering the decoded baseband audio signal. The method further includes generating a high-frequency reconstructed signal by copying a number of consecutive subband signals of the plurality of subband signals and obtaining an envelope adjusted high-frequency signal. The method further includes generating a noise component based on a noise parameter. Finally, the method includes adjusting a phase of the high-frequency reconstructed signal and obtaining a time-domain reconstructed audio signal by combining the decoded baseband audio signal and the combined high-frequency signal to obtain a time-domain reconstructed audio signal.

BIT ALLOCATING, AUDIO ENCODING AND DECODING

A bit allocating method is provided that includes determining the allocated number of bits in decimal point units based on each frequency band so that a Signal-to-Noise Ratio (SNR) of a spectrum existing in a predetermined frequency band is maximized within a range of the allowable number of bits for a given frame; and adjusting the allocated number of bits based on each frequency band.

Digital encapsulation of audio signals
11710493 · 2023-07-25 · ·

Encoding and decoding systems are described for the provision of high quality digital representations of audio signals with particular attention to the correct perceptual rendering of fast transients at modest sample rates. This is achieved by optimising downsampling and upsampling filters to minimise the length of the impulse response while adequately attenuating alias products that have been found perceptually harmful.

SUBBAND BLOCK BASED HARMONIC TRANSPOSITION
20230238017 · 2023-07-27 · ·

The present document relates to audio source coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), as well as to digital effect processors, e.g. exciters, where generation of harmonic distortion add brightness to the processed signal, and to time stretchers where a signal duration is prolonged with maintained spectral content. A system and method configured to generate a time stretched and/or frequency transposed signal from an input signal is described. The system comprises an analysis filterbank configured to provide an analysis subband signal from the input signal; wherein the analysis subband signal comprises a plurality of complex valued analysis samples, each having a phase and a magnitude. Furthermore, the system comprises a subband processing unit configured to determine a synthesis subband signal from the analysis subband signal using a subband transposition factor Q and a subband stretch factor S. The subband processing unit performs a block based nonlinear processing wherein the magnitude of samples of the synthesis subband signal are determined from the magnitude of corresponding samples of the analysis subband signal and a predetermined sample of the analysis subband signal. In addition, the system comprises a synthesis filterbank configured to generate the time stretched and/or frequency transposed signal from the synthesis subband signal.

AUDIO PROCESSING FOR VOICE ENCODING AND DECODING

The present document relates an audio encoding and decoding system (referred to as an audio codec system). In particular, the present document relates to a audio codec system which is particularly well suited for voice encoding/decoding. A transform-based speech encoder is configured to encode a speech signal into a bitstream is described. A speech decoder configured to decode audio signals from a bitstream is further described.