G10L21/0388

FEATURE DOMAIN BANDWIDTH EXTENSION AND SPECTRAL REBALANCE FOR ASR DATA AUGMENTATION
20230186925 · 2023-06-15 · ·

A method of processing speech includes: providing a first set of audio data having audio features in a first bandwidth; down-sampling the first set of audio data to a second bandwidth lower than the first bandwidth; producing, by a high frequency reconstruction network (HFRN), an estimate of audio features in the first bandwidth for the first set of audio data, based on at least the down-sampled audio data; inputting, into the HFRN, a second set of audio data having audio features in the second bandwidth; producing, by the HFRN, based on a second set of audio data having audio features in the second bandwidth, an estimate of audio features in the first bandwidth for the second set of audio data; and training a speech processing system (SPS) using the estimates of audio features in the first bandwidth for the first and second sets of audio data.

High-band signal generation

A device for signal processing includes a receiver and a high-band excitation signal generator. The receiver is configured to receive a parameter associated with a bandwidth-extended audio stream. The high-band excitation signal generator is configured to determine a value of the parameter. The high-band excitation signal generator is also configured to select, based on the value of the parameter, one of target gain information associated with the bandwidth-extended audio stream or filter information associated with the bandwidth-extended audio stream. The high-band excitation signal generator is further configured to generate a high-band excitation signal based on the one of the target gain information or the filter information.

APPARATUS AND METHOD FOR PROCESSING AN AUDIO SIGNAL TO OBTAIN A PROCESSED AUDIO SIGNAL USING A TARGET TIME-DOMAIN ENVELOPE
20170345433 · 2017-11-30 ·

Subject of the invention is an apparatus described by a schematic block diagram for processing an audio signal to obtain a processed audio signal. The apparatus includes a phase calculator for calculating phase values for spectral values of a sequence of frequency-domain frames representing overlapping frames of the audio signal. Moreover, the phase calculator is configured to calculate the phase values based on information on a target time-domain envelope related to the processed audio signal, so that the processed audio signal has at least in an approximation the target time-domain envelope and a spectral envelope determined by the sequence of frequency-domain frames.

APPARATUS AND METHOD FOR PROCESSING AN AUDIO SIGNAL TO OBTAIN A PROCESSED AUDIO SIGNAL USING A TARGET TIME-DOMAIN ENVELOPE
20170345433 · 2017-11-30 ·

Subject of the invention is an apparatus described by a schematic block diagram for processing an audio signal to obtain a processed audio signal. The apparatus includes a phase calculator for calculating phase values for spectral values of a sequence of frequency-domain frames representing overlapping frames of the audio signal. Moreover, the phase calculator is configured to calculate the phase values based on information on a target time-domain envelope related to the processed audio signal, so that the processed audio signal has at least in an approximation the target time-domain envelope and a spectral envelope determined by the sequence of frequency-domain frames.

AUDIO PROCESSING APPARATUS AND AUDIO PROCESSING METHOD
20170345442 · 2017-11-30 · ·

An upper limit of a frequency range of audio indicated by input audio data is detected. A representative point extraction unit downsamples the input audio data to a sampling rate set to be less than or equal to twice the detected upper limit to obtain representative-point audio data. An interpolation processing unit upsamples the representative-point audio data by using a fractal interpolation function (FIF) that uses a mapping function calculated by a mapping function calculation unit, while using the input audio data, if necessary, to generate high-frequency interpolated audio data.

PHASE RECONSTRUCTION IN A SPEECH DECODER

Innovations in phase quantization during speech encoding and phase reconstruction during speech decoding are described. For example, to encode a set of phase values, a speech encoder omits higher-frequency phase values and/or represents at least some of the phase values as a weighted sum of basis functions. Or, as another example, to decode a set of phase values, a speech decoder reconstructs at least some of the phase values using a weighted sum of basis functions and/or reconstructs lower-frequency phase values then uses at least some of the lower-frequency phase values to synthesize higher-frequency phase values. In many cases, the innovations improve the performance of a speech codec in low bitrate scenarios, even when encoded data is delivered over a network that suffers from insufficient bandwidth or transmission quality problems.

Time domain spectral bandwidth replication

A wireless audio system for encoding and decoding an audio signal using spectral bandwidth replication is provided. Bandwidth extension is performed in the time-domain, enabling low-latency audio coding.

ENCODING DEVICE, DECODING DEVICE, AND COMMUNICATION SYSTEM FOR EXTENDING VOICE BAND
20170330584 · 2017-11-16 · ·

A first encoding unit generates a first encoded signal by encoding a component within a first band in a voice signal. A frequency shifting unit shifts the frequency of a component within a second band in the voice signal, the second band having a frequency higher than that of the first band, to the frequency of a component within the first band. A second encoding unit generates a second encoded signal by encoding the component whose frequency has been shifted in the frequency shifting unit. An output unit outputs both the first encoded signal generated in the first encoding unit and the second encoded signal generated in the second encoding unit.

ENCODING DEVICE, DECODING DEVICE, AND COMMUNICATION SYSTEM FOR EXTENDING VOICE BAND
20170330584 · 2017-11-16 · ·

A first encoding unit generates a first encoded signal by encoding a component within a first band in a voice signal. A frequency shifting unit shifts the frequency of a component within a second band in the voice signal, the second band having a frequency higher than that of the first band, to the frequency of a component within the first band. A second encoding unit generates a second encoded signal by encoding the component whose frequency has been shifted in the frequency shifting unit. An output unit outputs both the first encoded signal generated in the first encoding unit and the second encoded signal generated in the second encoding unit.

Audio signal encoding and decoding method, and audio signal encoding and decoding apparatus
09805736 · 2017-10-31 · ·

An audio signal encoding and decoding method, an audio signal encoding and decoding apparatus, a transmitter, a receiver, and a communications system, which can improve encoding and/or decoding performance. The audio signal encoding method includes dividing a to-be-encoded time domain signal into a low band signal and a high band signal; encoding the low band signal to obtain a low frequency encoding parameter; calculating a voiced degree factor, and predicting a high band excitation signal; weighting the high band excitation signal and random noise using the voiced degree factor, so as to obtain a synthesized excitation signal; and obtaining a high frequency encoding parameter based on the synthesized excitation signal and the high band signal. Technical solutions in the embodiments of the present invention can improve an encoding or decoding effect.