G10L19/26

SPEECH ENHANCEMENT METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
20230097520 · 2023-03-30 ·

A speech enhancement method includes: performing pre-enhancement on a target speech frame according to a complex spectrum corresponding to the target speech frame, to obtain a first complex spectrum; performing speech decomposition on the target speech frame according to the first complex spectrum, to obtain a glottal parameter, a gain, and an excitation signal that correspond to the target speech frame; and performing synthesis according to the glottal parameter, the gain, and the excitation signal, to obtain an enhanced speech signal corresponding to the target speech frame.

AUDIO SIGNAL ENHANCEMENT METHOD AND APPARATUS, COMPUTER DEVICE, STORAGE MEDIUM AND COMPUTER PROGRAM PRODUCT
20230099343 · 2023-03-30 ·

This application relates to an audio signal enhancement method, performed by a computer device. The method including decoding received speech packets sequentially to obtain a residual signal, long term filtering parameters and linear filtering parameters; filtering the residual signal to obtain an audio signal; extracting feature parameters from the audio signal, when the audio signal is a feedforward error correction frame signal; converting the audio signal into a filter speech excitation signal based on the linear filtering parameters; performing speech enhancement on the filter speech excitation signal according to the feature parameters, the long term filtering parameters and the linear filtering parameters to obtain an enhanced speech excitation signal; and performing speech synthesis to obtain an enhanced speech signal based on the enhanced speech excitation signal and the linear filtering parameters.

AUDIO ENCODER AND BANDWIDTH EXTENSION DECODER

An audio encoder for providing an output signal using an input audio signal includes a patch generator, a comparator and an output interface. The patch generator generates at least one bandwidth extension high-frequency signal, wherein a bandwidth extension high-frequency signal includes a high-frequency band. The high-frequency band of the bandwidth extension high-frequency signal is based on a low frequency band of the input audio signal. A comparator calculates a plurality of comparison parameters. A comparison parameter is calculated based on a comparison of the input audio signal and a generated bandwidth extension high-frequency signal. Each comparison parameter of the plurality of comparison parameters is calculated based on a different offset frequency between the input audio signal and a generated bandwidth extension high-frequency signal. Further, the comparator determines a comparison parameter from the plurality of comparison parameters, wherein the determined comparison parameter fulfils a predefined criterion.

BACKWARD-COMPATIBLE INTEGRATION OF HARMONIC TRANSPOSER FOR HIGH FREQUENCY RECONSTRUCTION OF AUDIO SIGNALS

A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag.

METHOD AND APPARATUS FOR TRANSMITTING/RECEIVING VOICE SIGNAL ON BASIS OF ARTIFICIAL NEURAL NETWORK
20230036087 · 2023-02-02 ·

The present disclosure relates to a communication method and system for converging a 5th-Generation (5G) communication system for supporting higher data rates beyond a 4th-Generation (4G) system with a technology for Internet of Things (IoT). The present disclosure may be applied to intelligent services based on the 5G communication technology and the IoT-related technology, such as smart home, smart building, smart city, smart car, connected car, health care, digital education, smart retail, security and safety services. Method and apparatus for transmitting/receiving a voice signal based on an artificial neural network are disclosed. A method of a transmission terminal transmitting a voice signal comprises the steps of: transmitting, to a reception terminal, neural network structure information related to the transmission terminal; generating a wideband signal on the basis of an inputted voice; generating a narrowband signal by down-sampling the wideband signal; and transmitting the narrowband signal to the reception terminal.

METHOD AND APPARATUS FOR TRANSMITTING/RECEIVING VOICE SIGNAL ON BASIS OF ARTIFICIAL NEURAL NETWORK
20230036087 · 2023-02-02 ·

The present disclosure relates to a communication method and system for converging a 5th-Generation (5G) communication system for supporting higher data rates beyond a 4th-Generation (4G) system with a technology for Internet of Things (IoT). The present disclosure may be applied to intelligent services based on the 5G communication technology and the IoT-related technology, such as smart home, smart building, smart city, smart car, connected car, health care, digital education, smart retail, security and safety services. Method and apparatus for transmitting/receiving a voice signal based on an artificial neural network are disclosed. A method of a transmission terminal transmitting a voice signal comprises the steps of: transmitting, to a reception terminal, neural network structure information related to the transmission terminal; generating a wideband signal on the basis of an inputted voice; generating a narrowband signal by down-sampling the wideband signal; and transmitting the narrowband signal to the reception terminal.

LOW-FREQUENCY EMPHASIS FOR LPC-BASED CODING IN FREQUENCY DOMAIN

The invention provides an audio encoder including a combination of a linear predictive coding filter having a plurality of linear predictive coding coefficients and a time-frequency converter, wherein the combination is configured to filter and to convert a frame of the audio signal into a frequency domain in order to output a spectrum based on the frame and on the linear predictive coding coefficients; a low frequency emphasizer configured to calculate a processed spectrum based on the spectrum, wherein spectral lines of the processed spectrum representing a lower frequency than a reference spectral line are emphasized; and a control device configured to control the calculation of the processed spectrum by the low frequency emphasizer depending on the linear predictive coding coefficients of the linear predictive coding filter.

LOW-FREQUENCY EMPHASIS FOR LPC-BASED CODING IN FREQUENCY DOMAIN

The invention provides an audio encoder including a combination of a linear predictive coding filter having a plurality of linear predictive coding coefficients and a time-frequency converter, wherein the combination is configured to filter and to convert a frame of the audio signal into a frequency domain in order to output a spectrum based on the frame and on the linear predictive coding coefficients; a low frequency emphasizer configured to calculate a processed spectrum based on the spectrum, wherein spectral lines of the processed spectrum representing a lower frequency than a reference spectral line are emphasized; and a control device configured to control the calculation of the processed spectrum by the low frequency emphasizer depending on the linear predictive coding coefficients of the linear predictive coding filter.

Residual coding method of linear prediction coding coefficient based on collaborative quantization, and computing device for performing the method

Disclosed are a method for coding a residual signal of LPC coefficients based on collaborative quantization and a computing device for performing the method. The residual signal coding method includes: generating encoded LPC coefficients and LPC residual signals by performing LPC analysis and quantization on an input speech; Determining a predicted LPC residual signal by applying the LPC residual signal to cross module residual learning; Performing LPC synthesis using the coded LPC coefficients and the predicted LPC residual signal; It may include the step of determining an output speech that is a synthesized output according to a result of performing the LPC synthesis.

AUDIO PROCESSING METHOD AND DEVICE, TERMINAL, AND COMPUTER-READABLE STORAGE MEDIUM

Implementation of the disclosure provides an audio processing method, a device, a terminal and a computer-readable storage medium. The method can include the following. An audio to-be-matched and a reference audio are obtained. A frequency spectrum distribution of the audio to-be-matched and a frequency spectrum distribution of the reference audio are obtained. A target filter set for matching the audio to-be-matched with the reference audio is determined according to the frequency spectrum distribution of the audio to-be-matched and the frequency spectrum distribution of the reference audio. The target filter set is determined as a matching rule for the audio to-be-matched and the reference audio. An audio playing device is compensated by using the matching rule to adjust an audio playing effect of the audio playing device; and an audio is played through the compensated audio playing device.