G10L25/69

AUDIO ENCODING METHOD, AUDIO DECODING METHOD, APPARATUS, COMPUTER DEVICE, STORAGE MEDIUM, AND COMPUTER PROGRAM PRODUCT
20230046509 · 2023-02-16 ·

An audio encoding bit rate prediction model training method is performed by a computer device. The method includes: obtaining a sample audio feature parameter corresponding to each of sample audio frames in a first sample audio; performing encoding bit rate prediction on the sample audio feature parameter through an encoding bit rate prediction model, to obtain a sample encoding bit rate for each of the sample audio frames; performing audio encoding on the sample audio frames based on the corresponding sample encoding bit rates to generate sample audio data corresponding to the sample audio frames; performing audio decoding on the sample audio data, to obtain a second sample audio corresponding to the sample audio data; and training the encoding bit rate prediction model based on the first sample audio and the second sample audio until a sample encoding quality score reaches a target encoding quality score.

AUDIO ENCODING METHOD, AUDIO DECODING METHOD, APPARATUS, COMPUTER DEVICE, STORAGE MEDIUM, AND COMPUTER PROGRAM PRODUCT
20230046509 · 2023-02-16 ·

An audio encoding bit rate prediction model training method is performed by a computer device. The method includes: obtaining a sample audio feature parameter corresponding to each of sample audio frames in a first sample audio; performing encoding bit rate prediction on the sample audio feature parameter through an encoding bit rate prediction model, to obtain a sample encoding bit rate for each of the sample audio frames; performing audio encoding on the sample audio frames based on the corresponding sample encoding bit rates to generate sample audio data corresponding to the sample audio frames; performing audio decoding on the sample audio data, to obtain a second sample audio corresponding to the sample audio data; and training the encoding bit rate prediction model based on the first sample audio and the second sample audio until a sample encoding quality score reaches a target encoding quality score.

METHOD AND APPARATUS FOR DETERMINING PARAMETERS OF A GENERATIVE NEURAL NETWORK
20230229892 · 2023-07-20 · ·

Described herein is a method of determining parameters for a generative neural network for processing an audio signal, wherein the generative neural network includes an encoder stage mapping to a coded feature space and a decoder stage, each stage including a plurality of convolutional layers with one or more weight coefficients, the method comprising a plurality of cycles with sequential processes of: pruning the weight coefficients of either or both stages based on pruning control information, the pruning control information determining the number of weight coefficients that are pruned for respective convolutional layers; training the pruned generative neural network based on a set of training data; determining a loss for the trained and pruned generative neural network based on a loss function; and determining updated pruning control information based on the determined loss and a target loss. Further described are corresponding apparatus, programs, and computer-readable storage media.

METHOD AND APPARATUS FOR DETERMINING PARAMETERS OF A GENERATIVE NEURAL NETWORK
20230229892 · 2023-07-20 · ·

Described herein is a method of determining parameters for a generative neural network for processing an audio signal, wherein the generative neural network includes an encoder stage mapping to a coded feature space and a decoder stage, each stage including a plurality of convolutional layers with one or more weight coefficients, the method comprising a plurality of cycles with sequential processes of: pruning the weight coefficients of either or both stages based on pruning control information, the pruning control information determining the number of weight coefficients that are pruned for respective convolutional layers; training the pruned generative neural network based on a set of training data; determining a loss for the trained and pruned generative neural network based on a loss function; and determining updated pruning control information based on the determined loss and a target loss. Further described are corresponding apparatus, programs, and computer-readable storage media.

Acoustic quality evaluation apparatus, acoustic quality evaluation method, and program

To obtain an appropriate evaluation value in an acoustic quality evaluation by a conversational test. An acoustic quality evaluation apparatus 3 evaluates the acoustic quality of a call performed between a near-end terminal 1 and a far-end terminal 2 via a voice communication network 4. An evaluation value presenting unit 31 displays, on a display unit 13, evaluation categories obtained by classifying each of a plurality of evaluation viewpoints into a predetermined number of levels. An input unit 14 transmits the evaluation category selected by the evaluator for each of the evaluation viewpoints, to an evaluation value determination unit 32. The evaluation value determination unit 32 determines the lowest evaluation value among evaluation values assigned to the evaluation category received from the input unit 14 as a subjective evaluation value for acoustic quality.

Acoustic quality evaluation apparatus, acoustic quality evaluation method, and program

To obtain an appropriate evaluation value in an acoustic quality evaluation by a conversational test. An acoustic quality evaluation apparatus 3 evaluates the acoustic quality of a call performed between a near-end terminal 1 and a far-end terminal 2 via a voice communication network 4. An evaluation value presenting unit 31 displays, on a display unit 13, evaluation categories obtained by classifying each of a plurality of evaluation viewpoints into a predetermined number of levels. An input unit 14 transmits the evaluation category selected by the evaluator for each of the evaluation viewpoints, to an evaluation value determination unit 32. The evaluation value determination unit 32 determines the lowest evaluation value among evaluation values assigned to the evaluation category received from the input unit 14 as a subjective evaluation value for acoustic quality.

COMPUTERIZED MONITORING OF DIGITAL AUDIO SIGNALS
20220399945 · 2022-12-15 ·

A digital audio quality monitoring device uses a deep neural network (DNN) to provide accurate estimates of signal-to-noise ratio (SNR) from a limited set of features extracted from incoming audio. Some embodiments improve the SNR estimate accuracy by selecting a DNN model from a plurality of available models based on a codec used to compress/decompress the incoming audio. Each model has been trained on audio compressed/decompressed by a codec associated with the model, and the monitoring device selects the model associated with the codec used to compress/decompress the incoming audio. Other embodiments are also provided.

AUTOMATED PIPELINE SELECTION FOR SYNTHESIS OF AUDIO ASSETS

An example method of automated selection of audio asset synthesizing pipelines includes: receiving an audio stream comprising human speech; determining one or more features of the audio stream; selecting, based on the one or more features of the audio stream, an audio asset synthesizing pipeline; training, using the audio stream, one or more audio asset synthesizing models implementing respective stages of the selected audio asset synthesizing pipeline; and responsive to determining that a quality metric of the audio asset synthesizing pipeline satisfies a predetermined quality condition, synthesizing one or more audio assets by the selected audio asset synthesizing pipeline.

Real-time assessment of call quality

Disclosed embodiments provide techniques for improved call quality during telephony sessions. The speech quality of an active voice session is periodically evaluated using multiple noise reduction algorithms. In an instance where the speech quality of the currently used noise reduction algorithm is below the quality of another noise reduction algorithm, the telephony system may switch to a new noise reduction algorithm as the currently used (active) noise reduction algorithm in order to improve call quality during an active voice session.

METHOD AND APPARATUS FOR PROCESSING AN INITIAL AUDIO SIGNAL
20230087486 · 2023-03-23 ·

A method processes an initial audio signal, having a target portion and a side portion, by receiving of the initial audio signal; modifying the received initial audio signal using a first signal modifier to obtain a first modified audio signal and modifying the received initial audio signal using a second signal modifier to obtain a second modified audio signal; comparing received initial audio signal with the first modified audio signal to obtain a first perceptual similarity value describing the perceptual similarity between the initial audio signal and the first modified audio signal; and comparing the received initial audio signal with the second modified audio signal to obtain a second perceptual similarity value describing the perceptual similarity between the initial audio signal and the second modified audio signal; and selecting the first or second modified audio signal dependent on the respective first or second perceptual similarity value.