G10L19/09

AUDIO SIGNAL ENCODING METHOD AND APPARATUS, AND AUDIO SIGNAL DECODING METHOD AND APPARATUS
20220335960 · 2022-10-20 ·

An audio signal encoding method and apparatus, and an audio signal decoding method and apparatus are provided. The audio signal encoding method includes: obtaining a frequency-domain coefficient of a current frame and a frequency-domain coefficient of a reference signal of the current frame; performing filtering processing on the frequency-domain coefficient of the current frame to obtain a filtering parameter; determining a target frequency-domain coefficient of the current frame based on the filtering parameter; performing filtering processing on the frequency-domain coefficient of the reference signal and a reference frequency-domain coefficient based on the filtering parameter to obtain a target frequency-domain coefficient of the reference signal; and encoding the target frequency-domain coefficient of the current frame based on the target frequency-domain coefficient of the current frame, the target frequency-domain coefficient of the reference signal, a reference target frequency-domain coefficient. The method can improve audio signal encoding/decoding efficiency.

AUDIO SIGNAL ENCODING METHOD AND APPARATUS, AND AUDIO SIGNAL DECODING METHOD AND APPARATUS
20220335960 · 2022-10-20 ·

An audio signal encoding method and apparatus, and an audio signal decoding method and apparatus are provided. The audio signal encoding method includes: obtaining a frequency-domain coefficient of a current frame and a frequency-domain coefficient of a reference signal of the current frame; performing filtering processing on the frequency-domain coefficient of the current frame to obtain a filtering parameter; determining a target frequency-domain coefficient of the current frame based on the filtering parameter; performing filtering processing on the frequency-domain coefficient of the reference signal and a reference frequency-domain coefficient based on the filtering parameter to obtain a target frequency-domain coefficient of the reference signal; and encoding the target frequency-domain coefficient of the current frame based on the target frequency-domain coefficient of the current frame, the target frequency-domain coefficient of the reference signal, a reference target frequency-domain coefficient. The method can improve audio signal encoding/decoding efficiency.

SOUND SIGNAL DATABASE GENERATION APPARATUS, SOUND SIGNAL SEARCH APPARATUS, SOUND SIGNAL DATABASE GENERATION METHOD, SOUND SIGNAL SEARCH METHOD, DATABASE GENERATION APPARATUS, DATA SEARCH APPARATUS, DATABASE GENERATION METHOD, DATA SEARCH METHOD, AND PROGRAM

To provide database generation techniques that can accurately and efficiently generate a database useable in text-based sound signal search. A sound signal database generation apparatus includes: a latent variable generation unit that generates, from a sound signal, a latent variable corresponding to the sound signal using a sound signal encoder; a data generation unit that generates a natural language representation corresponding to the sound signal from the latent variable and a condition concerning an index for a natural language representation using a natural language representation decoder; and a sound signal database generation unit that generates a record including the natural language representation corresponding to the sound signal and the sound signal from the natural language representation corresponding to the sound signal and the sound signal, and generates a sound signal database made up of the record.

SOUND SIGNAL DATABASE GENERATION APPARATUS, SOUND SIGNAL SEARCH APPARATUS, SOUND SIGNAL DATABASE GENERATION METHOD, SOUND SIGNAL SEARCH METHOD, DATABASE GENERATION APPARATUS, DATA SEARCH APPARATUS, DATABASE GENERATION METHOD, DATA SEARCH METHOD, AND PROGRAM

To provide database generation techniques that can accurately and efficiently generate a database useable in text-based sound signal search. A sound signal database generation apparatus includes: a latent variable generation unit that generates, from a sound signal, a latent variable corresponding to the sound signal using a sound signal encoder; a data generation unit that generates a natural language representation corresponding to the sound signal from the latent variable and a condition concerning an index for a natural language representation using a natural language representation decoder; and a sound signal database generation unit that generates a record including the natural language representation corresponding to the sound signal and the sound signal from the natural language representation corresponding to the sound signal and the sound signal, and generates a sound signal database made up of the record.

Method for speech coding, method for speech decoding and their apparatuses
09852740 · 2017-12-26 · ·

A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal. In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result.

Method for speech coding, method for speech decoding and their apparatuses
09852740 · 2017-12-26 · ·

A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal. In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result.

Method and apparatus for polyphonic audio signal prediction in coding and networking systems

A method, device, and apparatus provide the ability to predict a portion of a polyphonic audio signal for compression and networking applications. The solution involves a framework of a cascade of long term prediction filters, which by design is tailored to account for all periodic components present in a polyphonic signal. This framework is complemented with a design method to optimize the system parameters. Specialization may include specific techniques for coding and networking scenarios, where the potential of each enhanced prediction is realized to considerably improve the overall system performance for that application. One specific technique provides enhanced inter-frame prediction for the compression of polyphonic audio signals, particularly at low delay. Another specific technique provides improved frame loss concealment capabilities to combat packet loss in audio communications.

Method and apparatus for polyphonic audio signal prediction in coding and networking systems

A method, device, and apparatus provide the ability to predict a portion of a polyphonic audio signal for compression and networking applications. The solution involves a framework of a cascade of long term prediction filters, which by design is tailored to account for all periodic components present in a polyphonic signal. This framework is complemented with a design method to optimize the system parameters. Specialization may include specific techniques for coding and networking scenarios, where the potential of each enhanced prediction is realized to considerably improve the overall system performance for that application. One specific technique provides enhanced inter-frame prediction for the compression of polyphonic audio signals, particularly at low delay. Another specific technique provides improved frame loss concealment capabilities to combat packet loss in audio communications.

Apparatus and method realizing a fading of an MDCT spectrum to white noise prior to FDNS application

An apparatus for decoding an encoded audio signal to obtain a reconstructed audio signal includes a receiving interface for receiving one or more frames comprising information on a plurality of audio signal samples of an audio signal spectrum of the encoded audio signal, and a processor for generating the reconstructed audio signal. The processor is configured to generate the reconstructed audio signal by fading a modified spectrum to a target spectrum, if a current frame is not received by the receiving interface or if the current frame is received by the receiving interface but is corrupted, wherein the modified spectrum includes a plurality of modified signal samples, wherein, for each of the modified signal samples of the modified spectrum, an absolute value of the modified signal sample is equal to an absolute value of one of the audio signal samples of the audio signal spectrum.

Apparatus and method realizing a fading of an MDCT spectrum to white noise prior to FDNS application

An apparatus for decoding an encoded audio signal to obtain a reconstructed audio signal includes a receiving interface for receiving one or more frames comprising information on a plurality of audio signal samples of an audio signal spectrum of the encoded audio signal, and a processor for generating the reconstructed audio signal. The processor is configured to generate the reconstructed audio signal by fading a modified spectrum to a target spectrum, if a current frame is not received by the receiving interface or if the current frame is received by the receiving interface but is corrupted, wherein the modified spectrum includes a plurality of modified signal samples, wherein, for each of the modified signal samples of the modified spectrum, an absolute value of the modified signal sample is equal to an absolute value of one of the audio signal samples of the audio signal spectrum.