G10L19/09

PACKET LOSS CONCEALMENT FOR SPEECH CODING
20180012606 · 2018-01-11 · ·

A speech coding method of reducing error propagation due to voice packet loss, is achieved by limiting or reducing a pitch gain only for the first subframe or the first two subframes within a speech frame, the excitation of a next frame is obtained according to the reduced or limited pitch gain value of the first subframe, and the next frame is encoded according to the obtained excitation. The method is used for a voiced speech class.

PACKET LOSS CONCEALMENT FOR SPEECH CODING
20180012606 · 2018-01-11 · ·

A speech coding method of reducing error propagation due to voice packet loss, is achieved by limiting or reducing a pitch gain only for the first subframe or the first two subframes within a speech frame, the excitation of a next frame is obtained according to the reduced or limited pitch gain value of the first subframe, and the next frame is encoded according to the obtained excitation. The method is used for a voiced speech class.

Signal filtering

In methods and systems for filtering an information input signal, a system may have: a first filter unit filtering an input signal at an initial subinterval in a current update interval according to parameters associated to the preceding update interval, the parameters being scaled by a first scaling factor changing towards 0; and a second filter unit filtering a second filter input signal, based on the output of the first filter unit, at the initial subinterval, according to parameters associated to the current update interval, the parameters being scaled by a second scaling factor changing from 0, or a value close to 0, toward a value more distant from 0.

Signal filtering

In methods and systems for filtering an information input signal, a system may have: a first filter unit filtering an input signal at an initial subinterval in a current update interval according to parameters associated to the preceding update interval, the parameters being scaled by a first scaling factor changing towards 0; and a second filter unit filtering a second filter input signal, based on the output of the first filter unit, at the initial subinterval, according to parameters associated to the current update interval, the parameters being scaled by a second scaling factor changing from 0, or a value close to 0, toward a value more distant from 0.

System and method for recognizing user's speech

Provided is a system and method for recognizing a user's speech. A method, performed by a server, of providing a text string for a speech signal input to a device includes: receiving, from the device, an encoder output value derived from an encoder of an end-to-end automatic speech recognition (ASR) model included in the device; identifying a domain corresponding to the received encoder output value; selecting a decoder corresponding to the identified domain from among a plurality of decoders of an end-to-end ASR model included in the server; obtaining a text string from the received encoder output value using the selected decoder; and providing the obtained text string to the device.

System and method for recognizing user's speech

Provided is a system and method for recognizing a user's speech. A method, performed by a server, of providing a text string for a speech signal input to a device includes: receiving, from the device, an encoder output value derived from an encoder of an end-to-end automatic speech recognition (ASR) model included in the device; identifying a domain corresponding to the received encoder output value; selecting a decoder corresponding to the identified domain from among a plurality of decoders of an end-to-end ASR model included in the server; obtaining a text string from the received encoder output value using the selected decoder; and providing the obtained text string to the device.

Post filter for audio signals

In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information.

Post filter for audio signals

In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information.

AUDIO SIGNAL ENCODING METHOD AND APPARATUS, AND AUDIO SIGNAL DECODING METHOD AND APPARATUS
20220335961 · 2022-10-20 ·

An audio signal encoding method and apparatus, and an audio signal decoding method and apparatus, are described. The encoding method includes obtaining a target frequency-domain coefficient of a current frame and a reference target frequency-domain coefficient of the current frame. The encoding method further includes calculating a cost function based on the target frequency-domain coefficient and the reference target frequency-domain coefficient of the current frame, where the cost function is for determining whether to perform long-term prediction (LTP) processing on the current frame during encoding of the target frequency-domain coefficient of the current frame. Additionally, the method includes encoding the target frequency-domain coefficient of the current frame based on the cost function.

AUDIO SIGNAL ENCODING METHOD AND APPARATUS, AND AUDIO SIGNAL DECODING METHOD AND APPARATUS
20220335961 · 2022-10-20 ·

An audio signal encoding method and apparatus, and an audio signal decoding method and apparatus, are described. The encoding method includes obtaining a target frequency-domain coefficient of a current frame and a reference target frequency-domain coefficient of the current frame. The encoding method further includes calculating a cost function based on the target frequency-domain coefficient and the reference target frequency-domain coefficient of the current frame, where the cost function is for determining whether to perform long-term prediction (LTP) processing on the current frame during encoding of the target frequency-domain coefficient of the current frame. Additionally, the method includes encoding the target frequency-domain coefficient of the current frame based on the cost function.