G10L21/0324

Speech Signal Processing Method and Apparatus
20230029267 · 2023-01-26 ·

This application relates to the field of signal processing technologies and headsets, and provides a speech signal processing method and apparatus, to provide a full-band low-noise speech signal. The method is applied to a headset including at least two speech collectors, where the at least two speech collectors include an ear canal speech collector and at least one external speech collector. The method includes: preprocessing a speech signal that is in a first frequency band and that is collected by the ear canal speech collector, to obtain a first speech signal; preprocessing a speech signal that is in a second frequency band and that is collected by the at least one external speech collector, to obtain an external speech signal, where frequency ranges of the first frequency band and the second frequency band are different; performing correlation processing on the first speech signal and the external speech signal to obtain a second speech signal; and outputting a target speech signal, where the target speech signal includes the first speech signal and the second speech signal.

Speech Signal Processing Method and Apparatus
20230029267 · 2023-01-26 ·

This application relates to the field of signal processing technologies and headsets, and provides a speech signal processing method and apparatus, to provide a full-band low-noise speech signal. The method is applied to a headset including at least two speech collectors, where the at least two speech collectors include an ear canal speech collector and at least one external speech collector. The method includes: preprocessing a speech signal that is in a first frequency band and that is collected by the ear canal speech collector, to obtain a first speech signal; preprocessing a speech signal that is in a second frequency band and that is collected by the at least one external speech collector, to obtain an external speech signal, where frequency ranges of the first frequency band and the second frequency band are different; performing correlation processing on the first speech signal and the external speech signal to obtain a second speech signal; and outputting a target speech signal, where the target speech signal includes the first speech signal and the second speech signal.

AUDIO TRANSMITTER PROCESSOR, AUDIO RECEIVER PROCESSOR AND RELATED METHODS AND COMPUTER PROGRAMS

An audio transmitter processor for generating an error protected frame using encoded audio data of an audio frame, the encoded audio data for the audio frame having a first amount of information units and a second amount of information units, has: a frame builder for building a codeword frame having a codeword raster, wherein the frame builder is configured to determine a border between a first amount of information units and a second amount of information units so that a starting information unit of the second amount of information units coincides with a codeword border; and an error protection coder to obtain a plurality of processed codewords representing the error protected frame.

AUDIO TRANSMITTER PROCESSOR, AUDIO RECEIVER PROCESSOR AND RELATED METHODS AND COMPUTER PROGRAMS

An audio transmitter processor for generating an error protected frame using encoded audio data of an audio frame, the encoded audio data for the audio frame having a first amount of information units and a second amount of information units, has: a frame builder for building a codeword frame having a codeword raster, wherein the frame builder is configured to determine a border between a first amount of information units and a second amount of information units so that a starting information unit of the second amount of information units coincides with a codeword border; and an error protection coder to obtain a plurality of processed codewords representing the error protected frame.

AUTOMATED MIXING OF AUDIO DESCRIPTION

A computer-implemented method of audio processing, the method comprising: receiving audio object data and audio description data, wherein the audio object data includes a first plurality of audio objects; calculating a long-term loudness of the audio object data and a long- term loudness of the audio description data; calculating a plurality of short-term loudnesses of the audio object data and a plurality of short-term loudnesses of the audio description data; reading a first plurality of mixing parameters that correspond to the audio object data; generating a second plurality of mixing parameters based on the first plurality of mixing parameters, the long-term loudness of the audio object data, the long-term loudness of the audio description data, the plurality of short-term loudnesses of the audio object data, and the plurality of short-term loudnesses of the audio description data; generating a gain adjustment visualization corresponding to the second plurality of mixing parameters, the audio object data and the audio description data; and generating mixed audio object data by mixing the audio object data and the audio description data according to the second plurality of mixing parameters, wherein the mixed audio object data includes a second plurality of audio objects, wherein the second plurality of audio objects correspond to the first plurality of audio objects mixed with the audio description data according to the second plurality of mixing parameters.

ESTIMATION OF BACKGROUND NOISE IN AUDIO SIGNALS
20230215447 · 2023-07-06 ·

Background noise estimators and methods are disclosed for estimating background noise in an audio signal. Some methods include obtaining at least one parameter associated with an audio signal segment, such as a frame or part of a frame, based on a first linear prediction gain, calculated as a quotient between a residual signal from a 0th-order linear prediction and a residual signal from a 2nd-order linear prediction for the audio signal segment. A second linear prediction gain is calculated as a quotient between a residual signal from a 2nd-order linear prediction and a residual signal from a 16th-order linear prediction for the audio signal segment. Whether the audio signal segment comprises a pause is determined based at least on the obtained at least one parameter; and a background noise estimate is updated based on the audio signal segment when the audio signal segment comprises a pause.

ESTIMATION OF BACKGROUND NOISE IN AUDIO SIGNALS
20230215447 · 2023-07-06 ·

Background noise estimators and methods are disclosed for estimating background noise in an audio signal. Some methods include obtaining at least one parameter associated with an audio signal segment, such as a frame or part of a frame, based on a first linear prediction gain, calculated as a quotient between a residual signal from a 0th-order linear prediction and a residual signal from a 2nd-order linear prediction for the audio signal segment. A second linear prediction gain is calculated as a quotient between a residual signal from a 2nd-order linear prediction and a residual signal from a 16th-order linear prediction for the audio signal segment. Whether the audio signal segment comprises a pause is determined based at least on the obtained at least one parameter; and a background noise estimate is updated based on the audio signal segment when the audio signal segment comprises a pause.

Detecting and Compensating for the Presence of a Speaker Mask in a Speech Signal
20230005498 · 2023-01-05 ·

Compensating a speech signal for the presence of a speaker mask includes receiving a speech signal, dividing the speech signal into subframes, generating speech parameters for a subframe, and determining whether the subframe is suitable for use in detecting a mask. If the subframe is suitable for use in detecting a mask, the speech parameters for the subframe are used in determining whether a mask is present. If a mask is present, the speech parameters for the subframe are modified to produce modified speech parameters that compensate for the presence of the mask.

Loudness adjustment for downmixed audio content

Audio content coded for a reference speaker configuration is downmixed to downmix audio content coded for a specific speaker configuration. One or more gain adjustments are performed on individual portions of the downmix audio content coded for the specific speaker configuration. Loudness measurements are then performed on the individual portions of the downmix audio content. An audio signal that comprises the audio content coded for the reference speaker configuration and downmix loudness metadata is generated. The downmix loudness metadata is created based at least in part on the loudness measurements on the individual portions of the downmix audio content.

Loudness adjustment for downmixed audio content

Audio content coded for a reference speaker configuration is downmixed to downmix audio content coded for a specific speaker configuration. One or more gain adjustments are performed on individual portions of the downmix audio content coded for the specific speaker configuration. Loudness measurements are then performed on the individual portions of the downmix audio content. An audio signal that comprises the audio content coded for the reference speaker configuration and downmix loudness metadata is generated. The downmix loudness metadata is created based at least in part on the loudness measurements on the individual portions of the downmix audio content.