Apparatus and method for synthesizing an audio signal, decoder, encoder, system and computer program
11373664 · 2022-06-28
Assignee
Inventors
- Guillaume Fuchs (Bubenrath, DE)
- Tom Baeckstroem (Nuremberg, DE)
- Ralf Geiger (Erlangen, DE)
- Wolfgang Jaegers (Erlangen, DE)
- Emmanuel Ravelli (Erlangen, DE)
Cpc classification
G10L19/06
PHYSICS
G10L19/12
PHYSICS
G10L19/087
PHYSICS
International classification
G10L19/00
PHYSICS
G10L19/12
PHYSICS
G10L19/02
PHYSICS
G10L19/087
PHYSICS
Abstract
A method and an apparatus for synthesizing an audio signal are described. A spectral tilt is applied to the code of a codebook used for synthesizing a current frame of the audio signal. The spectral tilt is based on the spectral tilt of the current frame of the audio signal. Further, an audio decoder operating in accordance with the inventive approach is described.
Claims
1. An apparatus for synthesizing an audio signal, comprising: an input for receiving an encoded signal, a codebook for decoding the encoded audio signal, the codebook comprising a plurality of codes, a synthesizer for receiving from the codebook a code selected from the codebook on the basis of the encoded audio signal, and for generating a synthesized signal, and a processing unit comprising a hardware implementation and configured to apply a spectral tilt to the code of a codebook used for synthesizing a current frame of the audio signal, wherein the spectral tilt is based on the spectral tilt of the current frame of the audio signal, wherein the apparatus is configured to determine the spectral tilt of the current frame of the audio signal on the basis of spectral envelope information for the current frame of the audio signal, and wherein the processing unit is configured to apply the spectral tilt by filtering the code from the codebook based on a transfer function modeling the spectral tilt.
2. The apparatus of claim 1, wherein the spectral envelope information is defined by LPC coefficients, and wherein the spectral tilt of the current frame of the audio signal is defined as follows:
3. The apparatus of claim 2, wherein N is equal to the number of codes in the codebook.
4. The apparatus of claim 1, wherein the spectral envelope information is defined by LPC coefficients, and wherein the spectral tilt of the current frame of the audio signal is defined as follows:
5. The apparatus of claim 1, wherein the transfer function comprising the spectral tilt is defined as follows:
F.sub.t1(z)=1−γz.sup.−1.
6. The apparatus of claim 1, wherein the processing unit is further configured to combine the determined spectral tilt of the current frame of the audio signal with a factor related to the voicing of the previous frame of the audio signal.
7. The apparatus of claim 6, wherein the processing unit is configured to apply the spectral tilt by filtering the code from the codebook based on a transfer function comprising the spectral tilt and the factor related to the voicing of the previous frame of the audio signal.
8. The apparatus of claim 7, wherein the transfer function comprising the spectral tilt is defined as follows:
F.sub.t2(z)=1−(α.Math.β+b.Math.γ)z.sup.−1, with: a, b constants.
9. The apparatus of claim 6, wherein the factor related to the voicing of the previous frame of the audio signal is defined as follows:
10. An audio decoder comprising apparatus for synthesizing an audio signal according to claim 1.
11. A system, comprising: an audio decoder comprising apparatus for synthesizing an audio signal according to claim 1, and an audio encoder for encoding an audio signal, wherein the audio encoder is configured to determine from a spectral tilt of a current frame of the audio signal a spectral tilt for a code of a codebook representing a current frame of the audio signal.
12. A method for synthesizing an audio signal, the method comprising: receiving an encoded signal, decoding the encoded audio signal using a codebook comprising a plurality of codes, a synthesizer for receiving from the codebook a code selected from the codebook on the basis of the encoded audio signal, and for generating a synthesized signal using a code selected from the codebook on the basis of the encoded audio signal, and applying, by a processing unit comprising a hardware implementation, a spectral tilt to the code of a codebook used for synthesizing a current frame of the audio signal, wherein the spectral tilt is determined on the basis of the spectral tilt of the current frame of the audio signal, wherein the spectral tilt of the current frame of the audio signal is determined on the basis of spectral envelope information for the current frame of the audio signal, and wherein applying the spectral tilt comprises filtering the code from the codebook based on a transfer function modeling the spectral tilt.
13. The method of claim 12, wherein the spectral envelope information is defined by LPC coefficients, and wherein the spectral tilt of the current frame of the audio signal is determined as follows:
14. The method of claim 13, wherein N is equal to the number of codes in the codebook.
15. The method of claim 12, wherein the spectral envelope information is defined by LPC coefficients, and wherein the spectral tilt of the current frame of the audio signal is determined as follows:
16. The method of claim 12, wherein the transfer function comprising the spectral tilt is determined as follows:
F.sub.t1(z)=1−γz.sup.−1.
17. The method of claim 12, further comprising combining the determined spectral tilt of the current frame of the audio signal with a factor related to the voicing of the previous frame of the audio signal.
18. The method of claim 17, wherein the factor related to the voicing of the previous frame of the audio signal is determined as follows:
19. The method of claim 17, wherein applying the spectral tilt comprises filtering the code from the codebook based on a transfer function comprising the spectral tilt and the factor related to the voicing of the previous frame of the audio signal.
20. The method of claim 19, wherein the transfer function comprising the spectral tilt is determined as follows:
F.sub.t2(z)=1−(α.Math.β+b.Math.γ)z.sup.−1, with: a, b constants.
21. A non-transitory digital storage medium having a computer program stored thereon to perform, when said computer program is run by a computer, a method for synthesizing an audio signal, which method comprises: receiving an encoded signal, decoding the encoded audio signal using a codebook comprising a plurality of codes, a synthesizer for receiving from the codebook a code selected from the codebook on the basis of the encoded audio signal, and for generating a synthesized signal using a code selected from the codebook on the basis of the encoded audio signal, and applying, by a processing unit comprising a hardware implementation, a spectral tilt to the code of a codebook used for synthesizing a current frame of the audio signal, wherein the spectral tilt is determined on the basis of the spectral tilt of the current frame of the audio signal, wherein the spectral tilt of the current frame of the audio signal is determined on the basis of spectral envelope information for the current frame of the audio signal, and wherein applying the spectral tilt comprises filtering the code from the codebook based on a transfer function modeling the spectral tilt.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Embodiments of the present invention will be detailed subsequently referring to the appended drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7) In the following, embodiments of the inventive approach will be described. It is noted that in the subsequent description similar elements/steps are referred by the same reference signs.
DETAILED DESCRIPTION OF THE INVENTION
(8)
(9) In accordance with further embodiments, an adaptive tilt compensation for shaping codes of a CELP innovative codebook will be described.
(10) The synthesizer 200 includes the filter 218 that is connected between the fixed codebook 202 and the first amplifier 212. The filter 218 receives from the storage 216 the LPC coefficients for the current frame. By means of the inventive structure the tilt of the audio frame that is currently processed is recovered from the already transmitted LPC coefficients that are stored in storage 216. In accordance with the embodiment of
(11)
where N is the size of the truncation of the infinite impulse response f.sub.S(n). In accordance with an embodiment, N is equal to the size of the innovative codebook, i.e. N is equal to the number of codes or codewords stored in the innovative codebook. The spectral tilt is applied, in accordance with the embodiment of
c(n)*f.sub.t1(n),
where f.sub.t1(n) is the impulse response of the following transfer function:
F.sub.t1(z)=1−γz.sup.−1.
(12) The embodiment of
(13) In accordance with a third embodiment, for further improving the spectral tilt to be closer to an optimal tilt, i.e. to be closer to the actual tilt of the current frame of the input signal, the LPC synthesis filter 208 has the following transfer function:
(14)
with w1=0.8 and w2=0.9. In this case, the spectral tilt is defined as follows:
(15)
(16) The weighting constants w1 and w2 are used to control the dynamic of the spectral envelope. For example, if w1=0 and w2=1, then F.sub.e(z) follows quite closely the true signal envelope. The resulting spectral tilt γ will show a high dynamic and can fluctuate too much. This may be a solution for very low bit-rates where the codebook lacks definitively of tilt structure. However it was found that perceptually it is better to deduce the spectral tilt γ from a smooth version of the spectral envelope. A good smoothing was found to be achieved with the above values w1=0.8 and w2=0.9, which shows a good trade-off for a large range of bit-rates. In accordance with embodiments, w1 and w2 are be bit-rate dependent. At very high rates if the codebook is large enough and is able to model any spectral tilts γ, one may switch off the influence of the spectral tilt γ by setting w1=w2=1.
(17) When compared to the second embodiment, which yields a tilt having a steeper slope than the optimal tilt would have, the third embodiment using the “weighted” transfer function provides for a tilt that is closer to the actual tilt of the current frame.
(18)
F.sub.t2(z)=1−(α.Math.β+b.Math.γ)z.sup.−1
where a and b are constants. In an advantageous embodiment a=0.5 and b=0.25. The factor β may be deduced from the voicing of a previous frame as follows:
(19)
and the actual factor β may be determined as follows:
β=constant.Math.(1+voicing)
(20) The constants a and b are applied to control the mixture of voicing tilt β and the spectral tilt γ. As mentioned above with regard to the weighting constants w1 and w2, for low and medium bit-rates, it may be relevant to shape the codebook by sharpening low frequencies or high frequencies based on the spectral tilt γ. It was also observed that the more the signal is voiced the better is it to sharp the high frequencies. The constants a and b may be used to normalize the tilt factors β and γ and weigh their strengths in order to combine the two effects as desired. In accordance with embodiments, the constants a and b may be found empirically by assessing the perceptual quality. This gives about the same strength to both factors: γ is bounded between −1 and 1, s so b.Math.γ is between −0.25 and 0.25 and β is bounded between 0 and 0.5 so a.Math.β is bounded between 0 and 0.25. As for the weighting constants w1 and w2, also the constants a and b may be made bit-rate dependent.
(21) In accordance with the fourth embodiment, the audio synthesis as shown in
(22)
(23)
(24) Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus. Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, some one or more of the most important method steps may be executed by such an apparatus.
(25) Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a non-transitory storage medium such as a digital storage medium, for example a floppy disc, a DVD, a Blu-Ray, a CD, a ROM, a PROM, and EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
(26) Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
(27) Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may, for example, be stored on a machine readable carrier.
(28) Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
(29) In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
(30) A further embodiment of the inventive method is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein. The data carrier, the digital storage medium or the recorded medium are typically tangible and/or non-transitionary.
(31) A further embodiment of the invention method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may, for example, be configured to be transferred via a data communication connection, for example, via the internet.
(32) A further embodiment comprises a processing means, for example, a computer or a programmable logic device, configured to, or programmed to, perform one of the methods described herein.
(33) A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
(34) A further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver. The receiver may, for example, be a computer, a mobile device, a memory device or the like. The apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver.
(35) In some embodiments, a programmable logic device (for example, a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods are advantageously performed by any hardware apparatus.
(36) While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations and equivalents as fall within the true spirit and scope of the present invention.