Apparatus and method for downmixing or upmixing a multichannel signal using phase compensation
11488609 · 2022-11-01
Assignee
Inventors
- Jan Buethe (Erlangen, DE)
- Guillaume Fuchs (Bubenreuth, DE)
- Wolfgang Jaegers (Forchheim, DE)
- Franz Reutelhuber (Erlangen, DE)
- Juergen Herre (Erlangen, DE)
- Eleni Fotopoulou (Nuremberg, DE)
- Markus Multrus (Nuremberg, DE)
- Srikanth Korse (Nuremberg, DE)
Cpc classification
H04S2420/03
ELECTRICITY
H04S7/30
ELECTRICITY
G10L19/008
PHYSICS
H04S3/008
ELECTRICITY
International classification
G10L19/008
PHYSICS
H04S3/00
ELECTRICITY
H04S7/00
ELECTRICITY
Abstract
An apparatus for downmixing a multi-channel signal having at least two channels, has: a downmixer for calculating a downmix signal from the multi-channel signal, wherein the downmixer is configured to calculate the downmix using an absolute phase compensation, so that a channel having a lower energy among the at least two channels is only rotated or is rotated stronger than a channel having a greater energy in calculating the downmix signal; and an output interface for generating an output signal, the output signal having information on the downmix signal.
Claims
1. An apparatus for upmixing an encoded multi-channel signal, comprising: an input interface for receiving the encoded multi-channel signal and for acquiring a downmix signal from the encoded multi-channel signal and for acquiring a side gain from the encoded multi-channel signal, the side gain indicating an energy relation between a first original channel and a second original channel; and an upmixer for upmixing the downmix signal, wherein the upmixer is configured to calculate a reconstructed first channel and a reconstructed second channel using a phase compensation, wherein the downmix signal is, in calculating the reconstructed first channel, only phase-rotated or is phase-rotated stronger than in calculating the reconstructed second channel depending on the side gain.
2. The apparatus of claim 1, wherein the input interface is configured to acquire, from the encoded multichannel signal, inter-channel phase difference values, and wherein the upmixer is configured to apply the inter-channel phase difference values in the phase compensation, when calculating the reconstructed first channel and the reconstructed second channel.
3. The apparatus of claim 2, wherein the upmixer is configured to calculate a phase rotation parameter from an inter-channel phase difference value and the side gain, and to apply the phase rotation parameter in the phase compensation, when calculating the reconstructed first channel in a first manner, and to apply the inter-channel phase difference value and the phase rotation parameter in the phase compensation, when calculating the reconstructed second channel in a second manner, wherein the first manner is different from the second manner.
4. The apparatus of claim 3, wherein the upmixer is configured to calculate the phase rotation parameter so that the phase rotation parameter is within ±20% of a value determined based on the following equation:
5. The apparatus of claim 4, wherein the atan function comprises an atan2 function, the atan2(y,x) function being the two argument arctangent function whose value is an angle between the point (x,y) and a positive x-axis.
6. The apparatus of claim 4, wherein the upmixer is configured to calculate the reconstructed first channel and the reconstructed second channel so that the reconstructed first channel and the reconstructed second channel comprise values that are in the range of ±20% with respect to values as determined based on the following equations:
7. The apparatus of claim 3, wherein the upmixer is configured to calculate the reconstructed first channel and the reconstructed second channel so that the reconstructed first channel and the reconstructed second channel comprise values that are in the range of ±20% with respect to values as determined based on the following equations:
8. The apparatus of claim 1, wherein the apparatus further comprises a residual signal synthesizer for synthesizing a residual signal using the residual gain; wherein the upmixer is configured to perform a first weighting operation of the downmix signal using the side gain to acquire a first weighted downmix signal, wherein the upmixer is configured to perform a second weighting operation using the side gain and the downmix signal to acquire a second weighted downmix signal, wherein the first weighting operation is different from the second weighting operation, so that the first weighted downmix signal is different from the second weighted downmix signal, and wherein the upmixer is configured to calculate the reconstructed first channel using a combination of the first weighted downmix signal and the residual signal and to calculate the reconstructed second channel using a second combination of the second weighted downmix signal and the residual signal.
9. The apparatus of claim 8, wherein the upmixer is configured to combine the first weighted downmix signal and the residual signal using a first combination rule in calculating the reconstructed first channel, and wherein the upmixer is configured to combine the second weighted downmix signal and the residual signal using a second combination rule in calculating the reconstructed second channel, wherein the first combination rule and the second combination rule are different from each other, or wherein one of the first and the second combination rules is an adding operation and the other of the first and the second combination rules is a subtracting operation.
10. The apparatus of claim 8, wherein the upmixer is configured to perform the first weighting operation comprising a weighting factor derived from a sum of the side gain and a first predetermined number, and wherein the upmixer is configured to perform the second weighting operation comprising a weighting factor derived from a difference between a second predetermined number and the side gain, wherein the first predetermined number and the second predetermined number are equal to each other or are different from each other.
11. The apparatus of claim 8, wherein the residual signal synthesizer is configured to weight a downmix signal of a preceding frame using the residual gain for a current frame to acquire the residual signal for the current frame, or to weight a decorrelated signal derived from the current frame or from one or more preceding frames using the residual gain for the current frame to acquire the residual signal for the current frame.
12. The apparatus of claim 8, wherein the residual signal synthesizer is configured to calculate the residual signal so that an energy of the residual signal is equal to a signal energy indicated by the residual gain.
13. The apparatus of claim 8, wherein the residual signal synthesizer is configured to calculate the residual signal so that values of the residual signal are in a range of ±20% of values determined based on the following equation:
14. The apparatus of claim 13, wherein g.sub.norm is the energy normalization factor comprising values in the range of ±20% of values determined based on the following equation:
{tilde over (ρ)}.sub.t,k={tilde over (M)}.sub.t−d.sub.
15. The apparatus of claim 1, wherein the upmixer is configured to calculate the reconstructed first channel and the reconstructed second channel in a spectral domain, wherein the apparatus further comprises a spectrum-time converter for converting the reconstructed first channel and the reconstructed second channel into a time domain, wherein the upmixer is configured to rotate the channel comprising the lower energy more than the channel comprising the higher energy only when the energy difference between the channels is greater than a predefined threshold.
16. The apparatus of claim 15, wherein the spectrum-time converter is configured to convert, for each one of the reconstructed first channel and the reconstructed second channel, subsequent frames into a time sequence of frames to weight each time frame using a synthesis window; and to overlap and add subsequent windowed time frames to acquire a time block of the reconstructed first channel and the time block of the reconstructed second channel.
17. A method of upmixing an encoded multi-channel signal, comprising: receiving the encoded multi-channel signal and acquiring a downmix signal from the encoded multi-channel signal and acquiring a side gain from the encoded multi-channel signal, the side gain indicating an energy relation between a first original channel and a second original channel; and upmixing the downmix signal, the upmixing comprising calculating a reconstructed first channel and a reconstructed second channel using a phase compensation, wherein the downmix signal is, in calculating the reconstructed first channel, only phase-rotated or is phase-rotated stronger than in calculating the reconstructed second channel depending on the side gain.
18. A non-transitory digital storage medium having stored thereon a computer program for performing, when said computer program is run by a computer, a method of upmixing an encoded multi-channel signal, comprising: receiving the encoded multi-channel signal and acquiring a downmix signal from the encoded multi-channel signal and acquiring a side gain from the encoded multi-channel signal, the side gain indicating an energy relation between a first original channel and a second original channel; upmixing the downmix signal, the upmixing comprising calculating a reconstructed first channel and a reconstructed second channel using a phase compensation, wherein the downmix signal is, in calculating the reconstructed first channel, only phase-rotated or is phase-rotated stronger than in calculating the reconstructed second channel depending on the side gain.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Embodiments of the present invention are subsequently discussed with respect to the attached drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
DETAILED DESCRIPTION OF THE INVENTION
(19)
(20) The multichannel signal 100 is input into a downmixer 120 for calculating a downmix signal 122 from the multichannel signal 100. The downmixer can use, for calculating the multichannel signal, the first channel 101, the second channel 102 and the third channel 103 or only the first and the second channel or all channels of the multichannel signal depending on the certain implementation.
(21) Furthermore, the apparatus for encoding comprises a parameter calculator 140 for calculating a side gain 141 from the first channel 101 and the second channel 102 of the at least two channels and, additionally, the parameter calculator 104 calculates a residual gain 142 from the first channel and the second channel. In other embodiments, an optional inter-channel phase difference (IPD) is also calculated as illustrated at 143. The downmix signal 122, the side gain 141 and the residual gain 142 are forwarded to an output interface 160 that generates an encoded multichannel signal 162 that comprises information on the downmix signal 122, on the side gain 141 and the residual gain 142.
(22) It is to be noted that the side gain and the residual gain are typically calculated for frames so that, for each frame, a single side gain and the single residual gain is calculated. In other embodiments, however, not only a single side gain and a single residual gain is calculated for each frame, but a group of side gains and the group of residual gains are calculated for a frame where each side gain and each residual gain are related to a certain subband of the first channel and the second channel. Thus, in embodiments, the parameter calculator calculates, for each frame of the first and the second channel, a group of side gains and a group of residual gains, where the number of the side and the residual gains for a frame is typically equal to the number of subbands. When a high resolution time-spectrum-conversion is applied such as a DFT, the side gain and the residual gain for a certain subband are calculated from a group of frequency bins of the first channel and the second channel. However, when a low resolution time-frequency transform is applied that results in subband signals, then the parameter calculator 140 calculates, for each subband or even for a group of subbands a side gain and a residual gain.
(23) When the side gain and the residual gain are calculated for a group of subband signals, then the parameter resolution is reduced resulting in a lower bitrate but also resulting in a lower quality representation of the parametric representation of the side signal. In other embodiments, the time resolution can also be modified so that a side gain and a residual gain are not calculated for each frame but are calculated for a group of frames, where the group of frames has two or more frames. Thus, in such an embodiment, it is of advantage to calculate subband-related side/residual gains, where the side/residual gains refer to a certain subband, but refer to a group of frames comprising two or more frames. Thus, in accordance with the present invention, the time and frequency resolution of the parameter calculation performed by block 140 can be modified with high flexibility.
(24) The parameter calculator 140 may be implemented as outlined in
(25) The outputs of blocks 23, 24, 25 are forwarded to a side gain calculator 26 and are also forwarded to a residual gain calculator 27. The side gain calculator 26 and the residual gain calculator 27 apply a certain relation among the first amplitude related characteristic, the second amplitude related characteristic and the inner product and the relation applied by the residual gain calculator for combining both inputs is different from the relation that is applied by the side gain calculator 26.
(26) In an embodiment, the first and the second amplitude related characteristics are energies in subbands. However, other amplitude related characteristics relate to the amplitudes in subbands themselves, relate to signal powers in subbands or relate to any other powers of amplitudes with an exponent greater than 1, where the exponent can be a real number greater than 1 or an integer number greater than 1 such an integer number of 2 relating to a signal power and an energy or relating to an number of 3 that is associated with loudness, etc. Thus, each amplitude-related characteristic can be used for calculating the side gain and the residual gain.
(27) In an embodiment, the side gain calculator and the residual gain calculator 27 are configured to calculate the side gain as a side prediction gain that is applicable to a mid-signal of the first and the second channels to predict a side signal of the first and the second channels or the parameter calculator and, particularly, the residual gain calculator 27 is configured to calculate the residual gain as a residual prediction gain indicating an amplitude related measure of a residual signal of a prediction of the side signal by the mid-signal using the side gain.
(28) In particular, the parameter calculator 140 and the side gain calculator 26 of
(29) In a further embodiment, the parameter calculator the residual gain calculator 27 of
(30) In particular, the side calculator 26 of
(31) The residual gain calculator 27 is configured for using, in the nominator, a weighted sum of the amplitude characteristics of the first and the second channels and an inner product where the inner product is subtracted from the weighted sum of the amplitude characteristics of the first and the second channels. The denominator for calculating the residual gain calculator comprises a sum of the amplitude characteristics of the first and the second channel and the inner product where the inner product may be multiplied by two but can be multiplied by other factors as well.
(32) Furthermore, as illustrated by the connection line 28, the residual gain calculator 27 is configured for calculating the residual gain using the side gain calculated by the side gain calculator.
(33) In another embodiment, the residual gain and the side gain operate as follows. In particular, the bandwise inter-channel phase differences that will be described later on can be calculated or not. However, before particularly outlining the calculation of the side gain as illustrated later on in equation (9) and the specific advantageous calculation of the side gain as illustrated later on in equation (10), a further description of the encoder is given that also refers to a calculation of IPDs and downmixing in addition to the calculation of the gain parameters.
(34) Encoding of stereo parameters and computation of the downmix signal is done in frequency domain. To this end, time frequency vectors L.sub.t and R.sub.t of the left and right channel are generated by simultaneously applying an analysis window followed by a discrete Fourier transform (DFT): The DFT bins are then grouped into subbands (L.sub.t,k).sub.k ∈ I.sub.b resp. (R.sub.t, k).sub.k ∈ I.sub.b, where I.sub.b denotes the set of subbands indices.
(35) Calculation of IPDs and Downmixing
(36) For the downmix, a bandwise inter-channel-phase-difference (IPD) is calculated as
IPD.sub.t,b=arg(Σ.sub.k∈l.sub.
where z* denotes the complex conjugate of z. This is used to generate a bandwise mid and side signal
(37)
for k∈I.sub.b. The absolute phase rotation parameter β is given by
(38)
where g.sub.t,b denotes the side gain which will be specified below. Here, atan 2(y,x) is the two argument arctangent function whose value is the angle between the point (x,y) and the positive x-axis. It is intended to carry out the IPD compensation rather on the channel which has less energy. The factor 2 moves the singularity at IPD.sub.t,b=±π and g.sub.t,b=0 to IPD.sub.t,b=±π and g.sub.t,b=−⅓. This way toggling of β is avoided in out-of-phase situations with approximately equal energy distribution in left and right channel. The downmix signal is generated by applying the inverse DFT to M.sub.t followed by a synthesis window and overlap add.
(39) In other embodiments, other arctangent functions different from a tan 2-function can be used as well such as a straightforward tangent function, but the a tan 2 function is of advantage due to its safe application to the posed problem.
(40) Calculation of Gain Parameters
(41) Additional to the band-wise IPDs, two further stereo parameters are extracted. The optimal gain for predicting S.sub.t,b by M.sub.t,b, i.e. the number g.sub.t,b such that the energy of the remainder
p.sub.t,k=S.sub.t,k−g.sub.t,bM.sub.t,k (5)
is minimal, and a gain factor r.sub.t,b which, if applied to the mid signal M.sub.t, equalizes the energy of p.sub.t and M.sub.t in each band, i.e.
(42)
(43) The optimal prediction gain can be calculated from the energies in the subbands
(44)
and the absolute value of the inner product of L.sub.t and R.sub.t
(45)
(46) From this it follows that g.sub.t,b lies in [−1,1]. The residual gain can be calculated similarly from the energies and the inner product as
(47)
which implies
0≤r.sub.t,b≤√{square root over (1−g.sub.t,b.sup.2)} (11)
(48) In particular, this shows that r.sub.t,b ∈ [0,1]. This way, the stereo parameters can be calculated independently from the downmix by calculating the corresponding energies and the inner product. In particular, it is not necessary to compute the residual p.sub.t,k in order to compute its energy. It is noteworthy that calculation of the gains involves only one special function evaluation whereas calculation of ILD and ICC from E.sub.L,t,b, E.sub.R,t,b and X.sub.L/R,t,b involves two, namely a square root and a logarithm:
(49)
Lowering Parameter Resolution
(50) If a lower parameter resolution as given by the window length is desired, one may compute the gain parameters over h consecutive windows by replacing X.sub.L/R,t,b by
(51)
and E.sub.L,t,b resp. E.sub.R,t,b by
(52)
in (9) and (10). The side gain is then a weighted average of the side gains for the individual windows where the weights depend on the energy of M.sub.t+i,k or depends on the bandwise energies E.sub.M,s,b, wherein s is the summation index in equations 14 and 15.
(53) Similarly, the IPD values are then calculated over several windows as
(54)
(55) Advantageously, the parameter calculator 140 illustrated in
(56) Furthermore, the parameter generator 140 is configured to calculate the first and the second amplitude related measures by squaring magnitudes of complex spectral values in a subband and by summing squared magnitudes in the subband as, for example, also previously illustrated in equation (7), where index b stands for the subband.
(57) Furthermore, as also outlined in equation 8, the parameter calculator 140 and, in particular, the inner product calculator 25 of
(58) As also outlined in equations 1 to 4, it is advisable to use an absolute phase compensation. Thus, in this embodiment, the downmixer 120 is configured to calculate the downmix 122 using an absolute phase compensation so that only the channel having the lower energy among the two channels is rotated or the channel having the lower energy among the two channels is rotated stronger than the other channel that has a greater energy when calculating the downmix signal. Such a downmixer 120 is illustrated in
(59) In particular, an exponent or power of three corresponds, for example, to the loudness rather than to the energy.
(60) In particular, the IPD calculator 30 of
(61) Advantageously, block 36 is implemented as a side gain calculator so that the absolute phase rotation calculator operates based on the side gain.
(62) Thus, block 30 of
(63) In particular, the factor 2 in equation (4) before the term involving the side gain g.sub.t,b can be set different from 2 and can be, for example, a value advantageously between 0.1 and 100. Naturally, also −0.1 and −100 can also be used. This value makes sure that the singularity existing at an IPD of +−180° for almost equal left and right channels is moved to a different place, i.e., to a different side gain of, for example, −⅓ for the factor 2. However, other factors different from 2 can be used. These other factors then move the singularity to a different side gain parameter from −⅓. It has been shown that all these different factors are useful since these factors achieve that the problematic singularity is at a “place” in the sound stage having associated left and right channel signals that typically occur less frequently than signals being out of phase and having equal or almost equal energy.
(64) In the embodiment, the output interface 160 of
(65) Particularly in the embodiment, where the residual gain depends on the side gain, if the side gain is quantized and then the residual gain is quantized, wherein, in this embodiment, the quantization step for the residual gain depends on the value of the side gain.
(66) In particular, this is illustrated in
(67)
(68) Thus, this dependency can be used by lowering the quantization step size for the quantization of the residual gain for higher side gains. Thus, when
(69) In a further embodiment, the quantizer is configured to perform a joint quantization using groups of quantization points, where each group of quantization points is defined by a fixed amplitude-related ratio between the first and the second channel. One example for an amplitude-related ratio is the energy between left and right, i.e., this means lines for the same ILD between the first and the second channel as illustrated in
(70) In particular, the code builder receives a sign of the side gain g and determines a sign bit 57a illustrated in
(71) Subsequently, further embodiments for the quantization are outlined
(72) Quantization of Side and Residual Gain
(73) The inequalities in (11) reveal a strong dependence of the residual gain on the side gain, since the latter determines the range of the first. Quantizing the side gain g and the residual gain r independently by choosing quantization points in [−1,1] and [0,1] is therefore inefficient, since the number of possible quantization points for r would decrease as g tends towards ±1.
(74) Conditional Quantization
(75) There are different ways to handle this problem. The easiest way is to quantize g first and then to quantize r conditional on the quantized value, whence the quantization points will lie in the interval [0, √{square root over (1−{tilde over (g)}.sup.2)}]. Quantization points can then e.g., be chosen uniformly on these quantization lines, some of which are depicted in
(76) Joint Quantization
(77) A more sophisticated way to choose quantization points is to look at lines in the (g, r)-plane which correspond to a fixed energy ratio between L and R. If c.sup.2≥1 denotes such an energy ratio, then the corresponding line is given by either (0, s) for 0≤s≤1 if c=1 or
(78)
(79) This also covers the case c.sup.2<1 since swapping L.sub.t and R.sub.t only changes the sign of g.sub.t,b and leaves r.sub.t,b unchanged.
(80) This approach covers a larger region with the same number of quantization points as can be seen from
(81) A quantization scheme that has been found to work well is based on energy lines corresponding to ILD values
±{0,2,4,6,8,10,13,16,19,22,25,30,35,40,45,50}, (23)
on each of which 8 quantization points are selected. This gives rise to a code-book with 256 entries, which is organized as a 8×16 table of quantization points holding the values corresponding to non-negative values of g and a sign bit. This gives rise to a 8 bit integer representation of the quantization points (g, r) where e.g. the first bit specifies the sign of g, the next four bits hold the column index in the 8×16 table and the last three bits holding the row index.
(82) Quantization of (g.sub.t,b, r.sub.t,b) could be done by an exhaustive code-book search, but it is more efficient to calculate the subband ILD first and restrict the search to the best-matching energy line. This way, only 8 points need to be considered.
(83) Dequantization is done by a simple table lookup.
(84) The 128 quantization points for this scheme covering the non-negative values of g are displayed in
(85) Although a procedure has been disclosed for calculating the side gain and the residual gain without an actual calculation of the side signal, i.e., the difference signal between the left and the right signals as illustrated in equation (9) and equation (10), a further embodiment operates to calculate the side gain and the residual gain differently, i.e., with an actual calculation of the side signal. This procedure is illustrated in
(86) In this embodiment, the parameter calculator 140 illustrated in
(87) The side signal as calculated by the side signal calculator 60 is forwarded to a residual signal calculator 61. The residual signal calculator 62 performs the procedure illustrated in equation (5), for example. The residual signal calculator 61 is configured to use different test side gains, i.e., different values for the side gain g.sub.d,b, i.e., different test side gains for one and the same band and frame and, consequently, different residual signals are obtained as illustrated by the multiple outputs of block 61.
(88) The side gain selector 62 in
(89) The selected specific test side gain is determined by the side gain selector 62 as the side gain parameter for a certain frame or for a certain band and a certain frame. The selected residual signal is forwarded to the residual gain calculator 63 and the residual gain calculator can, in an embodiment, simply calculate the amplitude related characteristic of the selected residual signal or can, advantageously, calculate the residual gain as a relation between the amplitude related characteristic of the residual signal with respect to the amplitude-related characteristic of the downmix signal or mid-signal. Even when a downmix is used that is different from a phase compensated downmix or is different from a downmix consisting of a sum of left and right, then the residual gain can, nevertheless, be related to a non-phase compensated addition of left and right, as the case may be.
(90) Thus,
(91) Furthermore, it is to be noted here that all the equations given are advantageous embodiments for the values determined by the corresponding equations. However, it has been found that values that are different in a range of +−20% from the values as determined by the corresponding equations are also useful and already provide advantages over known technology, although the advantages become greater when the deviation from the values as determined by the equations becomes smaller. Thus, in other embodiments, it is of advantage to use values that are only different from the values as determined by the corresponding equations by +−10% and, in a most advantageous embodiment, the values determined by the equations are the values used for the calculation of the several data items.
(92)
(93) In particular, the input interface 204 is configured for receiving the encoded multichannel signal 200 and for obtaining a downmix signal 207, a side gain g 206 and a residual gain r 205 from the encoded multichannel signal 200. The residual signal synthesizer 208 synthesizes a residual signal using the residual gain 205 and the upmixer 212 is configured for upmixing the downmix signal 207 using the side gain 206 and the residual signal 209 as determined by the residual signal synthesizer 208 to obtain a reconstructed first channel 213 and a reconstructed second channel 214. In the embodiment in which the residual signal synthesizer 208 and the upmixer 212 operate in the spectral domain or at least the upmixer 212 operates in the spectral domain, the reconstructed first and second channels 213, 214 are given in spectral domain representations and the spectral domain representation for each channel can be converted into the time domain by the spectrum-time converter 216 to finally output the time domain first and second reconstructed channels.
(94) In particular, the upmixer 212 is configured to perform a first weighting operation using a first weighter 70 illustrated in
(95) Advantageously, the combination rules performed by the first combiner 72 and the second combiner 73 are different from each other so that the output of block 72 on the one hand and block 73 on the other hand are substantially different to each other due to the different combining rules in block 72, 73 and due to the different weighting rules performed by block 70 and block 71.
(96) Advantageously, the first and the second combination rules are different from each other due to the fact that one combination rule is an adding operation and the other operation rule a subtracting operation. However, other pairs of first and second combination rules can be used as well.
(97) Furthermore, the weighting rules used in block 70 and block 71 are different from each other, since one weighting rule uses a weighting with a weighting factor determined by a difference between a predetermined number and the side gain and the other weighting rule uses a weighting factor determined by a sum between a predetermined number and the side gain. The predetermined numbers can be equal to each other in both weighters or can be different from each other and the predetermined numbers are different from zero and can be integer or non-integer numbers and may be equal to 1.
(98)
(99) In an embodiment, the raw residual signal selector 80 is configured for selecting a downmix signal of a preceding frame such as the immediately preceding frame or an even earlier frame. However, and depending on the implementation, the raw residual signal selector 80 is configured for selecting the left or right signal or first or second channel signal as calculated for a preceding frame or the raw residual signal selector 80 can also determine the residual signal based on, for example, a combination such as a sum, a difference or so of the left and right signal determined for either the immediately preceding frame or an even earlier preceding frame. In other embodiments, the decorrelated signal calculator 80 can also be configured to actually generate a decorrelated signal. However, it is of advantage that the raw residual signal selector 80 operates without a specific decorrelator such as a decorrelation filter such as reverberation filter, but, for low complexity reasons, only selects an already existing signal from the past such as the mid signal, the reconstructed left signal, the reconstructed right signal or a signal derived from the earlier reconstructed left and right signal by simple operations such as a weighted combination, i.e., a (weighted) addition, a (weighted) subtraction or so that does not rely on a specific reverberation or a decorrelation filter.
(100) Generally, the weighter 82 is configured to calculate the residual signal so that an energy of the residual signal is equal to a signal energy indicated by the residual gain r, where this energy can be indicated in absolute terms, but may be indicated in relative terms with respect to the mid signal 207 of the current frame.
(101) In the embodiments for the encoder side and the decoder side, values of the side gain and if appropriate from the residual gain are different from zero.
(102) Subsequently, additional embodiments for the decoder are given in equation form.
(103) The upmix is again done in frequency domain. To this end, the time-frequency transform from the encoder is applied to the decoded downmix yielding time-frequency vectors {tilde over (M)}.sub.t,b. Using the dequantized values I{tilde over (P)}D.sub.t,b, {tilde over (g)}.sub.t,b, and {tilde over (r)}.sub.t,b, left and right channel are calculated as
(104)
for k ∈ I.sub.b, where {tilde over (ρ)}.sub.t,k is a substitute for the missing residual ρ.sub.t,k from the encoder, and g.sub.norm is the energy adjusting factor
(105)
that turns the relative gain coefficient {tilde over (r)}.sub.t,b into an absolute one. One may for instance take
{tilde over (ρ)}.sub.t,k={tilde over (M)}.sub.t−d.sub.
where d.sub.b>0 denotes a band-wise frame-delay. The phase rotation factor {tilde over (β)} is calculated again as
(106)
(107) The left channel and the right channel are then generated by applying the inverse DFT to {tilde over (L)}.sub.t and {tilde over (R)}.sub.t followed by a synthesis window and overlap add.
(108)
(109)
(110) In the embodiment, in which the side gain and the residual gain are calculated in the spectral domain, the left and right channels or first and second channels are separated into advantageously overlapping frames F(1), F(2), F(3) and F(4) and so on. In the embodiment illustrated in
(111)
(112) Then, the sequences of windowed frames are input into a transform block 1302. Advantageously, the transform block 1302 performs a transform algorithm resulting in complex spectral values such as a DFT and, specifically, an FFT. In other embodiments, however, also a purely real transform algorithm such as a DCT or an MDCT (modified discrete cosine transform) can be used as well and, subsequently, the imaginary parts can be estimated from the purely real parts as is known in the art and as is, for example, implemented in the USAC (unified speech and audio coding) standard. Other transform algorithms can be sub-band filter banks such as QMF filter banks that result in complex-valued subband signals. Typically, subband signal filter bands have a lower frequency resolution than FFT algorithms and an FFT or DFT spectrum having a certain number of DFT bins can be transformed into a sub-band-wise representation by collecting certain bins. This is illustrated in
(113) Particularly,
(114) Thus,
(115)
(116) The backward transformer 1310 is configured to perform an algorithm resulting in a backward transform and, particularly, an algorithm that may be inverse to the algorithm applied in block 1302 of
(117) Subsequently, different specific aspects of the present invention are given in short. Stereo M/S with IPD compensation and absolute phase compensation according to equation (4). Stereo M/S with IPD compensation and prediction of S by M according to (10) Stereo M/S with IPD compensation, prediction of S by M according to (9) and residual prediction according to gain factor (10) Efficient quantization of side and residual gain factors through joint quantization Joint quantization of side and residual gain factors on lines corresponding to a fixed energy ratio of L.sub.t and R.sub.t in the (g, r)-plane.
(118) It is to be noted that, advantageously, all of the above referenced five different aspects are implemented in one and the same encoder/decoder framework. However, it is additionally to be noted that the individual aspects given before can also be implemented separately from each other. Thus, the first aspect with the IPD compensation and absolute phase compensation can be performed in any downmixer irrespective of any side gain/residual gain calculation. Furthermore, for example, the aspect of the side gain calculation and the residual gain calculation can be performed with any downmix, i.e., also with a downmix that is not calculated by a certain phase compensation.
(119) Furthermore, even the calculation of the side gain on the one hand and the residual gain on the other hand can be performed independent from each other, where the calculation of the side gain alone or together with any other parameter different from the residual gain is also advantageous over the art particularly, with respect to an ICC or ILD calculation and, even the calculation of the residual gain alone or together with any other parameter different from the side gain is also already useful.
(120) Furthermore, the efficient joint or conditional quantization of the side and the residual gains or gain factors is useful with any particular downmix. Thus, the efficient quantization can also be used without any downmix at all. And, this efficient quantization can also be applied to any other parameters where the second parameter depends, with respect to its value range, from the first parameter so that a very efficient and low complex quantization can be performed for such dependent parameters that can, of course, be parameters different from the side gain and residual gain as well.
(121) Thus, all of the above mentioned five aspects can be performed and implemented independent from each other or together in a certain encoder/decoder implementation, and, also, only a subgroup of the aspects can be implemented together, i.e., three aspects are implemented together without the other two aspects or only two out of the five aspects are implemented together without the other three aspects as the case may be.
(122) Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
(123) Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed.
(124) Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
(125) Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may for example be stored on a machine readable carrier.
(126) Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier or a non-transitory storage medium.
(127) In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
(128) A further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein.
(129) A further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the Internet.
(130) A further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
(131) A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
(132) In some embodiments, a programmable logic device (for example a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods may be performed by any hardware apparatus.
(133) While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which will be apparent to others skilled in the art and which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations, and equivalents as fall within the true spirit and scope of the present invention.
REFERENCES
(134) MPEG-4 High Efficiency Advanced Audio Coding (HE-AAC) v2 FROM JOINT STEREO TO SPATIAL AUDIO CODING—RECENT PROGRESS AND STANDARDIZATION, Proc. of the 7th Int. Conference on digital Audio Effects (DAFX-04), Naples, Italy, Oct. 5-8, 2004.