Apparatus and method for generating an output signal employing a decomposer
09729991 · 2017-08-08
Assignee
Inventors
- Andreas Walther (Crissier, CH)
- Andreas SILZLE (Buckenhof, DE)
- Oliver Hellmuth (Erlangen, DE)
- Bernhard Grill (Lauf, DE)
- Harald Popp (Tuchenbach, DE)
Cpc classification
H04S3/006
ELECTRICITY
International classification
Abstract
An apparatus for generating an output signal having at least two output channels from an input signal having at least two input channels, has an ambient/direct decomposer, an ambient modification unit and a combination unit. The ambient/direct decomposer is adapted to decompose at least two input channels of the input signal such that each one of the at least two input channels is decomposed into a signal of a first signal group and into a signal of a second signal group. The ambient modification unit is adapted to modify a signal of the ambient signal group or a signal derived from a signal of the ambient signal group to obtain a modified signal as a first output channel. The combination unit is adapted to combine a signal of the ambient signal group or a signal derived from a signal of the ambient signal group and a signal of the direct signal group or a signal derived from a signal of the direct signal group as a second output channel.
Claims
1. An apparatus for generating an output signal comprising at least two output channels from an input signal comprising at least two input channels, comprising: an ambient/direct decomposer being adapted to decompose at least two input channels of the input signal such that each one of the at least two input channels is decomposed into an ambient signal of an ambient signal group and into a direct signal of a direct signal group; an ambient modification unit being adapted to modify an ambient signal of the ambient signal group or a signal derived from a signal of the ambient signal group to acquire a modified ambient signal as a first output channel for a first loudspeaker of a plurality of loudspeakers; and a combination unit being adapted to combine an ambient signal of the ambient signal group or a signal derived from an ambient signal of the ambient signal group and a direct signal of the direct signal group or a signal derived from a direct signal of the direct signal group as a second output channel for a second loudspeaker of the plurality of loudspeakers, wherein the apparatus is adapted via a function only a first amount of ambient signal portions of one of the at least two input channels to one of the plurality of loudspeakers, and wherein the apparatus is adapted via a function a remaining amount of the ambient signal portions of said one of the at least two input channels plus the direct signal portions of said one of the at least two input channels to another one of the plurality of loudspeakers.
2. The apparatus according to claim 1, wherein the ambient modification unit is adapted to modify a first derived signal, wherein the first derived signal is derived by filtering, gain modifying or decorrelating an ambient signal of the ambient signal group, wherein the combination unit is adapted to modify a second derived signal, wherein the second derived signal is derived by filtering, gain modifying or decorrelating an ambient signal of the ambient signal group, and wherein the combination unit is adapted to modify a third derived signal, wherein the third derived signal is derived by filtering, gain modifying or decorrelating the direct signal of the direct signal group.
3. The apparatus according to claim 1, wherein the ambient modification unit is adapted to combine a first ambient signal of the ambient signal group and a second ambient signal of the ambient signal group to acquire a modified ambient signal.
4. The apparatus according to claim 1, wherein the apparatus further comprises a first ambient gain modifier being adapted to gain modify an ambient signal of the ambient signal group or a signal derived from an ambient signal of the ambient signal group to acquire a first gain modified ambient signal; and wherein the combination unit is adapted to combine the first gain modified ambient signal and a direct signal of the direct signal group or a signal derived from a direct signal of the direct signal group as the second output channel.
5. The apparatus according to claim 4, wherein the gain modifier is adapted to gain modify an ambient signal of the ambient signal group such that at a first point in time, the ambient signal is gain modified with a first gain modification factor while at a different second point in time, the ambient signal is gain modified with a different second gain modification factor.
6. The apparatus according to claim 1, wherein the ambient modification unit comprises a decorrelator to decorrelate a first ambient signal of the ambient signal group or a signal derived from an ambient signal of the ambient signal group to acquire the modified signal as the first output channel.
7. The apparatus according to claim 1, wherein the modification unit comprises a second ambient gain modifier being adapted to gain modify an ambient signal of the ambient signal group or a signal derived from an ambient signal of the ambient signal group to acquire the modified signal as the first output channel.
8. The apparatus according to claim 1, wherein the ambient modification unit comprises a filter unit to filter an ambient signal of the ambient signal group or a signal derived from an ambient signal of the ambient signal group to acquire the modified signal as the first output channel.
9. The apparatus according to claim 8, wherein the filter unit is adapted to employ a low pass filter.
10. The apparatus according to claim 1, wherein the combination unit is adapted to form a linear combination of an ambient signal of the ambient signal group or a signal derived from an ambient signal of the ambient signal group and a direct signal of the direct signal group or a signal derived from a direct signal of the direct signal group to generate the combination signal.
11. The apparatus according to claim 1, wherein the ambient/direct decomposer is adapted to decompose at least three input channels of the input signal, wherein the ambient/direct decomposer comprises a downmixer, an analyzer and a signal processor, wherein the downmixer is adapted to downmix the input signal to acquire a downmix signal, wherein the downmixer is configured for downmixing so that a number of downmix channels of the downmixed signal is at least 2 and smaller than the number of input channels; wherein the analyzer is adapted to analyze the downmixed signal to derive an analysis result; and wherein the signal processor is adapted to process the input signal or a signal derived from the input signal, or a signal, from which the input signal is derived, using the analysis result, wherein the signal processor is configured for applying the analysis result to the input channels of the input signal or channels of the signal derived from the input signal to acquire the decomposed signal.
12. The apparatus according to claim 11, further comprising a time/frequency converter for converting the input channels into a time sequence of channel frequency representations, each input channel frequency representation comprising a plurality of subbands, or in which the downmixer comprises a time/frequency converter for converting the downmixed signal, wherein the analyzer is configured for generating an analysis result for individual subbands, and wherein the signal processor is configured for applying the individual analysis results to corresponding subbands of the input signal or the signal derived from the input signal.
13. The apparatus according to claim 11, wherein the analyzer is configured to produce, as the analysis result, weighting factors, and wherein the signal processor is configured for applying the weighting factors to the input signal or the signal derived from the input signal by weighting with the weighting factors.
14. The apparatus according to claim 11, wherein the analyzer is configured for using a pre-stored frequency-dependent reference curve indicating a similarity between two signals generatable by previously known reference signals.
15. A method for generating an output signal comprising at least two output channels from an input signal comprising at least two input channels, comprising: decomposing at least two input channels of the input signal such that each one of the at least two input channels is decomposed into an ambient signal of an ambient group and into a direct signal of a direct signal group; modifying an ambient signal of the ambient signal group or a signal derived from an ambient signal of the ambient signal group to acquire a modified signal as a first output channel; combining an ambient signal of the ambient signal group or a signal derived from an ambient signal of the ambient signal group and a direct signal of the direct signal group or a signal derived from a direct signal of the direct signal group as a second output channel, adapting via a function so that only a first amount of ambient signal portions of one of the at least two input channels is outputted to one of a plurality of loudspeakers and wherein a remaining amount of the ambient signal portions of said one of the at least two input channels plus the direct signal portions of said one of the at least two input channels is outputted to another one of the plurality of loudspeakers.
16. An apparatus for generating an output signal comprising at least four output channels from an input signal comprising at least two input channels, comprising: an ambience extractor being adapted to extract at least two ambient signals with ambient signal portions from the at least two input channels, an ambient modification unit being adapted to modify the at least two ambient signals to acquire at least a first modified ambient signal and a second modified ambient signal, at least four speakers, wherein two speakers of the at least four speakers are placed in first heights in a listening environment with respect to a listener, wherein two further speakers of the at least four speakers are placed in second heights in a listening environment with respect to a listener, the second heights being different from the first heights, wherein the ambient modification unit is adapted to feed only the first modified ambient signal as a third output channel into a first speaker of the two further speakers, and wherein the ambient modification unit is adapted to feed the second modified ambient signal as a fourth output channel into a second speaker of the two further speakers, and wherein the apparatus for generating an output signal is adapted to feed the first input channel with direct and ambient signal portions as a first output channel into a first horizontally arranged speaker, and wherein the ambience extractor is adapted to feed the second input channel with direct and ambient signal portions as a second output channel into a second horizontally arranged speaker, wherein the apparatus is adapted via a function to output only a first amount of ambient signal portions of one of the at least two input channels to one of the plurality of loudspeakers, and wherein the apparatus is adapted to output a remaining amount of the ambient signal portions of said one of the at least two input channels plus the direct signal portions of said one of the at least two input channels to another one of the plurality of output channels.
17. The apparatus according to claim 16, wherein the ambient modification unit is configured to feed no direct signal portions into the two further speakers or, in addition to the ambient signal portions, to feed only direct signal portions into the two further speakers which are attenuated with respect to the direct signal component fed into the two speakers.
18. A method for generating an output signal comprising at least four output channels for at least four speakers from an input signal comprising at least two input channels, wherein two speakers of the at least four speakers are placed in first heights in a listening environment with respect to a listener, wherein two further speakers of the at least four speakers are placed in second heights in a listening environment with respect to a listener, the second heights being higher than the two first heights, comprising: extracting at least two ambient signals with ambient signal portions from the at least two input channels, modifying the at least two ambient signals to acquire at least a first modified ambient signal and a second modified ambient signal for at least four speakers, feeding only the first modified ambient signal as a third output channel into a first speaker of the two further speakers, feeding the second modified ambient signal as a fourth output channel into a second speaker of the two further speakers, feeding the first input channel with direct and ambient signal portions as a first output channel into a first horizontally arranged speaker, and feeding the second input channel with direct and ambient signal portions as a second output channel into a second horizontally arranged speaker, wherein the method comprises adapting, via a function, to output only a first amount of ambient signal portions of one of the at least two input channels to one of the plurality of output channels, and wherein the method comprises outputting a remaining amount of the ambient signal portions of said one of the at least two input channels plus the direct signal portions of said one of the at least two input channels to another one of the plurality of output channels.
19. A non-transitory computer-readable medium comprising a computer program for performing the method of claim 15, when the computer program is executed by a computer or processor.
20. A non-transitory computer-readable medium comprising a computer program for performing the method of claim 18, when the computer program is executed by a computer or processor.
21. An apparatus for generating an output signal having at least two output channels from an input signal having at least one input channel, wherein the apparatus comprises: an ambience extractor being adapted to extract at least one ambient signal with ambient signal portions from the at least one input channel, an ambient modification unit being adapted to modify the at least one ambient signal to obtain at least a first modified ambient signal, and at least two speakers, wherein a first speaker of the at least two speakers is placed in first heights in a listening environment with respect to a listener, wherein a second speaker of the at least two speakers is placed in second heights in a listening environment with respect to the listener, the second heights being different from the first heights, wherein the apparatus for generating an output signal is adapted to feed only the first modified ambient signal into the second speaker, and wherein the apparatus for generating an output signal is adapted to feed the first input channel with direct and ambient signal portions into the first speaker being a first horizontally arranged speaker, wherein the apparatus is adapted via a function to output only a first amount of ambient signal portions of one of the at least two input channels to one of the plurality of loudspeakers, and wherein the apparatus is adapted to output a remaining amount of the ambient signal portions of said one of the at least two input channels plus the direct signal portions of said one of the at least two input channels to another one of the plurality of speakers.
22. An apparatus according to claim 21, wherein the output signal has at least four output channels being the at least two output channels, wherein the input signal has at least two input channels as the at least one audio input channel, wherein the apparatus is configured to generate the output signal having the at least four output channels from the input signal having the at least two input channels, wherein the at least one ambient signal are at least two ambient signals, wherein the ambience extractor is adapted to extract the at least two ambient signals with ambient signal portions from the at least two input channels, wherein the ambient modification unit is adapted to modify the at least two ambient signals to obtain at least the first modified ambient signal and a second modified ambient signal, wherein the at least two speakers are at least four speakers and wherein the apparatus comprises the at least four speakers, wherein the first speaker is one of two first speakers of the at least four speakers, and wherein the second speaker is one of two second speakers of the at least four speakers, wherein the two first speakers are placed in the first heights in the listening environment with respect to the listener, wherein the two second speakers are placed in the second heights in the listening environment with respect to the listener, wherein the apparatus for generating an output signal is adapted to feed the first modified ambient signal as a third output channel into one of the two second speakers, and wherein the apparatus for generating an output signal is adapted to feed the second modified ambient signal as a fourth output channel into another one of the two second speakers, and wherein the apparatus for generating an output signal is adapted to feed one of the at least two input channels with direct and ambient signal portions as a first output channel into one of the two first speakers, being a first horizontally arranged speaker, and wherein the apparatus for generating an output signal is adapted to feed another one of the at least two input channels with direct and ambient signal portions as a second output channel into another one of the two first speakers, being a second horizontally arranged speaker.
23. An apparatus according to claim 22, wherein the apparatus for generating an output signal is configured to feed no direct signal portions into the two second speakers, or to feed direct signal portions into the two second speakers which are attenuated with respect to the direct signal component fed into the two first speakers.
24. A method for generating an output signal having at least two output channels from an input signal having at least one input channel, wherein a first speaker of at least two speakers is placed in first heights in a listening environment with respect to a listener, wherein a second speaker of the at least two speakers is placed in second heights in a listening environment with respect to the listener, the second heights being different from the first heights, wherein the method comprises: extracting at least one ambient signal with ambient signal portions from the at least one input channel, modifying the at least one ambient signal to obtain at least a first modified ambient signal, and feeding only the first modified ambient signal into the second speaker, and feeding the first input channel with direct and ambient signal portions into the first speaker being a first horizontally arranged speaker, wherein the method comprises adapting, via a function, to output only a first amount of ambient signal portions of one of at least two input channels to one of the plurality of output speakers, and wherein the method comprises outputting a remaining amount of the ambient signal portions of said one of the at least two input channels plus the direct signal portions of said one of the at least two input channels to another one of the plurality of speakers.
25. A method according to claim 24, wherein the output signal has at least four output channels being the at least two output channels, wherein the input signal has at least two input channels as the at least one audio input channel, wherein the method comprises the step of generating the output signal having the at least four output channels from the input signal having the at least two input channels, wherein the at least one ambient signal are at least two ambient signals, wherein the method comprises the step of extracting the at least two ambient signals with ambient signal portions from the at least two input channels, wherein the method comprises the step of modifying the at least two ambient signals to obtain at least the first modified ambient signal and a second modified ambient signal, wherein the at least two speakers are at least four speakers, wherein the first speaker is one of two first speakers of the at least four speakers, and wherein the second speaker is one of two second speakers of the at least four speakers, wherein the two first speakers are placed in the first heights in the listening environment with respect to the listener, wherein the two second speakers are placed in the second heights in the listening environment with respect to the listener, wherein the method comprises the step of feeding the first modified ambient signal as a third output channel into one of the two second speakers, and wherein the method comprises the step of feeding the second modified ambient signal as a fourth output channel into another one of the two second speakers, and wherein one of the at least two input channels with direct and ambient signal portions is fed as a first output channel into one of the two first speakers, being a first horizontally arranged speaker, and wherein another one of the at least two input channels with direct and ambient signal portions is fed as a second output channel into another one of the two first speakers, being a second horizontally arranged speaker.
26. A method according to claim 25, wherein no direct signal portions are fed into the two second speakers, or wherein direct signal portions are fed into the two second speakers which are attenuated with respect to the direct signal component fed into the two first speakers.
27. A non-transitory computer-readable medium comprising a computer program for performing the method of one of claims 24 to 26, when the computer program is executed by a computer or processor.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Embodiments of the present invention are subsequently discussed with respect to the accompanying figures, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
DETAILED DESCRIPTION OF THE INVENTION
(18)
(19) Moreover, the apparatus of the embodiment illustrated in
(20) Furthermore, the apparatus of the embodiment illustrated in
(21) In the embodiment illustrated by
(22) In embodiments, the modification unit 120 and the combination unit 130 may be adapted to communicate with each other as illustrated by dotted line 135. Depending on this communication, the modification unit 120 may modify its received ambient signals, e.g. ambient signal 152, depending on the combinations conducted by the combination unit 130, and/or the combination unit 130 may combine its received signals, e.g. signal 152 and signal 162, depending on the modifications conducted by the modification unit 120.
(23) The embodiment of
(24) By this, in an embodiment, e.g. a certain amount of the ambient signal portions of a channel may be outputted to a certain loudspeaker, while, e.g. another loudspeaker receives the remaining amount of the ambient signal portions of the channel plus the direct signal portion. For example, the ambient modification unit may gain modify the ambient signal 152 by multiplying its amplitudes by 0.7 to generate a first output channel. Moreover, the combination unit may combine the direct signal 162 and the ambient signal portion to generate a second output channel, wherein the ambient signal portions are multiplied by factor 0.3. By this, the modified ambient signal 172 and the combination signal 182 result to:
signal 172=0.7.Math.ambient signal portion of signal 142
signal 182=0.3.Math.ambient signal portion of signal 142+direct signal portion of signal 142
Therefore,
(25)
(26)
(27) In the embodiment of
(28) Furthermore, the ambient modification unit 320 modifies the second ambient signal 354 of the ambient signal group to obtain a second modified ambient signal 374. In further embodiments, the ambient modification unit 320 may combine the first ambient signal 352 and the second ambient signal 354 to obtain one or more modified ambient signals.
(29) Moreover, in the embodiment of
(30) In the embodiment of
(31)
(32) Gain modification in the gain modifier 490 may be conducted by multiplying the amplitudes of the ambient signal 452 with a factor <1 to reduce the weight of the ambient signal 452 in the combination signal 482. This allows to add a certain amount of the ambient signal portions of an input signal to the combination signal 482, while the remaining ambient portions of the input signal may be outputted as a modified ambient signal 472.
(33) In alternative embodiments, the multiplication factor may be >1 to increase the weight of the ambient signal 452 in the combination signal 482 which is generated by the combination unit 430. This allows to enhance the ambient signal portions and to create a different sound impression for the listener.
(34) While in the embodiment illustrated in
(35) In other embodiments, the input signal comprises more than two channels which are fed into the ambient/direct decomposer 410. As a result, the ambient signal group then comprises more than two ambient signals and also the direct signal group comprises more than two direct signals. Correspondingly, more than two channels may be also fed into the gain modifier 490 for gain modification. For example, three, four, five or nine input channels may be fed into the ambient gain modifier 490. In embodiments, the modification unit 420 and the combination unit 430 may be adapted to communicate with each other as illustrated by dotted line 435.
(36)
(37) In the embodiment of
(38) The decorrelated signals 562, 564, 566 are then fed into the gain modifier 524. The gain modifier gain modifies each one of the inputted signals 562, 564, 566 to obtain gain modified signals 572, 574, 576, respectively. The gain modifier 524 may be adapted to multiply the amplitudes of the inputted signals 562, 564, 566 by a factor to obtain the gain modified signals. Gain modification in the gain modifier 524 may be time-variant. For example, at a first point in time, a signal is gain modified with a first gain modification factor while at a different second point in time, a signal is gain modified with a different second gain modification factor.
(39) Afterwards, the gain modified signals 572, 574, 576 are fed into a low-pass filter unit 526. The low-pass filter unit 526 low-pass filters each one of the gain modified signals 572, 574, 576 to obtain modified signals 582, 584, 586, respectively. While the embodiment of
(40)
(41) The five input channels L, R, C, LS, RS are fed into an ambient/direct decomposer 610. The ambient/direct decomposer 610 decomposes the left signal L into an ambient signal L.sub.A of an ambient signal group and into a direct signal L.sub.D of a direct signal group. Furthermore, the ambient/direct decomposer 610 decomposes the input signal R into an ambient signal R.sub.A of an ambient signal group and into a direct signal R.sub.D of a direct signal group. Moreover the ambient/direct decomposer 610 decomposes a left surround signal LS into an ambient signal LS.sub.A of an ambient signal group and into a direct signal LS.sub.D of a direct signal group. Furthermore, the ambient/direct decomposer 610 decomposes the right surround signal RS into an ambient signal RS.sub.A of the ambient signal group and into a direct signal RS.sub.D of the direct signal group.
(42) The ambient/direct decomposer 610 does not modify the center signal C. Instead the signal C is outputted as an output channel C.sub.h without modification.
(43) The ambient/direct decomposer 610 feeds the ambient signal L.sub.A into a first decorrelation unit 621, which decorrelates the signal L.sub.A. The ambient/direct decomposer 610 also passes the ambient signal to a first gain modification unit 691 of a first gain modifier. The first gain modification unit 691 gain modifies the signal L.sub.A and feeds the gain modified signal into a first combination unit 631. Furthermore, the signal L.sub.D is fed by the ambient/direct decomposer 610 into the first combination unit 631. The first combination unit 631 combines the gain modified signal L.sub.A and the direct signal L.sub.D to obtain an output channel L.sub.h.
(44) Furthermore, the ambient/direct decomposer 610 feeds the signals R.sub.A, LS.sub.A and RS.sub.A into a second 692, a third 693 and a fourth 694 gain modification unit of a first gain modifier. The second 692, a third 693 and a fourth 694 gain modification units gain modify the received signals R.sub.A, LS.sub.A, and RS.sub.A respectively. The second 692, the third 693 and the fourth 694 gain modification unit then pass the gain modified signals to a second 632, a third 633 and a fourth 634 combination unit, respectively. Moreover, the ambient/direct decomposer 610 feeds the signal R.sub.D into the combination unit 632, feeds the signal LS.sub.D into the combination unit 633 and feeds the signal RS.sub.D into the combination unit 634, respectively. The combination units 632, 633, 634 then combine the signals R.sub.D, LS.sub.D, RS.sub.D with the gain modified signals R.sub.A, LS.sub.A, RS.sub.A, respectively, to obtain the respective output channels R.sub.h, LS.sub.h, RS.sub.h.
(45) Moreover, the ambient/direct decomposer 610 feeds the signal L.sub.A into a first decorrelation unit 621, wherein the ambient signal L.sub.A is decorrelated. The first decorrelation unit 621 then passes the decorrelated signal L.sub.A into a fifth gain modification unit 625 of a second gain modifier, wherein the decorrelated ambient signal L.sub.A is gain modified. Then, the fifth gain modification unit 625 passes the gain modified ambient signal L.sub.A into a first low-passed filter unit 635, where the gain modified ambient signal is low-pass filtered to obtain a low-pass filtered ambient signal L.sub.e as an output channel of the output signal of the apparatus.
(46) Likewise, the ambient/direct decomposer 610 passes the signals R.sub.A, LS.sub.A and RS.sub.A to a second 622, third 623 and fourth 624 decorrelation unit which decorrelate the received ambient signals, respectively. The second, third and fourth decorrelation units 622, 623, 624 respectively pass the decorrelated ambient signals to a sixth 626, seventh 627 and eighth 628 gain modification unit of a second gain modifier, respectively. The sixth, seventh and eighth gain modification units 626, 627, 628 gain modify the decorrelated signals and pass the gain modified signals to a second 636, third 637 and fourth 638 low-pass filter unit, respectively. The second, third and fourth low-pass filter unit 636, 637, 638 low-pass filter the gain modified signals, respectively, to obtain low-pass filtered output signals R.sub.e, LS.sub.e and RS.sub.e as output channels of the output signal of the apparatus.
(47) In an embodiment, a modification unit may comprise the first, second, third and fourth decorrelation units 621, 622, 623, 624, the fifth, sixth, seventh and eighth gain modification units 625, 626, 627, 628 and the first, second, third and fourth low-pass filter units 635 636, 637, 638. A joint combination unit may comprise the first, second, third and fourth combination unit 631, 632, 633, 634.
(48) In the embodiment of
(49)
(50)
(51) The five loudspeakers 810, 820, 830, 840, 850 are horizontally arranged, i.e. are arranged horizontally with respect to an listener's position. The four other loudspeakers 860, 870, 880, 890 are elevated, i.e. are arranged such that they are arranged elevated with respect to a listener's position. In other embodiments, the loudspeakers 810, 820, 830, 840, 850 are horizontally arranged, while the four other loudspeakers 860, 870, 880, 890 are lowered, i.e. are arranged such that they are arranged lowered with respect to a listener's position. In further embodiments, one or more of the loudspeakers are horizontally arranged, one or more of the loudspeakers are elevated and one or more of the loudspeakers are lowered with respect to a listener's position.
(52) In an embodiment, an apparatus of the embodiment illustrated by
(53) In a further embodiment, an apparatus of the embodiment illustrated by
(54) In an embodiment, an apparatus for generating an output signal is provided. The output signal has at least four output channels. Moreover, the output signal is generated from an input signal having at least two input channels. The apparatus comprises an ambience extractor which is adapted to extract at least two ambient signals with ambient signal portions from the at least two input channels. The ambience extractor is adapted to feed the first input channel with direct and ambient signal portions as a first output channel into a first horizontally arranged loudspeaker. Moreover, the ambience extractor is adapted to feed the second input channel with direct and ambient signal portions as the second output channel into a second horizontally arranged loudspeaker. Furthermore, the apparatus comprises an ambient modification unit. The ambient modification unit is adapted to modify the at least two ambient signals to obtain at least a first modified ambient signal and a second modified ambient signal. Furthermore, the ambient modification unit is adapted to feed the first modified ambient signal as a third output channel into a first elevated loudspeaker. Moreover, the ambient modification unit is adapted to feed the second modified ambient signal as a fourth output channel into a second elevated loudspeaker. In further embodiments, the ambient modification unit may combine a first ambient signal and a second ambient signal to obtain one or more modified ambient signals.
(55) In an embodiment, a plurality of loudspeakers is arranged in a motor vehicle, for example, in a car. The plurality of loudspeakers are arranged as horizontally arranged loudspeakers and as elevated loudspeakers. An apparatus according to one of the above-described embodiments is employed to generate output channels. Output channels which only comprise ambient signal are fed into the elevated loudspeakers. Output channels which are combination signals comprising ambient and direct signal portions are fed into the horizontally arranged loudspeakers.
(56) In embodiments, one, some or all of the elevated and/or horizontally arranged loudspeakers may be inclined.
(57) Subsequently, possible configurations of an ambient/direct decomposer according to embodiments are discussed.
(58) Various decomposers and decomposing methods that are adapted for decomposing an input signal having two channels into two ambient and two direct signals are known in the state of the art. See, for example: C. Avendano and J.-M. Jot, “A frequency-domain approach to multichannel upmix,” Journal of the Audio Engineering Society, vol. 52, no. 7/8, pp. 740-749, 2004. C. Faller, “Multiple-loudspeaker playback of stereo signals,” Journal of the Audio Engineering Society, vol. 54, no. 11, pp. 1051-1064, November 2006. J. Usher and J. Benesty, “Enhancement of spatial sound quality: A new reverberation-extraction audio upmixer,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 7, pp. 2141-2150, September 2007.
(59) In the following and with respect to
(60)
(61) In
(62) The analyzer is operative to analyze the downmixed signal with respect to perceptually distinct components. These perceptually distinct components can be independent components in the individual channels on the one hand, and dependent components on the other hand. Alternative signal components to be analyzed are direct components on the one hand and ambient components on the other hand. There are many other components which can be separated, such as speech components from music components, noise components from speech components, noise components from music components, high frequency noise components with respect to low frequency noise components, in multi-pitch signals the components provided by the different instruments, etc.
(63)
(64) Hence, different possibilities exist for the signal processor and all of these possibilities are advantageous due to the unique operation of the analyzer using a pre-calculated frequency-dependent correlation curve as a reference curve to determine the analysis result.
(65) Subsequently, further embodiments are discussed. It is to be noted that, as discussed in the context of
(66) Particularly, the time/frequency converter would be placed to convert the analysis signal before the analysis signal is input into the analyzer, and the frequency/time converter would be placed at the output of the signal processor to convert the processed signal back into the time domain. When a signal deriver exists, the time/frequency converter might be placed at an input of the signal deriver so that the signal deriver, the analyzer, and the signal processor all operate in the frequency/subband domain. In this context, frequency and subband basically mean a portion in frequency of a frequency representation.
(67) It is furthermore clear that the analyzer in
(68) In
(69) In the picture, T/F denotes a time frequency transform; commonly a Short-time Fourier Transform (STFT). iT/F denotes the respective inverse transform.
(70) [x.sub.1(n), . . . , x.sub.N(n)] are the time domain input signals, where n is the time index. [X.sub.1(m,i), . . . , X.sub.N(m,i)] denote the coefficients of the frequency decomposition, where m is the decomposition time index, and i is the decomposition frequency index.
(71) [D.sub.1(m,i), D.sub.2(m,i)] are the two channels of the downmixed signal.
(72)
(73) W(m,i) is the calculated weighting. [Y.sub.1(m,i), . . . , Y.sub.N(m,i)] are the weighted frequency decompositions of each channel. H.sub.ij(i) are the downmix coefficients, which can be real-valued or complex-valued and the coefficients can be constant in time or time-variant. Hence, the downmix coefficients can be just constants or filters such as HRTF filters, reverberation filters or similar filters.
Y.sub.j(m,i)=W.sub.j(m,i).Math.X.sub.j(m,i), where j=(1,2, . . . , N) (2)
(74) In
Y.sub.j(m,i)=W(m,i).Math.X.sub.j(m,i) (3)
(75) [y.sub.1(n), . . . , y.sub.N(n)] are the time-domain output signals comprising the extracted signal components. (The input signal may have an arbitrary number of channels (N), produced for an arbitrary target playback loudspeaker setup. The downmix may include HRTFs to obtain ear-input-signals, simulation of auditory filters, etc. The downmix may also be carried out in the time domain.).
(76) In an embodiment, the difference between a reference correlation (Throughout this text, the term correlation is used as synonym for inter-channel similarity and may thus also include evaluations of time shifts, for which usually the term coherence is used.)
(77) The term similarity includes the correlation and the coherence, where—in a strict—mathematical sense, the correlation is calculated between two signals without an additional time shift and the coherence is calculated by shifting the two signals in time/phase so that the signals have a maximum correlation and the actual correlation over frequency is then calculated with the time/phase shift applied. For this text, similarity, correlation and coherence are considered to mean the same, i.e., a quantitative degree of similarity between two signals, e.g., where a higher absolute value of the similarity means that the two signals are more similar and a lower absolute value of the similarity means that the two signals are less similar.
(78) Even if time-shifts are evaluated, the resulting value may have a sign. (Commonly, the coherence is defined as having only positive values) as a function of frequency (c.sub.ref(ω)), and the actual correlation of the downmixed input signal (c.sub.sig(ω)) is computed.
(79) Depending on the deviation of the actual curve from the reference curve, a weighting factor for each time-frequency tile is calculated, indicating if it comprises dependent or independent components. The obtained time-frequency weighting indicates the independent components and may already be applied to each channel of the input signal to yield a multichannel signal (number of channels equal to number of input channels) including independent parts that may be perceived as either distinct or diffuse.
(80) The reference curve may be defined in different ways. Examples are: Ideal theoretical reference curve for an idealized two- or three-dimensional diffuse sound field composed of independent components. The ideal curve achievable with the reference target loudspeaker setup for the given input signal (e.g. Standard stereo setup with azimuth angles (±30°), or standard five channel setup according to ITU-R BS.775 with azimuth angles (0°, ±30°, ±110°). The ideal curve for the actually present loudspeaker setup (the actual positions could be measured or known through user-input. The reference curve can be calculated assuming playback of independent signals over the given loudspeakers). The actual frequency-dependent short time power of each input channel may be incorporated in the calculation of the reference.
(81) Given a frequency dependent reference curve (c.sub.ref(ω)), an upper threshold (c.sub.hi(ω)) and lower threshold (c.sub.lo(ω)) can be defined (see
(82) If the deviation of the actual curve from the reference curve is within the boundaries given by the thresholds, the actual bin gets a weighting indicating independent components. Above the upper threshold or below the lower threshold, the bin is indicated as dependent. This indication may be binary, or gradually (i.e. following a soft-decision function). In particular, if the upper- and lower threshold coincides with the reference curve, the applied weighting is directly related to the deviation from the reference curve.
(83) With reference to
(84) Then, as for example illustrated in
(85) When, however, it is determined that the determined correlation value indicates a higher absolute correlation than the reference correlation value, then it is determined that the time/frequency tile under consideration comprises dependent components. Hence, when the correlation of a time/frequency tile of the downmix or analysis signal indicates a higher absolute correlation value than the reference curve, then it can be said that the components in this time/frequency tile are dependent on each other. When, however, the correlation is indicated to be very close to the reference curve, then it can be said that the components are independent. Dependent components can receive a first weighting value such as 1 and independent components can receive a second weighting value such as 0. Advantageously, as illustrated in
(86) Furthermore, with respect to
(87) The alternative way of calculating the result is to actually calculate the distance between the correlation value determined in block 80 and the retrieved correlation value obtained in block 82 and to then determine a metric between 0 and 1 as a weighting factor based on the distance. While the first alternative (1) in
(88) The signal processor 20 in
(89) Subsequently, the calculation of a reference curve is discussed in more detail. For the present invention, however, it is basically not important how the reference curve was derived. It can be an arbitrary curve or, for example, values in a look-up table indicating an ideal or desired relation of the input signals x.sub.j in the downmix signal D or, and in the context of
(90) The physical diffusion of a sound field can be evaluated by a method introduced by Cook et al. (Richard K. Cook, R. V. Waterhouse, R. D. Berendt, Seymour Edelman, and Jr. M. C. Thompson, “Measurement of correlation coefficients in reverberant sound fields,” Journal Of The Acoustical Society Of America, vol. 27, no. 6, pp. 1072-1077, November 1955), utilizing the correlation coefficient (r) of the steady state sound pressure of plane waves at two spatially separated points, as illustrated in the following equation (4)
(91)
where p.sub.1(n) and p.sub.2(n) are the sound pressure measurements at two points, n is the time index, and <•> denotes time averaging. In a steady state sound field, the following relations can be derived:
(92)
where d is the distance between the two measurement points and
(93)
is the wavenumber, with λ being the wavelength. (The physical reference curve r(k,d) may already be used as C.sub.ref for further processing.)
(94) A measure for the perceptual diffuseness of a sound field is the interaural cross correlation coefficient (ρ), measured in a sound field. Measuring ρ implies that the distance between the pressure sensors (resp. the ears) is fixed. Including this restriction, r becomes a function of frequency with the radian frequency ω=kc, where c is the speed of sound in air. Furthermore, the pressure signals differ from the previously considered free field signals due to reflection, diffraction, and bending-effects caused by the listener's pinnae, head, and torso. Those effects, substantial for spatial hearing, are described by head-related transfer functions (HRTFs). Considering those influences, the resulting pressure signals at the ear entrances are p.sub.L(n,ω) and p.sub.R(n,ω). For the calculation, measured HRTF data may be used or approximations can be obtained by using an analytical model (e.g. Richard O. Duda and William L. Martens, “Range dependence of the response of a spherical head model,” Journal Of The Acoustical Society Of America, vol. 104, no. 5, pp. 3048-3058, November 1998).
(95) Since the human auditory system acts as a frequency analyzer with limited frequency selectivity, furthermore this frequency selectivity may be incorporated. The auditory filters are assumed to behave like overlapping bandpass filters. In the following example explanation, a critical band approach is used to approximate these overlapping bandpasses by rectangular filters. The equivalent rectangular bandwidth (ERB) may be calculated as a function of center frequency (Brian R. Glasberg and Brian C. J. Moore, “Derivation of auditory filter shapes from notched-noise data,” Hearing Research, vol. 47, pp. 103-138, 1990). Considering that the binaural processing follows the auditory filtering, ρ has to be calculated for separate frequency channels, yielding the following frequency dependent pressure signals
(96)
where the integration limits are given by the bounds of the critical band according to the actual center frequency ω. The factors 1/b (w) may or may not be used in equations (7) and (8).
(97) If one of the sound pressure measurements is advanced or delayed by a frequency independent time difference, the coherence of the signals can be evaluated. The human auditory system is able to make use of such a time alignment property. Usually, the interaural coherence is calculated within ±1 ms. Depending on the available processing power, calculations can be implemented using only the lag-zero value (for low complexity) or the coherence with a time advance and delay (if high complexity is possible). Throughout this document, no distinction is made between both cases.
(98) The ideal behavior is achieved considering an ideal diffuse sound field, which can be idealized as a wave field that is composed of equally strong, uncorrelated plane waves propagating in all directions (i.e. a superposition of an infinite number of propagating plane waves with random phase relations and uniformly distributed directions of propagation). A signal radiated by a loudspeaker can be considered a plane wave for a listener positioned sufficiently far away. This plane wave assumption is common in stereophonic playback over loudspeakers. Thus, a synthetic sound field reproduced by loudspeakers consists of contributing plane waves from a limited number of directions.
(99) Given an input signal with N channels, produced for playback over a setup with loudspeaker positions [l.sub.1, l.sub.2, l.sub.3, . . . , l.sub.N]. (In the case of a horizontal only playback setup, l.sub.i, indicates the azimuth angle. In the general case, l.sub.i=(azimuth, elevation) indicates the position of the loudspeaker relative to the listener's head. If the setup present in the listening room differs from the reference setup, l.sub.i may alternatively represent the loudspeaker positions of the actual playback setup). With this information, an interaural coherence reference curve ρ.sub.ref for a diffuse field simulation can be calculated for this setup under the assumption that independent signals are fed to each loudspeaker. The signal power contributed by each input channel in each time-frequency tile may be included in the calculation of the reference curve. In the example implementation, ρ.sub.ref is used as c.sub.ref.
(100) Different reference curves as examples for frequency-dependent reference curves or correlation curves are illustrated in
(101) Subsequently the calculation of the analysis results as discussed in the context of
(102) The goal is to derive a weighting that equals 1, if the correlation of the downmix channels is equal to the calculated reference correlation under the assumption of independent signals being played back from all loudspeakers. If the correlation of the downmix equals +1 or −1, the derived weighting should be 0, indicating that no independent components are present. In between those extreme cases, the weighting should represent a reasonable transition between the indication as independent (W=1) or completely dependent (W=0).
(103) Given the reference correlation curve c.sub.ref(ω) and the estimation of the correlation/coherence of the actual input signal played back over the actual reproduction setup (c.sub.sig(ω)) (c.sub.sig is the correlation resp. coherence of the downmix), the deviation of c.sub.sig(ω) from C.sub.ref(ω) can be calculated. This deviation (possibly including an upper and lower threshold) is mapped to the range [0;1] to obtain a weighting (W(m,i)) that is applied to all input channels to separate the independent components.
(104) The following example illustrates a possible mapping when the thresholds correspond with the reference curve:
(105) The magnitude of the deviation (denoted as Δ) of the actual curve c.sub.sig from the reference C.sub.ref is given by
Δ(ω)=|c.sub.sig(ω)−c.sub.ref(ω)| (9)
(106) Given that the correlation/coherence is bounded between [−1;+1], the maximally possible deviation towards +1 or −1 for each frequency is given by
(107) The weighting for each frequency is thus obtained from
(108)
(109) Considering the time dependence and the limited frequency resolution of the frequency decomposition, the weighting values are derived as follows (Here, the general case of a reference curve that may change over time is given. A time-independent reference curve (i.e. c.sub.ref(i)) is also possible):
(110)
(111) Such a processing may be carried out in a frequency decomposition with frequency coefficients grouped to perceptually motivated subbands for reasons of computational complexity and to obtain filters with shorter impulse responses. Furthermore, smoothing filters could be applied and compression functions (i.e. distorting the weighting in a desired fashion, additionally introducing minimum and/or maximum weighting values) may be applied.
(112)
(113) In the other alternative where there are weighting values between 0 and 1 in
(114) When, however, the signal processor 20 would be implemented for not extracting the independent components, but for extracting the dependent components, then the weightings would be assigned in the opposite so that, when the weighting is performed in the multipliers 20 illustrated in
(115)
(116) Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
(117) The inventive decomposed signal can be stored on a digital storage medium or can be transmitted on a transmission medium such as a wireless transmission medium or a wired transmission medium such as the Internet.
(118) Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed.
(119) Some embodiments according to the invention comprise a non-transitory data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
(120) Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may for example be stored on a machine readable carrier.
(121) Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
(122) In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
(123) A further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein.
(124) A further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the Internet.
(125) A further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
(126) A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
(127) In some embodiments, a programmable logic device (for example a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods may be performed by any hardware apparatus.
(128) While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which will be apparent to others skilled in the art and which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations, and equivalents as fall within the true spirit and scope of the present invention.