Spectral defect compensation for crosstalk processing of spatial audio signals
11051121 · 2021-06-29
Assignee
Inventors
Cpc classification
H04S2420/07
ELECTRICITY
H04R5/04
ELECTRICITY
H04S2400/05
ELECTRICITY
H04S3/008
ELECTRICITY
H04S2400/01
ELECTRICITY
H04R2430/03
ELECTRICITY
International classification
H04S7/00
ELECTRICITY
H04S3/00
ELECTRICITY
Abstract
An audio system provides for spatial enhancement, crosstalk processing, and crosstalk compensation of an input audio signal. The crosstalk compensation compensates for spectral defects caused by the application of the crosstalk processing to a spatially enhanced signal. The crosstalk compensation may be performed prior to the crosstalk processing, after the crosstalk processing, or in parallel with the crosstalk processing. The crosstalk compensation includes applying filters to the mid and side components of the left and right input channels to compensate for spectral defects from crosstalk processing of the audio signal. The crosstalk processing may include crosstalk simulation or crosstalk cancellation. In some embodiments, the crosstalk compensation may be integrated with a subband spatial processing that spatially enhances the audio signal.
Claims
1. A method for enhancing an audio signal having a left channel and a right channel, comprising, by a circuitry: applying a crosstalk processing to the audio signal; generating a mid component using a sum of the left channel and the right channel, the mid component being a nonspatial component of the audio signal; generating a mid compensation channel by applying filters to the mid component that compensate for spectral defects in the crosstalk processed audio signal caused by the crosstalk processing; and generating a left output channel and a right output channel using the mid compensation channel.
2. The method of claim 1, wherein the crosstalk processing includes a crosstalk cancellation.
3. The method of claim 2, wherein applying the crosstalk processing including the crosstalk cancellation includes: applying a first filter and first time delay to a portion of the left channel; and applying a second filter and a second time delay to a portion of the right channel.
4. The method of claim 1, wherein the crosstalk processing includes a crosstalk simulation.
5. The method of claim 4, wherein applying the crosstalk processing including the crosstalk simulation includes: applying a first filter and first time delay to the left channel; and applying a second filter and a second time delay to the right channel.
6. The method of claim 1, further comprising, by the circuitry, applying a subband spatial processing to the audio signal by gain adjusting mid subband components and side subband components of the left and right channels, the mid subband components being frequency bands of the mid component.
7. The method of claim 6, wherein the mid compensation channel is generated subsequent to the application of the subband spatial processing to the audio signal.
8. The method of claim 6, wherein the mid compensation channel is generated prior to the application of the subband spatial processing to the audio signal.
9. The method of claim 1, wherein the crosstalk processing is applied prior to the generation of the mid compensation channel.
10. The method of claim 1, wherein the crosstalk processing is applied subsequent to the generation of the mid compensation channel.
11. A system for enhancing an audio signal having a left channel and a right channel, comprising: a circuitry configured to: apply a crosstalk processing to the audio signal; generate a mid component using a sum of the left channel and the right channel, the mid component being a nonspatial component of the audio signal; generate a mid compensation channel by applying filters to the mid component that compensate for spectral defects in the crosstalk processed audio signal caused by the crosstalk processing; and generate a left output channel and a right output channel using the mid compensation channel.
12. The system of claim 11, wherein the crosstalk processing includes a crosstalk cancellation.
13. The system of claim 12, wherein the circuitry configured to apply the crosstalk processing including the crosstalk cancellation includes the circuitry being configured to: apply a first filter and a first time delay to a portion of the left channel; and apply a second filter and a second time delay to a portion of the right channel.
14. The system of claim 11, wherein the crosstalk processing includes a crosstalk simulation.
15. The system of claim 14, wherein the circuitry configured to apply the crosstalk processing including the crosstalk simulation includes the circuitry being configured to: apply a first filter and first time delay to the left channel; and apply a second filter and a second time delay to the right channel.
16. The system of claim 11, wherein the circuitry is further configured to apply a subband spatial processing to the audio signal by gain adjusting mid subband components and side subband components of the left and right channels, the mid subband components being frequency bands of the mid component.
17. The system of claim 16, wherein the circuitry is configured to generate the mid compensation channel subsequent to the application of the subband spatial processing to the audio signal.
18. The system of claim 16, wherein the circuitry is configured to generate the mid compensation channel prior to the application of the subband spatial processing to the audio signal.
19. The system of claim 11, wherein the circuitry is configured to apply the crosstalk processing prior to the generation of the mid compensation channel.
20. The system of claim 11, wherein the circuitry is configured to apply the crosstalk processing subsequent to the generation of the mid compensation channel.
21. A non-transitory computer readable medium comprising stored program code that when executed by a processor causes the processor to: apply a crosstalk processing to an audio signal including a left channel and a right channel; generate a mid component using a sum of the left channel and the right channel, the mid component being a nonspatial component of the audio signal; generate a mid compensation channel by applying filters to the mid component that compensate for spectral defects in the crosstalk processed audio signal caused by the crosstalk processing; and generate a left output channel and a right output channel using the mid compensation channel.
22. The computer readable medium of claim 21, wherein the crosstalk processing includes a crosstalk cancellation.
23. The computer readable medium of claim 22, wherein the program code that causes the processor to apply the crosstalk processing including the crosstalk cancellation includes the program code causing the processor to: generate a left crosstalk cancellation component by filtering and time delaying a portion of the left channel; and generate a right crosstalk cancellation component by filtering and time delaying a portion of the right channel.
24. The computer readable medium of claim 21, wherein the crosstalk processing includes a crosstalk simulation.
25. The computer readable medium of claim 24, wherein the program code that causes the processor to apply the crosstalk processing including the crosstalk simulation includes the program code causing the processor to: apply a first filter and first time delay to the left channel; and apply a second filter and a second time delay to the right channel.
26. The computer readable medium of claim 21, wherein the program code further causes the processor to apply a subband spatial processing to the audio signal by gain adjusting mid subband components and side subband components of the left and right channels, the mid subband components being frequency bands of the mid component.
27. The computer readable medium of claim 26, wherein the program code causes the processor to generate the mid compensation channel subsequent to the application of the subband spatial processing to the audio signal.
28. The computer readable medium of claim 26, wherein the program code causes the processor to generate the mid compensation channel prior to the application of the subband spatial processing to the audio signal.
29. The computer readable medium of claim 21, wherein the program code causes the processor to apply the crosstalk processing prior to the generation of the mid compensation channel.
30. The computer readable medium of claim 21, wherein the program code causes the processor to apply the crosstalk processing subsequent to the generation of the mid compensation channel.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
DETAILED DESCRIPTION
(31) The features and advantages described in the specification are not all inclusive and, in particular, many additional features and advantages will be apparent to one of ordinary skill in the art in view of the drawings, specification, and claims. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter.
(32) The Figures (FIG.) and the following description relate to the preferred embodiments by way of illustration only. It should be noted that from the following discussion, alternative embodiments of the structures and methods disclosed herein will be readily recognized as viable alternatives that may be employed without departing from the principles of the present invention.
(33) Reference will now be made in detail to several embodiments of the present invention(s), examples of which are illustrated in the accompanying figures. It is noted that wherever practicable similar or like reference numbers may be used in the figures and may indicate similar or like functionality. The figures depict embodiments for purposes of illustration only. One skilled in the art will readily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles described herein.
(34) The audio systems discussed herein provide crosstalk processing for spatially enhanced audio signals. The crosstalk processing may include crosstalk cancellation for loudspeakers, or crosstalk simulation for headphones. An audio system that performs crosstalk processing for spatially enhanced signals may include a crosstalk compensation processor that adjusts for spectral defects resulting from the crosstalk processing of audio signals, with or without spatial enhancement.
(35) In a loudspeaker arrangement such as illustrated in
(36) In a head-mounted speaker arrangement such as illustrated in
(37) Example Audio System
(38)
(39) The crosstalk compensation may be applied in various ways. In one example, crosstalk compensation is performed prior to the crosstalk processing. For example, crosstalk compensation may be performed in parallel with subband spatial processing of the input audio signal X to generate a combined result, and the combined result may subsequently receive crosstalk processing. In another example, the crosstalk compensation is integrated with the subband spatial processing of the input audio signal, and the output of the subband spatial processing subsequently receives the crosstalk processing. In another example, the crosstalk compensation may be performed after crosstalk processing is performed on the spatially enhanced signal E.
(40) In some embodiments, the crosstalk compensation may include enhancement (e.g., filtering) of mid components and side components of the input audio signal X. In other embodiments, the crosstalk compensation enhances only the mid components, or only the side components.
(41)
(42) The audio processing system 200 includes a subband spatial processor 210, a crosstalk compensation processor 220, a combiner 260, and a crosstalk cancellation processor 720. The audio processing system 200 performs crosstalk compensation and subband spatial processing of the input audio input channels X.sub.L, X.sub.R, combines the result of the subband spatial processing with the result of the crosstalk compensation, and then performs a crosstalk cancellation on the combined signals.
(43) The subband spatial processor 210 includes a spatial frequency band divider 240, a spatial frequency band processor 245, and a spatial frequency band combiner 250. The spatial frequency band divider 240 is coupled to the input channels X.sub.L and X.sub.R and the spatial frequency band processor 245. The spatial frequency band divider 240 receives the left input channel X.sub.L and the right input channel X.sub.R, and processes the input channels into a spatial (or “side”) component Y.sub.s and a nonspatial (or “mid”) component Y.sub.m. For example, the spatial component Y.sub.s can be generated based on a difference between the left input channel X.sub.L and the right input channel X.sub.R. The nonspatial component Y.sub.m can be generated based on a sum of the left input channel X.sub.L and the right input channel X.sub.R. The spatial frequency band divider 240 provides the spatial component Y.sub.s and the nonspatial component Y.sub.m to the spatial frequency band processor 245. Additional details regarding the spatial frequency band divider is discussed below in connection with
(44) The spatial frequency band processor 245 is coupled to the spatial frequency band divider 240 and the spatial frequency band combiner 250. The spatial frequency band processor 245 receives the spatial component Y.sub.s and the nonspatial component Y.sub.m from spatial frequency band divider 240, and enhances the received signals. In particular, the spatial frequency band processor 245 generates an enhanced spatial component E.sub.s from the spatial component Y.sub.s, and an enhanced nonspatial component E.sub.m from the nonspatial component Y.sub.m.
(45) For example, the spatial frequency band processor 245 applies subband gains to the spatial component Y.sub.s to generate the enhanced spatial component E.sub.s, and applies subband gains to the nonspatial component Y.sub.m to generate the enhanced nonspatial component E.sub.m. In some embodiments, the spatial frequency band processor 245 additionally or alternatively provides subband delays to the spatial component Y.sub.s to generate the enhanced spatial component E.sub.s, and subband delays to the nonspatial component Y.sub.m to generate the enhanced nonspatial component E.sub.m. The subband gains and/or delays may can be different for the different (e.g., n) subbands of the spatial component Y.sub.s and the nonspatial component Y.sub.m, or can be the same (e.g., for two or more subbands). The spatial frequency band processor 245 adjusts the gain and/or delays for different subbands of the spatial component Y.sub.s and the nonspatial component Y.sub.m with respect to each other to generate the enhanced spatial component E.sub.s and the enhanced nonspatial component E.sub.m. The spatial frequency band processor 245 then provides the enhanced spatial component E.sub.s and the enhanced nonspatial component E.sub.m to the spatial frequency band combiner 250. Additional details regarding the spatial frequency band divider is discussed below in connection with
(46) The spatial frequency band combiner 250 is coupled to the spatial frequency band processor 245, and further coupled to the combiner 260. The spatial frequency band combiner 250 receives the enhanced spatial component E.sub.s and the enhanced nonspatial component E.sub.m from the spatial frequency band processor 245, and combines the enhanced spatial component E.sub.s and the enhanced nonspatial component E.sub.m into a left spatially enhanced channel E.sub.L and a right spatially enhanced channel E.sub.R. For example, the left spatially enhanced channel E.sub.L can be generated based on a sum of the enhanced spatial component E.sub.s and the enhanced nonspatial component E.sub.m, and the right spatially enhanced channel E.sub.R can be generated based on a difference between the enhanced nonspatial component E.sub.m and the enhanced spatial component E.sub.s. The spatial frequency band combiner 250 provides the left spatially enhanced channel E.sub.L and the right spatially enhanced channel E.sub.R to the combiner 260. Additional details regarding the spatial frequency band divider is discussed below in connection with
(47) The crosstalk compensation processor 220 performs a crosstalk compensation to compensate for spectral defects or artifacts in the crosstalk cancellation. The crosstalk compensation processor 240 receives the input channels X.sub.L and X.sub.R, and performs a processing to compensate for any artifacts in a subsequent crosstalk cancellation of the enhanced nonspatial component E.sub.m and the enhanced spatial component E.sub.s performed by the crosstalk cancellation processor 270. In some embodiments, the crosstalk compensation processor 220 may perform an enhancement on the nonspatial component X.sub.m and the spatial component X.sub.s by applying filters to generate a crosstalk compensation signal Z, including a left crosstalk compensation channel Z.sub.L and a right crosstalk compensation channel Z.sub.R. In other embodiments, the crosstalk compensation processor 220 may perform an enhancement on only the nonspatial component X.sub.m. Additional details regarding crosstalk compensation processors are discussed below in connection with
(48) The combiner 260 combines the left spatially enhanced channel E.sub.L with the left crosstalk compensation channel Z.sub.L to generate a left enhanced compensated channel TL, and combines the right spatially enhanced channel E.sub.R with the right crosstalk compensation channel Z.sub.R to generate a right compensation channel T.sub.R. The combiner 260 is coupled to the crosstalk cancellation processor 270, and provides the left enhanced compensated channel TL and the right enhanced compensation channel T.sub.R to the crosstalk cancellation processor 270. Additional details regarding the combiner 260 are discussed below in connection with
(49) The crosstalk cancellation processor 270 receives the left enhanced compensated channel T.sub.L and the right enhanced compensation channel T.sub.R, and performs crosstalk cancellation on the channels T.sub.L, T.sub.R to generate the output audio signal O including left output channel O.sub.L and right output channel O.sub.R. Additional details regarding the crosstalk cancellation processor 270 are discussed below in connection with
(50)
(51)
(52) In particular, the crosstalk compensation processor 320 is coupled to the spatial frequency band processor 245 to receive the enhanced nonspatial component E.sub.m and the enhanced spatial component E.sub.s, performs the crosstalk compensation using the enhanced nonspatial component E.sub.m and the enhanced spatial component E.sub.s (e.g., rather than the input signal X as discussed above for the audio systems 200 and 202) to generate a mid enhanced compensation channel T.sub.m and a side enhanced compensation channel T.sub.s. The spatial frequency band combiner 250 receives the mid enhanced compensation channel T.sub.m and a side enhanced compensation channel T.sub.s, and generates the left enhanced compensation channel T.sub.L and the right enhanced compensation channel T.sub.R. The crosstalk cancellation processor 270 generates output audio signal O including left output channel O.sub.L and right output channel O.sub.R by performing the crosstalk cancellation on the left enhanced compensation channel T.sub.L and the right enhanced compensation channel T.sub.R. Additional details regarding the crosstalk compensation processor 320 are discussed below in connection with
(53)
(54)
(55) The crosstalk compensation processor 520 receives the input channels X.sub.L and X.sub.R, and performs a processing to compensate for artifacts in a subsequent combination of a crosstalk simulation signal W generated by the crosstalk simulation processor 580 and the enhanced channel E. The crosstalk compensation processor 520 generates a crosstalk compensation signal Z, including a left crosstalk compensation channel Z.sub.L and a right crosstalk compensation channel Z.sub.R. The crosstalk simulation processor 580 generates a left crosstalk simulation channel W.sub.L and a right crosstalk simulation channel W.sub.R. The subband spatial processor 210 generates the left enhanced channel E.sub.L and the right enhanced channel E.sub.R. Additional details regarding the crosstalk compensation processor 520 are discussed below in connection with
(56) The combiner 560 receives the left enhanced channel E.sub.L, the right enhanced channel E.sub.R, the left crosstalk simulation channel W.sub.L, the right crosstalk simulation channel W.sub.R, the left crosstalk compensation channel Z.sub.L, and a right crosstalk compensation channel Z.sub.R. The combiner 560 generates the left output channel O.sub.L by combining the left enhanced channel E.sub.L, the right crosstalk simulation channel W.sub.R, and the left crosstalk compensation channel Z.sub.L. The combiner 560 generates the right output channel O.sub.R by combining the left enhanced channel E.sub.L, the right crosstalk simulation channel W.sub.R, and the left crosstalk compensation channel Z.sub.L. Additional details regarding the combiner 560 are discussed below in connection with
(57)
(58) The combiner 562 combines the left enhanced channel E.sub.L from the subband spatial processor 210 with the right simulation compensation channel SC.sub.R to generate the left output channel O.sub.L, and combines the right enhanced channel E.sub.R from the subband spatial processor 210 with the left simulation compensation channel SC.sub.L to generate the right output channel O.sub.R. Additional details regarding the combiner 562 are discussed below in connection with
(59)
(60)
(61)
(62)
(63) When the crosstalk compensation processor 800 is part of the audio system 200, 400, 500, 504, or 700, the crosstalk compensation processor 800 receives left and right input channels (e.g., X.sub.L and X.sub.R), and performs a crosstalk compensation processing, such as to generate the left crosstalk compensation channel Z.sub.L and the right crosstalk compensation channel Z.sub.R. The channels Z.sub.L, Z.sub.R may be used to compensate for any artifacts in crosstalk processing, such as crosstalk cancellation or simulation. The L/R to M/S converter 812 receives the left input audio channel X.sub.L and the right input audio channel X.sub.R, and generates the nonspatial component X.sub.m and the spatial component X.sub.s of the input channels X.sub.L, X.sub.R. In general, the left and right channels may be summed to generate the nonspatial component of the left and right channels, and subtracted to generate the spatial component of the left and right channels.
(64) The mid component processor 820 includes a plurality of filters 840, such as m mid filters 840(a), 840(b), through 840(m). Here, each of the m mid filters 840 processes one of m frequency bands of the nonspatial component X.sub.m. The mid component processor 820 generates a mid crosstalk compensation channel Z.sub.m by processing the nonspatial component X.sub.m. In some embodiments, the mid filters 840 are configured using a frequency response plot of the nonspatial X.sub.m with crosstalk processing through simulation. In addition, by analyzing the frequency response plot, any spectral defects such as peaks or troughs in the frequency response plot over a predetermined threshold (e.g., 10 dB) occurring as an artifact of the crosstalk processing can be estimated. These artifacts result primarily from the summation of the delayed and possibly inverted (e.g., for crosstalk cancellation) contralateral signals with their corresponding ipsilateral signal in the crosstalk processing, thereby effectively introducing a comb filter-like frequency response to the final rendered result. The mid crosstalk compensation channel Z.sub.m can be generated by the mid component processor 820 to compensate for the estimated peaks or troughs, where each of the m frequency bands corresponds with a peak or trough. Specifically, based on the specific delay, filtering frequency, and gain applied in the crosstalk processing, peaks and troughs shift up and down in the frequency response, causing variable amplification and/or attenuation of energy in specific regions of the spectrum. Each of the mid filters 840 may be configured to adjust for one or more of the peaks and troughs.
(65) The side component processor 830 includes a plurality of filters 850, such as m side filters 850(a), 850(b) through 850(m). The side component processor 830 generates a side crosstalk compensation channel Z.sub.s by processing the spatial component X.sub.s. In some embodiments, a frequency response plot of the spatial X.sub.s with crosstalk processing can be obtained through simulation. By analyzing the frequency response plot, any spectral defects such as peaks or troughs in the frequency response plot over a predetermined threshold (e.g., 10 dB) occurring as an artifact of the crosstalk processing can be estimated. The side crosstalk compensation channel Z.sub.s can be generated by the side component processor 830 to compensate for the estimated peaks or troughs. Specifically, based on the specific delay, filtering frequency, and gain applied in the crosstalk processing, peaks and troughs shift up and down in the frequency response, causing variable amplification and/or attenuation of energy in specific regions of the spectrum. Each of the side filters 850 may be configured to adjust for one or more of the peaks and troughs. In some embodiments, the mid component processor 820 and the side component processor 830 may include a different number of filters.
(66) In some embodiments, the mid filters 840 and side filters 850 may include a biquad filter having a transfer function defined by Equation 1:
(67)
where z is a complex variable, and a.sub.0, a.sub.1, a.sub.2, b.sub.0, b.sub.1, and b.sub.2 are digital filter coefficients. One way to implement such a filter is the direct form I topology as defined by Equation 2:
(68)
where X is the input vector, and Y is the ouput. Other topologies may be used, depending on their maximum word-length and saturation behaviors.
(69) The biquad can then be used to implement a second-order filter with real-valued inputs and outputs. To design a discrete-time filter, a continuous-time filter is designed, and then transformed into discrete time via a bilinear transform. Furthermore, resulting shifts in center frequency and bandwidth may be compensated using frequency warping.
(70) For example, a peaking filter may have an S-plane transfer function defined by Equation 3:
(71)
where s is a complex variable, A is the amplitude of the peak, and Q is the filter “quality,” and and the digital filter coefficients are defined by:
(72)
where ω.sub.0 is the center frequency of the filter in radians and
(73)
(74) Furthermore, the filter quality Q may be defined by Equation 4:
(75)
where Δf is a bandwidth and f.sub.c is a center frequency.
(76) The M/S to L/R converter 814 receives the mid crosstalk compensation channel Z.sub.m and the side crosstalk compensation channel Z.sub.s, and generates the left crosstalk compensation channel Z.sub.L and the right crosstalk compensation channel Z.sub.R. In general, the mid and side channels may be summed to generate the left channel of the mid and side components, and the mid and side channels may be subtracted to generate right channel of the mid and side components.
(77) When the crosstalk compensation processor 800 is part of the audio system 502, the crosstalk compensation processor 800 receives the left crosstalk simulation channel W.sub.L and the right crosstalk simulation channel W.sub.R from the crosstalk simulation processor 580, and performs a preprocessing (e.g., as discussed above for the input channels X.sub.L and X.sub.R) to generate left simulation compensation channel SC.sub.L and the right simulation compensation channel SC.sub.R.
(78) When the crosstalk compensation processor 800 is part of the audio system 700, the crosstalk compensation processor 800 receives the left enhanced compensation channel T.sub.L and the right enhanced compensation channel T.sub.R from the combiner 562, and performs a preprocessing (e.g., as discussed above for the input channels X.sub.L and X.sub.R) to generate left output channel O.sub.L and the right output channel O.sub.R.
(79)
(80) When the crosstalk compensation processor 900 is part of the audio system 200, 500, or 504, for example, the L&R combiner 910 receives the left input audio channel X.sub.L and the right input audio channel X.sub.R, and generates the nonspatial component X.sub.m by adding the channels X.sub.L, X.sub.R. The mid component processor 820 receives the nonspatial component X.sub.m, and generates the mid crosstalk compensation channel Z.sub.m by processing the nonspatial component X.sub.m using the mid filters 840(a) through 840(m). The M to L/R converter 950 receives the mid crosstalk compensation channel Z.sub.m, generates each of left crosstalk compensation channel Z.sub.L and the right crosstalk compensation channel Z.sub.R using the mid crosstalk compensation channel Z.sub.m. When the crosstalk compensation processor 900 is part of the audio system 400, 502, or 700, for example, the input and output signals may be different as discussed above for the crosstalk compensation processor 800.
(81)
(82)
(83) The crosstalk compensation processor 1100 includes the mid component processor 820 and the side component processor 830. The mid component processor 820 receives the enhanced nonspatial component E.sub.m from the spatial frequency band processor 245, and generates the mid enhanced compensation channel T.sub.m using the mid filters 840(a) through 840(m). The side component processor 830 receives the enhanced spatial component E.sub.s from the spatial frequency band processor 245, and generates the side enhanced compensation channel T.sub.s using the side filters 850(a) through 850(m).
(84)
(85)
(86) More specifically, the spatial frequency band processor 245 includes a subband filter for each of n frequency subbands of the nonspatial component Y.sub.m and a subband filter for each of the n subbands of the spatial component Y.sub.s. For n=4 subbands, for example, the spatial frequency band processor 245 includes a series of subband filters for the nonspatial component Y.sub.m including a mid equalization (EQ) filter 1362(1) for the subband (1), a mid EQ filter 1362(2) for the subband (2), a mid EQ filter 1362(3) for the subband (3), and a mid EQ filter 1362(4) for the subband (4). Each mid EQ filter 1362 applies a filter to a frequency subband portion of the nonspatial component Y.sub.m to generate the enhanced nonspatial component E.sub.m.
(87) The spatial frequency band processor 245 further includes a series of subband filters for the frequency subbands of the spatial component Y.sub.s, including a side equalization (EQ) filter 1364(1) for the subband (1), a side EQ filter 1364(2) for the subband (2), a side EQ filter 1364(3) for the subband (3), and a side EQ filter 1364(4) for the subband (4). Each side EQ filter 1364 applies a filter to a frequency subband portion of the spatial component Y.sub.s to generate the enhanced spatial component E.sub.s.
(88) Each of the n frequency subbands of the nonspatial component Y.sub.m and the spatial component Y.sub.s may correspond with a range of frequencies. For example, the frequency subband (1) may corresponding to 0 to 300 Hz, the frequency subband(2) may correspond to 300 to 510 Hz, the frequency subband(3) may correspond to 510 to 2700 Hz, and the frequency subband(4) may correspond to 2700 Hz to Nyquist frequency. In some embodiments, the n frequency subbands are a consolidated set of critical bands. The critical bands may be determined using a corpus of audio samples from a wide variety of musical genres. A long term average energy ratio of mid to side components over the 24 Bark scale critical bands is determined from the samples. Contiguous frequency bands with similar long term average ratios are then grouped together to form the set of critical bands. The range of the frequency subbands, as well as the number of frequency subbands, may be adjustable.
(89)
(90) More specifically, the spatial frequency band combiner 250 includes a global mid gain 1422, a global side gain 1424, and an M/S to L/R converter 1426 coupled to the global mid gain 1422 and the global side gain 1424. The global mid gain 1422 receives the enhanced nonspatial component E.sub.m and applies a gain, and the global side gain 1424 receives the enhanced nonspatial component E.sub.s and applies a gain. The M/S to L/R converter 1426 receives the enhanced nonspatial component E.sub.m from the global mid gain 1422 and the enhanced spatial component E.sub.s from the global side gain 1424, and converts these inputs into the left spatially enhanced channel E.sub.L and the right spatially enhanced channel E.sub.R.
(91) When the spatial frequency band combiner 250 is part of the subband spatial processor 310 shown in
(92)
(93) In one embodiment, the crosstalk cancellation processor 260 includes an in-out band divider 1510, inverters 1520 and 1522, contralateral estimators 1530 and 1540, combiners 1550 and 1552, and an in-out band combiner 1560. These components operate together to divide the input channels T.sub.L, T.sub.R into in-band components and out-of-band components, and perform a crosstalk cancellation on the in-band components to generate the output channels O.sub.L, O.sub.R.
(94) By dividing the input audio signal T into different frequency band components and by performing crosstalk cancellation on selective components (e.g., in-band components), crosstalk cancellation can be performed for a particular frequency band while obviating degradations in other frequency bands. If crosstalk cancellation is performed without dividing the input audio signal T into different frequency bands, the audio signal after such crosstalk cancellation may exhibit significant attenuation or amplification in the nonspatial and spatial components in low frequency (e.g., below 350 Hz), higher frequency (e.g., above 12000 Hz), or both. By selectively performing crosstalk cancellation for the in-band (e.g., between 250 Hz and 14000 Hz), where the vast majority of impactful spatial cues reside, a balanced overall energy, particularly in the nonspatial component, across the spectrum in the mix can be retained.
(95) The in-out band divider 1510 separates the input channels T.sub.L, T.sub.R into in-band channels T.sub.L,In, T.sub.R,In and out of band channels T.sub.L,Out, T.sub.R,Out, respectively. Particularly, the in-out band divider 1510 divides the left enhanced compensation channel T.sub.L into a left in-band channel T.sub.L,In and a left out-of-band channel T.sub.L,Out. Similarly, the in-out band divider 1510 separates the right enhanced compensation channel T.sub.R into a right in-band channel T.sub.R,In and a right out-of-band channel T.sub.R,Out. Each in-band channel may encompass a portion of a respective input channel corresponding to a frequency range including, for example, 250 Hz to 14 kHz. The range of frequency bands may be adjustable, for example according to speaker parameters.
(96) The inverter 1520 and the contralateral estimator 1530 operate together to generate a left contralateral cancellation component S.sub.L to compensate for a contralateral sound component due to the left in-band channel T.sub.L,In. Similarly, the inverter 1522 and the contralateral estimator 1540 operate together to generate a right contralateral cancellation component S.sub.R to compensate for a contralateral sound component due to the right in-band channel T.sub.R,In.
(97) In one approach, the inverter 1520 receives the in-band channel T.sub.L,In and inverts a polarity of the received in-band channel T.sub.L,In to generate an inverted in-band channel T.sub.L,In′. The contralateral estimator 1530 receives the inverted in-band channel T.sub.L,In′, and extracts a portion of the inverted in-band channel T.sub.L,In′ corresponding to a contralateral sound component through filtering. Because the filtering is performed on the inverted in-band channel T.sub.L,In′, the portion extracted by the contralateral estimator 1530 becomes an inverse of a portion of the in-band channel T.sub.L,In attributing to the contralateral sound component. Hence, the portion extracted by the contralateral estimator 1530 becomes a left contralateral cancellation component S.sub.L, which can be added to a counterpart in-band channel T.sub.R,In to reduce the contralateral sound component due to the in-band channel T.sub.L,In. In some embodiments, the inverter 1520 and the contralateral estimator 1530 are implemented in a different sequence.
(98) The inverter 1522 and the contralateral estimator 1540 perform similar operations with respect to the in-band channel T.sub.R,In to generate the right contralateral cancellation component S.sub.R. Therefore, detailed description thereof is omitted herein for the sake of brevity.
(99) In one example implementation, the contralateral estimator 1530 includes a filter 1532, an amplifier 1534, and a delay unit 1536. The filter 1532 receives the inverted input channel T.sub.L,In′ and extracts a portion of the inverted in-band channel T.sub.L,In′ corresponding to a contralateral sound component through a filtering function. An example filter implementation is a Notch or Highshelf filter with a center frequency selected between 5000 and 10000 Hz, and Q selected between 0.5 and 1.0. Gain in decibels (G.sub.dB) may be derived from Equation 5:
G.sub.dB=−3.0−log 1.333(D) Eq. (5)
where D is a delay amount by delay unit 1536 and 1546 in samples, for example, at a sampling rate of 48 KHz. An alternate implementation is a Lowpass filter with a corner frequency selected between 5000 and 10000 Hz, and Q selected between 0.5 and 1.0. Moreover, the amplifier 1534 amplifies the extracted portion by a corresponding gain coefficient G.sub.L,In, and the delay unit 1536 delays the amplified output from the amplifier 1534 according to a delay function D to generate the left contralateral cancellation component S.sub.L. The contralateral estimator 1540 includes a filter 1542, an amplifier 1544, and a delay unit 1546 that performs similar operations on the inverted in-band channel T.sub.R,In′ to generate the right contralateral cancellation component S.sub.R. In one example, the contralateral estimators 1530, 1540 generate the left and right contralateral cancellation components S.sub.L, S.sub.R, according to equations below:
S.sub.L=D[G.sub.L,In*F[T.sub.L,In′]] Eq. (6)
S.sub.R=D[G.sub.R,In*F[T.sub.R,In′]] Eq. (7)
where F[ ] is a filter function, and D[ ] is the delay function.
(100) The configurations of the crosstalk cancellation can be determined by the speaker parameters. In one example, filter center frequency, delay amount, amplifier gain, and filter gain can be determined, according to an angle formed between two speakers 280 with respect to a listener. In some embodiments, values between the speaker angles are used to interpolate other values.
(101) The combiner 1550 combines the right contralateral cancellation component S.sub.R to the left in-band channel T.sub.L,In to generate a left in-band crosstalk channel U.sub.L, and the combiner 1552 combines the left contralateral cancellation component S.sub.L to the right in-band channel T.sub.R,In to generate a right in-band crosstalk channel U.sub.R. The in-out band combiner 1560 combines the left in-band crosstalk channel U.sub.L with the out-of-band channel T.sub.L,Out to generate the left output channel O.sub.L, and combines the right in-band crosstalk channel U.sub.R with the out-of-band channel T.sub.R,Out to generate the right output channel O.sub.R.
(102) Accordingly, the left output channel O.sub.L includes the right contralateral cancellation component S.sub.R corresponding to an inverse of a portion of the in-band channel T.sub.R,In attributing to the contralateral sound, and the right output channel O.sub.R includes the left contralateral cancellation component S.sub.L corresponding to an inverse of a portion of the in-band channel T.sub.L,In attributing to the contralateral sound. In this configuration, a wavefront of an ipsilateral sound component output by the loudspeaker 280.sub.R according to the right output channel O.sub.R arrived at the right ear can cancel a wavefront of a contralateral sound component output by the loudspeaker 280.sub.L according to the left output channel O.sub.L. Similarly, a wavefront of an ipsilateral sound component output by the speaker 280.sub.L according to the left output channel O.sub.L arrived at the left ear can cancel a wavefront of a contralateral sound component output by the loudspeaker 280.sub.R according to right output channel O.sub.R. Thus, contralateral sound components can be reduced to enhance spatial detectability.
(103)
(104) The crosstalk simulation processor 1600 includes a left head shadow low-pass filter 1602, a left cross-talk delay 1604, and a left head shadow gain 1610 to process the left input channel X.sub.L. The crosstalk simulation processor 1600 further includes a right head shadow low-pass filter 1606, a right cross-talk delay 1608, and a right head shadow gain 1612 to process the right input channel X.sub.R. The left head shadow low-pass filter 1602 receives the left input channel X.sub.L and applies a modulation that models the frequency response of the signal after passing through the listener's head. The output of the left head shadow low-pass filter 1602 is provided to the left cross-talk delay 1604, which applies a time delay to the output of the left head shadow low-pass filter 1602. The time delay represents trans-aural distance that is traversed by a contralateral sound component relative to an ipsilateral sound component. The frequency response can be generated based on empirical experiments to determine frequency dependent characteristics of sound wave modulation by the listener's head. For example and with reference to
(105) Similarly for the right input channel X.sub.R, the right head shadow low-pass filter 1606 receives the right input channel X.sub.R and applies a modulation that models the frequency response of the listener's head. The output of the right head shadow low-pass filter 1606 is provided to the right crosstalk delay 1608, which applies a time delay to the output of the right head shadow low-pass filter 1606. The right head shadow gain 1612 applies a gain to the output of the right crosstalk delay 1608 to generate the right crosstalk simulation channel W.sub.R.
(106) In some embodiments, the head shadow low-pass filters 1602 and 1606 have a cutoff frequency of 2,023 Hz. The cross-talk delays 1604 and 1608 apply a 0.792 millisecond delay. The head shadow gains 1610 and 1612 apply a −14.4 dB gain.
(107) The components of the crosstalk simulation processors 1600 and 1650 may be arranged in different orders. For example, although crosstalk simulation processor 1650 includes the left head shadow low-pass filter 1602 coupled with the left head shadow high-pass filter 1624, the left head shadow high-pass filter 1624 coupled to the left crosstalk delay 1604, and the left crosstalk delay 1604 coupled to the left head shadow gain 1610, the components 1602, 1624, 1604, and 1610 may be rearranged to process the left input channel X.sub.L in different orders. Similarly, the components 1606, 1626, 1608, and 1612 that process the right input channel X.sub.R may be arranged in different orders.
(108)
(109)
(110)
(111)
(112) The sum left 2002 combines the left spatially enhanced channel E.sub.L and the left simulation compensation channel SC.sub.L to generate the left output channel O.sub.L. The sum right 2004 combines the right spatially enhanced channel E.sub.R and the right simulation compensation channel SC.sub.R to generate the right output channel O.sub.R. The output gain 2006 applies gains to the left output channel O.sub.L and the right output channel O.sub.R, and outputs the left output channel O.sub.L and the right output channel O.sub.R.
(113) For the audio system 600, the combiner 562 receives the left enhanced compensation channel T.sub.L and the right enhanced compensation channel T.sub.R from the subband spatial processor 610, receives the left crosstalk simulation channel W.sub.L and the right crosstalk simulation channel W.sub.R from the crosstalk simulation processor 580. The sum left 2002 generates the left output channel O.sub.L by combining the left enhanced compensation channel T.sub.L and the right crosstalk simulation channel W.sub.R. The sum right 2004 generates the right output channel O.sub.R by combining the right enhanced compensation channel T.sub.R and the left crosstalk simulation channel W.sub.L.
(114) For the audio system 700, the combiner 562 receives the left spatially enhanced channel E.sub.L and the right spatially enhanced channel E.sub.R from the subband spatial processor 210, and receives the left crosstalk simulation channel W.sub.L and the right crosstalk simulation channel W.sub.R from the crosstalk simulation processor 580. The sum left 2002 generates the left enhanced compensation channel T.sub.L by combining the left spatially enhanced channel E.sub.L and the right crosstalk simulation channel W.sub.R. The sum right 2004 generates the right enhanced compensation channel T.sub.R by combining the right spatially enhanced channel E.sub.R and the left crosstalk simulation channel W.sub.L.
(115) Example Crosstalk Compensation
(116) As discussed above, a crosstalk compensation processor may compensate for comb-filtering artifacts that occur in the spatial and nonspatial signal components as a result of various crosstalk delays and gains in crosstalk cancellation. These crosstalk cancellation artifacts may be handled by applying correction filters to the non-spatial and spatial components independently. Mid/Side filtering (with associated M/S de-matrixing) can be inserted at various points in the overall signal flow of the algorithms, and the crosstalk-induced comb-filter peaks and notches in the frequency response of the spatial and nonspatial signal components may be handled in parallel.
(117)
(118) In these examples, compensation filters are applied to the spatial and nonspatial components independently, targeting all comb-filter peaks and/or troughs in the nonspatial (L+R, or mid) component, and all but the lowest comb-filter peaks and/or troughs in the spatial (L−R, or side) component. The method of compensation can be procedurally derived, tuned by ear and hand, or a combination.
(119)
(120)
(121)
(122)
(123)
(124)
(125)
(126)
(127) As shown in
(128) Example Processing
(129) The audio systems discussed herein perform various types of processing on an input audio signal including subband spatial processing (SBS), crosstalk compensation processing (CCP), and crosstalk processing (CP). The crosstalk processing may include crosstalk simulation or crosstalk cancellation. The order of processing for SBS, CCP, and CP may vary. In some embodiments, various steps of the SBS, CCP, or CP processing may be integrated. Some examples of processing embodiments are shown in
(130) With reference to
(131) With reference to
(132) With reference to
(133) With reference to
(134) With reference to
(135) With reference to
(136) With reference to
(137) With reference to
(138) With reference to
(139) With reference to
(140) With reference to
(141) With reference to
(142) With reference to
(143) Example Computer
(144)
(145) The storage device 3008 includes one or more non-transitory computer-readable storage media such as a hard drive, compact disk read-only memory (CD-ROM), DVD, or a solid-state memory device. The memory 3006 holds instructions and data used by the processor 3002. The pointing device 3014 is used in combination with the keyboard 3010 to input data into the computer system 3000. The graphics adapter 3012 displays images and other information on the display device 3018. In some embodiments, the display device 3018 includes a touch screen capability for receiving user input and selections. The network adapter 3016 couples the computer system 3000 to a network. Some embodiments of the computer 3000 have different and/or other components than those shown in
(146) The computer 3000 is adapted to execute computer program modules for providing functionality described herein. For example, some embodiments may include a computing device including one or more modules configured to perform the processing as discussed herein. As used herein, the term “module” refers to computer program instructions and/or other logic used to provide the specified functionality. Thus, a module can be implemented in hardware, firmware, and/or software. In one embodiment, program modules formed of executable computer program instructions are stored on the storage device 3008, loaded into the memory 3006, and executed by the processor 3002.
(147) Upon reading this disclosure, those of skill in the art will appreciate still additional alternative embodiments the disclosed principles herein. Thus, while particular embodiments and applications have been illustrated and described, it is to be understood that the disclosed embodiments are not limited to the precise construction and components disclosed herein. Various modifications, changes and variations, which will be apparent to those skilled in the art, may be made in the arrangement, operation and details of the method and apparatus disclosed herein without departing from the scope described herein.
(148) Any of the steps, operations, or processes described herein may be performed or implemented with one or more hardware or software modules, alone or in combination with other devices. In one embodiment, a software module is implemented with a computer program product comprising a computer readable medium (e.g., non-transitory computer readable medium) containing computer program code, which can be executed by a computer processor for performing any or all of the steps, operations, or processes described.