Method and apparatus for acoustic crosstalk cancellation
11115775 · 2021-09-07
Assignee
Inventors
Cpc classification
H04S1/002
ELECTRICITY
H04S7/305
ELECTRICITY
H04S2400/09
ELECTRICITY
H04S2420/01
ELECTRICITY
H04R2499/11
ELECTRICITY
International classification
Abstract
An acoustic crosstalk canceller is determined for an asymmetric audio playback device, by determining a transfer function of an acoustic stereo playback path having asymmetries defined by speakers of the playback device. The transfer function is inverted to determine an inverse transfer function. The inverse transfer function is regularised by applying frequency dependent regularisation parameters to obtain an acoustic crosstalk canceller. Also, the inverse transfer function could be regularised for symmetric playback paths by applying aggregated frequency dependent regularisation parameters to obtain an acoustic crosstalk canceller without band branching.
Claims
1. A device for reducing acoustic crosstalk at a time of audio playback, the device comprising: a processor configured to pass a stereo audio signal through a crosstalk canceller, wherein the crosstalk canceller comprises a regularised inverse transfer function of an acoustic stereo playback path having asymmetries defined by stereo playback speakers having unequal directivity with respect to the device, wherein the crosstalk canceller is configured to provide acoustic crosstalk cancellation in relation to the speakers having unequal directivity, and wherein the crosstalk canceller has been regularised by aggregated frequency dependent regularisation parameters; and further configured to pass an output of the crosstalk canceller to the stereo playback speakers for acoustic playback.
2. The device of claim 1, wherein the aggregated frequency dependent regularisation parameters are selected so that the crosstalk canceller is configured to provide for an amount of crosstalk cancellation and spectral coloration in one part of the audio spectrum which is different from an amount of crosstalk cancellation and spectral coloration in another part of the audio spectrum.
3. The device of claim 2, wherein the aggregated frequency dependent regularisation parameters are selected to be generally larger at high frequencies, so that the crosstalk canceller is configured to provide less crosstalk cancellation and less spectral coloration at high frequencies.
4. The device of claim 2, wherein the crosstalk canceller is configured to provide less crosstalk cancellation and less spectral coloration above 8 kHz.
5. The device of claim 1, wherein the acoustic crosstalk canceller is configured to provide for matching of loudspeaker frequency response so that a difference between loudspeakers' respective frequency responses is reduced.
6. The device of claim 1, comprising a respective acoustic crosstalk canceller in relation to each of a plurality of expected use modes of the device.
7. The device of claim 6, comprising a first crosstalk canceller configured for landscape playback, and comprising a second crosstalk canceller configured for portrait playback, and wherein the processor is configured to detect whether the device is being held in a landscape or portrait position and to use the respective first or second crosstalk canceller at a time of audio or video playback.
8. The device of claim 1, wherein the crosstalk canceller has been regularised by aggregated frequency dependent regularisation parameters without band branching.
9. A method of determining an acoustic crosstalk canceller for an asymmetric; audio playback device, the method comprising: determining a transfer function of an acoustic stereo playback path having asymmetries defined by speakers of the playback device having unequal directivity with respect to the device; inverting the transfer function to determine an inverse transfer function; regularising the inverse transfer function by applying aggregated frequency dependent regularisation parameters to obtain an acoustic crosstalk canceller.
10. The method of claim 9, wherein the aggregated frequency dependent regularisation parameters are selected so that the crosstalk canceller is configured to provide for a different amount of crosstalk cancellation and spectral coloration in one part of the audio spectrum as compared to another part of the audio spectrum.
11. The method of claim 10, wherein the aggregated frequency dependent regularisation parameters are selected to be generally larger at high frequencies, so that the crosstalk canceller is configured to provide less crosstalk cancellation and less spectral coloration at high frequencies.
12. The method of claim 10, wherein the crosstalk canceller is configured to provide less crosstalk cancellation and less spectral coloration above 8 kHz.
13. The method of claim 9, wherein the acoustic crosstalk canceller is configured to provide for matching of loudspeaker frequency response so that a difference between loudspeakers' respective frequency responses is reduced.
14. The method of claim 9, when performed more than once in respect of the audio playback device so as to determine a respective acoustic crosstalk canceller in relation to each of a plurality of expected use modes of the device.
15. The method of claim 14, wherein a first crosstalk canceller is designed and stored in the device in respect of landscape video playback, and a second crosstalk canceller is designed and stored in the device in respect of portrait video playback, so that selection of the appropriate crosstalk canceller may be made at a time of video playback based on whether the device is being held in a portrait or landscape position.
16. The method of claim 9, wherein regularising the inverse transfer function by applying aggregated frequency dependent regularisation parameters comprises regularising the inverse transfer function by applying aggregated frequency dependent regularisation parameters without band branching.
17. The method of claim 9, comprising deriving a directionality matrix representing directivity gains from each speaker to each ear.
18. A device for determining an acoustic crosstalk canceller for an asymmetric audio playback device, the device comprising: a processor configured to determine a transfer function of an acoustic stereo playback path having asymmetries defined by speakers of the playback device having unequal directivity with respect to the device; invert the transfer function to determine an inverse transfer function; and regularise the inverse transfer function by applying aggregated frequency dependent regularisation parameters to obtain an acoustic crosstalk canceller.
19. A method of reducing acoustic crosstalk at a time of audio playback, the method comprising: passing a stereo audio signal rough a crosstalk canceller, wherein the crosstalk canceller comprises a regularised inverse transfer function of an acoustic stereo playback path having asymmetries defined by stereo playback speakers having unequal directivity with respect to the device, wherein the crosstalk canceller has been regularised by aggregated frequency dependent regularisation parameters; and passing an output of the crosstalk canceller to the stereo playback loudspeakers for acoustic playback.
20. A device for reducing acoustic crosstalk at a time of audio playback, the device comprising: a processor configured to pass a stereo audio signal through a crosstalk canceller, wherein the crosstalk canceller comprises a regularised inverse transfer function of an acoustic stereo playback path having asymmetries defined by stereo playback speakers having unequal directivity with respect to the device, wherein the crosstalk canceller has been regularised by aggregated frequency dependent regularisation parameters; and further configured to pass an output of the crosstalk canceller to stereo loudspeakers for acoustic playback.
21. A method of determining an acoustic crosstalk canceller for an asymmetric audio playback device, the method comprising: determining a transfer function of an acoustic stereo playback path having asymmetries defined by stereo playback speakers having unequal directivity with respect to the device; inverting the transfer function to determine an inverse transfer function; regularising the inverse transfer function by applying aggregated frequency dependent regularisation parameters, to obtain an acoustic crosstalk canceller.
22. A non-transitory computer readable medium for determining an acoustic crosstalk canceller for an audio playback device, comprising instructions which, when executed by one or more processors, causes performance of the method of claim 9.
23. A non-transitory computer readable medium for determining an acoustic crosstalk canceller for an audio playback device, comprising instructions which, when executed by one or more processors, causes performance of the method of claim 21.
24. A device for determining an acoustic crosstalk canceller for an asymmetric audio playback device, the device comprising; a processor configured to determine a transfer function of an acoustic stereo playback path having asymmetries defined by stereo playback speakers having unequal directivity with respect to the device; invert the transfer function to determine an inverse transfer function; and regularise the inverse transfer function by applying aggregated frequency dependent regularisation parameters, to obtain an acoustic crosstalk canceller.
25. A method of reducing acoustic crosstalk at a time of audio playback, the method comprising: passing a stereo audio signal through a crosstalk canceller, wherein the crosstalk canceller comprises a regularised inverse transfer function of an acoustic stereo playback path having asymmetries defined by stereo playback speakers having unequal directivity with respect to the device, wherein the crosstalk canceller has been regularised by aggregated frequency dependent regularisation parameters; and passing an output of the crosstalk canceller to stereo loudspeakers for acoustic playback.
26. A non-transitory computer readable medium for reducing acoustic crosstalk at a time of audio playback, comprising instructions which, when executed by one or more processors, causes performance of the method of claim 19.
27. A non-transitory computer readable medium for reducing acoustic crosstalk at a time of audio playback, comprising instructions which, when executed by one or more processors, causes performance of the method of claim 25.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) An example of the invention will now be described with reference to the accompanying drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
DESCRIPTION OF THE PREFERRED EMBODIMENTS
(11)
(12) The aim of an acoustic crosstalk canceller (XTC) is to cancel the contralateral audio signals while delivering audio from the ipsilateral loudspeakers to a listener's ears, thereby providing the listener with an accurate binaural image and retain stereo cues.
(13) We first describe crosstalk cancellation for a generalised playback system, being a system in which it is assumed that two non-identical speakers are used, and further in which it is assumed that the respective speaker directionalities are unequal. The geometry and model of the generalised playback system is as follows.
(14) All specified geometric parameters of the playback model collectively define a spatial channel transfer function (CTF), C, which fully describes relations between the source (loudspeakers) and the sink (ear canals) of the generalised playback model. These relations are assumed to be linear so that for any chosen path, the CTF only changes amplitude and delay of the emitted soundwave.
(15) The described generalised soundwave propagation model may be represented as a typical two input-two output (“2×2”) system, as depicted in
(16) In order to derive a XTC for the generalised playback system of
(17) The stereo digital audio signal {right arrow over (d)}=[d.sub.L d.sub.R]T is passed through the system analog front-end and loudspeakers s.sub.L and s.sub.R with combined frequency response S, which in the case of perfect left and right audio channel decoupling can be expressed as follows.
(18)
(19) In Equation 1 s.sub.L (jω) and s.sub.R (jω) are complex-valued frequency responses of the left and right analog front-end and loudspeaker respectively. Herein, s.sub.L (jω) and s.sub.R (jω) will be called loudspeaker frequency responses, and an analog front-end is implied. The directionality of each speaker, s.sub.L and s.sub.R, along ipsilateral paths l.sub.1 and l′.sub.1, and contralateral paths l.sub.2 and l′.sub.2 as shown in
(20)
(21) In Equation 2, b.sub.ij (jω) are complex-valued directivity gains along the left and right ipsilateral paths l.sub.1 and l′.sub.1, and the corresponding contralateral paths l.sub.2 and l′.sub.2. One method of obtaining the directionality matrix B is by measuring four frequency responses along the propagation paths l.sub.1, l.sub.2, l′.sub.1, and l′.sub.2: two for each ipsilateral path, l.sub.1, and l′.sub.1; and two for each contralateral path, l.sub.2 and l′.sub.2-b.sub.RR (jω), b.sub.LL (jω), b.sub.LR (jω), and b.sub.RL (jω) respectively for all frequencies jω. Each frequency response b.sub.ij(jω) may be measured by frequency sweeping (DC to the Nyquist frequency) from the left or right speaker, and recording it by a reference microphone in the left or right ear of the HATS, depending on the propagation path being identified. See also
(22) Further, the magnitude response |b.sub.ij(jω)| of the frequency responses b.sub.ij(jω) are smoothed across the entire frequency band, and normalised so that the largest |b.sub.ij(jω)|=1, and therefore the remaining three amplitude responses are less than unity. Then, the common phase shift is removed from all b.sub.ij(jω). Propagation gains and delays due to discrepancies between the paths l1, l2, and l′1 and l′2 are also removed from b.sub.LR (jω) and b.sub.RL(jω) so that the channel frequency response is removed from the measurements. It should be noted, that the frequency dependent directivity gains, b.sub.ij(jω), may be reduced to correspondent scalar (frequency independent) gains and delays depending on required precision of directivity compensation. The overall input-output equation (the “speaker-to-ear” transfer function) can thus be expressed as follows:
{right arrow over (p)}=C° BS{right arrow over (d)} (EQ 3),
where ° is the Hadamard (element-wise) matrix multiplication, {right arrow over (p)}=[p.sub.L p.sub.R].sup.T, and C is a 2×2 channel frequency response:
(23)
(24) It is convenient to introduce a directional channel model, {tilde over (C)}, such that
(25)
(26) Substitution of EQ 5 into EQ 3 yields:
{right arrow over (p)}={tilde over (C)}S{right arrow over (d)}. (EQ 6)
(27) The purpose of the proposed stereo enhancement method of the present invention is to seek to make the sound at the listener's ears {right arrow over (p)} very close to the original audio signal {right arrow over (d)}, but only to within a certain margin. This is done by finding a matrix (operator) H, which when applied on to the original stereo audio signal {right arrow over (d)}, largely but not completely cancels the impact of the directional channel {tilde over (C)}. This is equivalent to cancelling both crosstalk and the discrepancy in the loudspeakers' directionality.
{right arrow over (p)}={tilde over (C)}SH{right arrow over (d)}. (EQ 7)
(28) Matrix His the frequency response of the crosstalk canceller with component filters h.sub.ij (i=L(eft) or R(ight) ear canal, j=L(eft) or R(ight) loudspeaker):
(29)
(30) In order for the crosstalk canceller to efficiently counteract the impact of the directional channel {tilde over (C)}, it is necessary to match frequency responses of the left and right loudspeakers, s.sub.L (jω) and s.sub.R (jω) respectively, so that the difference between the loudspeakers' frequency responses is minimal. The matching may be performed in a number of ways. For example, if the frequency response of the right loudspeaker is to be matched to the frequency response of the left loudspeaker, a filter
(31)
will be applied on to the frequency response of the right loudspeaker:
{tilde over (s)}.sub.R(jω)=s.sub.
where {tilde over (s)}.sub.R (jω) is the frequency response of the right loudspeaker after matching it to the frequency response of the left loudspeaker.
(32) Conversely, if the frequency response of the left loudspeaker is to be matched to the frequency response of the right loudspeaker, a filter
(33)
will be applied on to the frequency response of the left loudspeaker:
{tilde over (s)}.sub.L(jω)=s.sub.
where {tilde over (s)}.sub.L (jω) is the frequency response of the left loudspeaker after matching it to the frequency response of the right loudspeaker.
(34) In other embodiments, it is possible to match frequency responses of both left and right speakers to a frequency response of a user-defined or otherwise predefined frequency response. The matching filter derivation and the matching procedure is similar to the ones described above.
(35) The above-described process of loudspeaker matching is convenient to represent in matrix form. Let s.sub.
(36)
(37) The loudspeaker matching is achieved by applying S on the output of the crosstalk canceller so that EQ 7 yields:
(38)
where {tilde over (s)}(jω) is the frequency response of both loudspeakers after matching.
(39) Substituting EQ 15 into EQ 14 yields:
{right arrow over (p)}={tilde over (s)}{tilde over (C)}H{right arrow over (d)}. (EQ 16)
(40) From EQ 16 it follows that the performance of the proposed playback system depends on the choice of the crosstalk canceller. For example, in theory, perfect cancellation is achieved when the XTC is the inverse of the directional channel frequency response, or:
H={tilde over (C)}.sup.−1 (EQ 17).
(41) Substitution of EQ 17 into EQ 16 gives
{right arrow over (p)}={tilde over (s)}{tilde over (C)}H{right arrow over (d)}={tilde over (s)}{tilde over (C)}{tilde over (C)}.sup.−1{right arrow over (d)}={tilde over (s)}{right arrow over (d)}. (EQ 18)
(42) Therefore, in theory, after perfect crosstalk cancellation the audio at the listener's ears is precisely the same as the original audio signal spectrally shaped by the frequency response of the matched loudspeakers. However in practice if the XTC is set to be the inverse of the directional channel frequency response in accordance with EQ 17, a highly sensitive and in fact impractical system results.
(43)
(44) As noted above, in practice if the XTC is set to be the exact inverse of the directional channel frequency response in accordance with EQ 17, a highly sensitive and impractical system results. Accordingly, the present invention seeks to provide a robust crosstalk canceller. In order to introduce such a canceller, the following considerations are necessary.
(45) First, for a given playback system and geometry, the performance of the XTC is fully determined by the choice of H.
(46) Second, to provide a robust practical solution it is necessary to avoid perfect crosstalk cancellation as per EQ 17. This is because while in theory it totally removes crosstalk, in practice the performance of this method is highly sensitive to the listener's head position, results in excessive spectral coloration, and adds a substantial load on both transducers. When geometry of the playback is violated (e.g. the listener moves his head left or right with respect to the centre of the playback device), the effect of crosstalk cancellation is severely deteriorated, and the spectral coloration causes unpleasant sound distortion.
(47) Third, the severity of spectral coloration caused by the designed crosstalk canceller can be fully determined by a suitable method of deriving H, in accordance with the present invention. However some such methods allow a special parameterisation, which enables a trade-off between maximal spectral coloration, achievable crosstalk cancellation, and the size of the “sweet spot”, being the three dimensional volume within which maximum or sufficient crosstalk cancellation occurs and within which minimal or tolerable audible spectral coloration is perceived.
(48) Fourth, the performance of the XTC is sensitive to the position of the listener's head. By controlling spectral coloration in a trade off against the amount of perceived binaural cues it is possible to reduce perceived distortion arising in response to head movement.
(49) Fifth, the performance of the crosstalk canceller will progressively degrade with increasing discrepancy between the loudspeakers' frequency responses. Discrepancy in the phase responses is more damaging to the XTC, than discrepancy in the magnitude responses. For this reason, in order to maximise the obtainable beneficial effect of crosstalk cancellation, in some embodiments we propose that the frequency responses of both loudspeakers are to be matched to each other, as per EQ 15. This matching may be advantageous in compact playback devices or indeed in any system in which relatively low cost, and thus poorly matched, speakers are employed. Embodiments deployed on devices having sufficiently well matched loudspeakers may however omit this step.
(50) Sixth, the performance of the crosstalk canceller will deteriorate if the loudspeakers have different directionality patterns. Such differences in directionality may arise due to a difference in the loudspeaker design, a difference in the loudspeaker port design, placement of the loudspeakers on non-parallel or orthogonal surfaces of the device (as shown in
(51) With particular regard to the first to fourth considerations above, the present invention provides for crosstalk canceller regularisation in order to introduce a controllable trade-off between residual crosstalk and spectral coloration. The described embodiments effect a frequency dependent regularisation using an aggregated regularisation parameter, however other types of regularisation may be used. The described embodiment further extends this method to a more general case of asymmetric playback geometry, and solves the XTC problem for a more general case with speaker directivity, while also significantly simplifying the method such that most of its complexity lies in off-line design of the XTC, H, and so that on-line (run-time) complexity is minimised, to allow deployment on compact mobile devices and the like. To this end, the XTC is expressed as follows. The frequency response of the crosstalk canceller is calculated as follows.
H=[C.sup.HC+R].sup.−1C.sup.H (EQ 19),
where R is a frequency dependent regularisation matrix, such that:
(52)
where Γ.sup.L and Γ.sup.R are the required levels of spectral coloration, at the left and right loudspeakers respectively, ρ.sup.L (ω,Γ) and ρ.sup.R (ω,Γ) are the aggregated frequency-dependent regularisation parameters used to achieve required spectral coloration at the left or right loudspeakers, respectively, such that
ρ.sup.L(ω,Γ.sup.L)=max{(ρ.sub.I.sup.L(ω,Γ.sup.L),ρ.sub.II.sup.L(ω,Γ.sup.L),0}, (EQ 21)
ρ.sup.R(ω,Γ.sup.R)=max{(ρ.sub.I.sup.R(ω,Γ.sup.R),ρ.sub.II.sup.R(ω,Γ.sup.R),0}. (EQ 22)
(53) The regularisation sub-parameters ρ.sub.I and ρ.sub.II may be calculated using a method described in U.S. Pat. No. 9,167,344, or by any other suitable method. It is to be noted that U.S. Pat. No. 9,167,344 uses the regularisation sub-parameters ρ.sub.I and ρ.sub.II in a manner unlike that of the present embodiment of the invention, by using a band branching method which requires the input audio to be divided into sub-bands whose widths are dependent on the playback system parameters (e.g. playback geometry, sampling frequency), and then processing each such band separately by a respective XTC designed specifically for each band using a respective regularisation parameter, which is complex with high MIPS and memory requirements. In contrast, the present embodiment of the invention uses the regularisation sub-parameters ρ.sub.I and ρ.sub.II to produce aggregated regularisation parameters ρ.sup.L and ρ.sup.R which importantly permits crosstalk cancellation to be effected without the use of band branching, requiring only a single XTC design.
(54) In order to derive the desired aggregated regularisation parameters, the present embodiment of the invention recognises that peaks of the unregularised in-phase XTC response S.sub.i(ω) (where S.sub.i(ω)=|h.sub.LL (jω)+h.sub.LR (jω)|=|h.sub.RL(jω)+h.sub.RR(jω)|) always coincide in frequency with peaks of the FDR parameter ρ.sub.I. It was further recognised that peaks of the unregularised out-of-phase XTC response S.sub.o(ω) (where S.sub.o(ω)=|h.sub.LL(jω)−h.sub.LR (jω)|=|h.sub.RL(jω)−h.sub.RR(jω)|) always coincide in frequency with peaks of the FDR parameter β.sub.II. This coincidence is illustrated in
(55) In (EQ 19) all components are frequency dependent. For every jω-th spectral frequency, the crosstalk canceller is represented as a 2×2 matrix, H, as per EQ 8, and each matrix H consists of four component filters as described earlier.
(56) Although it is in the general case possible to achieve different spectral coloration at each loudspeaker, in this treatment, without loss of generality, we will consider a case, where the same spectral coloration is required at both left and right loudspeakers, so Γ=Γ.sup.L=Γ.sup.R is a scalar.
(57) A particular recognition of some embodiments of the present invention is that the spectral coloration caused by the frequency response, H, of the crosstalk canceller is an undesired artefact, particularly in high frequencies. Accordingly, here we propose a method of frequency selective control of spectral coloration caused by XTC, which allows reduced spectral coloration in any chosen frequency band, different to the coloration permitted in other bands. The method is as follows. If designed using EQ 19, the XTC introduces an amount of spectral coloration, Γ, that is inversely proportional to the regularisation parameter ρ: the smaller rho, the larger the spectral coloration, and with ρ=0, the spectral coloration is maximal. Therefore it is possible to decrease spectral coloration by making a controlled increase in the regularisation parameter, ρ.
(58) Hence, one method of frequency selective control of the spectral coloration is to apply a “shaping” function on to the allowed spectral coloration, Γ. This function may be, but is not limited to, the “flipped” logistic function:
(59)
where e is the natural logarithm base, n is n-th DFT frequency bin, no is the DFT frequency bin corresponding to the sigmoid's midpoint, F is the allowed spectral coloration (the sigmoid's maximum value), and k is the slope (steepness) of the curve.
(60)
(61)
(62) Accordingly, we can provide a method for XTC design for a generalised playback system. The proposed method of the XTC design is as follows. For a specific XTC use case, e.g. music video playback on a mobile phone, we define an input parameter vector {right arrow over (u)}=[r.sub.S, r′.sub.S, r.sub.h, r′.sub.h, Δr, Γ, n, f.sub.S,], where Γ (dB) is the maximum allowed spectral coloration (cumulative gain due to crosstalk cancellation); n is the length of each component filter, and f.sub.S (Hz) is the sampling frequency.
(63) Next, calculate the playback geometry parameters: path lengths l.sub.1, l.sub.2, l′.sub.1, and l′.sub.2:
l.sub.1=l.sub.RR=√{square root over ((0.5Δr−r.sub.s).sup.2+r.sub.h.sup.2)} (EQ 24)
l.sub.2=l.sub.LR=√{square root over ((0.5Δr+r.sub.s).sup.2+r.sub.h.sup.2)} (EQ 25)
l′.sub.1=l.sub.LL=√{square root over ((0.5Δr−r′.sub.s).sup.2+r′.sub.h.sup.2)} (EQ 26)
l′.sub.2=l.sub.RL=√{square root over ((0.5Δr+r′.sub.s).sup.2+r′.sub.h.sup.2)} (EQ 27)
where l.sub.ij is the path length to the i-th (L(eft) or R(ight)) ear canal from the j-th loudspeaker.
(64) Next, calculate the channel parameters along each propagation path l.sub.1, l.sub.2, l′.sub.1, and l′.sub.2. In particular, calculate the path attenuations g.sub.1, g.sub.2, g′.sub.1 and g′.sub.2 as follows. Select the shortest path length l.sub.min=min{(l.sub.1, l.sub.2, l′.sub.1, l′.sub.2} and set the gain across this path to unity, so that g [l.sub.min]=1. Here, [A] denotes “index of A”. The remaining gains are calculated as
(65)
(66) Thereby, the path gains g.sub.1=g.sub.RR, g.sub.2=g.sub.LR, g′.sub.1=g.sub.LL and g′.sub.2=g.sub.RL are estimated. Next, calculate the path delays in seconds, τ.sub.C and path delays samples, τ.sub.S, along all propagation paths l.sub.1, l.sub.2, l′.sub.1, and l′.sub.2:
(67)
(68) Next, normalise the calculated path delays (in seconds) by selecting the shortest delay τ.sub.C min and subtracting it from all delays in EQ 30-33, so that they become:
τ.sub.Cl.sub.
τ.sub.Cl.sub.
τ.sub.Cl′.sub.
τ.sub.Cl′.sub.
(69) Normalised path delays (in samples) is τ.sub.Sl.sub.
(70) Then, we construct the spatial channel frequency response, C, represented by its component filters c.sub.LL, c.sub.LR, c.sub.RL, c.sub.RR by performing an n-point DFT on the C.sup.t component filters c.sub.CL.sup.t, c.sub.LR.sup.t, c.sub.RL.sup.t, c.sub.RR.sup.t. Next, we construct the directional channel frequency response, {tilde over (C)}, represented by its component filters {tilde over (c)}.sub.LL, {tilde over (c)}.sub.LR, {tilde over (c)}.sub.RL, {tilde over (c)}.sub.RR by performing a Hadamard (element-wise) multiplication of the channel frequency response, C, on the speaker directionality matrix, B, as per EQ 5.
(71) Next we calculate the crosstalk canceller frequency response, H. For a given spectral coloration level Γ dB we calculate the frequency-dependent regularisation parameters for each (left or right) side of the playback system, ρ.sup.L (ω, Γ) and ρ.sup.R (ω, Γ), respectively.
ρ.sup.L(ω)=max{ρ.sub.I.sup.L(ω),ρ.sub.I.sup.L(ω),0}, (EQ 38)
ρ.sup.R(ω)=max{ρ.sub.I.sup.R(ω),ρ.sub.II.sup.R(ω),0}. (EQ 39)
(72) It is to be noted that this method for calculation of the regularisation parameters is generalised to a non-symmetric playback geometry, and it does not require band branching.
(73) For each frequency ω assemble a matrix C.sup.ω such that:
(74)
(75) For each frequency ω estimate the crosstalk canceller frequency response, H.sup.ω as:
(76)
where superscript .sup.(H) represents the Hermitian conjugation operator, and the regularisation matrix is defined by EQ 20.
(77) It is to be noted that regularisation occurs naturally at the frequencies where ρ.sup.k(ω)>0, k=L or R, which is where the magnitude frequency response of the unregularised XTC exceeds Γ dB. Otherwise, ordinary least-squares inversion is performed as there is no need for the regularisation.
(78) Next we construct the XTC impulse response, H.sup.t, represented by its component filters h.sub.ij.sup.t by performing an n-point inverse DFT (IDFT) on the H.sup.ω component filters h.sub.ij across all frequencies, followed by a cyclic shift of n/2. The calculated component filters coefficients h.sub.ij.sup.t of the XTC are loaded into the two-input two-output filter structure H (
(79) Importantly, while derivation of the component filters coefficients h.sub.ij.sup.t of the XTC H involves the above described process and entails a considerable computational burden, this is a one-off process which can be performed just once in respect of each expected use mode of the device 100. The component filters coefficients h.sub.ij.sup.t of the XTC H do not necessarily require any further change thereafter throughout the entire lifetime of the device 100. The run-time computational burden of the presently described crosstalk canceller is much reduced as compared to the one-off design of the canceller, because the run-time process of stereo audio playback merely involves passing the input audio stereo signal d through H.
(80) In another embodiment of the invention, the crosstalk canceller is designed for the case of crosstalk cancellation of a playback system having same plane placement of identical speakers.
(81) The described free-field soundwave propagation model may be represented as a typical two input-two output (“2×2”) system, as depicted in
(82)
(83)
where s(jω) is a complex-valued frequency response of both left and right analog front-end and loudspeakers, and I is a 2×2 identity matrix.
(84) In the case of identical and symmetrically placed loudspeakers, the speaker directionality matrix becomes
(85)
(86) After substituting EQ 42 and EQ 43 into EQ 3, the overall input-output equation for the symmetric free-field model can be expressed as follows.
{right arrow over (p)}=sC{right arrow over (d)} (EQ 44).
(87) Substituting EQ 17 into EQ 44 yields:
{right arrow over (p)}=sCH{right arrow over (d)}={tilde over (s)}CC.sup.−1{right arrow over (d)}={tilde over (s)}{right arrow over (d)}. (EQ 45)
(88) Therefore, after perfect crosstalk cancellation, the audio at the listener's ears is, again only in theory, the original audio signal spectrally shaped by the frequency response of the matched loudspeakers.
(89) Hence, as shown in
(90) Accordingly, for the case of symmetric placement of two identical loudspeakers, the proposed XTC is derived as follows. For each jω-th spectral frequency
H=[C.sup.HC+ρI].sup.−1C.sup.H (EQ 46)
where 0≤ρ<1 is an aggregated frequency-dependent regularisation parameter, I—identity matrix.
(91) The proposed method of the XTC design for the embodiment of
l.sub.1=√{square root over ((0.5Δr−r.sub.s).sup.2+r.sub.h.sup.2)} (EQ 47)
l.sub.2=√{square root over ((0.5Δr+r.sub.s).sup.2+r.sub.h.sup.2)} (EQ 48)
Δl=l.sub.2−l.sub.1 (EQ 49)
(92) Next we calculate channel parameters, including the path attenuation g, the path delay in seconds τ.sub.c, and the path delay in samples τ.sub.s:
(93)
where c.sub.S is the speed of sound (m/s).
(94) We then construct the spatial channel impulse response, C.sup.t. c.sub.LL.sup.t=c.sub.RR.sup.t is an n-tap identity FIR. c.sub.LR.sup.t=c.sub.RL.sup.t is constructed by inserting g (EQ 50) into τ.sub.S-th (EQ 52) tap of an n-element zero vector. If τ.sub.S is non-integer it may be rounded to a nearest integer. We next construct the spatial channel frequency response, C, represented by its component filters c.sub.LL=c.sub.RR and c.sub.LR=c.sub.RL, by performing an n-point DFT on the C.sup.t component filters c.sub.LL.sup.t=c.sub.RR.sup.t and c.sub.LR.sup.t=c.sub.RL.sup.t.
(95) Next, construct crosstalk canceller frequency response, H, as follows. For a given spectral coloration level Γ dB calculate the aggregated frequency-dependent regularisation parameter as follows.
ρ(ω)=max{ρ.sub.I(ω),ρ.sub.II(ω),0}. (EQ 53)
(96) For each frequency ω assemble a matrix C.sup.ω such that
(97)
(98) For each frequency ω estimate the crosstalk canceller frequency response, H.sup.ω such that:
(99)
where superscript .sup.(H) represents Hermitian conjugation operator.
(100) It is to be noted that regularisation occurs naturally at the frequencies where ρ(ω)>0 which is where the magnitude frequency response of the unregularised XTC exceeds Γ dB. Otherwise, ordinary least-squares inversion is performed as there is no need for the regularisation. We construct the XTC impulse response, H.sup.t, represented by its component filters h.sub.LL.sup.t=h.sub.RR.sup.t and h.sub.LR.sup.t=h.sub.RL.sup.t, by performing an n-point inverse DFT (IDFT) on the H.sup.ω component filters h.sub.LL.sup.ω=h.sub.RR.sup.ω and h.sub.LR.sup.ω=h.sub.RL.sup.ω, followed by a cyclic shift of n 2. This completes construction of this embodiment of the crosstalk canceller frequency response, H. The calculated component filters coefficients h.sub.LL.sup.t=h.sub.RR.sup.t and h.sub.LR.sup.t=h.sub.RL.sup.t of the XTC are thus loaded into the two-input two-output filter structure H. Once again, this is a one-off design process and the component filters coefficients of H need no further change.
(101) It is further to be noted that other special cases derived from the generalised playback system are possible, e.g. same plane loudspeaker placement of non-identical speakers; orthogonal plane loudspeaker placement of identical speakers, etc. Solutions for these special cases can be easily derived from the above described solution for the generalised playback geometry case and are thus to be considered within the scope of the present invention.
(102) A block-diagram of a XTC module in accordance with one embodiment of the invention is shown in
(103) In the above described embodiments it is further necessary to provide software and apparatus for the one-off XTC development.
(104) The audio recording device is connected to a PC via an audio interface; an audio playback/analysis software is used to evaluate performance of the XTC being developed. Also the PC is running an XTC generator tool which generates the XTC component filters h.sub.LL.sup.t, h.sub.RR.sup.t, h.sub.LR.sup.t, and h.sub.RL.sup.t given an input parameter vector u as described in the previous sections. The calculated component filters h.sub.LL.sup.t, h.sub.RR.sup.t, h.sub.LR.sup.t, and h.sub.RL.sup.t can be loaded into the playback device where they are used to preprocess the original stereo audio signal in order to cancel acoustic interference. The playback device may be implemented as a prototype board/device with a digital signal processor (DSP) used to implement the XTC. It has analog front-end which includes DAC, power amplifier, and two loudspeakers (
(105) Accordingly, the process of the XTC development is as follows. For a given playback device, and for a given playback scenario (e.g. watching a music video on a smartphone), define an input parameter vector G. For the chosen music video playback scenario the input parameter vector may take the following values: {right arrow over (u)}=[0.13 (m), 0.5 (m), 0.175 (m), 7 (dB), 512 (taps), 48 (kHz)] (this being a special case of the same plane identical loudspeakers placement). Given the parameterised vector {right arrow over (u)} the XTC generator tool running on the PC generates the XTC component filters h.sub.LL.sup.t, h.sub.RR.sup.t, h.sub.LR.sup.t, and h.sub.RL.sup.t given an input parameter vector {right arrow over (u)}=[r.sub.S, r.sub.h, Δr, Γ, n, f.sub.S,] as described in the previous section. The four 512-tap component filters are loaded into the playback device and applied on to the input audio. The processed audio is played through the loudspeakers, and after propagation through the spatial channel is registered on the left and right microphones. Then the analog audio signal (both channels) is passed to the stereo recording equipment where it is amplified, sampled and quantised and recorded into an audio file. It should be noted that the HATS is used only to imitate the impact of human head on the acoustic channel and thus on the crosstalk cancelling characteristics. The audio file is copied to the PC and loaded into the audio playback/analysis software where its quality is analysed both subjectively and objectively.
(106) Sensitivity of the developed XTC performance to a listener's head position can be assessed by applying some (X,Y, Φ) displacement on to the HATS using the moving platform. The process of playback, recording, and performance evaluation is performed as specified above. In order to develop an XTC with different properties, for example for a different use mode, the vector {right arrow over (u)} is adjusted and the process of XTC development and performance assessment is repeated. Thus more than one XTC may be developed and stored in the playback device in respect of more than one use mode, with the appropriate XTC to use at any given time being defined simply by the use mode of the device.
(107) It is to be appreciated that the method and device described herein may embody the present invention in software or firmware held by any suitable computer-readable storage medium including non-transitory media, and may be executed by a general purpose processor or an application specific processor such as a digital signal processor.
(108) It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific embodiments without departing from the spirit or scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects as illustrative and not limiting or restrictive.