Sound reproduction systems
09961468 ยท 2018-05-01
Assignee
Inventors
Cpc classification
H04S2420/01
ELECTRICITY
H04R1/26
ELECTRICITY
H04S5/00
ELECTRICITY
H04S2400/05
ELECTRICITY
H04R2205/024
ELECTRICITY
International classification
H04S5/00
ELECTRICITY
H04R1/26
ELECTRICITY
Abstract
A sound reproduction system includes an electro-acoustic transducer and a transducer driver for driving the electro-acoustic transducer. The transducer drive includes a filter which is configured to reproduce at a listener's location an approximation to the local sound field that would be present at the listener's ears in recording space, taking into account the characteristics and intended position of the electro-acoustic transducer relative to the listener's ears. The electro-acoustic transducer includes a first sound emitter which provides an intermediate sound emission channel, and second and third sound emitters providing respective left and right sound emission channels. The first sound emitter is located intermediate of second and third sound emitters. Higher frequencies from at least one of the second and third sound emitters are transmitted closer to the first sound emitter while lower frequencies are transmitted away from the first sound emitter.
Claims
1. A sound reproduction system comprising: an electro-acoustic transducer; and a transducer drive for driving the electro-acoustic transducer in response to an input sound recording, the transducer drive comprising a filter which is configured to reproduce at a listener location an approximation to the local sound field that would be present at a listener's ears in recording space, taking into account characteristics and an intended position of the electro-acoustic transducer relative to the ears of the listener, the electro-acoustic transducer comprising a central sound emitter which provides a central sound emission channel, a left sound emitter which provides a left sound emission channel and a right sound emitter which provides a right sound emission channel, the central sound emitter being located intermediate of left sound emitter and the right sound emitter, the central sound emitter, the left sound emitter and the right sound emitter each arranged to emit a range of frequencies, and both of the left sound emitter and the right sound emitter being such that different frequencies of the range are emitted from different respective azimuthal positions in a frequency distributed arrangement wherein predominantly higher frequencies of the range are transmitted closer to the central sound emitter and predominantly lower frequencies of the range are transmitted away from the central sound emitter, and the central sound emitter arranged to emit said range of frequencies emitted by one or both of the left sound emission channel and the right sound emission channel from substantially a single azimuthal location, as opposed to the frequency distributed arrangement of both of said left sound emission channel and said right sound emission channel.
2. The sound reproduction system as claimed in claim 1 in which at least one of the left sound emitter and the right sound emitter is positioned over a respective azimuthal span or region, and portions of at least one of (i) the left sound emitter and (ii) the right sound emitter having different azimuth directions emit predominantly different frequencies of sound, or predominantly different ranges of frequencies of sound.
3. The sound reproduction system as claimed in claim 1 in which at least one of the left sound emitter and the right sound emitter comprises a plurality of different positioned sound emitter devices, and in use each sound emitter device emitting a respective predominant frequency or a predominant range of frequencies of sound.
4. The sound reproduction system as claimed in claim 1 in which the central sound emitter is provided substantially central of the left sound emitter and the right sound emitter.
5. The sound reproduction system as claimed in claim 1 in which the central sound emitter is located rearwardly of the left sound emitter and the right sound emitter.
6. The sound reproduction system as claimed in claim 1 in which the central sound emitter provides a substantially non-variable frequency output with respect to the spatial extent of the central sound emitter, wherein the frequency substantially does not vary with azimuthal position and a range of frequencies is configured to be emitted therefrom.
7. The sound reproduction system as claimed in claim 1 in which one of the left sound emitter and the right sound emitter provides a substantially non-variable frequency output with respect to the spatial extent of the left sound emitter and the right sound emitter, wherein the frequency substantially does not vary with azimuthal position and a range of frequencies is configured to be emitted therefrom.
8. The sound reproduction system as claimed in claim 1 in which the head related transfer functions of a listener are taken into account.
9. The sound reproduction system as claimed in claim 1 in which the operational transducer frequency/azimuth range is determined by an equation of the form
0<n<4, f: is the frequency, c.sub.0: is the speed of sound, and r: is the equivalent distance between the ears.
10. The sound reproduction system as claimed in claim 9 where 0<n<3.9.
11. A sound reproduction system as claimed in claim 9 where 0<n<3.7.
12. The sound reproduction system as claimed in claim 9 where 0.1<n<3.9.
13. The sound reproduction system as claimed in claim 9 where 0.3<n<3.7.
14. The sound reproduction system as claimed in claim 1 in which the transducer drive comprises cross-over filters for distributing signals of the appropriate frequency range to the appropriate sets of sound emitters, the cross-over filters responding to the outputs of an inverse filter of said filter.
15. The sound reproduction system as claimed in claim 1 in which the transducer drive comprises cross-over filters for distributing signals of the appropriate frequency range to the appropriate sets of sound emitters, with an inverse filter of said filter being responsive to the outputs of the cross-over filters.
16. The sound reproduction system as claimed in claim 1, in which the filter may be configured to be a minimum norm solution of the inverse problem.
17. The sound reproduction system as claimed in claim 1, in which the filter is configured to be a pseudoinverse filter.
18. The sound reproduction system as claimed in claim 1, in which the filter is configured to comprise adaptive filters.
19. The sound reproduction system as claimed in claim 1 comprising sub-woofers for responding to very low audio frequencies.
20. The sound reproduction system as claimed in claim 1, in which the number of sound emitter devices for the central sound emitter, the left sound emitter, and the right sound emitter comprise a different number of sound emitter devices to each other.
21. The sound reproduction system as claimed in claim 1, in which the central sound emitter comprises a single sound emitter device without any cross-over filters.
22. The sound reproduction system as claimed in claim 1 comprising a conventional loudspeaker for reproducing sound in a conventional method.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Various embodiments of the invention will now be described, by way of example only, together with a more detailed presentation of prior art arrangements with reference to the accompanying drawings, which show:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
(27) The principle of binaural reproduction over loudspeaker is described below and is illustrated in
w=Cv(1)
where C is the plant matrix (a matrix of transfer functions between sources and receivers). The two signals to be synthesised at the receivers are defined by the elements of the complex vector d=[d.sub.1(j)d.sub.2(j)].sup.T. In the case of audio applications, these signals are usually the signals that would produce a desired virtual auditory sensation when fed to the two ears independently. They can be obtained, for example, by recording sound source signals u with a recording head (eg a dummy head) or by filtering the signals u by matrix of synthesised binaural filters A.
(28) Therefore, a filter matrix H which contains inverse filters is introduced (the inverse filter matrix) so that v=Hd where
(29)
and thus
w=CHd(3)
(30) The inverse filter matrix H can be designed so that the vector w is a good approximation to the vector d with a certain delay [14][15]. When the independent control at two receivers is perfect, CH becomes the identity matrix I. The inverse filter matrix H can also be designed to be a pseudoinverse of the plant matrix C. The filter matrix H can also consist of adaptive filters.
(31) However, the system inversion involved gives rise to a number of problems such as, for example, loss of dynamic range and sensitivity to errors. A simple case involving the control of two monopole receivers with two monopole transducers (sources) under free field conditions is first considered here. The fundamental problems with regard to system inversion can be illustrated in this simple case. The geometry is illustrated in
(32) In the free field case, the plant transfer function matrix can be modelled as
(33)
where an e.sup.jt time dependence is assumed with k=/c.sub.0, and where .sub.0 and c.sub.0 are the density and sound speed.
(34) Now consider the case
(35)
i.e., the desired signals are the acoustic pressure signals which would have been produced by the closer sound source and whose values are either D.sub.1(j) or D.sub.2(j) without disturbance due to the other source (cross-talk). This way the effect of system inversion can be separated from the effects of spherical attenuation due to propagation in space as well as ensuring a causal solution. The elements of H can be obtained from the exact inverse of C, and the magnitude of the elements of H (|H.sub.mn(j)|) show the necessary amplification of the desired signals produced by each inverse filter in H. The maximum amplification of the source strengths can be found from the 2-norm of H (denoted as H) which is the largest of the singular values of H, where these singular values are denoted by .sub.o and .sub.i [13]. Thus
H=max(.sub.o,.sub.i)(6)
.sub.o corresponds to the amplification factor of the out-of-phase component of the desired signals and .sub.i corresponds to the amplification factor of the in-phase component of the desired signals. Plots of .sub.o, .sub.i, and H with respect to frequency are illustrated in
(36)
(37) The singular value .sub.o has peaks at n=0, 4, 8, . . . where the system has difficulty in reproducing the out-of-phase component of the desired signals and .sub.i has peaks at n=2, 6, 10, . . . where the system has difficulty in reproducing the in-phase component. Around these frequencies, sound signals from control sources interfere destructively with each other, leaving little response left at the ears of the listener. In other words, the signals cancel each other. Therefore, the solution for the inverse, i.e., the amplification required to produce the desired sound pressure at each receiver, becomes substantially large.
FUNDAMENTAL PROBLEMS OF PRIOR ART SYSTEMS BEFORE OPTIMAL SOURCE DISTRIBUTION
3.1 Loss of Dynamic Range
(38) In practice, since the maximum source output is given by H.sub.max, this must be within the range of the system in order to avoid clipping of the signals. The required amplification results directly in the loss of dynamic range illustrated in output levels and dynamic range of the systems are the same. Where H is large, each transducer is emitting very large sound most of which is cancelled by the sound from the other transducers. As a result, the levels of synthesised binaural signals at the listener's ears are significantly smaller than that those without cancellation. The given dynamic range is distributed into the system inversion and the remaining dynamic range that is to be used by the binaural auditory space synthesis, and also most importantly, by the sound source signal itself. Thus the signal to noise ratio of the signals w becomes low. Since the transducers are working much harder than they would normally to produce usual sound levels at the ears, non-linear distortion becomes more significant and is often audible. For the same reason, fatigue of the transducers is more severe. Conventional driver units are not designed to be used in this manner and they can be easily destroyed by fatigue.
(39) Eq. (1) implies that the system inversion (which determines v and leads to the design of the filter matrix H) is very sensitive to small errors in the assumed plant C (which is often measured and thus small errors are inevitable) where the condition number of C, (C), is large. In addition, the reproduced signals w are less robust to small changes in the real plant matrix C, where (C) is large.
(40) The condition number of C is shown in
(41) The calculated inverse filter matrix H is likely to contain large errors due to small errors in the assumed plant matrix C and results in large errors in the reproduced signal w at the receiver. This is because such errors are magnified by the inverse filters but remain not being cancelled in the plant. Even if H does not contain any errors, the reproduction of the signals at the receiver is too sensitive to the small errors within the real plant matrix C to be useful.
(42) Such errors include individual differences of HRTFs, [16]-[18] and misalignment of the head and loudspeakers [19], approximation of filters and regularisation, where a small error is deliberately introduced to improve the condition of matrix to design practical filters [20]. These errors may seem small but it is far too large in practice where (C) is large.
(43) On the contrary, (C) is small around the frequencies where n is an odd integer number in Eq. (7). Around these frequencies, a practical and close to ideal inverse filter matrix H is easily obtained and the accurate reproduction of intended sound signal is possible.
(44)
(45) In addition, the sound radiated in directions other than that of receiver has a peaky frequency response due to the response of inverse filter matrix H and normally results in severe coloration. This contributes to coloured reverberation and makes listening in any other location other than one optimal location impractical.
(46) Equation (7) can be rewritten in terms of the source azimuth span as
(47)
(48) As seen from the analysis above, frequencies with the source span where n is an odd integer number in Eq. (8) give the best control performance as well as robustness.
(49) The Optimal Source Distribution (OSD) introduced the idea of a pair of conceptual monopole transducers whose span varies continuously as a function of frequency (
(50) As discussed above, the two-channel OSD essentially uses the frequency span region where the two singular values, representing the in-phase and out-of-phase components of the binaural reproduction process, are balanced in order to overcome the fundamental problems of conventional binaural reproduction over loudspeakers. However, a system which aims to improve this further is proposed in what follows. For convenience, we refer to it as the three channel OSD system in contrast to the earlier OSD that will henceforth be referred as the two channel OSD.
(51) Now we try to make use of the lowest value (6 dB, at points B in
(52) Since the condition for the in-phase component has now been relaxed, we now can use the optimal value (points B in
(53) In order to see the effect of this additional transducer, we consider the simple case again where monopole transducers are used for binaural reproduction as in section 2.2 but this time with another transducer added on the median plane. The block diagram and geometry are illustrated in
(54)
where an e.sup.jt time dependence is assumed with k=c.sub.0, and where .sub.0 and c.sub.0 are the density and sound speed.
(55) Note that the system is under-determined in that there can be a number of choices of the inverse filter matrix which produces no error [22] [23]. Among them, the minimum norm solution would be the most straightforward choice as well as giving the best performance with regard to the fundamental problems described in Section 3.13.3. Therefore, the following examples use the minimum norm solution.
(56) The 2-norm of H (H) and the two singular values .sub.o and .sub.i with respect to frequency are illustrated in
(57) Having a third transducer for two point reproduction (i.e the mathematically under-determined case), the balance between the two singular values .sub.o and .sub.i can be changed independently by changing the relative sensitivity of the transducer of the centre channel with respect to those on the left and right. This is an important aspect which the three channel OSD possesses which in contrast the two channel OSD does not. If the sensitivity of the centre channel transducer is increased by the factor of {square root over (2)}, the two singular values .sub.o and .sub.i become equal to each other at n=2, 6, 10, . . . and that is shown in
(58) The singular value .sub.i at n=0, 4, 8, . . . is always smaller than that of at n=2, 6, 10, . . . where all three transducers can contribute to the reproduction of in phase component. The 2-norm of H (H) and the two singular values .sub.o and .sub.i of the 3 channel OSD with respect to frequency are illustrated in
(59) The three channel OSD requires, for the transmission of the left and right channels, monopole type transducers whose position varies substantially continuously as frequency varies, similar to the case with the two channel OSD. This may, for example, be realised by exciting a substantially triangular shaped plate whose width varies along its length. The requirement of such a transducer is that a certain frequency or a certain range of frequencies of vibration is excited most at a particular position having a certain width such that sound of that frequency is radiated mostly from that position (
(60) From Eq. (7), the range of source direction is given by the frequency range of interest as can be seen from
(61) Eq. (7) can also be rewritten in terms of frequency as
(62)
(63) The smallest value of n gives the lowest frequency limit for a given source direction. Since sin 1,
(64)
ie, the physically maximum source azimuth of .sub.L=.sub.R=90 gives the low frequency limit, f.sub.1, associated with this principle. A smaller value of n gives a lower low frequency limit so the system given by n=2 is normally the most useful among those with n=2, 6, 10, . . . . The low frequency limit given by n=2 of a system designed for all average human is about f.sub.i=700 Hz, which is higher than that for two channel OSD where it is about 350 Hz. Below the low frequency limit of three channel OSD, the performance gradually approaches that of two channel OSD, becoming identical below the low frequency limit of two channel OSD.
(65) In
(66) The fundamental behaviour is the same for the more realistic case where various other factors such as the Head Related Transfer Function come into effect as in the case with the two channel OSD.
(67) The discretisation of the Optimal Source Distribution can also be used for the three channel OSD in a similar way to the two channel case. In practice, whilst a monopole transducer whose position varies continuously as a function of frequency may not be easily available it is possible to realize a practical system based on the underlying principle by discretising the transducer span. With a given span, the frequency region where the amplification is relatively small and plant matrix C is well conditioned is relatively wide around the optimal frequency.
(68) Therefore, by allowing n to have some width, say (0<<2), a certain transducer span can nevertheless be allocated to cover a certain range of frequencies where control performance and robustness of the system is still reasonably good (
(69) The difference of the slope around the ideal frequency/span relationship has advantages here again in many ways. For the same given tolerance width of n, the error will be much smaller than that in the two channel OSD. So the same level of discretisation gives a better approximation to the ideal case for the three channel OSD. For the same level of approximation, the discretisation can be coarser hence saving resources. The maximum width of n, which is the maximum allowance for , becomes twice that in the two channel OSD, i.e. 0<<2. In general, the performance of the discretised three channel OSD is much better due to the fact that the valley in
(70) The condition number for the case shown in
(71) Reference will now be made to
(72) Turning initially to
(73)
(74)
(75)
(76)
(77)
(78)
(79)
(80) With reference to
(81) A new binaural reproduction system has been described which overcomes the fundamental problems with system inversion by utilising three-channels of transducers with variable position with respect to frequency.
(82) This system can most easily be realised in practice by discretising the theoretical continuously variable transducer span which results in multi-way sound control system.
(83) The three channel OSD arrangement finds application in numerous ways and in particular in the field of home audio. A particularly advantageous implementation is in the context of the transducers of portable media devices, such as mobile telephones and portable gaming devices, and so enhances the listener's experience of sound emitted thereby. Some portable media devices (such as MP3 players) are capable of being interfaced with a separate speaker arrangement (sometimes known as a docking station). Such speaker arrangements would benefit from being adapted to implement the three channel OSD arrangement.
REFERENCES
(84) [1] J. Blauert, Spatial Hearing; The Psychophysics of Human Sound Localization (MIT Press, Cambridge, Mass., 1997). [2]H. Mller, Fundamentals of Binaural Technology, Appl. Acoust. 36, 171-218 (1992). [3] D. R. Begault, 3-D Sound for Virtual Reality and Multimedia (AP Professional, Cambridge, Mass., 1994). [4] M. R. Schroeder, B. S. Atal, Computer Simulation of Sound Transmission in Rooms, IEEE Intercon. Rec. Pt7, 150-155 (1963). [5] P. Damaske, Head-related Two-channel Stereophony with Reproduction, J. Acoust. Soc. Am. 50, 1109-1115 (1971). [6]H. Hamada, N. Ikeshoji, Y. Ogura And T. Miura, Relation between Physical Characteristics of Orthostereophonic System and Horizontal Plane Localisation, Journal of the Acoustical Society of Japan, (E) 6, 143-154, (1985). [7] J. L. Bauck and D. H. Cooper, Generalized Transaural Stereo and Applications, J. Acoust. Soc. Am. 44 (9), 683-705 (1996). [8] P. A. Nelson, O. Kirkeby, T. Takeuchi, and H. Hamada, Sound fields for the production of virtual acoustic images, J. Sound. Vib. 204 (2), 386-396 (1997). [9] M. Miyoshi and N. Koizumi, New transaural system for teleconferencing service. Proceedings of the International Symposium on Active Control of Sound and Vibration, Acoustical Society of Japan, Apr. 9-11, (1991), Nippon-Toshi-Center, Tokyo, Japan. Pages 217-222. [10] M. Miyoshi and Y. Kaneda, Inverse filtering of room acoustics IEEE Transactions on Acoustics Speech and Signal Processing 36, 145-152 (1988). [11] S. Uto, H. Hamada, T. Miura, P. A. Nelson and S. J. Elliott, Proceedings of the International Symposium on Active Control of Sound and Vibration, Acoustical Society of Japan, Apr. 9-11, (1991), Nippon-Toshi-Center, Tokyo, Japan. Pages 421-426. [12] D. H. Cooper and J. L. Bauck, Head diffraction compensated stereo system with loudspeaker array U.S. Pat. No. 5,333,200 (1994). [13] T. Takeuchi and P. A. Nelson, Optimal source distribution for binaural synthesis over loudspeakers, J. Acoust. Soc. Am. 112, 2786 (2002). [14] P. A. Nelson, F. Orduna-Bustamante, and H. Hamada, Inverse Filter Design and Equalisation Zones in Multi-Channel Sound Reproduction, IEEE Trans. Speech Audio Process. 3(3), 185-192 (1995). [15] O. Kirkeby, P. A. Nelson, F. Orduna-Bustamante, and H. Hamada, Local Sound Field Reproduction Using Digital Signal Processing, J. Acoust. Soc. Am. 100, 1584-1593 (1996). [16] E. M. Wenzel, M. Arruda, D. J. Kistler and F. L. Wightman, Localisation using nonindividualized head-related transfer functions, J. Acoust. Soc. Am. 94(1), 111-123 (1993). [17]H. Mller, M. F. Srensen, D. Hammershi, and C. B. Jensen, Head-Related Transfer Functions on Human Subjects, J. Audio Eng. Soc., 43, 300-321 (1995). [18] T. Takeuchi, P. A. Nelson, O. Kirkeby and H. Hamada, Influence of Individual Head Related Transfer Function on the Performance of Virtual Acoustic Imaging Systems, 104th AES Convention Preprint 4700 (P4-3), (1998). [19] T. Takeuchi, P. A. Nelson, and H. Hamada, Robustness to Head Misalignment of Virtual Sound Imaging Systems, J. Acoust. Soc. Am. 109(3), 958-971 (2001). [20] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C, Second edition, (Cambridge University Press, 1992). [21] T. Takeuchi, P. A. Nelson, O. Kirkeby and H. Hamada, The Effects of Reflections on the Performance of Virtual Acoustic Imaging Systems, pp. 955-966, in Proceedings of the Active 97, The international symposium on active control of sound and vibration, Budapest, Hungary, Aug. 21-23, (1997), OPAKFI. [22] S. J. Elliot, C. C. Boucher, and P. A. Nelson, The Behavior of a Multiple Channel Active Control System, IEEE Trans. Signal Process 40(5), (1992). [23] D. J. Rossetti, M. R. Jolly, and S. C. Southward, Control Effort Weighting in Feedforward Adaptive Control Systems, J. Acoust. Soc. Am. 99(5), (1996).