GENERATING AN AUDIO SIGNAL ASSOCIATED WITH A VIRTUAL SOUND SOURCE
20230017323 · 2023-01-19
Inventors
Cpc classification
H04S5/005
ELECTRICITY
H04S7/302
ELECTRICITY
H04S2400/11
ELECTRICITY
H04S7/30
ELECTRICITY
International classification
H04S7/00
ELECTRICITY
Abstract
A method for generating an audio signal associated with a virtual sound source is disclosed. The method comprises obtaining an input audio signal x(t) and modifying the input audio signal x(t) to obtain a modified audio signal. The latter step comprises performing a signal delay operation. Optionally, modifying the input audio signal comprises a signal inverting operation and/or a signal amplification or attenuation and/or a signal feedback operation. The method further comprises generating the audio signal y(t) based on a combination, e.g. a summation, of the input audio signal x(t) and the modified audio signal.
Claims
1. A method for generating an audio signal y(t) associated with a virtual sound source, the method comprising either (i) obtaining an input audio signal x(t), and modifying the input audio signal x(t) to obtain a modified audio signal using a signal delay operation introducing a time delay; and generating the audio signal y(t) based on a combination of the input audio signal x(t), or of an inverted and/or attenuated or amplified version of the input audio signal x(t), and the modified audio signal, or the method comprising (ii) obtaining an input audio signal x(t), and generating the audio signal y(t) based on a signal feedback operation that recursively adds a modified version of the input audio signal x(t) to itself, wherein the signal feedback operation comprises a signal delay operation introducing a time delay.
2. The method according to claim 1, wherein the virtual sound source has a shape, the method comprising generating audio signal components associated with respective virtual points on the shape of the virtual sound source, wherein said generating audio signal components comprises generating a first audio signal component associated with a first virtual point on the shape of the virtual sound source and a second audio signal component associated with a second virtual point on the shape of the virtual sound source, wherein either (i) generating the first audio signal component comprises modifying the input audio signal to obtain a modified first audio signal component using a first signal delay operation introducing a first time delay and comprises generating the first audio signal component based on a combination of the input audio signal or of an inverted and/or attenuated or amplified version of the input audio signal x(t), and the modified first audio signal component, or wherein (ii) generating the first audio signal component comprises using a feedback loop that recursively adds a modified version of the input audio signal x(t) to itself, wherein the feedback loop comprises a signal delay operation introducing a first time delay and a signal inverting operation, and wherein either (i) generating the second audio signal component comprises modifying the input audio signal to obtain a modified second audio signal component using a second signal delay operation introducing a second time delay different from the first time delay and comprises generating the second audio signal component based on a combination of the input audio signal or of an inverted and/or attenuated or amplified version of the input audio signal x(t), and the modified second audio signal component, or wherein (ii) generating the second audio signal component comprises using a feedback loop that recursively adds a modified version of the input audio signal x(t) to itself, wherein the feedback loop comprises a signal delay operation introducing a second time delay and a signal inverting operation.
3. The method according to claim 2, comprising obtaining shape data representing virtual positions of the respective virtual points on the shape of the virtual sound source, and determining the first resp. second time delay based on the virtual position of the first resp. second virtual point.
4. The method according to claim 1, wherein the virtual sound source has a distance from an observer, the method comprising modifying the input audio signal using a time delay operation introducing a time delay and a signal feedback operation to obtain a first modified audio signal; generating a second modified audio signal based on a combination of the input audio signal x(t) and the first modified audio signal; and generating the audio signal y(t) based on the second modified audio signal, this step comprising attenuating the second modified audio signal.
5. The method according to claim 1, wherein the virtual sound source has a distance from an observer, the method comprising modifying the input audio signal to obtain a first modified audio signal using a signal feedback operation that recursively adds a modified version of the input audio signal to itself, wherein the signal feedback operation comprises a signal delay operation introducing a time delay, generating the audio signal y(t) based on the first modified audio signal, this step comprising a signal attenuation.
6. The method according to claim 4, wherein the introduced time delay is shorter than 0.00007 seconds, preferably shorter than 0.00005 seconds, more preferably shorter than 0.00002 seconds, most preferably approximately 0.00001 seconds.
7. The method according to claim 4, comprising attenuating the second modified audio signal in dependence of distance of the virtual sound source.
8. The method according to claim 7, wherein the signal feedback operation comprises attenuating a signal, and recursively adding the attenuated signal to the signal itself, the method further comprising controlling a degree of attenuation in the signal feedback operation and a degree of attenuation of the second modified audio signal in dependence of said distance, such that a larger the distance is, a lower the degree of attenuation in the signal feedback operation and a higher the degree of attenuation of the second modified audio signal.
9. The method according to claim 7, wherein modifying the input audio signal to obtain the first modified audio signal comprises a particular signal attenuation, the method comprising controlling a degree of attenuation of the particular signal attenuation and the degree of attenuation of the second modified audio signal in dependence of said distance, such that a larger the distance is, a lower the degree of attenuation of the particular signal attenuation and a higher the degree of attenuation of the second modified audio signal.
10. The method according to claim 1, wherein the virtual sound source is positioned at a virtual height above an observer, the method comprising modifying the input audio signal x(t) using a signal inverting operation, a signal attenuation operation and a time delay operation introducing a time delay in order to obtain a third modified audio signal, and generating the audio signal based on a combination of the input audio signal and the third modified audio signal.
11. The method according to claim 10, wherein modifying the input audio signal to obtain the third modified audio signal comprises performing a signal feedback operation.
12. The method according to claim 10, wherein said signal attenuation operation for obtaining the third modified audio signal is performed in dependence of the virtual height of the virtual sound source.
13. The method according to claim 12, wherein said signal attenuation operation is performed such that a higher the virtual sound source is positioned above the observer, a lower a degree of attenuation is.
14. The method according to claim 10, wherein the time delay that is introduced for obtaining the third modified audio signal is shorter than 0.00007 seconds, preferably shorter than 0.00005 seconds, more preferably shorter than 0.00002 seconds, most preferably approximately 0.00001 seconds.
15. The method according to claim 1, wherein the virtual sound source is positioned at a virtual depth below an observer, the method comprising modifying the input audio signal x(t) using a time delay operation introducing a time delay, a first signal attenuation operation and a signal feedback operation in order to obtain a sixth modified audio signal; and generating the audio signal based on a combination of the input audio signal and the sixth modified audio signal.
16. The method according to claim 1, wherein the virtual sound source is positioned at a virtual depth below an observer, the method comprising generating the audio signal y(t) using a signal feedback operation that recursively adds a modified version of the input audio signal to itself, wherein the signal feedback operation comprises a signal delay operation introducing a time delay and a first signal attenuation operation.
17. The method according to claim 1, wherein the virtual sound source is positioned at a virtual depth below an observer, the method comprising modifying the input audio signal to obtain a sixth modified audio signal using a signal feedback operation that recursively adds a modified version of the input audio signal to itself, wherein the signal feedback operation comprises a signal delay operation introducing a time delay and a first signal attenuation, and generating the audio signal based on a combination of the sixth modified audio signal and time-delayed and attenuated version of the sixth modified audio signal.
18. The method according to claim 15, wherein the introduced time delay for obtaining the sixth modified audio signal is shorter than 0.00007 seconds, preferably shorter than 0.00005 seconds, more preferably shorter than 0.00002 seconds, most preferably approximately 0.00001 seconds.
19. The method according to claim 15, wherein performing the signal feedback operation comprises recursively adding an attenuated version of a signal to itself.
20. The method according to claim 15, wherein the first signal attenuation operation is performed in dependence of the virtual depth of the virtual sound source below the observer.
21. The method according to claim 20, wherein said first signal attenuation operation is performed such that a lower the virtual sound source is positioned below the observer, a lower the attenuation is.
22. The method according to claim 1, further comprising receiving a user input indicative of a shape of the virtual sound source, and/or indicative of respective virtual positions of virtual points on the shape of the virtual sound sources, and/or indicative of a distance between the virtual sound source and an observer, and/or indicative of a height at which the virtual sound source is positioned above the observer, and/or indicative of a depth at which the virtual sound source is positioned below the observer.
23. The method according to claim 1, further comprising generating a user interface enabling a user to input at least one of: a shape of the virtual sound source, respective virtual positions of virtual points on the shape of the virtual sound source, a distance between the virtual sound source and an observer, a height at which the virtual sound source is positioned above the observer, a depth at which the virtual sound source is positioned below the observer.
24. A computer comprising a a computer readable storage medium having computer readable program code embodied therewith, and a processor, preferably a microprocessor, coupled to the computer readable storage medium, wherein responsive to executing the computer readable program code, the processor is configured to perform the method according to claim 1.
25. A computer program or suite of computer programs comprising at least one software code portion or a computer program product storing at least one software code portion, the software code portion, when run on a computer system, being configured for executing the method according to claim 1.
26. A non-transitory computer-readable storage medium storing at least one software code portion, the software code portion, when executed or processed by a computer, is configured to perform the method according to claim 1.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0066] Aspects of the invention will be explained in greater detail by reference to exemplary embodiments shown in the drawings, in which:
[0067]
[0068]
[0069]
[0070]
[0071]
[0072]
[0073]
[0074]
[0075]
[0076]
[0077]
[0078]
[0079]
[0080]
[0081]
[0082]
[0083]
[0084]
[0085]
[0086]
[0087]
[0088]
DETAILED DESCRIPTION OF THE DRAWINGS
[0089] Sound waves inherently carry detailed information about the environment, and about the observer of sound within the environment. This disclosure describes a soundwave transformation (spatial wave transform, or SWT), a method for generating an audio signal, that is perceived to have spatially coherent properties with regards to the dimensional size and shape of the reproduced sound source, its relative distance towards the observer, its height or depth above or below the observer and its directionality if the source is moving towards or away from the observer.
[0090] Typically, the spatial wave transform is an algorithm executed by a computer with as input a digital audio signal (e.g. a digital recording) and as output one or multiple modified audio signal(s) which can be played back on conventional audio playback systems. Alternatively, the transform could also apply to analogue (non-digital) means of generating and/or processing audio signal(s). Playing back the modified sound signal(s) will give the observer an improved perception of dimensional size and shape of the reproduced sound source (f.i, a recorded signal of a violin will sound as if the violin is physically present) and the sound source's spatial distance, height and depth in relation to the observer (f.i, the violin sounds at distinctive distance from the listener, and height above or depth below), while masking the physical properties of the sound output medium. i.e. the loudspeaker(s) (that is, the violin does not sound as if it is coming from a speaker).
[0091]
[0092] The input audio signal x(t) may have been output by a recording process in which sounds have been recorded and optionally converted into a digital signal. In an example, a musical instrument, such as a violin, has been recorded in a studio to obtain the audio signal that is input for the method for generating the audio signal as described herein.
[0093] The input audio signal x(t) is subsequently modified to obtain a modified audio signal. The signal modification comprises a signal delay operation 4 and/or a signal inverting operation 6 and/or a signal amplification or attenuation 8 and/or a signal feedback operation 10, 12.
[0094] The signal delay operation 4 may be performed using well-known components, such as a delay line. The signal inverting operation 6 may be understood as inverting a signal such that an input signal x(t) is converted into −x(t). The amplification or attenuation 8 may be a linear amplification or attenuation, which may be understood as amplifying or attenuating a signal by a constant factor a, such that a signal x(t) is converted into a *x(t).
[0095] The signal feedback operation may be understood to comprise recursively combining a signal with an attenuated version of itself. This is schematically depicted by the attenuation operation 12 that sits in the feedback loop and the combining operation 10. Decreasing the attenuation, i.e. enlarging constant b in
[0096] Herewith, the response of different materials to vibrations can be simulated based on their density and stiffness. For instance, the response of a metal object will generate a higher Q-factor than an object of the same size and shape made out of wood.
[0097] The combining operations 10 and 14 may be understood to combine two or more signals {x.sub.1(t) . . . . , x.sub.n(t)}. The input signals may be converted into a signal y(t) as follows.
[0098] In
[0099] The transformation of the input audio signal x(t) to the audio signal y(t) may be referred to hereinafter as the Spatial Wave Transform (SWT).
[0100] The method for generating the audio signal y(t) does not require finite computational methods, such as methods involving Fast Fourier Transforms, which may limit the achievable resolution of the generated audio signal. Thus, the method disclosed herein enables to form high-resolution audio signals. Herein, high-resolution may be understood as a signal with spectral modifications for an infinite amount of frequency components. The virtually infinite resolution is achieved because the desired spectral information does not need to be computed and modified for each individual frequency component, as would be the case in convolution or simulation models, but the desired spectral modification of frequency components results from the simple summation, i.e. wave interference of two identical audio signals with a specific time delay, amplitude and/or phase difference. This operation results in phase and amplitude differences for each frequency component in harmonic ratios, i.e. corresponding to the spectral patterns caused by resonance. The time delays relevant to the method are typically between 0.00001-0.02 seconds, but not excluding longer times.
[0101] The generated audio signal y(t) may be presented to an observer through a conventional audio output medium, e.g. one or more loudspeakers. The generated audio signal may be delayed in time and/or attenuated before being output to the audio output medium.
[0102]
[0103] Further.
[0104] Herein,
[0105] In
[0106] The embodiment of
[0107] The embodiment of
[0108]
[0109]
[0110] It should be appreciated that the embodiments of
[0111]
[0112]
[0113]
[0114] These figures show that the spectrum of an audio signal can be modified precisely according to harmonic ratios, using a very simple operation.
[0115]
[0116]
[0117]
[0118] Optionally, the modification also comprises a signal feedback operation 18.sub.1-18.sub.n, but this is not required for adding the dimensional information of the virtual sound source to the audio signal. The depicted embodiment shows that each audio signal component y.sub.n(t) may be the result of a summation of the input audio signal x(t) and the inverted, time-delayed input audio signal. While
[0119] For a string shaped virtual sound source of 1 meter long, the time differences for 17 equidistant positioned virtual points on the string may be as follows:
TABLE-US-00001 n Δt (s) 1 0.00000 2 0.00036 3 0.00073 4 0.00109 5 0.00146 6 0.00182 7 0.00219 8 0.00255 9 0.00292 10 0.00255 11 0.00219 12 0.00182 13 0.00146 14 0.00109 15 0.00073 16 0.00036 17 0.00000
[0120] These values for the introduced time delays are in accordance with Δt.sub.n=Lx.sub.n/v, wherein L indicates the length of the string, wherein x.sub.n denotes for virtual point n a multiplication factor and v relates to the speed of sound through a medium. For the values in the table, a value of 343 m/s was used, which is the velocity of sound waves moving through air at 20 degrees Celsius. A virtual point may be understood to be positioned on a line segment that runs from the center of the virtual sound source, e.g. the center of a string, plate or cube to an edge of the virtual sound source. As such, the virtual point may be understood to divide the line segment in two parts, namely a first part of the line segment that runs between an end of the virtual sound source and the virtual point and a second part of the line segment that runs between the virtual point and the center of the virtual sound source. The multiplication factor may be equal to the ratio between the length of the line segment's first part and the length of the line segment's second part. Accordingly, if the virtual point is positioned at an end of the sound source, the multiplication factor is zero and if the virtual point is positioned at the center of the virtual sound source, the multiplication factor is one. Thus, with these values, a user will perceive the generated audio signal as originating from a string-shaped sound source that is one meter in length, whereas the loudspeakers need not be spatially arranged in a particular manner.
[0121] In an embodiment, the method comprises obtaining shape data representing the virtual positions of the respective virtual points on the virtual sound source's shape and determining the time delays that are to be introduced by the respective time delay operations based on the virtual positions of the respective virtual points, preferably in accordance with the above described formula.
[0122]
[0123] Although
[0124]
[0125] In an embodiment, each of the generated audio signal components may in principle be fed to all loudspeakers that are present. However, depending on the panning method that is used, some of the audio signal components may be fed to a loudspeaker with zero amplification. Herewith, effectively, such loudspeaker does not receive such audio signal component. This is depicted in
[0126]
[0127] The virtual sound source may be shaped as a set of regular polygons; as well as shapes that are non-symmetrical, irregular or organically formed.
[0128]
[0129]
[0130] In this embodiment, determining an audio signal component comprises determining a first modified audio signal component and a third modified audio signal component. Determining the first resp. third modified audio signal component may comprise using a first resp. second time delay operation and a signal inverting operation and, optionally, a first resp. second signal feedback operation.
[0131] In this example, two combinations 32 and 38 are performed per audio signal component, however, for more complex shaped virtual sound sources, such as three dimensionally shaped sources, three or even more combination operations are performed per audio signal component. An example of this is shown in
[0132] It should be appreciated that although
[0133]
[0134] A first step comprises determining, for each virtual point, three values for the above mentioned multiplication factor x, viz. x.sub.A, x.sub.B, x.sub.C in accordance with the following formulas:
[0135] Herein R denotes the radius of a circle 52 passing through the vertices where two or more edges of the virtual sound source 50 meet. In this example, R is the radius of the circumscribed circle 52 of the square plate 50.
[0136] Further, r.sub.n.A denotes (see left illustration in
[0137] r.sub.n.B denotes (see middle illustration in
[0138] r.sub.n.C denotes (see right hand side illustration in
[0139] In a next step, the associated time delays Δt.sub.A, Δt.sub.B, Δt.sub.C are determined in accordance with Δt=Ax/v, wherein Δt.sub.B is only determined if x.sub.B is equal to or smaller than 0.25. Accordingly, for a square plate having 25 cm long edges and 25 virtual points as shown in
TABLE-US-00002 n x.sub.A x.sub.B x.sub.C Δt.sub.A (s) Δt.sub.B (s) Δt.sub.C (s) 1 0 0 0 0 0 0 2 0 0.25 0.125 0 0.003125 0.00156 3 0 1 0.0833 0 — 0.00104 4 0 0.25 0.125 0 0.003125 0.00156 5 0 0 0 0 0 0 6 0 0.25 0.125 0 0.003125 0.00156 7 0.25 0.25 0.0833 0.003125 0.003125 0.00104 8 0.25 1 0.125 0.003125 — 0.00156 9 0.25 0.25 0.0833 0.003125 0.003125 0.00104 10 0 0.25 0.125 0 0.003125 0.00156 11 0 1 0.0833 0 — 0.00104 12 0.25 1 0.125 0.003125 — 0.00156 13 0.33 1 0.167 0.004167 — 0.00208 14 0.25 1 0.125 0.003125 — 0.00156 15 0 1 0.0833 0 — 0.00104 16 0 0.25 0.125 0 0.003125 0.00156 17 0.25 0.25 0.0833 0.003125 0.003125 0.00104 18 0.25 1 0.125 0.003125 — 0.00156 19 0.25 0.25 0.0833 0.003125 0.003125 0.00104 20 0 0.25 0.125 0 0.003125 0.00156 21 0 0 0 0 0 0 22 0 0.25 0.125 0 0.003125 0.00156 23 0 1 0.0833 0 — 0.00104 24 0 0.25 0.125 0 0.003125 0.00156 25 0 0 0 0 0 0
[0140] As shown, some values of Δt.sub.A, Δt.sub.B, Δt.sub.C are zero, or not determined because x.sub.B>0.25. As a result, for each virtual point n, one or two different nonzero values are present for Δt.sub.A, Δt.sub.B, Δt.sub.C. These values are then determined to be Δt.sub.1 and Δt.sub.2. (See below table).
[0141] The cut-off frequency for the high pass filter for each virtual point n may be determined as
[0142] Thus, for a virtual sound source having a plate shape with a total surface area A of 625 cm.sup.2 which vibrates freely on its edges and is homogenous in its material structure, the following values for Δt and f.sub.c may be used.
TABLE-US-00003 n Δt.sub.1 (s) Δt.sub.2 (s) f.sub.c (Hz) 1 0 0 40 2 0.003125 0.00156 53.33 3 0.00104 0 80 4 0.003125 0.00156 53.33 5 0 0 40 6 0.003125 0.00156 53.33 7 0.003125 0.00104 80 8 0.003125 0.00156 53.33 9 0.003125 0.00104 80 10 0.003125 0.00156 53.33 11 0.00104 0 80 12 0.003125 0.00156 53.33 13 0.004167 0.00208 40 14 0.003125 0.00156 53.33 15 0.00104 0 80 16 0.003125 0.00156 53.33 17 0.003125 0.00104 80 18 0.003125 0.00156 53.33 19 0.003125 0.00104 80 20 0.003125 0.00156 53.33 21 0 0 40 22 0.003125 0.00156 53.33 23 0.00104 0 80 24 0.003125 0.00156 53.33 25 0 0 40
[0143] Thus, with these values, a user will perceive the generated audio signal as originating from a plate-shaped sound source of homogeneous substance and of particular size, whereas the loudspeakers need not be spatially arranged in a particular manner.
[0144] In an embodiment, the method comprises obtaining shape data representing the virtual positions of the respective virtual points on the virtual sound source's shape and determining the time delays that are to be introduced by the respective time delay operations based on the virtual positions of the respective virtual points. If the virtual sound source is shaped as a square plate, then the time delays may be determined using the formula described above.
[0145] Similarly as for 2D shapes, for a 3D shape two or more modified audio signal components are determined for some or each of the generated audio signal components y.sub.n(t) associated with virtual points that are defined on the shape. The values for the to be introduced time delays for each virtual point are in accordance with Δt=Vx/v, wherein V being the volume of the shape, wherein x denotes for virtual point n a multiplication factor according to the radial length r.sub.n from the centre and/or the edges of the shape to point n, and v relates to the speed of sound through a medium.
[0146] For each geometrical shape and/or different materials of heterogenous substance or material conditions, different variations of the algorithm may apply in accordance with the relationship between spatial dimensions of the shape and the time difference value at each virtual point.
[0147] For shapes that are not regular polygons and/or irregularly shaped, more than two or many modified audio signal components may be obtained for some or each of the generated audio signal components y.sub.n(t).
[0148]
[0149] The embodiment of
[0150] It should be appreciated that although
[0151]
[0152]
[0153]
[0154] In this embodiment, the input audio signal x(t) is modified using a time delay operation introducing a time delay and a signal feedback operation to obtain a first modified audio signal. Then, a second modified audio signal is generated based on a combination of the input audio signal x(t) and the first modified audio signal. The audio signal y(t) is generated by attenuating the second modified audio signal and optionally by performing a time delay operation as shown.
[0155] Preferably, the time delay that is introduced by the time delay operation performed for obtaining the first modified audio signal is as short as possible, e.g. shorter than 0.00007 seconds, preferably shorter than 0.00005 seconds, more preferably shorter than 0.00002 seconds. Most preferably, approximately 0.0001 seconds. In case of a digital sample rate of 96 kHz, the time delay may be 0.00001 seconds.
[0156] In dependence of the value of c together with value d, an observer will perceive different distances between himself and the virtual sound source. Herein, values in the triangles. i.e. in the attenuation or amplification operations may be understood to indicate a constant with which a signal is multiplied. Thus, if such value is larger than 1, then a signal amplification is performed. If such value is smaller than 1, then a signal attenuation is performed. When c=0 and d=1 no distance will be perceived and when c=1 and d=0 a maximum distance will be perceived corresponding a relative distance where the sound source has become imperceptible, and thus the output of the resulting sum audio signal will be 0 (-inf db). For performing the signal feedback operation to determine the first modified audio signal, the value for d may relate to the value for c as d=1−cx where the value for x is a multiplication factor equal to or smaller than 1 applied to the amount of signal feedback that influences the steepness of a high-frequency dissipation curve.
[0157] In an example, the method comprises obtaining distance data representing the distance of the virtual sound source. Then, the input audio signal is attenuated in dependence of the distance of the virtual sound source in order to obtain the modified audio signal.
[0158] The optional time delay indicated by Δt.sub.2 can create a Doppler effect associated with movement of the virtual sound source. Δt.sub.2 may be determined as Δt.sub.2=L/v, wherein L is a distance between the sound source S and the observer O and v is the speed of sound through a medium.
[0159]
[0160]
[0161]
[0162]
[0163]
[0164]
[0165]
[0166]
[0167] It should be appreciated that the signal delay operation, the signal inversion operation and the signal attenuation operation may be performed in any order.
[0168] The input audio signal x(t) may be attenuated in dependence of the height to obtain the third modified audio signal, preferably such that the higher the virtual sound source is positioned above the observer, the lower the degree of attenuation is. This is shown in
[0169] The introduced time delays as depicted in
[0170] In case the virtual sound source is positioned above a listener modifying the input audio signal to obtain the third modified audio signal optionally comprises performing a signal feedback operation. In a particular example, this step comprises recursively adding an attenuated version of a signal, e.g. the signal resulting from the time delay operation, signal attenuation operation and signal inverting operation that are performed to eventually obtain the third modified audio signal, to itself. If the signal feedback operation is performed, value f may be equal to f=e*x where the value for x is a multiplication factor smaller than 1 applied to the amount of signal feedback that influences the steepness of a low-frequency dissipation curve. By varying value e, preferably between 0-1, a perception of height can be added to an audio signal, optionally with value f simultaneously. Herein, e=0 and f=0 correspond to no perceived height and e=1 and f<1 to a maximum perceived height, i.e. a distance above the observer where the sound source has become close to imperceptible.
[0171]
[0172]
[0173]
[0174]
[0175]
[0176] The introduced time delay as depicted in
[0177] When g=0 and h=0 no depth will be perceived and when g=1 and h=1 a maximum depth will be perceived between the sound source S and the observer O. For performing the signal feedback operation to determine the third modified audio signal, the value for h may relate to the value for g as h=g*x where the value for x is a multiplication factor equal to or smaller than 1 applied to the amount of signal feedback, which influences the steepness of a high-frequency dissipation curve.
[0178]
[0179]
[0180]
[0181]
[0182]
[0183]
[0184] Further, it should be appreciated that building block 21 may be any of the building blocks depicted in
[0185] In the depicted embodiment, generating an audio signal component thus comprises adding dimensional information to the input audio signal, which may be performed by the steps indicated by box 72, adding distance information, which may be performed by steps indicated by box 74, and adding height information, which may be performed by steps indicated by box 76, or depth information, which may be performed by steps indicated by box 78. Further, a doppler effect may be added to the input audio signal, for example by adding an additional time delay as shown in box 80.
[0186] Preferably, because a virtual sound source is either positioned above or below an observer, only one of the modules 76 or 78 is performed. Module 76 can be set as inactive by setting e=0 and module 78 can be set inactive by setting g=0.
[0187]
[0192] All functional operations of a spatial wave transform are translated to front-end user properties, i.e. audible manipulations of sound in a virtual space. The application of the invention is in no way limited to the lay-out and of this particular interface example and can be the subject of numerous approaches in system design and involve numerous levels of control for shaping and positioning sound sources in a virtual space, nor is it limited to any particular platform, medium or visual design and layout.
[0193] The depicted user interface 90 comprises an input module that enables a user to control the input audio signal of a chain using input receives. The input receives may comprise of multiple audio channels, either receiving from other chains or external audio sources, together combined as the audio input signal of a chain. The user interface enables a user to control the amplification of each input channel. e.g. by using gain knobs 92.
[0194] The user interface 90 may further comprise an output module that enables a user to route the summed audio output signal of the chain as an audio input signal to other chains.
[0195] The user interface 90 may further comprise a virtual sound source definition section that enables a user to input parameters relating to the virtual sound source, such as its shape, e.g. by means of a drop-down menu 96, and/or whether the virtual sound source is hollow or solid and/or the scale of the virtual sound source and/or its dimensions, e.g. its Cartesian dimensions and/or a rotation and/or a resolution. The latter indicates how many virtual points are determined per unit of virtual surface area. This allows a user to control the amount of required calculations.
[0196] The input means for inputting parameters relating to rotation may be presented as endless rotational knobs for dimensions x, y and z
[0197] The user interface 90 may further comprise a position sector that enables a user to input parameters relating to the position of the virtual sound source, the position of the shape in 3-dimensional space may be expressed in Cartesian coordinates +/−x, y, z wherein the virtual center of the space is denoted as 0, 0, 0; and which may be presented as a visual 3-dimensional field that one can place and move a virtual object within. This 3-dimensional control field may be scaled in size by adjusting the radius of the field.
[0198] The user interface 90 may further comprise an attributes section 100 that enables a user to control various parameters, such as the bandwidth and peak level of the resonance, perceived distance, perceived elevation, doppler effect.
[0199] The user interface 90 may further comprise an output section 102 that enables a user to control the output. For example, the discrete amplification of each audio signal component that is distributed to a configured amount of audio output channels may be controlled. The gain of each loudspeaker may be automatically controlled by i) the modelling of the virtual sound source's shape, ii) the rotation of the shape in 3-dimensional space and iii) the position of the shape in 3-dimensional space. The method for distribution of the audio signal components to the audio output channels may depend on the type of loudspeaker configuration and may be achieved by any such methods known in the art.
[0200] The output section 102 may comprise a master level fader 104.
[0201] The user input that is received through the user interface may be used to determine appropriate values for the parameters according to methods described herein.
[0202]
[0203] The memory elements 1104 may include one or more physical memory devices such as, for example, local memory 1108 and one or more bulk storage devices 1110. The local memory may refer to random access memory or other non-persistent memory device(s) generally used during actual execution of the program code. A bulk storage device may be implemented as a hard drive or other persistent data storage device. The processing system 1100 may also include one or more cache memories (not shown) that provide temporary storage of at least some program code in order to reduce the number of times program code must be retrieved from the bulk storage device 1110 during execution.
[0204] Input/output (I/O) devices depicted as an input device 1112 and an output device 1114 optionally can be coupled to the data processing system. Examples of input devices may include, but are not limited to, a keyboard, a pointing device such as a mouse, or the like. Examples of output devices may include, but are not limited to, a monitor or a display, speakers, or the like. Input and/or output devices may be coupled to the data processing system either directly or through intervening I/O controllers.
[0205] In an embodiment, the input and the output devices may be implemented as a combined input/output device (illustrated in
[0206] A network adapter 1116 may also be coupled to the data processing system to enable it to become coupled to other systems, computer systems, remote network devices, and/or remote storage devices through intervening private or public networks. The network adapter may comprise a data receiver for receiving data that is transmitted by said systems, devices and/or networks to the data processing system 1100, and a data transmitter for transmitting data from the data processing system 1100 to said systems, devices and/or networks. Modems, cable modems, and Ethernet cards are examples of different types of network adapter that may be used with the data processing system 1100.
[0207] As pictured in
[0208] In one aspect of the present invention, the data processing system 1100 may represent an audio signal processing system.
[0209] Various embodiments of the invention may be implemented as a program product for use with a computer system, where the program(s) of the program product define functions of the embodiments (including the methods described herein). In one embodiment, the program(s) can be contained on a variety of non-transitory computer-readable storage media, where, as used herein, the expression “non-transitory computer readable storage media” comprises all computer-readable media, with the sole exception being a transitory, propagating signal. In another embodiment, the program(s) can be contained on a variety of transitory computer-readable storage media. Illustrative computer-readable storage media include, but are not limited to: (i) non-writable storage media (e.g., read-only memory devices within a computer such as CD-ROM disks readable by a CD-ROM drive, ROM chips or any type of solid-state non-volatile semiconductor memory) on which information is permanently stored; and (ii) writable storage media (e.g., flash memory, floppy disks within a diskette drive or hard-disk drive or any type of solid-state random-access semiconductor memory) on which alterable information is stored. The computer program may be run on the processor 1102 described herein.
[0210] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
[0211] The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of embodiments of the present invention has been presented for purposes of illustration, but is not intended to be exhaustive or limited to the implementations in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the present invention. The embodiments were chosen and described in order to best explain the principles and some practical applications of the present invention, and to enable others of ordinary skill in the art to understand the present invention for various embodiments with various modifications as are suited to the particular use contemplated.