APPARATUS AND METHOD FOR RENDERING A SOUND SCENE COMPRISING DISCRETIZED CURVED SURFACES
20230007429 · 2023-01-05
Inventors
Cpc classification
H04S2420/01
ELECTRICITY
H04S2400/11
ELECTRICITY
International classification
Abstract
An apparatus for rendering a sound scene having reflection objects and a sound source at a sound source position, includes: a geometry data provider for providing an analysis of the reflection objects of the sound scene to determine a reflection object represented by a first polygon and a second adjacent polygon having associated a first image source position for the first polygon and a second image source position for the second polygon, wherein the first and second image source positions result in a sequence including a first visible zone related to the first image source position, an invisible zone and a second visible zone related to the second image source position; an image source position generator for generating an additional image source position such that the additional image source position is placed between the first image source position and the second image source position; and a sound renderer for rendering the sound source at the sound source position and, additionally for rendering the sound source at the first image source position, when a listener position is located within the first visible zone, for rendering the sound source at the additional image source position, when the listener position is located within the invisible zone, or for rendering the sound source at the second image source position, when the listener position is located within the second visible zone.
Claims
1. Apparatus for rendering a sound scene comprising reflection objects and a sound source at a sound source position, comprising. a geometry data provider for providing an analysis of the reflection objects of the sound scene to determine a reflection object represented by a first polygon and a second adjacent polygon having associated a first image source position for the first polygon and a second image source position for the second polygon, wherein the first and second image source positions result in a sequence comprising a first visible zone related to the first image source position, an invisible zone and a second visible zone related to the second image source position; an image source position generator for generating an additional image source position such that the additional image source position is placed between the first image source position and the second image source position; and a sound renderer for rendering the sound source at the sound source position and, additionally for rendering the sound source at the first image source position, when a listener position is located within the first visible zone, for rendering the sound source at the additional image source position, when the listener position is located within the invisible zone, or for rendering the sound source at the second image source position, when the listener position is located within the second visible zone.
2. Apparatus of claim 1, wherein the geometry data provider is configured to retrieve pre-stored information on the reflection objects stored during an initialization stage, and wherein the image source position generator is configured to generate the additional image source position in response to the pre-stored information indicating the reflection object.
3. Apparatus of claim 1, wherein the geometry data provider is configured to detect, during runtime or during an initialization stage and using geometry data on the sound scene delivered by a computer added design application, the reflection object.
4. Apparatus of claim 1, wherein the geometry data provider is configured to detect, during runtime or during an initialization stage, as the reflection object, an object comprising a round geometry, a curved geometry, or a geometry derived from a spline interpolation.
5. Apparatus of claim 1, wherein the geometry data provider is configured to compute an angle between two adjacent polygons of a reflection object and to mark the two adjacent polygons as a specific pair of polygons, when the angle is below a threshold, to compute a further angle between two further adjacent polygons of the reflection object and to mark the two further adjacent polygons as a further specific pair of polygons, when the further angles below the threshold, and to detect the reflection object, when the further adjacent polygons and the adjacent polygons have an edge in common, or belong to the same corner.
6. Apparatus of claim 1, wherein the image source position generator is configured to analyze, whether the listener position is in the invisible zone, and to generate the additional image source position only when the listener position is located in the invisible zone.
7. Apparatus of claim 6, wherein the image source position generator is configured to determine a first geometrical range associated with the first polygon or a second geometrical range associated with the second polygon, or a third geometrical range between the first geometrical range and the second geometrical range, wherein the first geometrical range determines the first visible zone or wherein the second geometrical range determines the second visible zone, or wherein the third geometrical range determines the invisible zone, and wherein the first or the second geometrical range is determined such that a condition that an incidence angle from the source position to the first or the second polygon is equal to a reflection angle from the first or the second polygon is fulfilled for a position in the first or the second geometrical zone, or wherein the third geometrical range is determined such that the condition of a reflection angle being equal to the incidence angle is not fulfilled for a position in the invisible zone.
8. Apparatus of claim 6, wherein the image source position generator is configured to calculate a first frustum for the first polygon and to determine, whether the listener position is located within the first frustum, or wherein the image source position generator is configured to calculate a second frustum for the second polygon and to determine, whether the listener position is located within the second frustum, or wherein the image source position generator is configured to calculate an invisible zone frustum and to determine, whether the listener is located within the invisible zone frustum.
9. Apparatus of claim 8, wherein the image source position generator is configured to define four planes having normal vectors pointing inside the first frustum, the second frustum or the invisible zone frustum, and wherein the image source position generator is configured to determine, whether a distance of the listener position to each plane is greater than or equal to 0, and to detect that the listener is located within a frustum of the first frustum, the second frustum or the invisible zone frustum, when the distance of the listener to each plane is greater than or equal to 0.
10. Apparatus of claim 1, wherein the image source position generator is configured to calculate the additional image source position as a position between the first image source position and the second image source position.
11. Apparatus of claim 10, wherein the image source position generator is configured to calculate the additional image source position on a connection line between the first image source position and the second image source position.
12. Apparatus of claim 10, wherein the image source position generator is configured to calculate the additional image source position as a position on a circular arc with radius r1 around reflection point, where r1 denotes the distance between the source position and the reflection point.
13. Apparatus of claim 10, wherein the image source position generator is configured to calculate the additional image source position, so that a distance between the additional image source position and the second image source position is proportional to a distance of the listener position to the second visible zone, or so that a distance between the additional image source position and the first image source position is proportional to a distance of the listener position to the first visible zone.
14. Apparatus of claim 11, wherein the image source position generator is configured to determine a reflection point using an orthogonal projection of the vector for the sound source position and an orthogonal projection of the vector for the listener position with respect to the first polygon or the second polygon or the adjacent edge between the first polygon and the second polygon or to determine a point where the first polygon and the second polygon are connected to each other as the reflection point and, wherein the image source position generator is configured to determine a section point of a line connecting the listener position and the reflection point and the connection line between the first image source position and the second image source position as the additional image source position.
15. Apparatus of claim 1, wherein the image source position generator is configured to calculate the first image source position by mirroring the sound source position at a plane defined by the first polygon, or wherein the image source position generator is configured to calculate the second image source position by mirroring the sound source position at a plane defined by the second polygon.
16. Apparatus of claim 1, wherein the sound renderer is configured to render the sound source so that a sound source signal is filtered using a rendering filter defined by at least one of a distance between a corresponding image sound source position to the listener position and a delay time incurred by the distance, and an absorption coefficient or a reflection coefficient associated with the first polygon or the second polygon, or a frequency-selective absorption or reflection characteristic associated with the first polygon or the second polygon.
17. Apparatus of claim 1, wherein the sound renderer is configured to render the sound source using the sound source signal and the sound source position and the listener position using a direct sound filter stage, and to render the sound source using the sound source signal and a corresponding additional sound source position and the listener position as a first order reflection in a first order reflection filter stage, wherein the corresponding image sound source position comprises the first image sound source position, or the second image sound source position or the additional image sound source position.
18. Method of rendering a sound scene comprising reflection objects and a sound source at a sound source position, comprising: providing an analysis of the reflection objects of the sound scene to determine a reflection object represented by a first polygon and a second adjacent polygon having associated a first image source position for the first polygon and a second image source position for the second polygon, wherein the first and second image source positions result in a sequence comprising a first visible zone related to the first image source position, an invisible zone and a second visible zone related to the second image source position; generating an additional image source position such that the additional image source position is placed between the first image source position and the second image source position; and rendering the sound source at the sound source position and, additionally rendering the sound source at the first image source position, when a listener position is located within the first visible zone, rendering the sound source at the additional image source position, when the listener position is located within the invisible zone, or rendering the sound source at the second image source position, when the listener position is located within the second visible zone.
19. Non-transitory digital storage medium having a computer program stored thereon to perform the method of rendering a sound scene comprising reflection objects and a sound source at a sound source position, comprising: providing an analysis of the reflection objects of the sound scene to determine a reflection object represented by a first polygon and a second adjacent polygon having associated a first image source position for the first polygon and a second image source position for the second polygon, wherein the first and second image source positions result in a sequence comprising a first visible zone related to the first image source position, an invisible zone and a second visible zone related to the second image source position; generating an additional image source position such that the additional image source position is placed between the first image source position and the second image source position; and rendering the sound source at the sound source position and, additionally rendering the sound source at the first image source position, when a listener position is located within the first visible zone, rendering the sound source at the additional image source position, when the listener position is located within the invisible zone, or rendering the sound source at the second image source position, when the listener position is located within the second visible zone, when said computer program is run by a computer.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] Embodiments of the present invention will be detailed subsequently referring to the appended drawings, in which:
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
DETAILED DESCRIPTION OF THE INVENTION
[0029]
[0030] The image source position generator relies on the source position and the listener position and, particularly due to the fact that the listener position will change in runtime, the image source position generator will operate in runtime. The same is true for the sound renderer 30 that additionally operates in runtime using the sound source data, the listener position and additionally using the image source positions and the additional image source positions if required, i.e., if the user is placed in an invisible zone that has to be “enlightened” by an additional image source determined by the image source position generator in accordance with the present invention.
[0031] Advantageously, the geometry data provider 10 is configured for providing an analysis of the reflection object of the sound scene to determine a specific reflection object that is represented by a first polygon and a second adjacent polygon. The first polygon has associated a first image source position and the second polygon has associated a second image source position, where these image source positions are constructed, for example, as illustrated in
[0032] The sound renderer 30 is configured for rendering the sound source at the sound source position in order to obtain the direct sound at the listener position. Additionally, in order to also render a reflection, the sound source is rendered at the first image source position, when the listener position is located within the first visible zone. In this situation, the image source position generator does not need to generate an additional image source position, since the listener position is such that any artefacts due to the disco ball effect do not occur at all. The same is true when the listener position is located within the second visible zone associated with the second image source. However, when the listener is located within the invisible zone, then the sound renderer uses the additional image source position and does not use the first image source position and the second image source position. Instead of the “classical” image sources modeling the reflections at the first and the second adjacent polygons, the sound renderer only renders, for the purpose of reflection rendering, the additional image source position generated in accordance with the present invention in order to fill up or enlighten the invisible zone with sound. Any artefacts that would otherwise result in a permanently switching localization, timbre and loudness are avoided by means of the inventive processing using the image source position generator generating the additional image source between the first and the second image source position.
[0033]
[0034] Furthermore,
[0035] In
[0036]
[0037] Furthermore,
[0038] Furthermore, a wall absorption/reflection behavior is modeled by means of the wall absorption or reflection coefficient ∝. Advantageously, the coefficient ∝ is dependent on the frequency, i.e., represents a frequency-selective absorption or reflection curve H.sub.w(k) and typically has a high-pass characteristic, i.e., high frequencies are better reflected than low frequencies. This behavior is accounted for in embodiments. The strength of the image source application is that subsequent to the construction of the image source and the description of the image source with respect to the propagation time, the distance attenuation and the wall absorption, the wall 140 will be completely removed from the sound scene and is only modeled by the image source 120.
[0039]
[0040] For the additional image source position 90, the same path length, propagation time, distance attenuation and wall absorption is used for the purpose of rendering the first order reflection in the invisible zone 80. In an embodiment, a reflection point 92 is determined. The reflection point 92 is at the junction between the first polygon and the second polygon when watched from above, and typically is in a vertical position, for example in the example of the advertising pillar that is determined by the height of the listener 130 and the height of the source 100. Advantageously, the additional image source position 90 is placed on a line connecting the listener 130 and the reflection point 92, where this line is indicated at 93. Furthermore, the exact position of the additional sound source 90 in the embodiment is at the intersection point of the line 93 and the connecting line 91, connecting the image source positions 62 and 63 that have visible zones adjacent to the invisible zone 80.
[0041] However, the
[0042] Furthermore, although it is advantageous to exactly calculate the propagation time depending on the exact path length, other embodiments rely on an estimation of the path length as depending on a modified path length of image source position 63, or a modified path length of the other adjacent image source position 62. Furthermore, with respect to the wall absorption or wall reflection modeling, for the purpose of rendering the additional sound source position 90, either the wall absorption of one of the adjacent polygons can be used, or an average value of both absorption coefficients if they are different from each other can be used, and even a weighted average can be applied depending on whether the listener is closer to which visible zone, so that a certain wall absorption data of the wall having the visible zone to which the user is located closer receives a higher weighting value in a weighted addition compared to the absorption/reflection data of the other adjacent wall having the visible zone being further away from the listener position.
[0043]
[0044] Alternatively, when step 21 determines that the user is placed within the invisible zone 80, the additional image source position 90 of
[0045]
[0046]
[0047] Subsequently, further procedures are given in order to illustrate a further procedure of calculating the additional image source position. The extended image source model needs to extrapolate the image source position in the “dark zone” of the reflectors, i.e. the areas between the “bright zones” in which the image source is visible (see
{right arrow over (N.sub.k)}{right arrow over (X)}−d.sub.k=0. (1)
If the distance
l.sub.k={right arrow over (N.sub.k)}{right arrow over (L)}−d.sub.k (2)
is greater than or equal zero for all 4 planes, then the listener is located within the frustum that defines the coverage area of the model for the given round edge. The invisible zone frustum is illustrated in
[0048] In this case, one can determine the reflection point on the round edge as follows:
Let {right arrow over (P)}.sub.S be the orthogonal projection of the source position {right arrow over (S)} onto the edge and {right arrow over (P)}.sub.L be the orthogonal projection of the listener position {right arrow over (L)} onto the edge. This yields the reflection point {right arrow over (R)} as follows:
d.sub.S=|{right arrow over (P.sub.S)}−{right arrow over (S)}| (3)
d.sub.L=|{right arrow over (P.sub.L)}−{right arrow over (L)}| (4)
[0049] The construction of the reflection point is illustrate in
[0050] The computation of the coverage area of the round corners is very similar. Here, the k adjacent planes yield k image sources which together with the corner position result in a frustum that is bounded by k planes. Again, if the distances of the listener to these planes are all greater than or equal zero, the listener is located within the coverage area of the round corner. The reflection point {right arrow over (R)} is given by the corner point itself.
[0051] This situation, i.e., the invisible frustum or a round corner is illustrated in
[0052] For higher-order reflections, one can extend this method according to the frustum-tracing method where one splits up each frustum into sub-frustums whenever one hits a surface, round edge, or round corner.
[0053]
[0054] The geometric data provider may apply a curved surface detection. The geometry data provider also termed to be the geometry-processor calculates the specific reflection object determination in advance, in an initialization procedure or a runtime. If, for example, a CAD software is used to export the geometry data, as much information about curvatures as possible is advantageously used by the geometry data provider. For example, if surfaces are constructed from round geometry primitives like spheres or cylinders or from spline interpolations, the geometry pre-processor/geometry data provider is advantageously implemented within the export routine of the CAD software and detects and uses the information from the CAD software.
[0055] If no a priori knowledge about the surface curvature is available, the geometry preprocessor or data provider needs to implement a round edge and round corner detector by using only the triangle or polygon mesh. For example, this can be done by computing the angle Φ between two adjacent triangles 1, 2 or 1a, 2a as illustrated in
[0056]
[0057] Furthermore, depending on the output format required by the sound renderer 30, i.e., depending on whether the sound renderer outputs via headphones, via loudspeakers or just for storage or transmission in a certain format, a certain number of output adders such as a left adder 34, a right adder 35 and a center adder 36 and probably other adders for left surround output channels, or for right surround output channels, etc. are provided. While the left and the right adders 34 and 35 are advantageously used for the purpose of headphone reproduction for virtual reality applications, for example, any other adders for the purpose of loudspeaker output in a certain output format can also be used. When, for example, an output via headphones is required, then the direct sound filter stage 31 applies head related transfer functions depending on the sound source position 100 and the listener position 130. For the purpose of the first order reflection filter stage, corresponding head related transfer functions are applied, but now for the listener position 130 on the one hand and the additional sound source position 90 on the other hand. Furthermore, any specific propagation delays, path attenuations or reflection effects are also included within the head related transfer functions in the first order reflection filter stage 32. For the purpose of higher order reflection filter stages, other additional sound sources are applied as well.
[0058] If the output is intended for a loudspeaker set up, then the direct sound filter stage will apply other filters different from head related transfer functions such as filters that perform vector based amplitude panning, for example. In any case, each of the direct sound filter stage 31, the first order reflection filter stage 32 and the second order reflection filter stage 33 calculates a component for each of the adder stages 34, 35, 36 as illustrated, and the left adder 34 then calculates the output signal for the left headphone speaker and the right adder 35 calculates the headphone signal for the right headphone speaker, and so on. In case of an output format that is different from a headphone, the left adder 34 may deliver the output signal for the left speaker and the right adder 35 may deliver the output for the right speaker. If only two speakers in a two-speaker environment are there, then the center adder 32 is not required.
[0059] The inventive method avoids the disco-ball effect, that occurs when a curved surface, approximated by a discrete triangle mesh, is auralized using the classical image sound source technique [3, 4]. The novel technique avoids invisible zones, making the reflection audible. For this procedure, approximations of curved surfaces have to be identified by threshold face angle. The novel technique is an extension to the original model, with special treatment faces identified as a representation of a curvature.
[0060] Classical image sound source techniques [3, 4] do not consider that the given geometry can (partially) approximate a curved surface. This causes dark zones (silence) to be casted away from edge points of adjacent faces (see
[0061] While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations and equivalents as fall within the true spirit and scope of the present invention.
REFERENCES
[0062] [1] Vorländer, M. “Auralization: fundamentals of acoustics, modelling, simulation, algorithms and acoustic virtual reality.” Springer Science & Business Media, 2007. [0063] [2] Savioja, L., and Svensson, U. P. “Overview of geometrical room acoustic modeling techniques.” The Journal of the Acoustical Society of America 138.2 (2015): 708-730. [0064] [3] Krokstad, A., Strom, S., and Sørsdal, S. “Calculating the acoustical room response by the use of a ray tracing technique.” Journal of Sound and Vibration 8.1 (1968): 118-125. [0065] [4] Allen, J. B., and Berkley, D. A. “Image method for efficiently simulating small room acoustics.” The Journal of the Acoustical Society of America 65.4 (1979): 943-950. [0066] [5] Borish, J. “Extension of the image model to arbitrary polyhedra.” The Journal of the Acoustical Society of America 75.6 (1984): 1827-1836.