Audio Apparatus and Method Therefor
20240244391 ยท 2024-07-18
Inventors
- Jeroen Gerardus Henricus Koppens (Nederweert, NL)
- Sam Martin Jelfs (Valkenswaard, NL)
- Patrick Kechichian (Eindhoven, NL)
Cpc classification
H04S7/305
ELECTRICITY
H04S2420/01
ELECTRICITY
H04S7/302
ELECTRICITY
H04S7/30
ELECTRICITY
H04S2400/05
ELECTRICITY
G10H2210/291
PHYSICS
G10H2210/301
PHYSICS
International classification
Abstract
An audio apparatus for generating a flutter echo audio signal comprises a receiver (401) arranged to receive room metadata indicative of properties of a room. The room metadata may e.g. indicate dimensions of the room and/or acoustic reflection data for boundaries of the room. An estimator (405) determines a flutter echo estimate for the room in response to the room metadata where the flutter echo estimate is indicative of a level of a flutter echo in the room. A signal generator (403) includes a feedback delay network comprising a plurality of feedback loops. The signal generator (403) is arranged to generate the flutter echo audio signal from output signals of a set of feedback loops of the plurality of feedback loops being fed an audio source signal. An adapter (407) adapts a first parameter for a first feedback loop of the set of feedback loops in response to the flutter echo estimate.
Claims
1. An audio comprising: a receiver, wherein the receiver is arranged to receive room metadata, wherein the room metadata is indicative of a plurality of properties of a room; an estimator circuit, wherein the estimator circuit is arranged to determine a flutter echo estimate for the room in response to the room metadata, wherein the flutter echo estimate is indicative of a level of a flutter echo in the room; a signal generator circuit, wherein the signal generator circuit comprises a feedback delay network, wherein the feedback delay network comprises a plurality of feedback loops, wherein the signal generator circuit is arranged to generate a flutter echo audio signal from first output signals, wherein the first output signals are from a first portion of the plurality of feedback loops when the plurality of feedback loops are being fed an audio source signal; and an adapter circuit, wherein the adapter circuit is arranged to change a first parameter for a first feedback loop of the first portion of feedback loops in response to the flutter echo estimate.
2. The audio apparatus of claim 1, wherein the room metadata comprises including dimension data for the room, wherein the flutter echo estimate is determined in response to a room dimension in a first direction relative to a room dimension in a second direction.
3. The audio apparatus of claim 1, wherein the room metadata comprises acoustic reflection data for sides of the room, wherein the flutter echo estimate is determined in response to an acoustic reflection attenuation of a first boundary of the room relative to an acoustic reflection attenuation of a second boundary of the room.
4. The audio apparatus of claim 1, wherein the adapter circuit is arranged to increase a feedback factor, wherein the feedback factor is for the first feedback loop, wherein the feedback factor is based on the flutter echo estimate.
5. The audio apparatus of claim 1, wherein each of at least a second portion of the plurality of feedback loops have a second feedback factor, wherein a third portion of the second portion are dependent on room dimensions of the room.
6. The audio apparatus of claim 1, wherein the signal generator circuit is arranged to generate a diffuse reverberation signal from outputs of a fourth portion of the plurality of feedback loops, wherein the first portion of the plurality of feedback loops does not comprise any part of the fourth portion of the plurality of feedback loops, wherein the adapter circuit is arranged to vary a fifth portion of the plurality of feedback loops in response to the flutter echo estimate, wherein the first portion of the plurality of feedback loops comprises the fifth portion of the plurality of feedback loops.
7. The audio apparatus of claim 1, any previous claim wherein the signal generator circuit comprises a delay for the audio source signal, wherein the delay is applied to the audio source signal prior to a first feedback loop, wherein the first portion of the plurality of feedback loops comprises the first feedback loop, wherein the adapter circuit is arranged to change the delay in response to a position of at least one of an audio source for the audio source signal, a listener position, and a boundary of the room.
8. The audio apparatus of claim 1, wherein the first portion of the plurality of feedback loops comprises at least two feedback loops, wherein the signal generator circuit comprises a delay for the audio source signal, wherein the delay is applied to the audio source signal prior to the at least two feedback loops, wherein the delay is different for each of the at least two feedback loops.
9. The audio apparatus of claim 1, wherein the first portion of the plurality of feedback loops comprises one loop or two loops.
10. The audio apparatus of claim 1, wherein the plurality of feedback loops comprises the first portion and a sixth portion, wherein no part of the first portion of the plurality of feedback loops comprises any part of the sixth portion of the plurality of feedback loops, wherein the adapter circuit is arranged to change feedback factors for at least one of the plurality of feedback loops such that there is no feedback from a feedback loop of the first portion of the plurality of feedback loops to any feedback loop of the sixth portion of the plurality of feedback loops.
11. The audio apparatus of claim 1, further comprising: a spatial processor circuit; and wherein the spatial processor circuit is arranged to apply a spatial processing to the flutter echo signal, wherein the spatial processing is dependent on a position of at least one of a source of the audio source signal and a boundary of the room; and a combiner circuit, wherein the combiner circuit is arranged to combine a diffuse reverberation signal and the flutter echo signal after spatial processing, wherein the signal generator circuit is arranged to generate the diffuse reverberation signal.
12. The audio apparatus of claim 1, further comprising, a spatial processor circuit, wherein the spatial processor circuit is arranged to apply a spatial processing to the flutter echo signal, wherein the spatial processing is dependent on a position of at least one of a source of the audio source signal and a side of the room.
13. The audio apparatus of claim 1, further comprising a first circuit, wherein the first circuit is arranged to feed a plurality of audio source signals to the plurality of feedback loops, wherein at least one audio source signal is fed only to feedback loops of the first portion of the plurality of feedback loops.
14. The audio apparatus of claim 1, wherein the signal generator circuit comprises a gain circuit, wherein the gain circuit is applied to the audio source signal prior to a first feedback loop, wherein the first portion of the plurality of feedback loops comprises the first feedback loop, wherein the adapter circuit is arranged to change the gain circuit in response to at least one of a position of an audio source for the audio source signal, a listener position, a position of a boundary of the room, and a reflection order for an onset of the flutter echo audio signal.
15. The audio apparatus of claim 1, wherein the flutter echo audio signal represents a flutter echo between a pair of opposing boundaries of the room, wherein the signal generator circuit comprises a frequency dependent gain circuit, wherein the gain circuit is applied to the audio source signal prior to a first feedback loop, wherein the first portion of the plurality of feedback loops comprises the first feedback loop, wherein the adapter circuit is arranged to change the gain circuit in response to acoustic reflection data of the room metadata for room boundaries, wherein the acoustic reflection data is indicative of a frequency dependent acoustic property for at least one room boundary, wherein a pair of opposing room boundaries does not comprise the at least one room boundary.
16. The audio apparatus of claim 1, wherein the first portion of the plurality of feedback loops comprises at least two feedback loops, wherein each of the at least two feedback loops have different loop gains.
17. A method comprising: receiving room metadata, wherein the room metadata is indicative of a plurality of properties of a room; determining a flutter echo estimate for the room in response to the room metadata, wherein the flutter echo estimate is indicative of a level of a flutter echo in the room; generating a flutter echo audio signal from output signals of a first portion of a plurality of feedback loops when the plurality of feedback loops are fed an audio source signal, wherein a feedback delay network comprises the first portion of the plurality of feedback loops; and adapting a first parameter for a first feedback loop of the first portion of the plurality of feedback loops in response to the flutter echo estimate.
18. A computer program stored on a non transitory medium, wherein the computer program when executed on a processor performs the method as claimed in claim 17.
19. The audio apparatus of claim 4, wherein the feedback factor is increased when the flutter echo increases.
20. The method of claim 17, wherein the room metadata comprises dimension data for the room, wherein the flutter echo estimate is determined in response to a room dimension in a first direction relative to a room dimension in a second direction.
21. The method of claim 17, wherein the room metadata comprises acoustic reflection data for sides of the room, wherein the flutter echo estimate is determined in response to an acoustic reflection attenuation of a first boundary of the room relative to an acoustic reflection attenuation of a second boundary of the room.
22. The method of claim 17, wherein the adapter circuit is arranged to increase a feedback factor, wherein the feedback factor is for the first feedback loop, wherein the feedback factor is based on the flutter echo estimate.
23. The method of claim 17, wherein each of at least a second portion of the plurality of feedback loops have a second feedback factor, wherein a third portion of the second portion are dependent on room dimensions of the room.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0070] Embodiments of the invention will be described, by way of example only, with reference to the drawings, in which
[0071]
[0072]
[0073]
[0074]
[0075]
[0076]
[0077]
[0078]
[0079]
[0080]
[0081]
[0082]
[0083]
[0084]
[0085]
[0086]
[0087]
[0088]
DETAILED DESCRIPTION OF SOME EMBODIMENTS OF THE INVENTION
[0089] The following description will focus on audio processing and -generation for a virtual reality application, but it will be appreciated that the described principles and concepts may be used in many other applications and embodiments.
[0090] Virtual experiences allowing a user to move around in a virtual world are becoming increasingly popular and services are being developed to satisfy such a demand.
[0091] In some systems, the VR application may be provided locally to a user by e.g. a stand-alone device that does not use, or even have any access to, any remote VR data or processing. For example, a device such as a games console may comprise a store for storing the scene data, input for receiving/generating the user pose, and a processor for generating the corresponding images from the scene data.
[0092] In other systems, the VR application may be implemented and performed remote from the user. For example, a device local to the user may detect/receive movement/pose data which is transmitted to a remote device that processes the data to generate the user pose. The remote device may then generate suitable view images and corresponding audio signals for the user pose based on scene data describing the scene. The view images and corresponding audio signals are then transmitted to the device local to the user where they are presented. For example, the remote device may directly generate a video stream (typically a stereo/3D video stream) and corresponding audio stream which is directly presented by the local device. Thus, in such an example, the local device may not perform any VR processing except for transmitting movement data and presenting received video data.
[0093] In many systems, the functionality may be distributed across a local device and remote device. For example, the local device may process received input and sensor data to generate user poses that are continuously transmitted to the remote VR device. The remote VR device may then generate the corresponding view images and corresponding audio signals and transmit these to the local device for presentation. In other systems, the remote VR device may not directly generate the view images and corresponding audio signals but may select relevant scene data and transmit this to the local device, which may then generate the view images and corresponding audio signals that are presented. For example, the remote VR device may identify the closest capture point and extract the corresponding scene data (e.g. a set of object sources and their position metadata) and transmit this to the local device. The local device may then process the received scene data to generate the images and audio signals for the specific, current user pose. The user pose will typically correspond to the head pose, and references to the user pose may typically equivalently be considered to correspond to the references to the head pose.
[0094] In many applications, especially for broadcast services, a source may transmit or stream scene data in the form of an image (including video) and audio representation of the scene which is independent of the user pose. For example, signals and metadata corresponding to audio sources within the confines of a certain virtual room may be transmitted or streamed to a plurality of clients. The individual clients may then locally synthesize audio signals corresponding to the current user pose. Similarly, the source may transmit a general description of the audio environment including describing audio sources in the environment and acoustic characteristics of the environment. An audio representation may then be generated locally and presented to the user, for example using binaural rendering and processing.
[0095]
[0096] The VR server 303 may for example support a broadcast experience by transmitting an image signal comprising an image representation in the form of image data that can be used by the client devices to locally synthesize view images corresponding to the appropriate user poses (a pose refers to a position and/or orientation). Similarly, the VR server 303 may transmit an audio representation of the scene allowing the audio to be locally synthesized for the user poses. Specifically, as the user moves around in the virtual environment, the image and audio synthesized and presented to the user is updated to reflect the current (virtual) position and orientation of the user in the (virtual) environment.
[0097] In many applications, such as that of
[0098] In some embodiments, a model representing a scene may for example be stored locally and may be used locally to synthesize appropriate images and audio. For example, an audio model of a room may include an indication of properties of audio sources that can be heard in the room as well as acoustic properties of the room. The model data may then be used to synthesize the appropriate audio for a specific position.
[0099] It is a critical question how the audio scene is represented and how this representation is used to generate audio. Audio rendering aimed at providing natural and realistic effects to a listener typically includes rendering of an acoustic environment. For many environments, this includes the representation and rendering of diffuse reverberation present in the environment, such as in a room. The rendering and representation of such diffuse reverberation has been found to have a significant effect on the perception of the environment, such as on whether the audio is perceived to represent a natural and realistic environment. In the following, advantageous approaches will be described for representing an audio scene, and of rendering audio, and in particular augmentation of diffuse reverberation audio, based on this representation.
[0100] The approach will be described with reference to an audio apparatus as illustrated in
[0101] In the example, the audio apparatus is specifically arranged to generate an audio signal which represents how an audio source may be perceived in the current listening environment. It comprises functionality for generating direct and early reflection audio signal components as well as a diffuse reverberation audio signal component. The audio apparatus may thus receive one or more audio source signals and process one, some, or all of these to generate corresponding output signals that include the different components reflecting the behavior of the acoustic environment.
[0102] In addition, the apparatus is arranged to generate a flutter echo audio signal, which is dependent on room metadata that is indicative of properties of a room. If the acoustic environment is a room, this room may be characterized by room metadata and the audio apparatus may be arranged to generate a flutter echo audio signal that may emulate flutter echoes, which may occur in such a room. The flutter echo audio signal may be an additional audio component that is combined with the direct sound, early reflections, and/or diffuse reverberation audio components to provide a more accurate and natural perceived acoustic environment (although it will be appreciated that in some embodiments, only a flutter echo audio signal is generated). Further, as flutter echo typically is very specific to individual rooms, and indeed tends to only be significant or noticeable for some room types/properties, the audio apparatus may specifically provide a flutter echo audio signal when appropriate for the specific room, and typically with the flutter echo audio signal being adapted to reflect these specific conditions. In particular, in many embodiments, the generation of a flutter echo audio signal may be conditional on the room metadata and a flutter echo audio signal may only be generated if the room metadata meets a specific criterion.
[0103] For some room types and properties, opposing (and specifically parallel) boundaries/walls of a room may in addition to assisting in generating possible early reflections and diffuse reverberation also cause recurrent echoes at a fixed rate. Such effects may be perceived as a flutter echo reflecting sound bouncing back and forth between opposing walls with the energy decaying as the order of the reflections increases. Flutter echoes may comprise many frequencies (and specifically e.g. all audio frequencies) and are not limited to e.g. standing wave frequencies as known for room modes). They tend to be most noticeable for mid- and high frequencies.
[0104] For a flutter echo, the reflected sound is essentially returning from a reflecting wall at a fixed rate with a slightly lower level. The rate of the echo depends on the distance (i.e. time-of-flight) between the walls causing the echo. The level reduction depends on distance attenuation and reflection characteristics of the involved walls. These parameters are typically frequency dependent.
[0105] Flutter echo is an acoustic feature that may occur in many rooms where the specific room properties allow for suitable reflections, such as e.g. corridors, stairwells or rooms with very different material properties on different boundaries. Including an emulation of this acoustic effect may provide a compelling experience and create more immersion for the user. Nevertheless, commonly used methods cannot and do not perform such emulation.
[0106] The audio apparatus of
[0107] The apparatus specifically generates the flutter echo audio signal using a feedback delay network. Such a feedback delay network may also be used by a parametric reverberator to generate a diffuse reverberation and the functions may thus reuse the same functionality. Such an approach may provide for reduced complexity and/or facilitated operation and may for example in some embodiments allow a dynamic and flexible allocation of resources between the diffuse reverberation and the flutter echo simulation depending on the specific room properties. For the existing structure of the feedback delay network in the parametric reverberator, the approach of
[0108] The audio apparatus of
[0109] The room metadata may specifically comprise data characterizing dimensions of the room, such as the three dimensions of a rectangular room. In some embodiments only one or two dimensions of a room may be represented by the room metadata. The remaining dimension(s) may e.g. be predetermined or assumed dimensions, for example the room metadata may indicate the width and length of a room and the audio apparatus may assume a standard height. In some embodiments, absolute dimension data may be provided whereas other embodiments may alternatively or additionally employ relative dimension data information. In some embodiments, a room outline may for example be provided which not only indicates e.g. a distance between sides/boundaries/walls of the room but also the layout of the room.
[0110] Dimension data may in different embodiments be provided in different ways in different embodiments. For example, the room metadata may include distances in e.g. meters, room volume with dimension ratios, time of flight durations for each dimension, two dimensional or three dimensional data, as a mesh, etc.
[0111] In some embodiments, the room metadata may include acoustic reflection data, such as e.g. a reflection coefficient or absorption coefficient for one or more walls of the room, and in many cases for all walls/boundaries of the room.
[0112] Such information may be provided as an acoustic absorption-, transmission-, coupling-, diffusing coefficient for each of the walls of the room.
[0113] In addition to the room metadata, the receiver 401 may receive one or more audio source signals representing audio of audio sources in the room to be rendered. In many embodiments, the audio sources may be represented by audio objects, but it will be appreciated that the specific audio source signals will depend on the specific embodiment and may, for example, be channel sources or Higher Order Ambisonics (HOA) sources. The audio apparatus is arranged to generate an output signal for one or more of the received audio source signals/objects, and typically will generate an output signal including all audio sources. In many cases an output signal will be generated from a subset of all audio sources that have position metadata indicating that they are inside the room. The audio apparatus may specifically process all the received audio source signals to generate output signals that reflect the acoustic properties of the room including direct sound paths, early reflections, diffuse reverberation, and flutter echoes. The processing may for example be applied to each audio source signal sequentially or in parallel. The resulting output signals may be combined to generate a single rendering signal. For example, a binaural stereo signal may be generated by binaurally processing (at least parts) of the generated output signals for each source and then combining the binaural signals into a single output stereo signal.
[0114] It will be appreciated that the described approach may be applied to an audio apparatus that only generates a flutter echo audio signal and which does not e.g. generate any direct, early reflection, and/or diffuse reverberation signal components. However, the following description will focus on embodiments in which the audio apparatus is arranged to simulate a range of acoustic effects of typical acoustic environments.
[0115] The audio apparatus comprises a signal generator 403 which is arranged to generate one or more output signals from one or more (and typically all) received audio source signals. The signal generator 403 in the present example will generate the output signal(s) to reflect the intended acoustic environment.
[0116]
[0117] In many embodiments, the renderer 501 may also generate the direct path signal based on occluding or diffracting (virtual) elements that are in between the source and user positions.
[0118] In many embodiments, the path renderer 501 may also generate further signal components for individual paths where these include one or more reflections. This may for example be done by evaluating reflections of walls, ceiling etc. as will be known to the skilled person. The path renderer 501 may thus also generate the early reflection components. The direct path and reflected path components may be combined into a single output signal for each path renderer and thus a single signal representing the direct path and early/discrete reflections may be generated for each audio source.
[0119] In some embodiments, the output audio signal for each audio source may be a binaural signal, e.g. generated by applying HRTF or HRIR filters based on relative (angular) positions of the audio source and listener, and thus each output signal may include both a left car and a right car (sub)signal.
[0120] The output signals from the path renderers 501 are provided to a combiner 503, which combines the signals from the different path renderers 501 to generate a single combined signal. In many embodiments, a binaural output signal may be generated and the combiner may perform a combination, such as a weighted combination, of the individual signals from the path renderers 501, i.e. all the right car signals from the path renderers 501 may be added together to generate the combined right car signals and all the left ear signals from the path renderers 501 may be added together to generate the combined left car signals.
[0121] It is appreciated that binaural rendering can be replaced by rendering to loudspeaker configurations (e.g. 2.0, 5.1, 7.1, 9.1.4, 22.2) using panning algorithms such as VBAP, generating 2 or more loudspeaker signals. The combiner 503 would in most such embodiments combine all contributions to each loudspeaker signal in the loudspeaker configuration.
[0122] The path renderers and combiner may be implemented in any suitable way including typically as executable code for processing on a suitable computational resource, such as a microcontroller, microprocessor, digital signal processor, or central processing unit including supporting circuitry such as memory etc. It will be appreciated that the plurality of path renderers may be implemented as parallel functional units, such as e.g. a bank of dedicated processing unit, or may be implemented as repeated operations for each audio source. Typically, the same algorithm/code is executed for each audio source/signal.
[0123] In addition to the individual path audio components, the audio apparatus is further arranged to generate a signal component representing the diffuse reverberation in the environment. The diffuse reverberation signal is (efficiently) generated by combining the source signals into a downmix signal and then applying a reverberation algorithm to the downmix signal to generate the diffuse reverberation signal.
[0124] The audio apparatus of
[0125] An example of a suitable reverberator is the Jot reverberator illustrated in
[0126] The audio apparatus further comprises an echo signal generator 509, which is arranged to generate a flutter echo audio signal (and in many embodiments a plurality of flutter echo audio signals may be generated). The echo signal generator 509 receives the input audio source signal(s) and generates one or more flutter echo audio signals that are fed to the combiner 503 where it is combined with the other generated signal components to provide an output signal which reflects the acoustic properties of the room being simulated.
[0127] The echo signal generator 509, and thus the signal generator 403, comprises a feedback delay network with a plurality of feedback loops.
[0128] An example of such a feedback delay network of the echo signal generator 509 is illustrated in
[0129] Feedback delay networks are typically based on feedback loops with different delays in them. Input signals are inserted in the loops and with appropriate feedback gains, the signals are fed back into the loops. Output signals are extracted by combining signals in the loops. Signals fed in are therefore continuously repeated with different delays. Using delays that are mutually prime and having a feedback matrix that mixes signals between loops can create a pattern that is similar to reverberation in real spaces, and is particularly suitable for generating diffuse reverberation as in the example of a Jot or other parametric reverberator.
[0130] The absolute value of the elements in the feedback matrix are designed to be below one in order to achieve a stable, decaying impulse response. The coefficients can be set in combination with the delays to achieve a desired reverberation time (T60). In many implementations, additional gains or filters are included in the loops. These filters can control the attenuation instead of the matrix. Using filters has the benefit that the decaying response can be different for different frequencies.
[0131] In the audio apparatus, such a feedback delay network may be used to generate the flutter echo audio signal, and in many embodiments a feedback delay network may be used to generate both the flutter echo audio signal and diffuse reverberation. In particular, the same feedback delay network may be used for both with the parameter values being determined to provide the desired effect. Specifically, when no flutter echo is to be generated, all feedback loops of the feedback delay network may be used to generate diffuse reverberation components and the parameters may be set accordingly. If a flutter echo audio signal is to be generated, one or more (typically only few, such as no more than two or three) feedback loops are used to generate the flutter echo audio signal and the remaining feedback loops are used to generate the diffuse reverberation signal. The reassigned feedback loops are then setup with suitable parameters for generating a flutter echo audio signal. In many embodiments, a total of e.g. 8-20 feedback loops may be provided with no more than three of these being used for generation of the flutter echo audio signal when appropriate.
[0132] As a specific example, the approach may provide a way to include flutter echo simulation using the existing structure of the feedback delay network in the parametric reverberator generating the diffuse reverberation. This may add further characteristic features of a room's acoustics to the set of simulation tools, providing a more realistic modelling of common rooms in a virtual rendering.
[0133] The feedback delay network may thus be common to the echo signal generator 509 and to the reverberator 507.
[0134] In the example of
[0135] The audio apparatus is arranged to adapt the flutter echo audio signal generation. In particular, in many embodiments, the audio apparatus may be arranged to adapt a degree or level of flutter echo dependent on the room properties of the simulated room, and indeed in many embodiments the audio apparatus may be able to adapt whether a flutter echo audio signal is generated or not depending on the room properties. Thus, the flutter echo simulation is not merely a static generation of a flutter echo audio signal that provides a flutter echo effect but is rather a dynamically adapted flutter echo generation that depends on room properties, and especially rather than always generating a flutter echo effect, this may in many embodiments only be done when it is determined that flutter echoes are likely to be significant in the specific room.
[0136] The audio apparatus comprises an estimator 405 which is arranged to determine a flutter echo estimate for the room based on the received room metadata. The flutter echo estimate is indicative of a level/degree/amount/prevalence of flutter echo in the room.
[0137] The exact approach and algorithm or function for determining the flutter echo estimate may differ between different embodiments and may depend on the exact performance and operation desired for the individual application. In many embodiments, the flutter echo estimate may be generated to be indicative of an increasing level of flutter echo for the room metadata being indicative of reflections between one pair of opposing boundaries/walls being higher than for other pairs of boundaries/walls. This may for example be the case if the pair of opposing walls are substantially further apart from each other than other pairs of opposing walls and/or if the combined reflection attenuation for the pair of opposing walls is lower than for other pairs of walls. In such cases, the echoes occurring between the pair of opposing walls may be substantially stronger than other reflection paths that occur between walls and this may lead to more significant flutter echoes (generated by the pair of opposing walls) relative to other reflections creating e.g. the diffuse reverberation. Specifically, these flutter echoes may decay slower than other reflections creating, e.g., the diffuse reverberation. This may lead to more significant flutter echoes after a certain amount of time after emission by the source, e.g. 30 ms.
[0138] The estimator 405 is coupled to an adapter 407, which is arranged to adapt a parameter of at least one of the feedback loops of the feedback delay network in response to the flutter echo estimate. In many embodiments, the parameter may be a feedback factor (which may be frequency dependent) for the loop to itself, a feedback factor (which may be frequency dependent) for the loop to another loop of the feedback delay network, a feedback factor (which may be frequency dependent) from another loop to this loop; a loop gain/weight, a loop delay, a loop transfer function, and/or an extraction coefficient/weight for generating an output signal.
[0139] In many embodiments, a common feedback delay network may be used for the generation of diffuse reverberation and for the generation of the flutter echo signal. In such cases, feedback loops may be dynamically allocated to be used either for diffuse reverberation generation or for flutter echo audio signal generation and this may be done by adapting the parameters of the loops to be suitable for the diffuse reverberation or for the flutter echo audio signal. Thus, in many embodiments, the adapter 407 may for at least one feedback loop be arranged to switch between parameter values for generating a diffuse reverberation signal to parameters for generating a flutter echo audio signal in response to the flutter echo estimate.
[0140] The audio apparatus is accordingly arranged to determine the degree of a flutter echo that is consider to be present in the room and may setup the feedback loops of the feedback delay network to generate a flutter echo audio signal corresponding to this flutter echo.
[0141] The approach may provide an improved acoustic simulation in many embodiments and may in particular provide more naturally sounding audio when simulating rooms having particular characteristics resulting in specific flutter echoes being significant, without sacrificing performance for rooms in which flutter echo may not be significant or even noticeable.
[0142] The main driving factor defining a reverberation response is a sound wave's traveled distance. It causes attenuation and delay. However, each reflection on a surface causes an additional attenuation without adding any delay. Therefore, repetitive reflections in a small room dimension decay faster than for a large room dimension. Flutter echo will decay faster in short room dimensions than in large ones
[0143] The flutter-echo decay-rate is most often in line with the room's reverberation time T60, as the different dimensions of the room are roughly similar. This means the flutter echo is mixed with the other reflections that take different paths across multiple dimensions. These are causing a less regular reflection behavior. Due to the similar decay characteristics, the flutter echo will not be particularly noticeable in many situations and it is not considered for typical current approaches.
[0144] However, when one room dimension clearly deviates from the others by being much larger, there will be flutter echo in this dimension that deviates significantly from most of the reflection rates in the room. It will decay slower than the other reflection paths, because there will be less reflection interactions with the room boundaries. This makes it stand out from the rest of the reverberation, since fewer reflections result in less attenuation over time, and it accordingly becomes more audible. An example of a room impulse response showing flutter echoes is illustrated in
[0145] Similarly, flutter echo can stand out in the reverberation response when two parallel walls are significantly more reflective than other walls in the room. This makes the flutter echo in this dimension decay slower because each interaction with a wall is less destructive than in flutter echo in other dimensions and the reflection paths crossing multiple dimensions.
[0146] As described, flutter echo may result from the repetitive bouncing of a sound wave between two parallel surfaces. Such echoes tend to exist in all rooms, but stand out more in some rooms depending on their shape or their boundaries' relative material properties.
[0147] In the example, the estimator 405 may generate the flutter echo audio signal to reflect the difference in room dimensions. The room metadata may include dimension data for the room and the adapter 407 may determine the flutter echo estimate based on a room dimension in a first direction relative to a room dimension in a second direction. For example, the horizontal dimensions between the two parallel pairs of walls in a rectangular room may be determined from information of the size of the room indicated by the room metadata. The ratio of the longest dimension and the shortest dimension (or second longest dimension) may then be determined and used as an indication of how strong the flutter echo is. i.e. the ratio may be used directly as the flutter echo estimate.
[0148] The adapter 407 may then e.g. compare the flutter echo estimate in the form of the ratio to a threshold, and if the threshold is exceeded, it may configure some of the feedback loops of the feedback delay network to generate a flutter echo audio signal, and if it is below the threshold, it may instead configure the loops to contribute to the generation of the diffuse reverberation (and thus no flutter echo audio signal is generated). In other embodiments, a more gradual approach is used, such as for example by permanently using one or more feedback loops to generate a flutter echo audio signal, but with this having an amplitude that is a monotonically increasing function of the ratio/flutter echo estimate.
[0149] Alternatively or additionally, the adapter 407 may in some embodiments determine the flutter echo audio signal in response to variations in acoustic reflection attenuation for sides/boundaries/walls of the room. The room metadata may include acoustic reflection attenuation for walls of the room and the flutter echo estimate may be generated to reflect the variation of these. Specifically, the flutter echo estimate may be generated in response to a difference between a combined acoustic reflection attenuation for a pair of opposing sides of the room relative to a combined acoustic reflection attenuation for other pairs of opposing sides of the room. E.g. a ratio between such combined acoustic reflection attenuations may be determined and the flutter echo estimate may be generated directly as this ratio. The higher the difference, the higher the flutter echo estimate. As described for the dimension example, the adapter 407 may proceed to adapt the operation based on the ratio.
[0150] It will be appreciated that in many embodiments, the flutter echo estimate may be generated as a combination of different considerations and that specifically in many embodiments both room dimensions and acoustic reflection attenuations of the walls/sides of the room may be considered when generating the flutter echo estimate.
[0151] As mentioned, one potential cause for noticeable flutter echoes is a room with one deviating dimension being substantially longer than the other dimension(s), such as for a corridor. In such a case, the echoes of the two opposing walls in the deviating dimension will have longer path lengths that give rise to the flutter echo standing out from the rest of the Room Impulse Response (RIR). However, the reflecting paths fully orthogonal to the walls may be supplemented by reflection paths with additional reflections on the other boundaries in the short dimensions, but with the extension in the sideways direction being relatively small.
[0152] As a result, the path lengths of a significant portion of early reflections are dominated by the distance in the deviating dimension. This effect becomes stronger for higher reflection orders. If mirrored sources are spread e.g. by about 40 m in one dimension, a spread of e.g. about 4 m in another dimension does not add much more distance. Therefore, multiple reflections of different orders will be grouped close to each other in the RIR with only slightly different lags.
[0153] This means that flutter echo is not purely caused by a sound wave bouncing back and forth between two parallel surfaces. That effect just causes the first, and strongest, reflection of a sequence of reflections. More reflections may follow, representing one or more shallow, additional reflections on one of the long boundaries. These cause the clearly visible recurring bursts of concentrated energy in the RIR. This may result in flutter echoes that are not only a single echo reflection but with each echo essentially including a sequence of compound reflections.
[0154] Towards higher orders of the main flutter echo, the other reflections with similar distances will become more densely compressed in time. I.e. the lengths of the paths that bounce once or twice on a long room boundary will be closer to the path length without reflections on the long boundaries than for lower orders. Examples of such compound flutter echoes are illustrated in
[0155] Working in the digital domain, this means that at some point multiple reflections contribute to the same (discrete) filter delay. Their contributions add up and make the impulse response amplitude of these bursts larger than what they would be with an infinite sample-rate.
[0156] In the specific example, the audio apparatus implements an approach for adding simulation of flutter echoes by using the existing framework of a parametric reverberator. The overall complexity of the audio apparatus may thus not change substantially.
[0157] The audio apparatus may base the operation on room metadata descriptive of: [0158] room dimensions, [0159] locations and orientations of the room boundaries, and/or [0160] material properties related to the room boundaries.
[0161] Based on the metadata, the estimator 405 may first determine whether flutter echo is a likely audible acoustic property of the room where the user is located. For example, this may be considered the case when one dimension is significantly larger than the other two, or the reflective properties of the material on walls in one dimension are significantly larger than in the other. A flutter echo estimate may be generated that reflects this.
[0162] The adapter 407 may adapt the operation of the signal generator 403 in response to the flutter echo estimate. If this indicates that flutter echo is significant, the configuration parameters of the feedback delay network of the parametric reverberator are modified so that one or more of its feedback loops will model the flutter echo.
[0163] The adapter 407 may then proceed to set the loop delay to be proportionate to the room dimension in which the flutter echo occurs, the loop filter is set to correspond to the (combined) material properties of the walls involved with the flutter echo, and the feedback matrix may be adapted to isolate the loops from the remaining regular feedback loops. Thus, a number of parameters of the feedback loops may be set to emulate the flutter echo.
[0164] Thus, in some embodiments, a flutter echo estimate may be generated and evaluated to determine whether to simulate the flutter echo or not. It is only necessary when the flutter echo would be audible. Typically, there are two potential main root causes for audible flutter echo: [0165] a room dimension is significantly larger than the other two, [0166] the reflective properties of the material on walls in one dimension are significantly stronger than in the other two.
[0167] Also, combinations of the above could cause flutter. For example, when two dimensions are significantly larger than the third, but one of these is much less reflective than the other.
[0168] A room dimension may e.g. be considered significantly larger than the other two when it is twice as big as the maximum of the other two dimensions. An alternative criterion may be when one room dimension is at least 3.1 times as long as the average dimension of the other two dimensions. In some embodiments, it may be when a room dimension is at least 50% longer than the average of all three room dimensions.
[0169] If a room is not a rectangular cuboid (shoebox), the dimensions may be set to the outer limits of the geometry in all three dimensions.
[0170] Alternatively, a room may be eligible for flutter echo simulation if the material properties of room boundaries in one dimension are significantly different from those in other dimensions. The reflection may be represented by a parameter reflecting the acoustic reflection attenuation such as a reflection or absorption coefficient. For example, if the average reflection coefficient (a value between 0, non-reflective, and 1, fully reflective) of both walls in one room dimension is at least 0.2 higher than the maximum average reflection coefficient of both walls in the two other directions. Similarly, the average reflection coefficient of each wall pair may be compared to the average of all walls or the average of the two other wall pairs. For example, if the average reflection coefficient is at least 20% larger than the overall average. Additionally, a minimum required reflection coefficient may be introduced, e.g. the average reflection coefficient must be at least 0.67.
[0171] In other embodiments, absorption coefficients may be used to reflect the acoustic reflection attenuation, and these may be required to be smaller in the candidate flutter dimension than in the other dimensions. For example, an average absorption coefficient smaller than 85% of the average absorption coefficients of the wall pairs in other dimensions may be required.
[0172] Reflection (or absorption) coefficients are often frequency-dependent. They may be averaged over all frequencies or over a subset of frequencies. Additionally, averaging may happen over wall segments with different material properties.
[0173] Thus, a flutter echo estimate may be generated to reflect such parameters and the adapter 407 may determine whether to simulate the flutter echo or not based on whether the flutter echo estimate meets a suitable criterion.
[0174] The flutter echo estimate, and specifically deciding whether flutter echo will be simulated or not, may include a consideration of the combination of room dimension and material properties. E.g. either of separate criteria being met may cause flutter echo to be simulated. Other embodiments may only simulate the flutter echo when both a room dimension is significantly larger and the corresponding average material properties are significantly different. Optionally, or alternatively, the reflection coefficients of a candidate flutter dimension may additionally be required to be a minimum value.
[0175] In some embodiments, the dimension and material properties are combined into an estimated decay time (e.g. T60). If the estimated one dimensional decay time of one dimension is at least 30% longer than the maximum of the one dimensional decay times in the other two dimensions, flutter echo may be simulated in that dimension. In other embodiments the decay time may need to be at least 0.5 seconds longer than in the other dimensions.
[0176] The decay time can be estimated from the dimension and corresponding walls' average reflection coefficient. In the time it takes a sound wave to travel back and forth the room in that dimension, it attenuates due to the distance it traveled and two reflections on the walls. As an example, an estimated T60 decay time may be calculated according to:
[0177] This determines the attenuation in one back and forth path in the room dimension with size D. Using a source's reference distance d.sub.ref and the average reflection coefficient
[0178] In other embodiments, estimated one-dimensional decay times may be compared to overall room decay times, e.g. if the one-dimensional T60 is 10% longer than that estimated for the entire room. Overall room T60 can be estimated with equations such as a Sabine or Norris-Eyring formula.
[0179] The decision whether flutter echo should be simulated may also be a soft decision. By, e.g., choosing a low threshold where flutter echo is likely just inaudible and a high threshold where the flutter echo is likely audible, any cases in between these thresholds would result in a confidence between 0 and 1. A weight w=0 corresponds to no audible flutter echo and w=1 corresponds to full confidence that flutter echo is audible.
[0180] For example, if the 1-dimensional decay time in dimension 1, T60.sub.est.sup.(1), is compared against the average of all decay times in the room, there may be a threshold at 110% and at 150%, where below 110% there will be no flutter echo simulated, above 150% confidence is 1 and linearly increases from 0 to 1 in between the thresholds.
[0181] In some embodiments, the room characteristics may not directly be available but may e.g. be characterized by a Room Impulse Response. In some embodiments, the room metadata may include a RIR and the estimator 405 may be arranged to generate the flutter echo estimate in response to the RIR. In this example, the parameters of the feedback delay network may be determined from a flutter echo estimate generated from the RIR. Measuring impulse responses is more amenable to rooms with arbitrary shapes that deviate from a rectangular shoe-box model.
[0182] In such an embodiment, the presence of flutter echo can be measured using a smoothed version of the magnitude squared IR (e.sub.smooth(n)). By applying minimum tracking to the IR (e.sub.min(n)), any flutter echo components may be isolated. This is because discernable flutter echo will decay more slowly than the remaining reverberant reflections, and tracking the minimum approximates the reverb decay envelope. An example of this is shown in
[0183] Subtracting the two signals isolates the flutter echo components if any exist. If the energy of this signal exceeds a certain threshold, it may be determined that flutter echo is present. This decision may also be represented as a percentage of the reverberation, i.e.
[0184] As another example, the difference between the two echograms, e.sub.smooth(n)?e.sub.min(n), may be used to derive properties related to the delay and decay of the flutter echo, and used to configure the feedback delay network.
[0185] In some embodiments, a peak-picking algorithm can be used to extract local maxima and their timestamps. The decay rate of these echoes can be determined by fitting an exponential decay model to the peaks. Together, the decay rate, and timestamps can be used to determine parameters for the feedback loops.
[0186] The adapter 407 may be arranged to adapt parameters in different ways in different embodiments depending on the desired performance. The parameters for generating the flutter echo audio signal may be substantially different from the parameters used by feedback loops when generating diffuse reverberation.
[0187] The delays in a feedback delay network for generating reverberation are typically chosen relatively small, such that they create a fast build-up of reflection density. For example, an average of 12 ms is often used but for high bandwidth signals (e.g. 48 kHz) this is typically even smaller.
[0188] The choice of delay is often dependent on the reverberation time (T60). Although this is usually positively correlated with a room's dimensions, the material properties of the room boundaries also have a significant effect on the T60, i.e. the material properties introduce additional (in addition to attenuation caused by distance attenuation) attenuation to the RIR without adding latency, and the room dimensions determine the rate at which these attenuations occur in the RIR. Hence, the configuration of parametric reverberators is mainly determined by the overall reverb property, T60, and the desire to quickly reach a minimum reflection density to accurately model a room (for example: 1,000-10,000 reflections per second).
[0189] In contrast, when a feedback loop is configured to generate the flutter echo audio signal, the adapter 407 may select a loop delay that corresponds to the room dimensions in order to simulate the rate of the flutter echo. The loop filter that normally simulates the overall reverb slope, T60, may instead be chosen to correspond with the average material properties of the walls involved with the flutter echo to simulate the effect of the walls at each reflection.
[0190] The feedback matrix may in many embodiments be adjusted to keep the flutter echo separate from the diffuse reverb generation, so that the consistent recurrence of the flutter echo is simulated. If multiple different flutter echoes exist in a room, multiple feedback loops can be repurposed in a similar fashion.
[0191] In many embodiments, the adapter 407 may be arranged to increase a feedback factor/gain from the first feedback loop to itself for the flutter echo estimate being indicative of an increasing level of flutter echo. For an increasing degree of flutter echo, the feedback from a given feedback loop to itself may be increased. Alternatively or typically additionally, the adapter 407 may be arranged to decrease a feedback factor from the first feedback loop to a second feedback loop of the plurality of feedback loops for the flutter echo estimate being indicative of an increasing level of flutter echo. The second feedback loop may be a feedback loop that is not configured to be used for flutter echo generation but instead is used for generation of diffuse reverberation.
[0192] In some examples, a feedback loop used for generating a flutter echo may only feedback to itself. In some examples, a feedback loop used for generating a flutter echo may not feedback to any other feedback loop configured to generate a flutter echo. In some examples, a feedback loop used for generating a flutter echo may only receive a feedback signal from itself (out of the set of feedback loops used for generating the flutter echo or possibly out of all feedback loops of the feedback delay network).
[0193] The adaptation may for example be a gradual adaptation but in other embodiments the adaptation may for example be a step function. For example, if the flutter echo estimate is indicative of the flutter echo not being significant, a suitable feedback factor may be relatively low as the feedback loop may be mainly used to contribute to the diffuse reverberation in which case the feedback from a given loop is increasingly distributed to different loops to reflect the many different reflections making up the diffuse echo. However, if the flutter echo estimate indicates that flutter echo is significant, the feedback factor of the loop may be increased and the feedback factor to other loops may be reduced to reflect an increasing amount of periodic reflection corresponding to a typical flutter echo.
[0194] Some examples of the adaptation may be described below with reference to the example of
with D being the distance between the walls and c being the speed of sound), but with four different offsets, depending on where the user and source are between the walls.
[0195] In some low complexity embodiments, the audio apparatus may simplify this by repurposing only a single reverberator loop, using a delay corresponding to a single path-length between the walls
This corresponds with the listener and source being in the middle of the room where both the dotted- and the solid line reflections reach the listener simultaneously at a fixed rate.
[0196] The feedback matrix for a reverberator with N loops, using the first loop for the flutter echo could be defined as:
where A.sub.d.sup.(N?1) is a regular feedback matrix for diffuse reverb modelling only on N?1 loops. The reverberator's T60 filter (or loop filter) in the flutter loop may simulate the average reflection characteristics of the walls. For example:
where G.sub.d(x) is a function that returns the distance attenuation for a path length of x meters, which might be a frequency dependent attenuation,
[0197] The function G.sub.d(x) provides distance attenuation for a sound-wave propagating x meters. This can be a simple attenuation based on an omnidirectional source where its energy is spread over a sphere with radius x. It is well known that every doubling of the distance (i.e. radius) causes a 6 dB attenuation. In many embodiments a reference distance may be used as the distance for which the source signal is defined, where the distance attenuation is considered to be included in the signal and for which the additional distance attenuation from G.sub.d(x) equals 0 dB.
[0198] Additionally, other aspects may be added to G.sub.d(x), such as the effect of air absorption G.sub.abs(x). This effect typically becomes more significant at greater distances, and tends to be frequency-dependent. Typically, the effect of air absorption is quite small, especially when considered for realistic room dimensions D.
[0199] The described embodiments may use
[0200] Reflection coefficients may not be averaged but may e.g. be adapted to the lateral position of the source in front of the wall, i.e. where most of the flutter echoes will occur. In such embodiments, multiple sources may be grouped in separate loops according to their associated reflection coefficient.
[0201] As another example, the audio apparatus may be adapted to use four isolated loops with delays corresponding with the double path length (?.sub.r) when emulating the flutter echo of
[0202] The feedback matrix in this case could be defined as:
where the first four loops are dedicated to the flutter echoes.
[0203] The reverberator's loop filters in the flutter loop would both simulate the average reflection characteristics of the walls. For example:
[0204] Thus, each loop filter now simulates the attenuation resulting from the sound wave propagating through the medium (e.g. air) twice the wall-to-wall distance, and the reflections on both of the walls.
[0205] The advantage of this embodiment is that it simulates the asymmetry in the two loops similar to how it would be in a real room. Adjusting the pre-delay can be used to adapt the asymmetry to the user's position in the room, without having to update the parameters of the feedback delay network itself.
[0206] For example, if the listener is 30% of the wall-to-wall distance from wall 1101 and the source 15% of the wall-to-wall distance from wall 1102, the pre-delays could be set as follows:
based on the four first path-lengths from the source to the listener. Alternatively, the delays may be minimized to only reflect the path length differences, i.e.,
[0207] The previous embodiment may in some cases be simplified by combining the pre-delayed signals prior to feeding the combined signal to a single feedback loop, as e.g. illustrated in the example of
[0208] The feedback matrix may be:
[0209] And the loop filter would be the same as in the previous embodiment:
[0210] The loop simulates the path-length attenuation and reflections on two walls, but the pre-delay structure takes care of generating the offsets within the signal. Delays would be the same as in the previous example.
[0211] The pre-delay structure can also be extended to include gains or filters simulating the distance attenuation and reflections off the walls in these first paths. Such filters could also include additional filtering and/or attenuation simulating earlier propagation and reflections of the flutter echo, as the simulation in the feedback loops are not representing the first few reflection orders. However, such effects are typically already incorporated in the regular reverb pre-mixing and its coloration filters.
[0212] The separate input signal may also be obtained from a single tapped delay line. Typically, a parametric reverberator used in combination with direct path rendering and early reflections rendering includes a pre-delay for its normal operation, controlling where the reverb starts in relation to the direct path and early reflections. If this pre-delay is long enough, the delay buffer could be used as the tapped delay line. In this case, the flutter echo would start earlier, but this could be compensated in the early reflections modelling.
[0213] As another example, a set of feedback loops comprising two interacting loops, causing the signal to swap loop on every iteration, may be used. Using the following feedback matrix would achieve this in the first two loops.
[0214] The delays in this embodiment could be set for an arbitrary listener position to create a regular but non-symmetric pattern that is more in line with realistic scenarios. Alternatively, the delays could be adjusted according to the user's position between the walls. For example, if the user is 30% of the wall-to-wall distance from wall 1101, a first delay could be
and a second delay could be
[0215] Similar to the previous embodiment, a pre-delay structure can be used to create the missing offset due to the signal bouncing in two directions. This could be done with two delays corresponding with the first two paths (Delay 1 and Delay 2 in the above).
[0216] A particular advantage of this approach is that the two loop filters can simulate each wall separately. I.e. a first filter related to the first delay ?.sub.r2 would have a frequency response:
[0217] Similarly, a second filter associated to the second delay ?.sub.r2 would have a frequency response:
where M
[0218] A possibility with such embodiments is that, when the flutter loops are excluded from the regular extraction matrix to generate the diffuse reverb tail, they can be extracted to separate outputs for rendering with dedicated HRTF pairs.
[0219] In some embodiments, the signal generator 403 comprises a gain for the audio source signal prior to being fed to the feedback loop(s) of the feedback delay network and the adapter 407 is arranged to adapt the gain in response to a position for an audio source for the audio source signal. This may specifically, but not necessarily, be combined with the pre-delays previously described, and specifically each delay of the circuits shown in
[0220] The received data may include the audio signal representing the audio source as well as a position of the audio source and this position may be used to adapt the gain. Specifically, the gain may be adapted based on the position of the audio source relative to a wall/boundary/side of the room. Typically, the gain may be adapted based on a distance from the audio source to a wall (typically a nearest wall) being a reflecting wall for the flutter echo. The pre-gain may be used to adapt the relative strength/level of the overall flutter echo effect, and may specifically be used to adapt the level to reflect the strength of the signal when first being reflected.
[0221] In some embodiments the pre-gain may be adapted based on a distance of a listener/user. Specifically, the relative distance from the listener to the source or the distance from the source to the listener via at least one reflection on a reflecting wall for the flutter echo.
[0222] Further, in many embodiments, the first reflections may be represented by the early reflection simulations and the flutter echo signal generator 403 may only be used to represent further reflections of the flutter echoes. For example, the flutter echo signal generator 403 may be used to generate flutter echo components corresponding to the fourth or later reflections. In such cases, the sound being reflected has already been attenuated by the previous reflections including both the distance attenuation and reflection attenuation. Such effects may alternatively or additionally be represented by the pre-gain.
[0223] In some embodiments the adapter 407 may be arranged to adapt the gain in response to a distance between two walls/sides/boundaries of the room (and specifically walls/boundaries/sides that cause the flutter echo). In some embodiments the adapter 407 may be arranged to adapt the gain in response to an acoustic reflection attenuation for at least one wall/side/boundary of the room (and specifically a wall/boundary/side that causes the flutter echo). In some embodiments, the adapter 407 may be arranged to adapt the gain in response to a number of initial flutter echo reflections not emulated by the set of feedback loops of the feedback delay network allocated to flutter echo simulation.
[0224] Specifically, the (distance) gain component of the loop-filter may represent attenuation with respect to the adjustment in a previous loop pass (reflection), and the pre-gain may be used to adapt the input signal level, i.e. the level at the onset of reflections that are being simulated.
[0225] Signals may often be represented at a level corresponding to a certain reference distance. Before inserting the signal into the loop(s), a compensation/pre gain may specifically be employed to match the signal's level to the distance it has already travelled, i.e. to represent the initial distance gain. For example, the feedback delay network-based simulation may be configured to represent the flutter echo from its 4.sup.th order (because the first three are represented by early reflections modelling by another algorithm). In this particular example, with reference to
where the two occurrences of the number 3 represent the three previous iterations,
[0226] A feedback loop may have an overall loop gain set to reflect attenuation of a reflection path (which dependent on the specific approach may include one or more reflections). The loop gain may be set by the loop filter and/or the feedback factor (the feedback matrix). In the described examples, the feedback factor for a loop to itself is set to one and the loop gain (less than 1) is determined by the loop filter. The loop gain/attenuation are typically frequency dependent, and the frequency dependency is typically implemented by the use of a suitable loop filter.
[0227] Different approaches for determining a loop gain/attenuation G.sub.d may be used in different embodiments.
[0228] Typically, the loop filter(s) include two main components: material properties (for example a reflection coefficient) and a distance-related gain. Each loop filter may represent one or more reflection coefficients corresponding to reflections on one or two walls and a distance gain corresponding to a travelled distance consistent with the reflections represented by the average reflection coefficients.
[0229] Because the loop-related distance with respect to the reference distance keeps increasing, the required distance attenuation component should become less strong with every iteration. E.g.:
where x becomes larger with every iteration.
[0230] This means that consecutive reflections decay faster than exponentially, and that this may sometimes not be accurately simulated with a single feedback loop. The filters in the feedback loops may be constant due to the recursive character. Any processed sample may include components of many different iterations.
[0231] Isolating the energy dispersion component (the most significant component), distance
Say that every iteration corresponds with a travelled distance attenuation (for signal amplitudes) is: D. With every iteration, the distance d increases with this fixed distance D, which makes the additional attenuation with respect to the previous iteration be:
[0232] The problem is that d is increasing every iteration, and a fixed gain per iteration is needed. Representing the distance differently, where d is a multiple of D, we get a similar result that shows a further simplification:
[0233] It can be seen that the effect of distance attenuation in each iteration is not so much dependent on the actual distance travelled, but the increase relative to what has already been travelled (in line with the rule-of-thumb that attenuation is 6 dB for every doubling of the distance).
[0234] The distance gain in the first iterations has quite a big impact, since the distance corresponding with one iteration is relatively small compared to the overall travelled distance. Quickly, the dynamic effect of the distance attenuation in each iteration reduces (i.e. changes less between iterations). As a result, the decay approaches an exponential shape.
[0235] With the distance gain approaching 1, the per-iteration gain stabilizes towards the average reflection coefficient of the flutter boundaries' material properties. Simulating the flutter echo, different embodiments may choose different approaches. For example, the average reflection coefficient may be chosen to simulate the decay at higher orders. Alternatively, a steeper decay may be used to simulate the decay at lower orders. Or a value in between may be beneficial in most implementations so as to not have a decay that is too steep or too shallow. Accurate simulation of the slope at high orders may in many cases be unnecessary because it will be inaudible to the listener. A good trade-off may be made by choosing the slope corresponding with, for example, the 5.sup.th iteration.
[0236] As discussed above, many embodiments may adjust the input level of the signals that are inserted into the flutter loop. In addition to the compensation for reference distance, and reflection orders simulated differently, the input gain may be beneficially adjusted to the trade-off chosen for the attenuation gain G.sub.d. Choosing a relatively slow decay may cause the flutter echo to be too pronounced, while choosing a relatively steep decay, the flutter may not be audible anymore where an accurate simulation would be.
[0237] With a relatively slow decay configured, the initial level may therefore be further lowered to avoid it being too pronounced. The additional attenuation essentially compensates for the faster decay at early iterations that are not accurately modelled in the recursive process. As a result, the stronger first reflections may not be accurately modeled. In many cases, these would have been (largely) masked by the reverberation anyway.
[0238] As an example, based on the model from
[0239] The initial input gain may be configured to represent the first reflection order:
[0240] The compensation for the missed attenuation in the first I=9 iterations may be included according to:
where
represents me cumulative effect of the attenuations in the first I=9 iterations according to ideal modelling, excluding the material properties, whereas
represents the attenuations that will actually be applied using the slope corresponding to I+1=10, excluding the material properties. The material properties can be excluded because they are present equally in both elements of the fraction.
[0241] This compensation ensures a match of both slope and level at the 10.sup.th iteration. It can be stored for different decay reference iterations J in a look-up table.
[0242] In some cases, the level may not need to be matched to the same iteration and a trade-off may be used:
where ? is the trade-off parameter with a value between 0 and 1. A value of 1 means a compensation as described above and a value of 0 results in no compensation.
[0243] In some applications, both the low order reflections with higher levels as well as the lower levels for medium- and higher order reflections are preferred to be simulated. This could be possible in rooms with relatively low diffuse reverb energy (e.g. highly absorbent boundaries, except those involved in the flutter echo). Such applications can employ an embodiment where two or more loops simulate different decay rates with the same delays.
[0244] If the delay on the input signals and inside the flutter feedback loops are equal, the reflections are created at the same time lags. A first flutter loop may be configured with a steep decay and a relatively large input gain, while a second flutter loop may be configured with a slow decay and a relatively small input gain. When the two are combined by the output circuit, the joint effect may more closely resemble an accurate simulation with iteration-dependent loop gains.
[0245] In some embodiments, the set of feedback loops allocated to generate the flutter echo may accordingly comprise at least two feedback loops that have different loop gains. However, the at least two feedback loops may have the same delay.
[0246] The above embodiments configure the loop filters according to the material properties of the walls between which the flutter echoes occur. These filters can be extended to include the effects of the shallow reflections on the long room boundaries.
[0247] The material properties of the boundaries in the flutter dimension (the short boundaries) do not have an impact on the energy ratio of the first reflection with respect to the overall compound reflection. However, it does impact how fast consecutive compound reflections decay.
[0248] Conversely, the material properties of the long boundaries (i.e. not along the flutter dimension) determine how quick each individual compound reflection decays, and hence the energy ratio between the first reflection and the overall compound reflection. The decay of the first response amplitudes in consecutive compound reflections are not affected by this material.
[0249] As described, within the RIR, these responses will change with the order of the flutter echo they contribute to, compressing the individual reflections in time. However, essentially there are additional contributions with one, two or more additional material properties. The main effect is that this increases the energy in the individual flutter echo and its coloration. The coloration is affected by adding contributions with additional frequency dependent material properties, but in theory also due to the delayed reflections causing comb-filter effects. However, due to the many repetitions at different delays, the comb-filter effect is not substantial.
[0250] The compound reflections may be modelled by a single reflection. The loop filter H.sub.? can be set to represent a single pulse with the spectral response matching that of a compound reflection.
[0251] The net effect of the compound reflection also constitutes a larger energy than a single reflection, this total energy should also be represented in the single reflection. Due to the delays the amplitudes of the individual responses typically do not add up coherently.
[0252] The energy of the compound reflection can be approximated by:
where K may be infinity or limited to a certain duration after the direct response, e.g. 50 ms. Typically, higher K's don't contribute significantly due to the exponential behavior.
[0253] The above equation for E.sub.c ignores the distance attenuation. This contribution is relatively low. It also makes the energy ratio between initial amplitude and compound reflection energy independent of the flutter order.
[0254] In an alternative embodiment, a separate loop with a very short delay can simulate the tails to the main flutter response. This loop is only fed by the main flutter loop(s) and has no direct signal input (b.sub.i=0), but does feedback into itself. The short delay could be dependent on the shortest dimension of the room. The attenuation by the filter would be the average reflection coefficient if the long room boundaries (e.g.
[0255] Another alternative is to use a sparse IIR as the loop filter in the flutter loop that simulates the fast decaying response of the compound reflection.
[0256] In many embodiments, the audio apparatus may be arranged to feed a plurality of audio source signals to the feedback delay network, and specifically may be arranged to feed a plurality of audio source signals to the set of feedback loops that generate the flutter echo audio signal. The audio apparatus may for example receive audio source signals for a plurality of audio sources in the room and a plurality (and possibly all) of these signals may be fed to the set of feedback loops generating the flutter echo audio signal. The plurality of signals may for example be combined into a combined signal, which may then be fed to the set of feedback loops. Each signal may be subjected to a delay and/or gain adjustment prior to being combined with other signals. The gain and/or delay for each signal may for example be adapted to reflect an initial and/or relative signal level and/or arrival time for the individual signal (relative to other signals). In some embodiments, e.g. the gain and/or delay may be common for some or possibly all source signals fed to the set of feedback loops.
[0257] The previously described embodiments may allow accurate simulation of the offsets between individual reflections. This may provide a particularly realistic rendering. The described approaches have focused on generating flutter echo for a single source and the loop parameter properties etc. may depend on specific characteristics of the source, such as the position. However, often there are more than one source in a simulated room that generates a flutter echo. In such cases, each source may be simulated by its own, dedicated feedback loop(s) etc. These could be implemented with separate parallel paths to e.g. the pre-mixing and pre-delay prior to the feedback delay network.
[0258] In many applications, such a level of accuracy is however not required. The parameters may be set to suitable values (e.g. arbitrarily or artistically chosen values). In some embodiments they may be chosen equally for all simulated sources. For example, the approach illustrated in
[0259] The input to the feedback loops of the feedback delay network may be mathematically expressed as:
where x is an audio source signal (a mono signal) and X.sub.L the output signal vector corresponding to the P signals to be injected into the P feedback loops.
[0260] Some embodiments require or benefit from separate inputs to the feedback loops. This can be achieved by extending the input gain vector b to a matrix B that takes into account more than one signal and maps it to the different loops.
[0261] For example, the inputs provided to a feedback delay network with five feedback loops (P=5), could be processed by an input matrix B:
where x.sub.1 is the first input signal, x.sub.2 the second, and the first feedback loop is a feedback loop used for flutter echo generation and with the remaining four feedback loops being used to generate diffuse reverberation.
[0262] Alternatively, in line with
where the factor 0.9 in element b.sub.11 represents an attenuation corresponding, for example, to distance attenuation associated with Delay 1.
[0263] The delays create the different offsets for the P different paths from the source to the listener. Typically, P=4 per flutter dimension in a shoebox-shaped room. The delays can be chosen to represent the relative offsets to the smallest offset, where the common offset is disregarded. In other embodiments all delays may be set to the absolute offset, potentially dynamically adjusting to the listener position.
[0264] The delays may also be adjusted commonly to achieve an additional common delay component for the flutter echo. Such common delay component may be useful to control the offset of the flutter echo simulated by the parametric reverberator with respect to early reflections simulated by other means. For example in order to ensure an appropriate latency between the last early reflection associated with the flutter dimension and the first simulated flutter echo response from the feedback delay network.
[0265] In some cases, it may be advantageous to start flutter echoes earlier than the diffuse late reverberation part. In these embodiments, the inputs to the flutter loops may bypass the pre-delay and only pass through the dedicated flutter delays that control the start of the flutter echo simulation in relation to the source's emission. For example, the separately generated early reflections may exclude all reflections related to the flutter dimension and instead simulate these with the feedback delay network only.
[0266] In other embodiments, early reflection signals may be generated and fed into the flutter echo feedback loops of the feedback delay network. The early reflection signals may only include the reflections in the flutter dimension.
[0267] In some embodiments, the audio apparatus may be arranged such that at least one audio source signal is fed only to feedback loops that are used for flutter echo audio signal generation.
[0268] In some embodiments, the audio apparatus may comprise a spatial processor, which is arranged to apply a spatial processing to the flutter echo signal where the spatial processing is dependent on a position of the source of the audio source signal and/or a side of the room.
[0269] The spatial processing may be a processing that may modify or create a spatial cue for the flutter echo audio signal. In particular, the spatial processor may be arranged to perform a binaural processing of the flutter echo audio signal as e.g. illustrated in
[0270] In some embodiments, the spatially processed flutter echo audio signal may be combined with other generated audio components and it may specifically be combined with the diffuse reverberation generated by other feedback loops of the feedback delay network. However, this diffuse reverberation may not be subjected to the spatial processing as it is generally a distributed sound.
[0271] Thus, in some embodiments, the audio apparatus comprises a combiner for combining a spatially processed flutter echo audio signal with a (non-spatially processed) diffuse reverberation signal. In the example of
[0272] In many embodiments, the feedback delay network may generate the flutter echo audio signal by combining the output signals of the feedback loops that are used for generating the flutter echo audio signal. Similarly, the diffuse reverberation signal may be generated by combining output signals of the feedback loops that are used for generating the reverberation.
[0273] Typically, the feedback loops of the feedback delay network are used either for reverberation generation or for flutter echo generation.
[0274] In most embodiments, the adapter 407 may be arranged to assign a set of feedback loops to flutter echo audio signal generation with the remaining feedback loops being used for reverberation generation. In such cases, the adapter 407 may typically be arranged to keep the loops separate.
[0275] Specifically, the adapter 407 may adapt feedback factors for the feedback loops such that there is no feedback from a feedback loop of the set of feedback loops used to generate the flutter echo audio signal to any other feedback loop, and vice versa. Specifically, it may set all feedback coefficients of the feedback matrix relating to a feedback between two loops belonging to the two different sets to zero.
[0276] Similarly, when generating the output signals, the flutter echo audio signal may be generated by a combination of output signals of only the feedback loops of the set of feedback loops that are used for generation of the flutter echo audio signal and the reverberation signal may be generated by a combination of output signals of only the feedback loops of the set of feedback loops that are not used for generation of the flutter echo audio signal.
[0277] In many embodiments, the output signals of flutter feedback loops may be processed in the same way as the other feedback loops by generating output signals using a weighted combination that may be represented by an extraction matrix C. This may e.g. include applying correlation and/or coloration filters as known from generation of diffuse reverberation. The resulting flutter echoes will in this case not originate from a specific direction.
[0278] However, in embodiments where the flutter echo(es) is(are) desired to be directional, the flutter echo feedback loop output signal(s) may be extracted separately for alternative processing (as in the example of
[0279] The first and second output signal generated by the extraction matrix can be processed normally by the rest of the parametric reverberator functionality. The third and fourth output signals could be processed separately. For example, with different HRTF pairs corresponding to the opposing directions of both walls. These may be adaptive depending on the user's orientation.
[0280] This may be particularly advantageous in relation to embodiments where each wall is simulated in a separate loop. The first loop simulates wall 1105, and the second loop simulates wall 1107. The HRTF pair for the third output signal may correspond with the direction of wall 1105, respective to the listener, and similarly for the fourth output signal the HRTF pair may correspond with the direction of wall 1107.
[0281] For example, in
[0282] When soft decisions have been made on whether to generate a flutter echo audio signal, the rendering of flutter echoes may be advantageously adapted to the soft decision. For example, if the soft decision results in flutter echo estimate that includes (or consists in) a confidence value ? between 0 and 1, this may control the rendering between no flutter echo effect at confidence 0 and full flutter echo effect at confidence 1.
[0283] In a particularly simple implementation, the extraction matrix elements associated with the flutter echo are multiplied by the confidence value. As a consequence, the flutter echo level will be lower if the confidence is lower. The confidence value may also be modified, for example, to achieve a non-linear behavior with respect to the confidence. E.g.
[0284] Similarly, the confidence value can be used to modify the corresponding elements in the feedback matrix. This has the effect that the flutter echo dies out more quickly because the additional attenuation will be applied at every iteration. The confidence value may also be modified, for example, to achieve a non-linear behavior with respect to the confidence. E.g.:
[0285] In other embodiments the parametric reverberator may cross-fade between the diffuse and flutter echo schemes described above and a normal diffuse reverberator. A simple implementation of this may cross-fade the feedback matrices for the two schemes controlled by the confidence value.
[0286] As an effect, there is some bleeding of the diffuse reverb generation into the flutter echo generation and vice versa. This makes the flutter echo more diffuse as the confidence value decreases.
[0287] Other such embodiments may additionally cross-fade other aspects of the feedback loops. This may only affect the flutter loops. Delays may be modified and/or the loop filter target spectra may be cross-faded.
[0288] It should be noted that multiple flutter echo instances may occur in a room, with different reflection rates. In some cases, there may be multiple dimensions in which there are strong reflections. In oddly shaped rooms there may be staggered surfaces in the flutter direction.
[0289] In such cases, the additional flutter echo instances may be treated as described above, using additional feedback loops. Thus, the described approach may be copied for multiple flutter echo audio signal generations. If too many feedback loops are needed for flutter echo simulation, it may be beneficial to increase the number of feedback loops in the feedback delay network structure. Typically, if the number of loops for the reverberation processing is less than eight, quality may suffer.
[0290] It will be appreciated that the above description for clarity has described embodiments of the invention with reference to different functional circuits, units and processors. However, it will be apparent that any suitable distribution of functionality between different functional circuits, units or processors may be used without detracting from the invention. For example, functionality illustrated to be performed by separate processors or controllers may be performed by the same processor or controllers. Hence, references to specific functional units or circuits are only to be seen as references to suitable means for providing the described functionality rather than indicative of a strict logical or physical structure or organization.
[0291] The invention can be implemented in any suitable form including hardware, software, firmware or any combination of these. The invention may optionally be implemented at least partly as computer software running on one or more data processors and/or digital signal processors. The elements and components of an embodiment of the invention may be physically, functionally and logically implemented in any suitable way. Indeed, the functionality may be implemented in a single unit, in a plurality of units or as part of other functional units. As such, the invention may be implemented in a single unit or may be physically and functionally distributed between different units, circuits and processors.
[0292] Although the present invention has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present invention is limited only by the accompanying claims. Additionally, although a feature may appear to be described in connection with particular embodiments, one skilled in the art would recognize that various features of the described embodiments may be combined in accordance with the invention. In the claims, the term comprising does not exclude the presence of other elements or steps.
[0293] Furthermore, although individually listed, a plurality of means, elements, circuits or method steps may be implemented by e.g. a single circuit, unit or processor. Additionally, although individual features may be included in different claims, these may possibly be advantageously combined, and the inclusion in different claims does not imply that a combination of features is not feasible and/or advantageous. Also, the inclusion of a feature in one category of claims does not imply a limitation to this category but rather indicates that the feature is equally applicable to other claim categories as appropriate.
[0294] Furthermore, the order of features in the claims do not imply any specific order in which the features must be worked and in particular the order of individual steps in a method claim does not imply that the steps must be performed in this order. Rather, the steps may be performed in any suitable order. In addition, singular references do not exclude a plurality. Thus references to a, an, first, second etc. do not preclude a plurality. Reference signs in the claims are provided merely as a clarifying example shall not be construed as limiting the scope of the claims in any way.