Audio reproduction system and method for reproducing audio data of at least one audio object
09807533 · 2017-10-31
Assignee
Inventors
Cpc classification
H04S2420/01
ELECTRICITY
H04S2420/13
ELECTRICITY
H04S5/005
ELECTRICITY
H04S2400/11
ELECTRICITY
H04S7/30
ELECTRICITY
H04S2420/11
ELECTRICITY
International classification
H04S3/00
ELECTRICITY
H04S5/00
ELECTRICITY
Abstract
An audio reproduction system for reproducing audio data of at least one audio object and/or at least one sound source of an acoustic scene in a given environment comprising: at least two audio systems acting distantly apart from each other, wherein one of the audio systems is adapted to reproduce the audio object and/or the sound source in a first distance range to a listener and another of the audio systems is adapted to reproduce the audio object and/or the sound source in a second distance range to the listener, wherein the first and second distance ranges are different and possibly spaced apart from each other or placed adjacent to each other; and a panning information provider adapted to process at least one input to generate at least one panning information for each audio system to drive the at least two audio systems.
Claims
1. A method for reproducing audio data of at least one audio object and/or at least one sound source of an acoustic scene in a given environment by at least two audio systems acting distantly apart from each other comprising the following steps: reproducing by one of the audio systems audio signals corresponding to the audio object and/or the sound source arranged in at least one first distance range to a listener and reproducing by another of the audio systems audio signals corresponding to the audio object and/or the sound source arranged in at least one second distant range to the listener, wherein the first and second distant ranges are different and possibly spaced apart from each other or placed adjacent to each other; and processing by a panning information provider of at least one input to generate at least one panning information for each audio system to differently drive the at least two audio systems, in particular to differently generate audio signals, wherein as an input a position data of the position of the audio object and/or of the sound source in the acoustic scene are provided; and wherein as the panning information at least one parameter, in particular a signal intensity and/or an angular position for the same audio object and/or the same sound source are generated for each audio system to differently drive the at least two audio systems, in particular to differently generate audio signals in such a manner that the same audio object and/or the same sound source is panned within at least one distance range and/or between two of the distance ranges of the audio systems; and wherein the panning information is determined by at least one given distance effect function which represents the distance effect functions of the respective audio object and/or the respective sound source in a transfer range between the at least two distance ranges of the audio systems and/or within one of the distance ranges.
2. The method according to claim 1, wherein as another input at least a metadata of the acoustic scene, of the environment, the audio object, the sound source and/or an effect slider are provided.
3. The method according to claim 2, wherein at least one parameter of the panning information, in particular the signal intensity and/or an angular position of the same audio object and/or the same sound source for the at least two audio systems, are extracted from the metadata and/or the configuration settings of the audio systems and/or the audio data.
4. The method according to claim 2, wherein the panning information are extracted from the metadata of the respective audio object and/or a time and/or a spot in the environment, in particular in a game scenario or in a room.
5. The method for reproducing audio data according to claim 1, wherein the audio data corresponds to interactive gaming scenarios, software scenarios, theatre scenarios, music scenarios, concert scenarios, movie scenarios, process monitoring scenarios and/or manufacturing scenarios.
6. The method according to claim 1, wherein the angular position of the same audio object and/or the same sound source for the at least two audio systems are equal.
7. A method for reproducing audio data of at least one audio object and/or at least one sound source of an acoustic scene in a given environment by at least two audio systems acting distantly apart from each other comprising the following steps: reproducing by one of the audio systems audio signals corresponding to the audio object and/or the sound source arranged in at least one first distance range to a listener and reproducing by another of the audio systems audio signals corresponding to the audio object and/or the sound source arranged in at least one second distant range to the listener, wherein the first and second distant ranges are different and possibly spaced apart from each other or placed adjacent to each other; and processing by a panning information provider of at least one input to generate at least one panning information for each audio system to differently drive the at least two audio systems, in particular to differently generate audio signals, wherein as an input a position data of the position of the audio object and/or of the sound source in the acoustic scene are provided; wherein as the panning information at least one parameter, in particular a signal intensity and/or an angular position for the same audio object and/or the same sound source are generated for each audio system to differently drive the at least two audio systems, in particular to differently generate audio signals in such a manner that the same audio object and/or the same sound source is panned within at least one distance range and/or between two of the distance ranges of the audio systems; wherein as another input at least a metadata of the acoustic scene, of the environment, the audio object, the sound source and/or an effect slider are provided and wherein number and/or dimensions of the distant ranges and/or transfer ranges are extracted from the configuration settings, distance ranges definitions and/or from the metadata.
8. The method for reproducing audio data according to claim 7, wherein the audio data corresponds to interactive gaming scenarios, software scenarios, theatre scenarios, music scenarios, concert scenarios, movie scenarios, process monitoring scenarios and/or manufacturing scenarios.
9. A non-transitory computer-readable recording medium having a computer program for executing the method according to claim 7.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The present invention will become more fully understood from the detailed description given hereinbelow and the accompanying drawings which are given by way of illustration only, and thus, are not limitive of the present invention:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12) Corresponding parts are marked with the same reference symbols in all figures.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
(13)
(14) The environment 1 may be a real or virtual space, e.g. a living room or a space in a game or in a movie or in a software scenario or in a plant or facility. The acoustic scene 2 may be a real or virtual scene, e.g. an audio object Ox, a sound source Sy, a game scene, a movie scene, a technical process, in the environment 1.
(15) The acoustic scene 2 comprises at least one audio object Ox, e.g., voices of persons, wind, noises of audio objects, generated in the virtual environment 1. Additionally or alternatively, the acoustic scene 2 comprises at least one sound source Sy, e.g. loudspeakers, generated in the environment 1. In other words: the acoustic scene 2 is created by the audio reproduction of the at least one audio object Ox and/or the sound source Sy in the respective audio ranges C0 to C1 and D1 to D2 in the environment 1.
(16) Depending on the kind and/or the number of available audio systems 3.1 to 3.4 at least one audio system 3.1 to 3.4 is assigned to one of the distance ranges C0 to C1 and D1 to D2 to create sound effects in the respective distance ranges C0 to C1 and D1 to D2, in particular to reproduce the at least one audio object Ox and/or the sound source Sy in the at least one distance ranges C0 to C1, D1 to D2.
(17) For instance, a first audio system 3.1 is assigned to a first close range C0, a second audio system 3.2 is assigned to a second close range C1, a third audio system 3.3 is assigned to a first distant range D1 and a fourth audio system 3.4 is assigned to a second distant range D2 wherein all ranges C0, C1, D1 and D2 are placed adjacent to each other.
(18)
(19) The audio systems 3.1 to 3.4 are designed as audio systems which create sound effects of an audio object Ox and/or a sound source Sy in close as well as in distant ranges C0 to C1, D1 to D2 of the environment 1 of the listener L. The audio systems 3.1 to 3.4 may be a virtual or real surround system, a headphone assembly, a proximity audio system, e.g. sound bars.
(20) The panning information provider 4 processes at least one input IP1 to IP4 to generate at least one parameter of at least one panning information PI, PI(3.1) to PI(3.4) for each audio system 3.1 to 3.4 to differently drive the audio systems 3.1 to 3.4. One possible parameter of panning information PI is an angular position α of the audio object Ox and/or the sound source Sy. Another parameter of panning information PI is an intensity I of the audio object Ox and/or the sound source Sy.
(21) In a simple embodiment, the audio reproduction system 3 comprises only two audio systems 3.1 to 3.2 which are adapted to commonly interact to create the acoustic scene 2.
(22) As an input IP1 a position data P(Ox), P(Sy) of the position of the audio object Ox and/or of the sound source Sy, e.g. their distance and angular position relative to the listener L in the environment 1, are provided.
(23) Additionally, as another input IP2, basic metadata, in particular metadata MD(1, 2, Ox, Sy, ES) of the acoustic scene 2, the environment 1, the audio object Ox, the sound source Sy and/or the effect slider ES are provided.
(24) Furthermore, the metadata MD(Ox, Sy) of the audio object Ox and/or the sound source Sy may be more precisely described by other data, e.g. the distance ranges C0 to C1, T1, D1 to D2 may be defined as distance range data DRD or distance effect functions, a motion path MP may be defined as motion path data MPD, a random position area A to B may be defined by random position area data and/or effects, time, events, groups may be defined by parameter and/or functions.
(25) Additionally, as another input IP3 configuration settings CS of the audio reproduction system 3, in particular of the audio systems 3.1 to 3.4, e g kind of the audio systems, e.g. virtual or real, number and/or position of the loudspeakers of the audio systems, e.g. position of the loudspeakers relative to the listener L, are provided.
(26) Moreover, as another input IP4 audio data AD(Ox), AD(Sy) of the audio object Ox and/or of the sound source Sy, are provided.
(27) The panning information provider 4 processes the input data of at least one of the above described inputs IP1 to IP4 to generate as panning information PI, PI(3.1 to 3.4) at least one parameter, in particular a signal intensity I(3.1 to 3.4, Ox, Sy) and/or an angular position α(3.1 to 3.4, Ox, Sy) of the same audio object Ox and/or the same sound source Sy for each audio system 3.1 to 3.4 to differently drive that audio systems 3.1 to 3.4 in such a manner that the same audio object Ox and/or the same sound source Sy is panned in the acoustic scene 2 between the inner border of the inner audio range C0 and the outer border of the outer audio range D2 within the respective audio ranges C0 to C1, D1 to D2 of the audio systems 3.1 to 3.4.
(28) In particular, at least one of the audio systems 3.1 reproduces the audio object Ox and/or the sound source Sy in at least one first close range C0 to a listener L and another of the audio systems 3.2 reproduces the audio object Ox and/or the sound source Sy in at least one second distant range D1 to the listener (L). In the case that both audio systems 3.1 and 3.2 reproduce the same audio object Ox and/or the same sound source Sy than that audio object Ox and/or the sound source Sy is panned in a transfer range T1 between the close range C0 and the distant range D1 as it is shown in
(29) Preferably, the angular position α(3.1 to 3.4, Ox, Sy) of the same audio object Ox and/or the same sound source Sy for the audio systems 3.1 to 3.4 are equal to achieve the sound effect that it seems that that audio object Ox and/or that sound source Sy pans in the same direction. Alternatively, the angular position α(3.1 to 3.4, Ox, Sy) may be different to achieve special sound effects.
(30) In a further embodiment, the parameter of the panning information PI, in particular the signal intensity I of the same audio object Ox and/or the same sound source Sy for the two audio systems 3.1 to 3.4 are extracted from metadata MD and/or the configuration settings CS of the audio systems 3.1 to 3.4.
(31) The panning information provider 4 is a computer-readable recording medium having a computer program for executing the method described above. The audio reproduction system 3 in combination with the panning information provider 4 may be used for executing the described method in interactive gaming scenarios, software scenarios or movie scenarios and/or other scenarios, e.g. process monitoring scenarios, manufacturing scenarios.
(32)
(33) The panning of the audio object Ox and/or the sound source Sy within the transfer range T1 and thus between the close range C0 and the distant range D1 is created by both audio systems 3.1 and 3.2. In particular, each audio system 3.1 and 3.2 is controlled by the extracted parameters of the panning information PI(3.1, 3.2), in particular a given angular position α(3.1, Ox, Sy), α(3.2, Ox, Sy) and a given intensity I(3.1, Ox, Sy), I(3.2, Ox, Sy), of the same audio object Ox or the same sound source Sy to respectively reproduce the same audio object Ox or the same sound source Sy in such a manner that it sounds that this audio object Ox or this sound source Sy is in a respective direction and in a respective distance within the transfer range T1 to the position X of the listener L.
(34)
(35) As the intensities I(3.1, 3.2) the distance effect functions e(3.1, 3.2) are subdivided by other given distance effect functions g0, h0, i0 used to control the respective audio systems 3.1 and 3.2 for creating the distance ranges C0, T1 and D1.
(36) Alternatively, the distance effect functions e may be prioritized or adapted to ensure special sound effects at least in the transfer range T1, wherein the audio systems 3.1 to 3.2 will be alternatively or additionally controlled by the distance effect functions e(3.1) and e(3.2) to create at least the transfer zone T1 as it is shown in
(37) In the shown embodiment, the panning information PI, namely the distance effect functions e(3.1) and e(3.2) are extracted or determined from given or predefined distance effect functions g0, h0 and i0 depending on the distances r of the reproducing audio object Ox/the sound source Sy to the listener L for panning that audio object Ox and/or that sound source Sy at least in one of the audio ranges C0, T1 and/or D1.
(38) In particular, according to the extracted panning information PI, namely the distance effect functions e(3.1) and e(3.2), the sound effects of the audio object Ox and/or the sound source Sy are respectively reproduced by the first audio system 3.1 and/or second audio system 3.2 at least in a given distance r to the position X of the listener L within at least one of the distance ranges C0, T1 and/or D1 and with a respective intensity I corresponding to the extracted distance effect functions e(3.1) and e(3.2).
(39) As it is shown in
(40) In this embodiment the conjunction of the at least both audio systems 3.1, 3.2 create all audio ranges C0, T1, D1 according to the effect intensities e extracted from the distance effect functions g0, h0 and i0.
(41) In particular, for the same audio object Ox and/or the same sound source Sy in a distance r of up to r1=3 m from the listener L the audio system 3.1 creating the proximity area will be driven by the linear function g0(3.1) with a constant effect intensity e(3.1)=g0(3.1)=e2 of 100% and the audio system 3.2 creating the distant area will be driven by the linear function g0(3.2), with a constant effect intensity e(3.2)=g0(3.2)=e1 of 0%, in an area between the distance r1 and the distance r2 and thus between 3 m and 5 m from the listener L the audio system 3.1 creating the proximity area will be driven preferably also by a linear distance effect function h0(3.1) with a monotone decreasing effect intensity e(3.1, r1)=h0(3.1, r1)=e2 of 100% to e(3.1, r2)=h0(3.1, r2)=e1 of 0% and the audio system 3.2 creating the distant area will be driven by the linear distance effect function h0(3.2), with a monotone increasing effect intensity e(3.2, r1)=h0(3.2, r1)=e1 of 0% to e(3.2, r2)=h0(3.2, r2)=e2 of 100%, alternatively the distance effect functions e(3.1) to e(3.2) may be extracted from nonlinear functions h1 to hx in the same manner, in a distance r greater than r2=5 m from the listener L the audio system 3.1 creating the proximity area will be driven by the linear distance effect function i0(3.1) with a constant effect intensity e(3.1)=i0(3.1)=e1 of 0% and the audio system 3.2 creating the distant area will be driven by the linear distance effect function i0(3.2), with a constant effect intensity e(3.2)=i0(3.2)=e2 of 100%.
(42)
(43)
(44) The transfer range T1 is subdivided by a circumferential structure Z which is in a given distance r3 to the listener L. Further distances r4 and r5 are determined, wherein the distance r4 represents the distance from the circumferential structure Z to the outer surface of the close range C0 and the distance r5 represents the distance from the circumferential structure Z to the inner surface of the distant range D1.
(45) In particular, the audio system 3.1 in conjunction with the audio system 3.2 is controlled by at least one parameter of the panning information PI, in particular a given angular position α(3.1) and/or a given intensity I(3.1), of the audio object Ox or the sound source Sy which is respectively reproduced and panned in such a manner that it seems that this audio object Ox(r4, r5) or this sound source Sy(r4, r5) is in a respective direction and in a respective distances r4, r5 within the transfer range T1 to the position X of the listener L.
(46) Additionally, the audio system 3.2 in conjunction with the audio system 3.1 is controlled by at least another parameter of the panning information PI, in particular a given angular position α(3.2) and/or a given intensity I(3.2), of the audio object Ox or the sound source Sy which is respectively reproduced and panned in such a manner that it seems that this audio object Ox (r4, r5) or this sound source Sy(r4, r5) is in a respective direction and in a respective distances r4, r5 within the transfer range T1 to the position X of the listener L.
(47)
(48) The outer and/or the inner circumferential shapes of the ranges C0 and D1 are irregular and thus differ from each other. The panning of the audio object Ox and/or the sound source Sy within the transfer range T1 and thus between the close range C0 and the distant range D1 is created by both audio systems 3.1 and 3.2 analogous to the embodiment of
(49)
(50) According to the position and thus to the distance r1, r2 of the audio object Ox and/or the sound source Sy to the position X of the listener L, the distance effect functions e used to control the available audio systems 3.1 and 3.2 may be extracted by other given or predefined linear and/or non-linear distance effect functions g0, h0 to hx and i0 for an automatic panning of the audio object Ox/sound source Sy in such a manner that for an audio object Ox/a sound source Sy moving between a distance from 3 m to 5 m the distance effect functions e will be extracted from one of the predefined linear and/or non-linear distance effect functions h0 to hx, for an object in a distance less than 3 m the distance effect functions e will be extracted from the predefined distance effect functions g0 and for an object in a distance greater than 5 m the distance effect functions e will be extracted from the predefined distance effect functions i0.
(51) In this embodiment the conjunction of the at least both audio systems 3.1, 3.2 create all distance ranges C0, T1, D1 according to the effect intensities e extracted from the distance effect functions g0, h0 to hx and i0.
(52) Generally, the sum of the distance effect functions e(3.1) to e(3.n) is 100%. For instance, in the case that the audio reproduction system 3 comprises two audio systems 3.1, 3.2 then two distance effect functions e(3.1) and e(3.2) are provided as follows:
e(3.1)+e(3.2)=100% [0]
(53) In this embodiment, only one distance effect function for example e(3.2) may be provided as the other distance effect function e(3.1) may be extracted from the only one.
(54) In particular, for the same audio object Ox and/or the same sound source Sy in a distance r of up to r1=3 m from the listener L the audio system 3.1 creating the proximity area will be driven by the linear distance effect function g0(3.1) with a constant effect intensity e(3.1)=1-g0(3.2)=1-e1 of 70% and the audio system 3.2 creating the distant area will be driven by the linear distance effect function g0(3.2), with a constant effect intensity e(3.2)=g0(3.2)=e1 of 30%, in an area between the distance r1 and the distance r2 and thus between 3 m and 5 m from the listener L the audio system 3.1 creating the proximity area will be driven preferably also by a linear distance effect function h0(3.1) with a monotone decreasing effect intensity e(3.1, r1)=1-e(3.2, r1)=1-e1 of 70% to e(3.1, r2)=1-e(3.2, r2)=1-e2 of 20% and the audio system 3.2 creating the distant area will be driven by the linear distance effect function h0(3.2), with a monotone increasing effect intensity e(3.2, r1)=h0(3.2, r1)=e1 of 30% to e(3.2, r2)=h0(3.2, r2)=e2 of 80%, alternatively the effect intensities e(3.1) to e(3.2) may be extracted from nonlinear functions h1 to hx in the same manner (alternatively, non-linear distance effect functions h1 to hx may be also used in a similar manner to achieve special sound effects in the panning area), in a distance r greater than r2=5 m from the listener L the audio system 3.1 creating the proximity area will be driven by the linear distance effect function i0(3.1) with a constant effect intensity e(3.1)=1-i0(3.2)=1-e2 of 20% and the audio system 3.2 creating the distant area will be driven by the linear distance effect function i0(3.2), with a constant effect intensity e(3.2)=i0(3.2)=e2 of 80%.
(55)
(56)
(57)
(58) For example the acoustic scene 2 may be amended by adapting functions of a number of effect sliders ES shown in
(59) In one possible embodiment the distances r1, r2 of the distance ranges C0 and D1 and thus the inner and outer distances of the transfer range T1 may be slidable according to arrows P1.
(60) According to this embodiment, the close range C0 and the transfer range T1 do not describe a circle. On the contrary, the close range C0 and the transfer range T1 are designed as circular segment around the ear area of the listener L wherein the circular segment is also changeable. In particular the angle of the circular segment may be amended by a sliding of a respective effect slider ES or another control function according to arrows P2.
(61) In other words: The transfer zone or area between the two distance ranges C0 and D1 may be adapted by an adapting function, in particular a further scaling factor for the radius of the distance ranges C0, T1, D1 and/or the angle of circular segments.
(62)
(63) In particular, an operator OP or a programmable operator function controlling an area from 0° to 360° may be used to freely amend the transfer range T1 in such a manner that a position of the angle leg of the transfer range T1 may be moved, in particular rotated to achieve arbitrary distance ranges C0, T1, D1, in particular close range C0 and transfer range T1 as it is shown in
(64)
(65) The effect slider ES enables an adapting function, in particular a scaling factor f for adapting parameter of the panning information PI. For example, the effect slider ES may be designed for amending basic definitions such as an audio object Ox, a sound source Sy and/or a group of them. Furthermore, other definitions, in particular distances r, intensities I, the time, metadata MD, motion path data MPD, distance range data DRD, distance effect functions e(3.1 to 3.n), circumferential structure Z, position data P etc may be also amended by another effect slider ES to respectively drive the audio systems 3.1, 3.2.
(66) For example, the effect slider ES enables an additional assignment of a time, a position, a drama and/or other properties and/or events and/or states to at least one audio object Ox and/or sound source Sy and/or to a group of audio objects Ox and/or sound sources Sy by setting of the respective effect slider ES to adapt at least one of the parameters of the panning information, e.g. the distance effect functions e, the intensities I and/or the angles α.
(67) In a possible embodiment, the scaling factor f may be used for adapting the distance effect functions e(3.1) to e(3.2) in the area between effect intensity e1 and e2 of
For all f≧0 and f≦0.5: e1′=e1 [1]
e2′=e1+(e2−e1)*2*f [2]
For all f>0.5 and f≦1: e1′=e1+(e2−e1)*(f−0.5)*2 [3]
e2′=e2 [4]
(68) In another embodiment, the scaling factor f may be used for adapting the distance effect functions e(3.1) to e(3.2) over the whole distance area from 0% (position of the listener L) to 100% (maximum distance) as follows:
For all f≧0 and f≦0.5: e1′=e1*2*f [5]
e2′=e2*2*f [6]
For all f>0.5 and f≦1: e1′=e1+(1−e1)*(f−0.5)*2; [7]
e2′=e2+(1−e2)*(f−0.5)*2 [8]
(69) The effect slider ES may be designed as a mechanical slider of the audio reproduction system 3 and/or a sound machine and/or a monitoring system. Alternatively, the effect slider ES may be designed as a computer-implemented slider on a screen. Furthermore, the audio reproduction system 3 may comprise a plurality of effect sliders ES.
(70)
(71) As an example shown in
(72) As it is shown in
(73) The adapter 5 processes the motion path data MPD according to e.g. given fixed and/or random positions or a path function to adapt the position data P(Ox, Sy) which are fed to the panning information provider 4 which generates the adapted panning information PI, in particular the adapted parameter of the panning information PI.
(74) Additionally, distance range data DRD, e.g. shape, distances r, angles of the audio ranges C0 to C1, T1, D1 to D2 may be fed to the panning information provider 4 to respectively process and consider them during generating of the panning information, e.g. by using simple logic and/or formulas and equations.
(75)
(76) For example, an audio object Ox defined by object data OD as a bee or a noise can sound relative to the listener L and can follow the motion of the listener L according to motion path data MPD, too. The reproduction of the audio object Ox according to the motion path data MPD may be prioritized with respect to defined audio ranges C0 to C1, T1, D1 to D2. In other words: The reproduction of the audio object Ox based on motion path data MPD can be provided without or with using of the audio ranges C0 to C1, T1, D1 to D2. Such a reproduction enables immersive and 2D- and/or 3D live sound effects.
(77)
(78)
(79)
LIST OF REFERENCES
(80) 1 environment 2 acoustic scene 3 audio reproduction system 3.1 to 3.4 audio system 4 panning information provider ES effect slider A to B random position areas DRD distance range data C0 . . . Cm close range CS configuration settings D1 . . . Dn distant range AD audio data e1, e2 effect intensities ES effect slider I intensity IP1 . . . IP5 inputs e(3.1), e(3.2), g0, h1 . . . hx, i0 distance effect functions L listener MD metadata MP motion path MPD motion path data Ox audio object P position data PI panning information P0 to P5 arrows r1 to r5 distance S1 to S4 steps Sy sound source T1 transfer range Z circumferential structure α angular position