Audio encoder device and an audio decoder device having efficient gain coding in dynamic range control
RE049107 · 2022-06-14
Assignee
Inventors
- Fabian Kuech (Erlangen, DE)
- Christian Uhle (Ursensollen, DE)
- Michael Kratschmer (Fuerth, DE)
- Bernhard Neugebauer (Buckenhof, DE)
- Michael Meier (Aurachtal, DE)
- Stephan Schreiner (Birgland, DE)
Cpc classification
G10L19/167
PHYSICS
International classification
Abstract
An audio encoder device includes an audio encoder configured for producing an encoded audio bitstream from an audio signal having consecutive audio frames; a dynamic range control encoder configured for producing an encoded dynamic range control bitstream from .[.an.]. .Iadd.a .Iaddend.dynamic range control sequence corresponding to the audio signal and having consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames has one or more nodes, wherein each node of the one or more nodes has gain information for the audio signal and time information indicating to which point in time the gain information corresponds.
Claims
1. An audio encoder device comprising: an audio encoder configured for producing an encoded audio bitstream from an audio signal comprising consecutive audio frames; a dynamic range control encoder configured for producing an encoded dynamic range control bitstream from .[.an.]. .Iadd.a .Iaddend.dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames comprises one or more nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the dynamic range control encoder is configured in such way that the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the dynamic range control encoder is configured for executing a shift procedure, wherein one or more nodes of the nodes of one reference dynamic range control frame of the dynamic range control frames are selected as shifted nodes, wherein a bit representation of each of the one or more shifted nodes of the one reference dynamic range control frame is embedded in the bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame, wherein a bit representation of each remaining node of the nodes of the one reference dynamic range control frame of the dynamic range control frames is embedded into the bitstream portion corresponding to the one reference dynamic range control frame.
2. The audio encoder device according to claim 1, wherein the shift procedure is initiated in case that a number of the nodes of the reference dynamic range control frame is greater than a predefined threshold value.
3. The audio encoder device according to claim 1, wherein the shift procedure is initiated in case that a sum of a number of the nodes of the reference dynamic range control frame and a number of shifted nodes from the dynamic range control frame preceding the reference dynamic range control frame to be embedded in the bitstream portion corresponding to the reference dynamic range control frame is greater than a predefined threshold value.
4. The audio encoder device according to claim 1, wherein the shift procedure is initiated in case that a sum of a number of the nodes of the reference dynamic range control frame and a number of shifted nodes from the dynamic range control frame preceding the reference dynamic range control frame to be embedded in the bitstream portion corresponding to the reference dynamic range control frame is greater than a number of the nodes of the dynamic range control frame subsequent to the reference dynamic range control frame.
5. The audio encoder device according to claim 1, wherein the time information of the one or more nodes is represented in such way that the one or more shifted nodes may be identified by using the time information.
6. The audio encoder device according to claim 5, wherein the time information of the one or more shifted nodes is represented by a sum of a time difference from a beginning of the dynamic range control frame to which the respective node belongs to the temporal position of the respective node within the dynamic range control frame to which the respective node belongs and an offset value being greater than or equal to a temporal size of the dynamic range control frame subsequent to the respective dynamic range control frame.
7. The audio encoder device according to claim 1, wherein the gain information of the bit representation of the shifted node, which is at a first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame, is represented by an absolute gain value and wherein the gain information of each bit representation of the shifted nodes at a position after the bit representation of the node, which is at the first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame, is represented by a relative gain value which is equal to a difference of a gain value of the bit representation of the respective shifted node and a gain value of the bit representation of the node, which precedes the bit representation of the respective node.
8. The audio encoder device according to claim 1, wherein, in case that the bit representations of one or more shifted nodes of the reference dynamic range control frame is embedded in the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame, the gain information of the bit representation of the node of the subsequent dynamic range control frame at a first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame after the one or more positions of the bit representations of the one or more shifted nodes is represented by a relative gain value which is equal to a difference of a gain value of the bit representation of the respective node and a gain value of the bit representation of the shifted node, which precedes the bit representation of the respective node.
9. The audio encoder device according to claim 1, wherein a temporal size of the audio frames is equal to a temporal size of the dynamic range control frames.
10. The audio encoder device according to claim 1, wherein the one or more nodes of one of the dynamic range control frame are selected from a uniform time grid.
11. The audio encoder device according to claim 1, wherein each node of the one or more nodes comprises slope information.
12. The audio encoder device according to claim 1, wherein the dynamic range control encoder is configured for encoding the nodes using an entropy encoding technique.
13. An audio decoder device comprising: an audio decoder configured for decoding an encoded audio bitstream in order to reproduce an audio signal comprising consecutive audio frames; a dynamic range control decoder configured for decoding an encoded dynamic range control bitstream in order to reproduce .[.an.]. .Iadd.a .Iaddend.dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames; wherein the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the encoded dynamic range control bitstream comprises bit representations of nodes, wherein each bit representation of one node of the nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the encoded dynamic range control bit stream comprises bit representations of shifted nodes selected from the nodes of one reference dynamic range control frame of the dynamic range control frames, which are embedded in a bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame, wherein the bit representation of each remaining node of the nodes of the one reference dynamic range control frame of the dynamic range control frames is embedded into the bitstream portion corresponding to the one reference dynamic range control frame; and wherein the dynamic range control decoder is configured for decoding the bit representation of each remaining node of the remaining nodes of the one reference dynamic range control frame of the dynamic range control frames in order to reproduce each remaining node of the one reference dynamic range control frame of the dynamic range control frames, for decoding the bit representation of each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames in order to reproduce each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames and for combining the reproduced remaining nodes and the reproduced shifted nodes in order to reconstruct the reference dynamic range control frame.
14. The audio decoder device according to claim 13, wherein the dynamic range control decoder is configured for identifying the one or more shifted nodes by using the time information.
15. The audio decoder device according to claim 13, wherein the dynamic range control decoder is configured for decoding the time information of the one or more shifted nodes, which is represented by a sum of a time difference from a beginning of the dynamic range control frame to which the respective node belongs to the temporal position of the respective node within the dynamic range control frame to which the respective node belongs and an offset value being greater than or equal to a temporal size of the dynamic range control frame subsequent to the respective dynamic range control frame.
16. The audio decoder device according to claim 13, wherein the dynamic range control decoder is configured for decoding the gain information of the bit representation of the shifted node, which is at a first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame, is represented by an absolute gain value and wherein the gain information of each bit representation of the shifted nodes at a position after the bit representation of the node, which is at the first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame, is represented by a relative gain value which is equal to a difference of a gain value of the bit representation B′.sub.2 of the respective shifted node B.sub.2 and a gain value of the bit representation of the node, which precedes the bit representation of the respective node.
17. The audio decoder device according to claim 13, wherein the dynamic range control decoder is configured for decoding the gain information of the bit representation of the node of the subsequent dynamic range control frame at a first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame after the one or more positions of the bit representations of the one or more shifted nodes is represented by a relative gain value which is equal to a difference of a gain value of the bit representation of the respective node and a gain value of the bit representation of the shifted node, which precedes the bit representation of the respective node.
18. The audio decoder device according to claim 13, wherein a temporal size of the audio frames is equal to a temporal size of the dynamic range control frames.
19. The audio decoder device according to claim 13, wherein the one or more nodes of one of the dynamic range control frames are selected from a uniform time grid.
20. The audio decoder device according to claim 13, wherein each node of the one or more nodes comprises slope information.
21. The audio decoder device according to claim 13, wherein the dynamic range control decoder is configured for decoding the bit representations of the nodes using an entropy decoding technique.
22. A system comprising an audio encoder device comprising: an audio encoder configured for producing an encoded audio bitstream from an audio signal comprising consecutive audio frames; a dynamic range control encoder configured for producing an encoded dynamic range control bitstream from .[.an.]. .Iadd.a .Iaddend.dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames comprises one or more nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the dynamic range control encoder is configured in such way that the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the dynamic range control encoder is configured for executing a shift procedure, wherein one or more nodes of the nodes of one reference dynamic range control frame of the dynamic range control frames are selected as shifted nodes, wherein a bit representation of each of the one or more shifted nodes of the one reference dynamic range control frame is embedded in the bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame, wherein a bit representation of each remaining node of the nodes of the one reference dynamic range control frame of the dynamic range control frames is embedded into the bitstream portion corresponding to the one reference dynamic range control frame, and an audio decoder device according to claim 13.
23. A method for operating an audio encoder, the method comprising: producing an encoded audio bitstream from an audio signal comprising consecutive audio frames; producing an encoded dynamic range control bitstream from .[.an.]. .Iadd.a .Iaddend.dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames comprises one or more nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds.Iadd.; .Iaddend. wherein the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; executing a shift procedure, wherein one or more nodes of the nodes of one reference dynamic range control frame of the dynamic range control frames are selected as shifted nodes, wherein a bit representation of each of the one or more shifted nodes of the one reference dynamic range control frame is embedded in the bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame, wherein a bit representation of each remaining node of the nodes of the one reference dynamic range control frame of the dynamic range control frames is embedded into the bitstream portion corresponding to the one reference dynamic range control frame.
24. A method for operating an audio decoder, the method comprising: decoding an encoded audio bitstream in order to reproduce an audio signal comprising consecutive audio frames; decoding an encoded dynamic range control bitstream in order to reproduce .[.an.]. .Iadd.a .Iaddend.dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames; wherein the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the encoded dynamic range control bitstream comprises bit representations of nodes, wherein each bit representation of one node of the nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the encoded dynamic range control bit stream comprises bit representations of shifted nodes selected from the nodes of one reference dynamic range control frame of the dynamic range control frames, which are embedded in a bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame, wherein the bit representation of each remaining node of the nodes of the one reference dynamic range control frame of the dynamic range control frames is embedded into the bitstream portion corresponding to the one reference dynamic range control frame; and wherein the bit representation of each remaining node of the remaining nodes of the one reference dynamic range control frame of the dynamic range control frames is decoded in order to reproduce each remaining node of the one reference dynamic range control frame of the dynamic range control frames; wherein the bit representation of each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames is decoded in order to reproduce each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames; and wherein the reproduced remaining nodes and the reproduced shifted nodes are combined in order to reconstruct the reference dynamic range control frame.
25. A non-transitory digital storage medium having a computer program stored thereon to perform the method for operating an audio encoder, the method comprising: producing an encoded audio bitstream from an audio signal comprising consecutive audio frames; producing an encoded dynamic range control bitstream from .[.an.]. .Iadd.a .Iaddend.dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames comprises one or more nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds.Iadd.; .Iaddend. wherein the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; executing a shift procedure, wherein one or more nodes of the nodes of one reference dynamic range control frame of the dynamic range control frames are selected as shifted nodes, wherein a bit representation of each of the one or more shifted nodes of the one reference dynamic range control frame is embedded in the bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame, wherein a bit representation of each remaining node of the nodes of the one reference dynamic range control frame of the dynamic range control frames is embedded into the bitstream portion corresponding to the one reference dynamic range control frame, when said computer program is run by a computer.
26. A non-transitory digital storage medium having a computer program stored thereon to perform the method for operating an audio decoder, the method comprising: decoding an encoded audio bitstream in order to reproduce an audio signal comprising consecutive audio frames; decoding an encoded dynamic range control bitstream in order to reproduce .[.an.]. .Iadd.a .Iaddend.dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames; wherein the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the encoded dynamic range control bitstream comprises bit representations of nodes, wherein each bit representation of one node of the nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the encoded dynamic range control bit stream comprises bit representations of shifted nodes selected from the nodes of one reference dynamic range control frame of the dynamic range control frames, which are embedded in a bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame, wherein the bit representation of each remaining node of the nodes of the one reference dynamic range control frame of the dynamic range control frames is embedded into the bitstream portion corresponding to the one reference dynamic range control frame; and wherein the bit representation of each remaining node of the remaining nodes of the one reference dynamic range control frame of the dynamic range control frames is decoded in order to reproduce each remaining node of the one reference dynamic range control frame of the dynamic range control frames; wherein the bit representation of each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames is decoded in order to reproduce each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames; and wherein the reproduced remaining nodes and the reproduced shifted nodes are combined in order to reconstruct the reference dynamic range control frame, when said computer program is run by a computer.
.Iadd.27. An audio encoder device comprising: an audio encoder configured for producing an encoded audio bitstream from an audio signal comprising consecutive audio frames; a dynamic range control encoder configured for producing an encoded dynamic range control bitstream from a dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames comprises one or more nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the dynamic range control encoder is configured in such way that the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the dynamic range control encoder is configured for executing a shift procedure, wherein one or more nodes of the nodes of one reference dynamic range control frame of the dynamic range control frames are selected as shifted nodes, wherein a bit representation of each of the one or more shifted nodes of the one reference dynamic range control frame is embedded in the bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame..Iaddend.
.Iadd.28. The audio encoder device according to claim 27, wherein a bit representation of each remaining node of the nodes of the one reference dynamic range control frame of the dynamic range control frames is embedded into the bitstream portion corresponding to the one reference dynamic range control frame..Iaddend.
.Iadd.29. The audio encoder device according to claim 27, wherein the shift procedure is initiated in case that a number of the nodes of the reference dynamic range control frame is greater than a predefined threshold value..Iaddend.
.Iadd.30. The audio encoder device according to claim 27, wherein the shift procedure is initiated in case that a sum of a number of the nodes of the reference dynamic range control frame and a number of shifted nodes from the dynamic range control frame preceding the reference dynamic range control frame to be embedded in the bitstream portion corresponding to the reference dynamic range control frame is greater than a predefined threshold value..Iaddend.
.Iadd.31. The audio encoder device according to claim 27, wherein the shift procedure is initiated in case that a sum of a number of the nodes of the reference dynamic range control frame and a number of shifted nodes from the dynamic range control frame preceding the reference dynamic range control frame to be embedded in the bitstream portion corresponding to the reference dynamic range control frame is greater than a number of the nodes of the dynamic range control frame subsequent to the reference dynamic range control frame..Iaddend.
.Iadd.32. The audio encoder device according to claim 27, wherein the time information of the one or more nodes is represented in such way that the one or more shifted nodes may be identified by using the time information..Iaddend.
.Iadd.33. The audio encoder device according to claim 32, wherein the time information of the one or more shifted nodes is represented by a sum of a time difference from a beginning of the dynamic range control frame to which the respective node belongs to the temporal position of the respective node within the dynamic range control frame to which the respective node belongs and an offset value being greater than or equal to a temporal size of the dynamic range control frame subsequent to the respective dynamic range control frame..Iaddend.
.Iadd.34. The audio encoder device according to claim 27, wherein the gain information of the bit representation of the shifted node, which is at a first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame, is represented by an absolute gain value and wherein the gain information of each bit representation of the shifted nodes at a position after the bit representation of the node, which is at the first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame, is represented by a relative gain value which is equal to a difference of a gain value of the bit representation of the respective shifted node and a gain value of the bit representation of the node, which precedes the bit representation of the respective node..Iaddend.
.Iadd.35. The audio encoder device according to claim 27, wherein, in case that the bit representations of one or more shifted nodes of the reference dynamic range control frame is embedded in the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame, the gain information of the bit representation of the node of the subsequent dynamic range control frame at a first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame after the one or more positions of the bit representations of the one or more shifted nodes is represented by a relative gain value which is equal to a difference of a gain value of the bit representation of the respective node and a gain value of the bit representation of the shifted node, which precedes the bit representation of the respective node..Iaddend.
.Iadd.36. The audio encoder device according to claim 27, wherein a temporal size of the audio frames is equal to a temporal size of the dynamic range control frames..Iaddend.
.Iadd.37. The audio encoder device according to claim 27, wherein the one or more nodes of one of the dynamic range control frame are selected from a uniform time grid..Iaddend.
.Iadd.38. The audio encoder device according to claim 27, wherein each node of the one or more nodes comprises slope information..Iaddend.
.Iadd.39. The audio encoder device according to claim 27, wherein the dynamic range control encoder is configured for encoding the nodes using an entropy encoding technique..Iaddend.
.Iadd.40. An audio decoder device comprising: an audio decoder configured for decoding an encoded audio bitstream in order to reproduce an audio signal comprising consecutive audio frames; a dynamic range control decoder configured for decoding an encoded dynamic range control bitstream in order to reproduce a dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames; wherein the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the encoded dynamic range control bitstream comprises bit representations of nodes, wherein each bit representation of one node of the nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the encoded dynamic range control bit stream comprises bit representations of shifted nodes selected from the nodes of one reference dynamic range control frame of the dynamic range control frames, which are embedded in a bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame; and wherein the dynamic range control decoder is configured for decoding the bit representation of each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames in order to reproduce each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames..Iaddend.
.Iadd.41. The audio decoder device according to claim 40, wherein the bit representation of each remaining node of the nodes of the one reference dynamic range control frame of the dynamic range control frames is embedded into the bitstream portion corresponding to the one reference dynamic range control frame, wherein the dynamic range control decoder is configured for decoding the bit representation of each remaining node of the remaining nodes of the one reference dynamic range control frame of the dynamic range control frames in order to reproduce each remaining node of the one reference dynamic range control frame of the dynamic range control frames and for combining the reproduced remaining nodes and the reproduced shifted nodes in order to reconstruct the reference dynamic range control frame..Iaddend.
.Iadd.42. The audio decoder device according to claim 40, wherein the dynamic range control decoder is configured for identifying the one or more shifted nodes by using the time information..Iaddend.
.Iadd.43. The audio decoder device according to claim 40, wherein the dynamic range control decoder is configured for decoding the time information of the one or more shifted nodes, which is represented by a sum of a time difference from a beginning of the dynamic range control frame to which the respective node belongs to the temporal position of the respective node within the dynamic range control frame to which the respective node belongs and an offset value being greater than or equal to a temporal size of the dynamic range control frame subsequent to the respective dynamic range control frame..Iaddend.
.Iadd.44. The audio decoder device according to claim 40, wherein the dynamic range control decoder is configured for decoding the gain information of the bit representation of the shifted node, which is at a first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame, is represented by an absolute gain value and wherein the gain information of each bit representation of the shifted nodes at a position after the bit representation of the node, which is at the first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame, is represented by a relative gain value which is equal to a difference of a gain value of the bit representation B′.sub.2 of the respective shifted node B.sub.2 and a gain value of the bit representation of the node, which precedes the bit representation of the respective node..Iaddend.
.Iadd.45. The audio decoder device according to claim 40, wherein the dynamic range control decoder is configured for decoding the gain information of the bit representation of the node of the subsequent dynamic range control frame at a first position of the bitstream portion corresponding to the dynamic range control frame subsequent to the reference dynamic range control frame after the one or more positions of the bit representations of the one or more shifted nodes is represented by a relative gain value which is equal to a difference of a gain value of the bit representation of the respective node and a gain value of the bit representation of the shifted node, which precedes the bit representation of the respective node..Iaddend.
.Iadd.46. The audio decoder device according to claim 40, wherein a temporal size of the audio frames is equal to a temporal size of the dynamic range control frames..Iaddend.
.Iadd.47. The audio decoder device according to claim 40, wherein the one or more nodes of one of the dynamic range control frames are selected from a uniform time grid..Iaddend.
.Iadd.48. The audio decoder device according to claim 40, wherein each node of the one or more nodes comprises slope information..Iaddend.
.Iadd.49. The audio decoder device according to claim 40, wherein the dynamic range control decoder is configured for decoding the bit representations of the nodes using an entropy decoding technique..Iaddend.
.Iadd.50. A system comprising an audio encoder device comprising: an audio encoder configured for producing an encoded audio bitstream from an audio signal comprising consecutive audio frames; a dynamic range control encoder configured for producing an encoded dynamic range control bitstream from a dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames comprises one or more nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the dynamic range control encoder is configured in such way that the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the dynamic range control encoder is configured for executing a shift procedure, wherein one or more nodes of the nodes of one reference dynamic range control frame of the dynamic range control frames are selected as shifted nodes, wherein a bit representation of each of the one or more shifted nodes of the one reference dynamic range control frame is embedded in the bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame, and an audio decoder device according to claim 40..Iaddend.
.Iadd.51. A method for operating an audio encoder, the method comprising: producing an encoded audio bitstream from an audio signal comprising consecutive audio frames; producing an encoded dynamic range control bitstream from a dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames comprises one or more nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; executing a shift procedure, wherein one or more nodes of the nodes of one reference dynamic range control frame of the dynamic range control frames are selected as shifted nodes, wherein a bit representation of each of the one or more shifted nodes of the one reference dynamic range control frame is embedded in the bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame..Iaddend.
.Iadd.52. A method for operating an audio decoder, the method comprising: decoding an encoded audio bitstream in order to reproduce an audio signal comprising consecutive audio frames; decoding an encoded dynamic range control bitstream in order to reproduce a dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames; wherein the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the encoded dynamic range control bitstream comprises bit representations of nodes, wherein each bit representation of one node of the nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the encoded dynamic range control bit stream comprises bit representations of shifted nodes selected from the nodes of one reference dynamic range control frame of the dynamic range control frames, which are embedded in a bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame, wherein the bit representation of each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames is decoded in order to reproduce each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames..Iaddend.
.Iadd.53. A non-transitory digital storage medium having a computer program stored thereon to perform the method for operating an audio encoder, the method comprising: producing an encoded audio bitstream from an audio signal comprising consecutive audio frames; producing an encoded dynamic range control bitstream from a dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames comprises one or more nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; executing a shift procedure, wherein one or more nodes of the nodes of one reference dynamic range control frame of the dynamic range control frames are selected as shifted nodes, wherein a bit representation of each of the one or more shifted nodes of the one reference dynamic range control frame is embedded in the bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame, when said computer program is run by a computer..Iaddend.
.Iadd.54. A non-transitory digital storage medium having a computer program stored thereon to perform the method for operating an audio decoder, the method comprising: decoding an encoded audio bitstream in order to reproduce an audio signal comprising consecutive audio frames; decoding an encoded dynamic range control bitstream in order to reproduce a dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames; wherein the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the encoded dynamic range control bitstream comprises bit representations of nodes, wherein each bit representation of one node of the nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the encoded dynamic range control bit stream comprises bit representations of shifted nodes selected from the nodes of one reference dynamic range control frame of the dynamic range control frames, which are embedded in a bitstream portion corresponding to the dynamic range control frame subsequent to the one reference dynamic range control frame, wherein the bit representation of each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames is decoded in order to reproduce each shifted node of the shifted nodes selected from the nodes of the one reference dynamic range control frame of the dynamic range control frames; when said computer program is run by a computer..Iaddend.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Embodiments of the present invention will be detailed subsequently referring to the appended drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
DETAILED DESCRIPTION OF THE INVENTION
(11)
(12) an audio encoder 2 configured for producing an encoded audio bitstream ABS from an audio signal AS comprising consecutive audio frames AFP, AFR, AFS;
(13) a dynamic range control encoder 3 configured for producing an encoded dynamic range control bitstream DBS from an dynamic range control sequence DS corresponding to the audio signal AS and comprising consecutive dynamic range control frames DFP, DFR, DFS, wherein each dynamic range control frame DFP, DFR, DFS of the dynamic range control frames DFP, DFR, DFS comprises one or more nodes A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0, wherein each node of the one or more nodes A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0 comprises gain information GA.sub.0 . . . GA.sub.5; GB.sub.0 . . . GB.sub.2; GC.sub.0 for the audio signal AS and time information TA.sub.0 . . . TA.sub.5; TB.sub.0 . . . TB.sub.2; TC.sub.0 indicating to which point in time the gain information GA.sub.0 . . . GA.sub.5; GB.sub.0 . . . GB.sub.2; GC.sub.0 corresponds;
(14) wherein the dynamic range control encoder 3 is configured in such way that the encoded dynamic range control bitstream DBS comprises for each dynamic range control frame DFP, DFR, DFS of the dynamic range control frames DFP, DFR, DFS a corresponding bitstream portion DFP′, DFR′, DFS′;
(15) wherein the dynamic range control encoder 2 is configured for executing a shift procedure, wherein one or more nodes B.sub.1, B.sub.2 of the nodes B.sub.0 . . . B.sub.2 of one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS are selected as shifted nodes B.sub.1, B.sub.2, wherein a bit representation B′.sub.1, B′.sub.2 of each of the one or more shifted nodes B.sub.1, B.sub.2 of the one reference dynamic range control frame DFR is embedded in the bitstream portion DFS′ corresponding to the dynamic range control frame DFS subsequent to the one reference dynamic range control frame DFR, wherein a bit representation B′.sub.0 of each remaining node B.sub.0 of the nodes B.sub.0 . . . B.sub.2 of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS is embedded into the bitstream portion DFR′ corresponding to the one reference dynamic range control frame DFR. The invention allows controlling the peak bitrate that may be used for a reference dynamic range control frame DFR without changing the resulting bitstream sequence DBS compared to the case where the proposed method is not used. The proposed approach exploits the inherent delay of one frame introduced by state-of-the-art audio coders to reduce peaks of number of nodes within one frame by distributing the transmission of some of the nodes to the next subsequent dynamic range control frame. The details of the proposed method are presented in the following.
(16) As explained above, when combined with an audio coding scheme that introduces a frame delay relative to the dynamic range control gains, the decoded dynamic range control gains are delayed by one frame before being applied to the audio signal. This means that the nodes of the reference dynamic range control frame are applied to the valid audio decoder output at dynamic range control frame subsequent to the reference dynamic range control frame. This implies that in the default delay mode it is sufficient to transmit the nodes of the reference dynamic range control frame together with the nodes of the dynamic range control frame subsequent to the reference dynamic range control frame and apply the corresponding dynamic range control gains without a delay directly to the corresponding audio output signal at the decoder.
(17) This fact is exploited in the invention in order to reduce the maximum number of nodes transmitted within one dynamic range control frame. According to the invention some of the nodes of the reference dynamic range control frame are shifted to the subsequent dynamic range control frame, which may be done before encoding. As it will be discussed in the following, the shifted nodes may be “preceding” the first node in the subsequent dynamic range control frame only for the encoding of the gain differences and the slope information. For the coding of the time difference information, a different method may be applied.
(18) In the example shown in
(19) According to an advantageous embodiment of the invention a temporal size of the audio frames AFP, AFR, AFS is equal to a temporal size of the dynamic range control frames DFP, DFR, DFS.
(20) According to an advantageous embodiment of the invention the one or more nodes A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0 of one of the dynamic range control frame DFP, DFR, DFS are selected from a uniform time grid.
(21) According to an advantageous embodiment of the invention the dynamic range control encoder 3 is configured for encoding the nodes A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0 using an entropy encoding technique.
(22) In a further aspect the invention provides a method for operating an audio encoder 1, the method comprises the steps:
(23) producing an encoded audio bitstream ABS from an audio signal AS comprising consecutive audio frames AFP, AFR, AFS;
(24) producing an encoded dynamic range control bitstream DBS from an dynamic range control sequence DS corresponding to the audio signal AS and comprising consecutive dynamic range control frames DFP, DFR, DFS, wherein each dynamic range control frame DFP, DFR, DFS of the dynamic range control frames DFP, DFR, DFS comprises one or more nodes A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0, wherein each node of the one or more nodes A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0 comprises gain information GA.sub.0 . . . GA.sub.5; GB.sub.0 . . . GB.sub.2; GC.sub.0 for the audio signal AS and time information TA.sub.0 . . . TA.sub.5; TB.sub.0 . . . TB.sub.2; TC.sub.0 indicating to which point in time the gain information corresponds
(25) wherein the encoded dynamic range control bitstream DBS comprises for each dynamic range control frame DFP, DFR, DFS of the dynamic range control frames DFP, DFR, DFS a corresponding bitstream portion DFP′, DFR′, DFS′;
(26) executing a shift procedure, wherein one or more nodes B.sub.1, B.sub.2 of the nodes B.sub.0 . . . B.sub.2 of one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS are selected as shifted nodes B.sub.1, B.sub.2, wherein a bit representation B′.sub.1, B′.sub.2 of each of the one or more shifted nodes B.sub.1, B.sub.2 of the one reference dynamic range control frame DFR is embedded in the bitstream portion DFS′ corresponding to the dynamic range control frame DFS subsequent to the one reference dynamic range control frame DFR, wherein a bit representation B′.sub.0 of each remaining node B.sub.0 of the nodes B.sub.0 . . . B.sub.2 of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS is embedded into the bitstream portion DFR′ corresponding to the one reference dynamic range control frame DFR.
(27)
(28) The process of applying DRC to a signal can be expressed by a simple multiplication of the audio signal x(k) by a time-variant gain value g(k):
y(k)=g(k)×(k) (1)
(29) where k denotes a sample time index. The value of the gain g(k) is computed, e.g. based on a short-term estimate of the root-mean square of the input signal x(k). More details about strategies to determine suitable gains values are discussed in [1]. In the following we refer to the time-variant gains g(k) as a gain sequence.
(30) The invention refers to an application scenario, where both, the audio signal AS and the dynamic range control sequence DS are coded and transmitted. In this case, the dynamic range control gains are not directly applied to the audio signal AS, but encoded and transmitted together with the encoded audio signal ABS. At the decoder 4, both, the audio signal AS and the dynamic range control sequence DS are decoded and the dynamic range control information is applied to the corresponding audio signal AS.
(31) In one aspect the invention provides a system comprising an audio encoder device 1 according to the invention and an audio decoder device 4 according to the invention.
(32)
(33) In principle, the dynamic range control encoder/decoder chain can be operated in two modes. The so-called full-frame mode refers to the case where after decoding of a received dynamic range control bitstream, corresponding to a specific dynamic range control frame, the gains at each sample position of the dynamic range control frame can be immediately determined after interpolation based on the decoded nodes. This implies that a node has to be transmitted at each frame border, i.e., at the sample position corresponding to the last sample of the dynamic range control frame. If the dynamic range control frame length is N this means the last transmitted node has to be located at the sample position N within that frame. This is illustrated at the top in
(34) The second mode is referred to as “delay mode” and it is illustrated in the lower part “B” of
(35)
(36)
(37) According to an advantageous embodiment of the invention the shift procedure is initiated in case that a number of the nodes B.sub.0 . . . B.sub.2 of the reference dynamic range control frame DFR is greater than a predefined threshold value.
(38) According to an advantageous embodiment of the invention the shift procedure is initiated in case that a sum of a number of the nodes B.sub.0 . . . B.sub.2 of the reference dynamic range control frame DFR and a number of shifted nodes A.sub.4, A.sub.5 from the dynamic range control frame DFP preceding the reference dynamic range control frame DFR to be embedded in the bitstream portion DFR′ corresponding to the reference dynamic range control frame DFR is greater than a predefined threshold value.
(39) According to an advantageous embodiment of the invention the shift procedure is initiated in case that a sum of a number of the nodes B.sub.0 . . . B.sub.2 of the reference dynamic range control frame DFR and a number of shifted nodes A.sub.4, A.sub.5 from the dynamic range control frame DFP preceding the reference dynamic range control frame DFR to be embedded in the bitstream portion DFR′ corresponding to the reference dynamic range control frame DFR is greater than a number of the nodes C.sub.0 of the dynamic range control frame DFS subsequent to the reference dynamic range control frame DFR.
(40) As explained above, when combined with an audio coding scheme that introduces a frame delay relative to the dynamic range control frames, the decoded dynamic range control gains are delayed by one frame before being applied to the audio signal. Considering the left-hand side in
(41) This fact is exploited in the proposed method to reduce the maximum number of nodes transmitted within one frame. This is illustrated on the right-hand side in
(42)
(43) According to an advantageous embodiment of the invention the time information TA.sub.0 . . . TA.sub.5; TB.sub.0 . . . TB.sub.2; TC.sub.0 of the one or more nodes A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0 is represented in such way that the one or more shifted nodes A.sub.4, A.sub.5; B.sub.1, B.sub.2 may be identified by using the time information TA.sub.4, TA.sub.5; TB.sub.1, TB.sub.2.
(44) According to an advantageous embodiment of the invention the time information TA.sub.4, TA.sub.5; TB.sub.1, TB.sub.2 of the one or more shifted nodes A.sub.4, A.sub.5; B.sub.1, B.sub.2 is represented by a sum of a time difference t_A.sub.4, t_A.sub.5; t_B.sub.2 from a beginning of the dynamic range control frame DFP; DFR to which the respective node A.sub.4, A.sub.5; B.sub.1, B.sub.2 belongs to the temporal position of the respective node A.sub.4, A.sub.5; B.sub.1, B.sub.2 within the dynamic range control frame DFP; DFR to which the respective node A.sub.4, A.sub.5; B.sub.1, B.sub.2 belongs and an offset value drcFrameSize being greater than or equal to a temporal size of the dynamic range control frame DFR; DFS subsequent to the respective dynamic range control frame DFP; DFR.
(45) First we consider the encoding of the time differences between pairs of nodes. In
(46) The temporal position information of a node is encoded in a differential way, i.e., relative to the position of the previous node. If a node is the first node within a frame, the time difference is determined relative to the beginning of a frame. The left-hand side of
(47) Let us now consider the encoding of the node position for the proposed node reservoir technique using node shifting. For the example shown on the right-hand side of
(48) Next, we consider the computation of the time difference information that is actually encoded for the situation shown on the right-hand side of
(49) The method for decoding the temporal position information can be summarized as follows. The decoder extracts the time difference information of a node based on the corresponding code word from the bitstream. The time information is obtained by adding the time difference information to the time information of the previous node. If the resulting sample position is larger than drcFrameSize the decoder knows that the present node has to be processed as if it were the last node in the previous frame, i.e., it has to be appended to the nodes decoded in the previous frame. The correct sample position is determined by subtracting the offset value drcFrameSize from the decoded time value. The same processing steps are applied in an analog way if more shifted nodes occur in a decoded frame.
(50) After decoding and correcting the time information of an entire frame, the decoder knows how many nodes have been shifted back to the previous frame (without explicitly providing this information at the encoder) and on which sample position they are located within the previous frame. The information about the number of shifted nodes will be further exploited in the context of decoding gain and slope information described below.
(51)
(52) According to an advantageous embodiment of the invention the gain information GB.sub.1 of the bit representation B′.sub.1 of the shifted node B.sub.1, which is at a first position of the bitstream portion DFS′ corresponding to the dynamic range control frame DFS subsequent to the reference dynamic range control frame DFR, is represented by an absolute gain value g_B.sub.1 and wherein the gain information GB.sub.2 of each bit representation B′.sub.2 of the shifted nodes B.sub.2 at a position after the bit representation B′.sub.1 of the node B.sub.1, which is at the first position of the bitstream portion DFS′ corresponding to the dynamic range control frame DFS subsequent to the reference dynamic range control frame DFR, is represented by a relative gain value which is equal to a difference of a gain value g_B.sub.2 of the bit representation B′.sub.2 of the respective shifted node B.sub.2 and the gain value g_B.sub.1 of the bit representation B′.sub.1 of the .[.nodeB.sub.1.]. .Iadd.node B.sub.1.Iaddend., which precedes the bit representation B′.sub.2 of the respective .[.nodeB.sub.2.]. .Iadd.node B.sub.2.Iaddend..
(53) According to an advantageous embodiment of the invention, in case that the bit representations B′.sub.1, B′.sub.2 of one or more shifted nodes B.sub.1, B.sub.2 of the reference dynamic range control frame DFR is embedded in the bitstream portion DFS′ corresponding to the dynamic range control frame DFS subsequent to the reference dynamic range control frame DFR, the gain information GC.sub.0 of the bit representation C′.sub.0 of the node C.sub.0 of the subsequent dynamic range control frame DFS at a first position of the bitstream portion DFS′ corresponding to the dynamic range control frame DFS subsequent to the reference dynamic range control frame DFR after the one or more positions of the bit representations B′.sub.1, B′.sub.2 of the one or more shifted nodes B.sub.1, B.sub.2 is represented by a relative gain value which is equal to a difference of a gain value g_C.sub.0 of the bit representation C′.sub.0 of the respective node C.sub.0 and a gain value g_B.sub.2 of the bit representation B′.sub.2 of the shifted .[.nodeB.sub.2.]. .Iadd.node B.sub.2.Iaddend., which precedes the bit representation C′.sub.0 of the respective node C.sub.0.
(54) In
(55) First, the differential gain values for the node A.sub.4 is considered. For the approach without node reservoir, depicted on the left-hand side of
(56) For the situation shown on the right-hand side, where the node A.sub.4 has been shifted to the next frame n+1, the values of the encoded gain information is different. As can be seen, after being shifted, the node A.sub.4 becomes the first node in frame n+1 with respect to encoding the gain differences. Thus, its gain value is not encoded in a differential way, but the specific coding of initial gain values is applied as described above. The differential gain value of A.sub.5 will remain the same for both situations shown on the left- and the right-hand side. Since node B.sub.0 now follows node A.sub.5 if the node reservoir is used, its gain information will be determined from the difference of the gains of node B.sub.0 and A.sub.5, i.e., gainDelta_B.sub.0=g_B.sub.0−g_A.sub.5. Note that only the way how the gain differences are determined changes when applying the node reservoir technique, whereas the reconstructed values of the gains remain the same for each node. Obviously, after decoding the entire gain related information of the frames n and n+1, the obtained gain values for the nodes A.sub.0 to B.sub.0 are identical to that obtained in the left-hand side, and the nodes can be computed “in time” for application of the DRC gains to the corresponding audio frame.
(57) As discussed in the previous paragraph, the number of shifted nodes and their sample position within the previous frame are known after decoding the time difference information. As illustrated on the right-hand side of
(58)
(59) According to an advantageous embodiment of the invention each node A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0 of the one or more nodes comprises A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0 slope information SA.sub.0 . . . SA.sub.5; SB.sub.0 . . . SB.sub.2, SC.sub.0.
(60) Next, the coding of slope information is considered, which is illustrated in
(61) After all nodes information received for frame n+1 have been decoded and if applicable shifted back to the preceding frame n, the gain interpolation for frame n using splines or linear interpolation can be performed in the common way and the gain values are applied to the corresponding audio frame.
(62)
(63) an audio decoder 5 configured for decoding an encoded audio bitstream ABS in order to reproduce an audio signal AS comprising consecutive audio frames AFP, AFR, AFS;
(64) a dynamic range control decoder 6 configured for decoding an encoded dynamic range control bitstream DBS in order to reproduce an dynamic range control sequence DS corresponding to the audio signal AS and comprising consecutive dynamic range control frames DFP, DFR, DFS;
(65) wherein the encoded dynamic range control bitstream DBS comprises for each dynamic range control frame DFP, DFR, DFS of the dynamic range control frames a corresponding bitstream portion DFP′, DFR′, DFS′;
(66) wherein the encoded dynamic range control bitstream DBS comprises bit representations A′.sub.0 . . . A′.sub.5; B′.sub.0 . . . B′.sub.2; C′.sub.0 of nodes A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0; wherein each bit representation of one node of the nodes comprises gain information GA.sub.0 . . . GA.sub.5; GB.sub.0 . . . GB.sub.2; GC.sub.0 for the audio signal AS and time information TA.sub.0 . . . TA.sub.5; TB.sub.0 . . . TB.sub.2; TC.sub.0 indicating to which point in time the gain information GA.sub.0 . . . GA.sub.5; GB.sub.0 . . . GB.sub.2; GC.sub.0 corresponds;
(67) wherein the encoded dynamic range control bit stream DBS comprises bit representations B′.sub.1, B′.sub.2 of shifted nodes B.sub.1, B.sub.2 selected from the nodes B.sub.0 . . . B.sub.2 of one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS, which are embedded in a bitstream portion corresponding to the dynamic range control frame DFS subsequent to the one reference dynamic range control frame DFR, wherein the bit representation B′.sub.0 of each remaining node B.sub.0 of the nodes B.sub.0 . . . B.sub.2 of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS is embedded into the bitstream portion DFR′ corresponding to the one reference dynamic range control frame DFR; and
(68) wherein the dynamic range control decoder 6 is configured for decoding the bit representation B′.sub.0 of each remaining node B.sub.0 of the remaining nodes B′.sub.0 of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS in order to reproduce each remaining node B.sub.0 of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS, for decoding the bit representation B′.sub.1, B′.sub.2 of each shifted node B.sub.1, B.sub.2 of the shifted nodes B.sub.1, B.sub.2 selected from the nodes B.sub.0 . . . B.sub.2 of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS in order to reproduce each shifted node B.sub.1, B.sub.2 of the shifted nodes B.sub.1, B.sub.2 selected from the nodes of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS and for combining the reproduced remaining nodes B.sub.0 and the reproduced shifted nodes B.sub.1, B.sub.2 in order to reconstruct the reference dynamic range control frame DFR.
(69) According to an advantageous embodiment of the invention the dynamic range control decoder 6 is configured for identifying the one or more shifted nodes A.sub.4, A.sub.5; B.sub.1, B.sub.2 by using the time information TA.sub.4, TA.sub.5; TB.sub.1, TB.sub.2.
(70) According to an advantageous embodiment of the invention the dynamic range control decoder 6 is configured for decoding the time information TA.sub.4, TA.sub.5; TB.sub.1, TB.sub.2 of the one or more shifted nodes A.sub.4, A.sub.5; B.sub.1, B.sub.2, which is represented by a sum of a time difference t_A.sub.4, t_A.sub.5; t_B.sub.1, t_B.sub.2 from a beginning of the dynamic range control frame DFP; DFR to which the respective node A.sub.4, A.sub.5; B.sub.1, B.sub.2 belongs to the temporal position of the respective node A.sub.4, A.sub.5; B.sub.1, B.sub.2 within the dynamic range control frame DFP; DFR to which the respective node A.sub.4, A.sub.5; B.sub.1, B.sub.2 belongs and an offset value drcFrameSize being greater than or equal to a temporal size of the dynamic range control frame DFR; DFS subsequent to the respective dynamic range control frame DFP; DFR.
(71) According to an advantageous embodiment of the invention the dynamic range control decoder 6 is configured for decoding the gain information GB.sub.1 of the bit representation B′.sub.1 of the shifted node B.sub.1, which is at a first position of the bitstream portion DFS′ corresponding to the dynamic range control frame DFS subsequent to the reference dynamic range control frame DFR, is represented by an absolute gain value g_B.sub.1 and wherein the gain information GB.sub.2 of each bit representation B′.sub.2 of the shifted nodes B.sub.2 at a position after the bit representation B′.sub.1 of the node B.sub.1, which is at the first position of the bitstream portion DFS′ corresponding to the dynamic range control frame DFS subsequent to the reference dynamic range control frame DFR, is represented by a relative gain value which is equal to a difference of a gain value g_B.sub.2 of the bit representation B′.sub.2 of the respective shifted node B.sub.2 and the gain value g_B.sub.1 of the bit representation B′.sub.1 of the nodeB.sub.1, which precedes the bit representation B′.sub.2 of the respective .[.nodeB.sub.2 .]. .Iadd.node B.sub.2..Iaddend.
(72) According to an advantageous embodiment of the invention the dynamic range control decoder 6 is configured for decoding the gain information GC.sub.0 of the bit representation C′.sub.0 of the node C.sub.0 of the subsequent dynamic range control frame DFS at a first position of the bitstream portion DFS′ corresponding to the dynamic range control frame DFS subsequent to the reference dynamic range control frame DFR after the one or more positions of the bit representations B′.sub.1, B′.sub.2 of the one or more shifted nodes B.sub.1, B.sub.2 is represented by a relative gain value which is equal to a difference of a gain value g_C.sub.0 the bit representation C′.sub.0 of the respective node C.sub.0 and the gain value g_B.sub.2 of the bit representation B′.sub.2 of the shifted .[.nodeB.sub.2.]. .Iadd.node B.sub.2.Iaddend., which precedes the bit representation C′.sub.0 of the respective node C.sub.0.
(73) According to an advantageous embodiment of the invention a temporal size of the audio frames AFP, AFR, AFS is equal to a temporal size of the dynamic range control frames AFP, AFR, AFS.
(74) According to an advantageous embodiment of the invention the one or more nodes A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0 of one of the dynamic range control frames DFP, DFR, DFS are selected from a uniform time grid.
(75) According to an advantageous embodiment of the invention each node A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0 of the one or more nodes A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0 comprises slope information SA.sub.0 . . . SA.sub.5, SB.sub.0 . . . SB.sub.2; SC.sub.0.
(76) According to an advantageous embodiment of the invention the dynamic range control decoder 6 is configured for decoding the bit representations of the nodes A′.sub.0 . . . A′.sub.5; B′.sub.0 . . . B′.sub.2, C.sub.0 using an entropy decoding technique.
(77) In another aspect the invention provides a method for operating an audio decoder, the method comprises the steps:
(78) decoding an encoded audio bitstream ABS in order to reproduce an audio signal AS comprising consecutive audio frames AFP, AFR, AFS;
(79) decoding an encoded dynamic range control bitstream DBS in order to reproduce an dynamic range control sequence DS corresponding to the audio signal AS and comprising consecutive dynamic range control frames DFP, DFR, DFS;
(80) wherein the encoded dynamic range control bitstream DBS comprises for each dynamic range control frame DFP, DFR, DFS of the dynamic range control frames a corresponding bitstream portion DFP′, DFR′, DFS′;
(81) wherein the encoded dynamic range control bitstream DBS comprises bit representations A′.sub.0 . . . A′.sub.5; B′.sub.0 . . . B′.sub.2; C′.sub.0 of nodes A.sub.0 . . . A.sub.5; B.sub.0 . . . B.sub.2; C.sub.0, wherein each bit representation of one node of the nodes comprises gain information GA.sub.0 . . . GA.sub.5; GB.sub.0 . . . GB.sub.2; GC.sub.0 for the audio signal AS and time information TA.sub.0 . . . TA.sub.5; TB.sub.0 . . . TB.sub.2; TC.sub.0 indicating to which point in time the gain information GA.sub.0 . . . GA.sub.5; GB.sub.0 . . . GB.sub.2; GC.sub.0 corresponds;
(82) wherein the encoded dynamic range control bit stream DBS comprises bit representations B′.sub.1, B′.sub.2 of shifted nodes B.sub.1, B.sub.2 selected from the nodes B.sub.0 . . . B.sub.2 of one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS, which are embedded in a bitstream portion corresponding to the dynamic range control frame DFS subsequent to the one reference dynamic range control frame DFR, wherein the bit representation B′.sub.0 of each remaining node B.sub.0 of the nodes B.sub.0 . . . B.sub.2 of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS is embedded into the bitstream portion DFR′ corresponding to the one reference dynamic range control frame DFR; and
(83) wherein the bit representation B′.sub.0 of each remaining node B.sub.0 of the remaining nodes B′.sub.0 of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS is decoded in order to reproduce each remaining node B.sub.0 of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS;
(84) wherein the bit representation B′.sub.1, B′.sub.2 of each shifted node B.sub.1, B.sub.2 of the shifted nodes B.sub.1, B.sub.2 selected from the nodes B.sub.0 . . . B.sub.2 of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS is decoded in order to reproduce each shifted node B.sub.1, B.sub.2 of the shifted nodes B.sub.1, B.sub.2 selected from the nodes of the one reference dynamic range control frame DFR of the dynamic range control frames DFP, DFR, DFS; and
(85) wherein the reproduced remaining nodes B.sub.0 and the reproduced shifted nodes B.sub.1, B.sub.2 are combined in order to reconstruct the reference dynamic range control frame DFR.
(86) With respect to the decoder, the encoder and the methods of the described embodiments the following shall be mentioned:
(87) Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
(88) Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed.
(89) Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system such that one of the methods described herein is performed.
(90) Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may for example be stored on a machine readable carrier.
(91) Other embodiments comprise the computer program for performing one of the methods described herein, which is stored on a machine readable carrier or a non-transitory storage medium.
(92) In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
(93) A further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein.
(94) A further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may be configured, for example, to be transferred via a data communication connection, for example via the Internet.
(95) A further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured or adapted to perform one of the methods described herein.
(96) A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
(97) In some embodiments, a programmable logic device (for example a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods are advantageously performed by any hardware apparatus.
(98) While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations and equivalents as fall within the true spirit and scope of the present invention.
REFERENCES
(99) [1] D. Giannoulis, M. Massberg, J. D. Reiss, “Digital Dynamic Range Compressor Design—A Tutorial and Analysis” J. Audio Engineering Society, Vol. 60, No. 6, June 2012. in