METHOD AND APPARATUS FOR CODING OR DECODING SUBBAND CONFIGURATION DATA FOR SUBBAND GROUPS
20170243592 · 2017-08-24
Inventors
Cpc classification
International classification
Abstract
For an efficient encoding of subband configuration data the first, penultimate and last subband groups are treated differently than the other subband groups. Further, subband group bandwidth difference values are used in the encoding. The number of subband groups N.sub.SB is coded using a fixed number of bits representing N.sub.SB−1. The bandwidth value B.sub.SB[1] of the first subband group is coded using a unary code representing B.sub.SB[1]−1. No bandwidth value B.sub.SB[g] is coded for the last subband g=N.sub.SB. For subband groups g=2, . . . , N.sub.SB−2 bandwidth difference values ΔB.sub.SB [g]=B.sub.SB [g]−B.sub.SB[g−1] are coded using a unary code, and the bandwidth difference value ΔB.sub.SB[N.sub.SB−1] for subband group g=N.sub.SB−1 is coded using a fixed number of bits.
Claims
1. Method for coding subband configuration data (N.sub.SB, G.sub.1 . . . G.sub.N.sub.
2. Method according to claim 1, wherein a subband configuration data block (s.sub.SBconfig) includes a configuration value (configIdx) that determines whether: a first predefined combination of number of subband groups and related subband group widths represents said subband configuration data, or a different second predefined combination of number of subband groups and related subband group widths represents said subband configuration data, or optionally further predefined combinations of number of subband groups and related subband group widths represents said subband configuration data, or subband configuration data are coded according to the method of claim 1, wherein in case N.sub.SB=0 no subband configuration data is generated.
3. Apparatus for coding subband configuration data (N.sub.SB, G.sub.1 . . . G.sub.N.sub.
4. Apparatus according to claim 3, wherein a subband configuration data block (s.sub.SBconfig) includes a configuration value (configIdx) that determines whether: a first predefined combination of number of subband groups and related subband group widths represents said subband configuration data, or a different second predefined combination of number of subband groups and related subband group widths represents said subband configuration data, or optionally further predefined combinations of number of subband groups and related subband group widths represents said subband configuration data, or subband configuration data are coded based on a number of subband groups N.sub.SB with a fixed number of bits (N.sub.b,SB) representing N.sub.SB−1, where if N.sub.SB>1, coding for a first subband group g=1 a bandwidth value B.sub.SB[1] with a unary code representing B.sub.SB[1]−1; if N.sub.SB=3, in addition to coding said bandwidth value B.sub.SB[1] for said first subband group g=1, coding for subband group g=2 a bandwidth difference value ΔB.sub.SB[2]=B.sub.SB[2]−B.sub.SB[1] with a fixed number of bits (N.sub.b,lastDiff); if N.sub.SB>3, in addition to coding said bandwidth value B.sub.SB[1] for said first subband group g=1, coding for subband groups g=2, . . . , N.sub.SB−2 a corresponding number of bandwidth difference values ΔB.sub.SB[g]=B.sub.SB[g]−B.sub.SB[g−1] with a unary code, and coding for subband group g=N.sub.SB−1 a bandwidth difference value ΔB.sub.SB[N.sub.SB−1]=B.sub.SB[N.sub.SB−1]−B.sub.SB[N.sub.SB−2] with a fixed number of bits (N.sub.b,lastDiff), wherein a bandwidth value for a subband group is expressed as number of adjacent original subbands, wherein for subband g=N.sub.SB no corresponding value is included in the coded subband configuration data, and wherein in case N.sub.SB=0 no subband configuration data is generated.
5. Method for decoding coded subband configuration data (s.sub.SBconfig) for subband groups (g) valid for one or more frames of a coded audio signal, which subband configuration data are data which were coded according to claim 1 and which were arranged as a sequence of said coded number of subband groups and said coded bandwidth value for said first subband group and possibly one or more coded bandwidth difference values, wherein each subband group is equal to one original subband or is a combination of two or more adjacent original subbands, the bandwidth of a following subband group is greater than or equal to the bandwidth of a current subband group, and the number of original subbands N.sub.FB is predefined, characterised by: determining the number of subband groups N.sub.SB by adding ‘1’ to a decoded version of the coded number of subband groups; determining for the first subband group g=1 a bandwidth value B.sub.SB[1] by adding ‘1’ to a decoded version of the corresponding coded bandwidth value; if N.sub.SB=3, in addition to determining said bandwidth value B.sub.SB [1] for said first subband group g=1, decoding (73) for subband group g=2 from the coded version of bandwidth difference value ΔB.sub.SB[2] a bandwidth value B.sub.SB[2]=ΔB.sub.SB[2] B.sub.SB[1]; if N.sub.SB>3, in addition to determining said bandwidth value B.sub.SB[1] for said first subband group g=1, decoding for subband groups g=2, . . . , N.sub.SB−2 from the coded version of bandwidth difference values ΔB.sub.SB[g] bandwidth values B.sub.SB[g]=ΔB.sub.SB[g]+B.sub.SB[g−1], and decoding for subband group g=N.sub.SB−1 from the coded version of bandwidth difference value ΔB.sub.SB[N.sub.SB−1] a bandwidth value B.sub.SB[N.sub.SB−1]=ΔB.sub.SB[N.sub.SB−1]+B.sub.SB[N.sub.SB−2], determining the bandwidth value B.sub.SB[N.sub.SB] for subband g=N.sub.SB by subtracting the bandwidths B.sub.SB[1] to B.sub.SB[N.sub.SB−1] from N.sub.FB, wherein a bandwidth value for a subband group is expressed as number of adjacent original subbands.
6. Method according to claim 5, wherein a subband configuration data block (s.sub.SBconfig) includes a configuration value (configIdx) that determines whether: a first predefined combination of number of subband groups and related subband group widths represents said subband configuration data, or a different second predefined combination of number of subband groups and related subband group widths represents said subband configuration data, or optionally further predefined combinations of number of subband groups and related subband group widths represents said subband configuration data, or subband configuration data were coded based on a number of subband groups N.sub.SB with a fixed number of bits (N.sub.b,SB) representing N.sub.SB−1, where if N.sub.SB>1, coding for a first subband group g=1 a bandwidth value B.sub.SB[1] with a unary code representing B.sub.SB[1]−1; if N.sub.SB=3, in addition to coding said bandwidth value B.sub.SB[1] for said first subband group g=1, coding for subband group g=2 a bandwidth difference value ΔB.sub.SB[2]=B.sub.SB[2]−B.sub.SB[1] with a fixed number of bits (N.sub.b,lastDiff); if N.sub.SB>3, in addition to coding said bandwidth value B.sub.SB[1] for said first subband group g=1, coding for subband groups g=2, . . . , N.sub.SB−2 a corresponding number of bandwidth difference values ΔB.sub.SB[g]=B.sub.SB[g]−B.sub.SB[g−1] with a unary code, and coding for subband group g=N.sub.SB−1 a bandwidth difference value ΔB.sub.SB[N.sub.SB−1]=B.sub.SB[N.sub.SB−1]−B.sub.SB[N.sub.SB−2] with a fixed number of bits (N.sub.b,lastDiff), wherein a bandwidth value for a subband group is expressed as number of adjacent original subbands, wherein for subband g=N.sub.SB no corresponding value is included in the coded subband configuration data, and wherein only in case N.sub.SB≠0 the method according to claim 5 is carried out.
7. Apparatus for decoding coded subband configuration data (s.sub.SBconfig) for subband groups (g) valid for one or more frames of a coded audio signal, which subband configuration data are data which were coded according to claim 1 and which were arranged as a sequence of said coded number of subband groups and said coded bandwidth value for said first subband group and possibly one or more coded bandwidth difference values, wherein each subband group is equal to one original subband or is a combination of two or more adjacent original subbands, the bandwidth of a following subband group is greater than or equal to the bandwidth of a current subband group, and the number of original subbands N.sub.FB is predefined, said apparatus including means adapted to: determining the number of subband groups N.sub.SB by adding ‘1’ to a decoded version of the coded number of subband groups; determining for the first subband group g=1 a bandwidth value B.sub.SB[1] by adding ‘1’ to a decoded version of the corresponding coded bandwidth value; if N.sub.SB=3, in addition to determining said bandwidth value B.sub.SB[1] for said first subband group g=1, decoding for subband group g=2 from the coded version of bandwidth difference value ΔB.sub.SB[2] a bandwidth value B.sub.SB[2]=ΔB.sub.SB[2]+B.sub.SB[1]; if N.sub.SB>3, in addition to determining said bandwidth value B.sub.SB[1] for said first subband group g=1, decoding for subband groups g=2, . . . , N.sub.SB−2 from the coded version of bandwidth difference values ΔB.sub.SB[g] bandwidth values B.sub.SB[g]=ΔB.sub.SB[g]+B.sub.SB[g−1], and decoding for subband group g=N.sub.SB−1 from the coded version of bandwidth difference value ΔB.sub.SB[N.sub.SB−1] a bandwidth value B.sub.SB[N.sub.SB−1]=ΔB.sub.SB[N.sub.SB−1]+B.sub.SB[N.sub.SB−2], determining the bandwidth value B.sub.SB[N.sub.SB] for subband g=N.sub.SB by subtracting the bandwidths B.sub.SB[1] to B.sub.SB[N.sub.SB−1] from N.sub.FB, wherein a bandwidth value for a subband group is expressed as number of adjacent original subbands.
8. Apparatus according to claim 7, wherein a subband configuration data block (s.sub.SBconfig) includes a configuration value (configIdx) that determines whether: a first predefined combination of number of subband groups and related subband group widths represents said subband configuration data, or a different second predefined combination of number of subband groups and related subband group widths represents said subband configuration data, or optionally further predefined combinations of number of subband groups and related subband group widths represents said subband configuration data, or subband configuration data were coded according to the method of claim 1, wherein only in case N.sub.SB≠0 the apparatus operates according to claim 7.
9. Digital compressed audio signal that contains subband configuration data encoded according to the method of claim 1.
10. Digital compressed audio signal that contains multiple sets of different subband configuration data encoded according to the method of claim 1.
11. Storage medium that contains or stores, or has recorded on it, a digital compressed audio signal according to claim 9.
12. Computer program product comprising instructions which, when carried out on a computer, perform the method according to claim 1.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0032] Exemplary embodiments of the invention are described with reference to the accompanying drawings, which show in:
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
DESCRIPTION OF EMBODIMENTS
[0040] Even if not explicitly described, the following embodiments may be employed in any combination or sub-combination.
[0041]
[0042] The invention deals with the efficient coding of subband configurations, which includes the number of subband groups and the mapping of original subbands to subband groups. In case an audio encoder can operate with different subband configurations (i.e. different number of subbands and different bandwidths of these subbands), these subband configurations are transferred or transmitted to the audio decoder side.
[0043] In a different embodiment the subband configuration is changing over time (for example dependent on an analysis of the audio input signal).
[0044] It has to be ensured in both cases that both encoder and decoder use the same subband configuration. For streaming formats this kind of information is sent at the beginning of each streaming block where a decoding can be started.
[0045] It is assumed that the configuration and operation mode (e.g. QMF) of the original analysis filter bank 11 in the encoder is fixed and is known to the decoder. The number of subbands of the analysis filter bank 11 is denoted by N.sub.FB and needs not be transferred to decoder side. The number of combined subbands or subband groups used for the audio processing is denoted by N.sub.SB. The index used for these combined subbands or subband groups is g=1, . . . , N.sub.SB.
[0046] The gth subband group is defined by a data set G.sub.g that contains the subband indices of the analysis filter bank 11. For example (cf.
G.sub.1={1}, G.sub.2={2,3,4}, G.sub.3={5,6,7,8} (1)
[0047] It is assumed that all subband groups cover all subbands of the original filter bank 11 in the frequency range from 0 Hz up to the Nyquist frequency. Therefore the subband groups are fully described by their bandwidths expressed in number of original filter bank subbands per subband group. These numbers for bandwidths are denoted by B.sub.SB[g], and the sum of all these bandwidths is equal to the number of bands of the original filter bank 11:
Σ.sub.g=1.sup.N.sup.
[0048] The values that need to be transferred to the decoder side are: [0049] number of subband groups N.sub.SB; [0050] bandwidths of subband groups B.sub.SB[g] for g=1, . . . , N.sub.SB−1, whereby the bandwidth of the last subband group needs not be transferred due to the above complete frequency range covering assumption.
[0051] The combination of these values is called subband configuration data.
[0052] Using equation (2), the bandwidth of the last subband group can be computed from the other bandwidths by
B.sub.SB[N.sub.SB]=N.sub.FB−Σ.sub.g=1.sup.N.sup.
[0053] One way of coding the subband configuration could be as follows: [0054] The number of used subband groups N.sub.SB is coded with a fixed number of bits N.sub.b,SB. For determining this number of bits, a maximum number of subbands is defined. As an example N.sub.b,SB=5 bits could be used for coding N.sub.SBε[0,31]. [0055] The bandwidths B.sub.SB[g] for groups g=1, . . . , N.sub.SB−1 are coded with N.sub.b,BW bits each. The maximum bandwidth of each subband group is N.sub.FB and the coding of the bandwidth would require N.sub.b,BW=┌log.sub.2(N.sub.FB)┐ bits for each subband group.
[0056] As an example with N.sub.FB=64, N.sub.SB=4 and N.sub.b,SB=5 this approach would require N.sub.b,SB+(N.sub.SB−1).Math.N.sub.b,BW=5+3.Math.6=23 bits for transferring the subband configuration data.
[0057] Advantageously, the required number of bits for transferring a subband configuration can be reduced by using the following improved processing. It uses a value configIdx coded with 2 bits that describes three typical subband configurations for configIdxε{0,1,2}. For configIdx=3 an adapted coding of the subband configuration data is used. For the three pre-defined subband configurations the following values are selected: [0058] number of subband groups; [0059] for each subband group the bandwidths of this subband group.
[0060] Table 1 shows an example of filter bank subband configurations for N.sub.FB=64 encoded with a 2-bit value. Instead of N.sub.FB=64, N.sub.FB=32 or N.sub.FB=128 can be used. The configurations with configIdxε{0,1,2} are defined in the same way in both encoder and decoder. A zero value for N.sub.SB can also be used for indicating that the configuration data processing described below is not used at all. This way the corresponding coding tool can be disabled.
TABLE-US-00001 TABLE 1 numOfSubbandsTable[configIdx] subbandWidthTable[configIdx] (number of subband groups (subband group widths configIdx N.sub.SB) B.sub.SB) 0 0 [ ] 1 4 [1 1 5 57] 2 8 [1 1 1 2 2 5 10 42] 3 defined by other coding scheme
Bandwidth Coding Adapted to Typical Subband Configurations
[0061] As mentioned above in connection with the Traunmüller and Zwicker/Fastl publications, there exist different scales (e.g. Bark scale) for the frequency axis that approximate the properties of human hearing. These frequency scales share the property of increasing subband widths with increasing frequency, such that at lower frequencies a better frequency resolution is obtained. The subband widths can be coded by transferring the bandwidth differences
ΔB.sub.SB[g]=B.sub.SB[g]−B.sub.SB[g−1]; g=2, . . . ,N.sub.SB−1. (4)
[0062] For the considered subband properties these bandwidth differences are then always non-negative.
[0063] Therefore, a subband configuration can also be defined by: [0064] number of used subband groups N.sub.SB; [0065] bandwidth B.sub.SB[1] for the first subband group g=1; [0066] bandwidth differences ΔB.sub.SB[g] for subband groups g=2, . . . , N.sub.SB−1.
[0067] From the bandwidth differences the bandwidths B.sub.SB[g] for subband groups g=2, . . . , N.sub.SB−1 can be reconstructed, for instance as shown in table 4 following line CodedBwFirstSubband.
[0068] The last subband group bandwidth B.sub.SB[N.sub.SB] can be reconstructed by using equation (3).
Statistical Analysis of Typical Subband Group Widths
[0069] For a statistical analysis of the subband group bandwidths and bandwidth differences, example subband configurations for a QMF filter bank with N.sub.FB=64 subbands and with N.sub.SB=2, . . . , 20 subband groups that approximate a Bark scale were analysed. The subband groups were defined based on the conversion defined in the above-mentioned Traunmüller publication between z in Bark and f in Hz, which is given by
[0070] In more detail, the subband groups are obtained by: [0071] creating equally spaced band edges on the Bark scale for the number of desired subband groups; [0072] converting these values back to the frequency scale, which converted values are the desired band edges of the subband groups; [0073] find centre frequencies of the original QMF subbands that lie inside the desired subbands; [0074] do some postprocessing in order to achieve increasing bandwidths of the subband groups.
[0075] The resulting bandwidths of the subband groups, dependent on the number of subband groups, are given in table 2:
TABLE-US-00002 N.sub.SB B.sub.SB[1], . . . , B.sub.SB[N.sub.SB − 1] 2 [5] 3 [2 7] 4 [2 3 7] 5 [1 2 4 8] 6 [1 1 3 4 9] 7 [1 1 2 2 4 10] 8 [1 1 1 2 2 5 10] 9 [1 1 1 2 2 3 5 11] 10 [1 1 1 1 2 2 3 6 11] 11 [1 1 1 1 1 2 3 3 6 12] 12 [1 1 1 1 1 1 2 2 4 6 12] 13 [1 1 1 1 1 1 1 2 3 4 6 12] 14 [1 1 1 1 1 1 1 2 2 3 4 6 12] 15 [1 1 1 1 1 1 1 1 2 2 3 5 6 12] 16 [1 1 1 1 1 1 1 1 1 2 2 4 4 7 12] 17 [1 1 1 1 1 1 1 1 1 2 2 2 4 4 7 12] 18 [1 1 1 1 1 1 1 1 1 1 2 2 2 4 4 7 12] 19 [1 1 1 1 1 1 1 1 1 1 1 2 2 3 3 5 7 11] 20 [1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 4 5 7 11]
[0076] The bandwidth B.sub.SB[N.sub.SB] is omitted in table 2 because it is the remaining bandwidth that adds up to a total bandwidth of 64 subbands.
[0077]
[0078]
[0079] In
[0080] As mentioned above, for the last subband group g=N.sub.SB no bandwidth difference ΔB.sub.SB[N.sub.SB] needs to be transferred.
Improved Coding Processing
[0081] Based on the statistical analysis, the following improved coding processing is carried out: [0082] coding of the number of subband groups:
CodedNumberOfSubbands=N.sub.SB−1 (7) [0083] is coded with a fixed number of bits N.sub.b,SB; [0084] if the number of subband groups N.sub.SB is one, nothing else is transferred because this case is identical to a broadband processing; [0085] coding of the bandwidth value B.sub.SB[1] of the first subband group. As B.sub.SB[1]≧1,
CodedBwFirstSubband=B.sub.SB[1]−1 (8) [0086] is coded with a unary code; [0087] the following bandwidth values need only be transferred if N.sub.SB>2: [0088] subband groups g=2, . . . , N.sub.SB−2: bandwidth difference values ΔB.sub.SB [g] are each coded with a unary code; [0089] subband group g=N.sub.SB−1: the bandwidth difference value ΔB.sub.SB[N.sub.SB−1] is coded with a fixed number of bits N.sub.b,lastDiff; [0090] subband group g=N.sub.SB: no value or coded value is transferred.
[0091] The coding scheme bitstream syntax is shown in table 3 as pseudo-code for transfer of subband configuration data. Data in bold are written to the bitstream and represent a subband configuration data block (s.sub.SBconfig)
TABLE-US-00003 Syntax No. of bits Type configIdx 2 unsigned int if (configIdx == 3) { CodedNumberOfSubbands (i.e. N.sub.SB − 1) N.sub.b,SB unsigned int if (CodedNumberOfSubbands > 0) { CodedBwFirstSubband (dynamic) unary code if (CodedNumberOfSubbands > 1) { if (CodedNumberOfSubbands > 2) { for g = 2 to N.sub.SB − 2 { ΔB.sub.SB[g] (dynamic) unary code } } ΔB.sub.SB[N.sub.SB − 1] N.sub.b,lastDiff unsigned int } } }
[0092] The inventors have found that, for N.sub.FB=64, sufficient bit widths (i.e. word lengths) are N.sub.b,SB=5 and N.sub.b,lastDiff=3.
[0093] Table 4 shows decoding of the transferred subband configuration data, by reading these data from the bitstream received at decoder side (data in bold are read from the bitstream), and reconstruction of the bandwidth values B.sub.SB[g]:
TABLE-US-00004 Syntax No. of bits Type configIdx 2 unsigned int if (configIdx < 3) { N.sub.SB = numOfSubbandsTable[configIdx] B.sub.SB = subbandWidthTable[configIdx] } else { CodedNumberOfSubbands N.sub.b,SB unsigned int N.sub.SB = CodedNumberOfSubbands + 1 B.sub.total = 0 if (N.sub.SB > 1) { CodedBwFirstSubband (dynamic) unary code B.sub.SB[1] = CodedBwFirstSubband + 1 B.sub.total = B.sub.total + B.sub.SB[1] if (N.sub.SB > 2) { if (N.sub.SB > 3) { for g = 2 to N.sub.SB − 2 { ΔB.sub.SB[g] (dynamic) unary code B.sub.SB[g] = ΔB.sub.SB[g] + B.sub.SB[g − 1] B.sub.total = B.sub.total + B.sub.SB[g] } } g = N.sub.SB − 1 ΔB.sub.SB[g] N.sub.b,lastDiff unsigned int B.sub.SB[g] = ΔB.sub.SB[g] + B.sub.SB[g − 1] B.sub.total = B.sub.total + B.sub.SB[g] } } B.sub.SB[N.sub.SB] = N.sub.FB − B.sub.total }
[0094] The reconstruction of subband index set G.sub.g from the reconstructed bandwidth values B.sub.SB[g] for all subband groups is shown in pseudo code in table 5:
TABLE-US-00005 i = 0 for g = 1 to N.sub.SB { G.sub.g = { } for b = 1 to B.sub.SB[g] { i = i + 1 G.sub.g = G.sub.g ∪ {i} } }
Results for the Improved Coding Processing
[0095] The number of required bits for coding the subband configurations is simulated for a QMF filter bank with N.sub.FB=64 subbands and with N.sub.SB=2, . . . , 20 subband groups with the configurations given in table 2.
[0096] In comparison with the total of 23 bits example in the paragraph following equation (3), the improved processing requires 12 bits only.
[0097] The improved subband configuration coding processing clearly outperforms the alternative approaches.
[0098] An example encoder including generation of corresponding encoded subband configuration data is shown in
[0099] In
[0100] In the decoder in
[0101] In a different embodiment the original subbands do not have equal widths. Further, instead of having a number of original subbands that is a power of ‘2’, any other integer numbers of original subbands could be used. In both cases the described processing can be used in a corresponding manner.
[0102] In a further embodiment a compressed audio signal contains multiple sets of different subband configuration data encoded as described above, which serve for applying different coding tools used for coding that audio signal, e.g. directional signal parts and ambient signal parts of a Higher Order Ambisonics audio signal or any other 3D audio signal, or different channels of a multi-channel audio signal.
[0103] In a further embodiment the processed subband signals {circumflex over (x)}(k,i) may not be transferred to the decoder side, but at decoder side the subband signals are computed by an analysis filter bank from another transferred signal. Then the subband group side information s(k,g) is used in the decoder for further processing.
[0104] The described processing can be carried out by a single processor or electronic circuit, or by several processors or electronic circuits operating in parallel and/or operating on different parts of the complete processing.
[0105] The instructions for operating the processor or the processors according to the described processing can be stored in one or more memories. The at least one processor is configured to carry out these instructions.