AUDIO CODING DEVICE, AUDIO CODING METHOD, AUDIO CODING PROGRAM, AUDIO DECODING DEVICE, AUDIO DECODING METHOD, AND AUDIO DECODING PROGRAM
20220059108 · 2022-02-24
Assignee
Inventors
Cpc classification
G10L19/09
PHYSICS
G10L19/005
PHYSICS
International classification
G10L19/125
PHYSICS
G10L19/005
PHYSICS
Abstract
An audio signal transmission device for encoding an audio signal includes an audio encoding unit that encodes an audio signal and a side information encoding unit that calculates and encodes side information from a look-ahead signal. An audio signal receiving device for decoding an audio code and outputting an audio signal includes: an audio code buffer that detects packet loss based on a received state of an audio packet, an audio parameter decoding unit that decodes an audio code when an audio packet is correctly received, a side information decoding unit that decodes a side information code when an audio packet is correctly received, a side information accumulation unit that accumulates side information obtained by decoding a side information code, an audio parameter missing processing unit that outputs an audio parameter upon detection of audio packet loss, and an audio synthesis unit that synthesizes decoded audio from the audio parameter.
Claims
1-21. (canceled)
22. An audio encoding method by an audio encoding device for encoding an audio signal, comprising: an audio encoding step of encoding an audio signal; and a side information encoding step of calculating side information from a look-ahead signal for calculating a predicted value of an audio parameter to synthesize a decoded audio, and encoding the side information, wherein the side information contains information indicative of availability of the side information; wherein the side information is adopted as an audio parameter in the decoding processing side when the reliability of the predicted value of the calculated audio parameter is low.
23. The audio encoding method according to claim 22, wherein the side information is indicative of a pitch lag included in the look-ahead signal.
24. An audio encoding device for encoding an audio signal, the audio encoding device comprising: an audio encoder configured to encode the audio signal; and a side information encoder configured to calculate side information from a look-ahead signal for calculating a predicted value of an audio parameter to synthesize a decoded audio, and encoding the side information, wherein the side information contains information indicative of availability of the side information, wherein the side information is adopted as an audio parameter in the decoding processing side when the reliability of the predicted value of the calculated audio parameter is low.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
[0054]
[0055]
[0056]
[0057]
[0058]
[0059]
[0060]
[0061]
[0062]
[0063]
[0064]
[0065]
[0066]
[0067]
[0068]
[0069]
[0070]
[0071]
[0072]
[0073]
[0074]
[0075]
[0076]
[0077]
[0078]
[0079]
[0080]
[0081]
[0082]
DESCRIPTION OF EMBODIMENTS
[0083] Embodiments of the audio coding system are described hereinafter with reference to the attached drawings. Note that, where possible, the same elements are denoted by the same reference numerals and redundant description thereof is omitted.
[0084] An embodiment of the audio coding system relates to an encoder and a decoder that implement “packet loss concealment technology using side information” that encodes and transmits side information calculated on the encoder side for use in packet loss concealment on the decoder side.
[0085] In the embodiments of the audio coding system, the side information that is used for packet loss concealment is contained in a previous packet.
[0086] Because the side information is contained in a previous packet, it is possible to perform decoding without waiting for a packet that arrives after a packet to be decoded. Further, when packet loss is detected, because the side information for a frame to be concealed is obtained from the previous packet, it is possible to implement highly accurate packet loss concealment without waiting for the next packet.
[0087] In addition, by transmitting parameters for CELP encoding in a look-ahead signal as the side information, it is possible to reduce the inconsistency of adaptive codebooks even in the event of packet loss.
[0088] The embodiments of the audio coding system can include an audio signal transmitting device (audio encoding device) and an audio signal receiving device (audio decoding device). A functional configuration example of an audio signal transmitting device (such as an audio encoding device) is shown in
[0089] As shown in
[0090] The audio signal transmitting device encodes an audio signal for each frame and can transmit the audio signal by the example procedure shown in
[0091] The audio encoding unit 111 can calculate audio parameters for a frame to be encoded and output an audio code (Step S131 in
[0092] The side information encoding unit 112 can calculate audio parameters for a look-ahead signal and output a side information code (Step S132 in
[0093] It is determined whether the audio signal ends, and the above steps can be repeated until the audio signal ends (Step S133 in
[0094] The audio signal receiving device decodes a received audio packet and outputs an audio signal by the example procedure shown in
[0095] The audio code buffer 121 waits for the arrival of an audio packet and accumulates an audio code. When the audio packet has correctly arrived, the processing is switched to the audio parameter decoding unit 122. On the other hand, when the audio packet has not correctly arrived, the processing is switched to the audio parameter missing processing unit 123 (Step S141 in
[0096] <When Audio Packet is Correctly Received>
[0097] The audio parameter decoding unit 122 decodes the audio code and outputs audio parameters (Step S142 in
[0098] The side information decoding unit 125 decodes the side information code and outputs side information. The outputted side information is sent to the side information accumulation unit 126 (Step S143 in
[0099] The audio synthesis unit 124 synthesizes an audio signal from the audio parameters output from the audio parameter decoding unit 122 and outputs the synthesized audio signal (Step S144 in
[0100] The audio parameter missing processing unit 123 accumulates the audio parameters output from the audio parameter decoding unit 122 in preparation for packet loss (Step S145 in
[0101] The audio code buffer 121 determines whether the transmission of audio packets has ended, and when the transmission of audio packets has ended, stops the processing. While the transmission of audio packets continues, the above Steps S141 to S146 are repeated (Step S147 in
[0102] <When Audio Packet is Lost>
[0103] The audio parameter missing processing unit 123 reads the side information from the side information accumulation unit 126 and carries out prediction for the parameter(s) not contained in the side information and thereby outputs the audio parameters (Step S146 in
[0104] The audio synthesis unit 124 synthesizes an audio signal from the audio parameters output from the audio parameter missing processing unit 123 and outputs the synthesized audio signal (Step S144 in
[0105] The audio parameter missing processing unit 123 accumulates the audio parameters output from the audio parameter missing processing unit 123 in preparation for packet loss (Step S145 in
[0106] The audio code buffer 121 determines whether the transmission of audio packets has ended, and when the transmission of audio packets has ended, stops the processing. While the transmission of audio packets continues, the above Steps S141 to S146 are repeated (Step S147 in
Example 1
[0107] In this example of a case where a pitch lag is transmitted as the side information, the pitch lag can be used for generation of a packet loss concealment signal at the decoding end.
[0108] The functional configuration example of the audio signal transmitting device is shown in
[0109] <Transmitting End>
[0110] In the audio signal transmitting device, an input audio signal is sent to the audio encoding unit 111.
[0111] The audio encoding unit 111 encodes a frame to be encoded by CELP encoding (Step 131 in
[0112] The side information encoding unit 112 calculates a side information code using the parameters calculated by the audio encoding unit 111 and the look-ahead signal (Step 132 in
[0113] The LP coefficient calculation unit 151 calculates an LP coefficient using the ISF parameter calculated by the audio encoding unit 111 and the ISF parameter calculated in the past several frames (Step 161 in
[0114] First, the buffer is updated using the ISF parameter obtained from the audio encoding unit 111 (Step 171 in
where ω.sub.i.sup.(−j) is the ISF parameter, stored in the buffer, which is for the frame preceding by j-number of frames. Further, ω.sub.i.sup.C is the ISF parameter during the speech period that is calculated in advance by learning or the like. β is a constant, and it may be a value such as 0.75, for example, though not limited thereto. Further, α is also constant, and it may be a value such as 0.9, for example, though not limited thereto. ω.sub.i.sup.C, α and β may be varied by the index representing the characteristics of the frame to be encoded as in the ISF concealment described in ITU-T G.718, for example.
[0115] In addition, the values of i are arranged so that {dot over (ω)}.sub.i satisfies 0<{dot over (ω)}.sub.0<{dot over (ω)}.sub.1< . . . {dot over (ω)}.sub.14, and the values of {dot over (ω)}.sub.i can be adjusted so that the adjacent {dot over (ω)}.sub.i is not too close. As a procedure to adjust the value of {dot over (ω)}.sub.i, ITU-T G.718 (Equation 151) may be used, for example (Step 173 in
[0116] After that, the ISF parameter {dot over (ω)}.sub.i is converted into an ISP parameter and interpolation can be performed for each sub-frame. As an example method of calculating the ISP parameter from the ISF parameter, the method described in the section 6.4.4 in ITU-T G.718 may be used, and as a method of interpolation, the procedure described in the section 6.8.3 in ITU-T G.718 may be used (Step 174 in
[0117] Then, the ISP parameter for each sub-frame is converted into an LP coefficient {dot over (α)}.sup.i.sub.j(0<i≤P, 0≤j<M.sub.la). The number of sub-frames contained in the look-ahead signal is M.sub.la. For the conversion from the ISP parameter to the LP coefficient, in an example, the procedure described in the section 6.4.5 in ITU-T G.718 may be used (Step 175 in
[0118] The target signal calculation unit 152 calculates a target signal x(n) and an impulse response h(n) by using the LP coefficient {dot over (α)}.sup.i.sub.j (Step 162 in
[0119] First, a residual signal r(n) of the look-ahead signal S.sub.pre.sup.l(n)(0≤n<L′) is calculated using the LP coefficient according to the following equation (Step 181 in
[0120] Note that L′ indicates the number of samples of a sub-frame, and L indicates the number of samples of a frame to be encoded s.sub.pre(n)(0≤n<L). Then, s.sub.pre.sup.l(n−p)=s.sub.pre(n+L−p) is satisfied.
[0121] In addition, the target signal x(n)(0≤n<L′) is calculated by the following equations (Step 182 in
where an perceptual weighting filter γ=0.68. The value of the perceptual weighting filter may be a different value according to the design policy of audio encoding.
[0122] Then, the impulse response h(n)(0<n<L′) is calculated by the following equations (Step 183 in
[0123] The pitch lag calculation unit 153 calculates a pitch lag for each sub-frame by calculating k that maximizes the following equation (Step 163 in
Note that y.sub.k(n) is obtained by convoluting the impulse response with the linear prediction residual. Int(i) indicates an interpolation filter. The details of an example of an interpolation filter are described in the section 6.8.4.1.4.1 in ITU-T G.718. As a matter of course, v′(n)=u(n+N.sub.adapt−T.sub.p+i) may be employed without using the interpolation filter.
[0124] Although the pitch lag can be calculated as an integer by the above-described calculation method, the accuracy of the pitch lag may be increased to after the decimal point accuracy by interpolating the above T.sub.k.
[0125] A procedure to calculate the pitch lag after the decimal point by interpolation can be performed, such as by the processing method described in the section 6.8.4.1.4.1 in ITU-T G.718.
[0126] The adaptive codebook calculation unit 154 calculates an adaptive codebook vector v′(n) and a long-term prediction parameter from the pitch lag T.sub.p and the adaptive codebook u(n) stored in the adaptive codebook buffer 156 according to the following equation (Step 164 in
For the details of an example of the procedure to calculate the long-term parameter, the method described in the section 5.7 in 3GPPTS26-190 may be used.
[0127] The excitation vector synthesis unit 155 multiplies the adaptive codebook vector v′(n) by a predetermined adaptive codebook gain g.sub.p.sup.C and outputs an excitation signal vector according to the following equation (Step 165 in
e(n)=g.sub.p.sup.C.Math.v′(n) Equation 15
Although the value of the adaptive codebook gain g.sub.p.sup.C may be 1.0 or the like, for example, a value obtained in advance by learning may be used, or it may be varied by the index representing the characteristics of the frame to be encoded.
[0128] Then, the state of the adaptive codebook u(n) stored in the adaptive codebook buffer 156 is updated by the excitation signal vector according to the following equations (Step 166 in
u(n)=u(n+L) (0≤n<N−L) Equation 16
u(n+N−L)=e(n) (0≤n<L) Equation 17
[0129] The synthesis filter 157 synthesizes a decoded signal according to the following equation by linear prediction inverse filtering using the excitation signal vector as an excitation source (Step 167 in
[0130] The above-described Steps 162 to 167 in
[0131] The pitch lag encoding unit 158 encodes the pitch lag T.sub.p.sup.(j) (0≤j<M.sub.la) that is calculated in the look-ahead signal (Step 169 in
[0132] Encoding may be performed by a method such as one of the following methods, for example, although any method may be used for encoding.
1. A method that performs binary encoding, scalar quantization, vector quantization or arithmetic encoding on a part or the whole of the pitch lag T.sub.p.sup.(j) (0≤j<M.sub.la) and transmits the result.
2. A method that performs binary encoding, scalar quantization, vector quantization or arithmetic encoding on a part or the whole of a difference T.sub.p.sup.(j)−T.sub.p.sup.(j-1) (0≤j<M.sub.la) from the pitch lag of the previous sub-frame and transmits the result, where T.sub.p.sup.(j-1) is the pitch lag of the last sub-frame in the frame to be encoded.
3. A method that performs vector quantization or arithmetic encoding on either of a part, or the whole, of the pitch lag T.sub.p.sup.(j) (0≤j<M.sub.la) and a part or the whole of the pitch lag calculated for the frame to be encoded and transmits the result.
4. A method that selects one of a number of predetermined interpolation methods based on a part or the whole of the pitch lag T.sub.p.sup.(j) (0≤j<M.sub.la) and transmits an index indicative of the selected interpolation method. At this time, the pitch lag of a plurality of sub-frames used for audio synthesis in the past also may be used for selection of the interpolation method.
[0133] For scalar quantization and vector quantization, a codebook determined empirically or a codebook calculated in advance by learning may be used. Further, a method that performs encoding after adding an offset value to the above pitch lag may also be included.
[0134] <Decoding End>
[0135] As shown in
[0136] The audio code buffer 121 determines whether a packet is correctly received or not. When the audio code buffer 121 determines that a packet is correctly received, the processing is switched to the audio parameter decoding unit 122 and the side information decoding unit 125. On the other hand, when the audio code buffer 121 determines that a packet is not correctly received, the processing is switched to the audio parameter missing processing unit 123 (Step 141 in
[0137] <When Packet is Correctly Received>
[0138] The audio parameter decoding unit 122 decodes the received audio code and calculates audio parameters required to synthesize the audio for the frame to be encoded (ISP parameter and corresponding ISF parameter, pitch lag, long-term prediction parameter, adaptive codebook, adaptive codebook gain, fixed codebook gain, fixed codebook vector etc.) (Step 142 in
[0139] The side information decoding unit 125 decodes the side information code, calculates a pitch lag {circumflex over (T)}.sub.p.sup.(j) (0≤j<M.sub.la) and stores it in the side information accumulation unit 126. The side information decoding unit 125 decodes the side information code by using the decoding method corresponding to the encoding method used at the encoding end (Step 143 in
[0140] The audio synthesis unit 124 synthesizes the audio signal corresponding to the frame to be encoded based on the parameters output from the audio parameter decoding unit 122 (Step 144 in
[0141] An LP coefficient calculation unit 1121 converts an ISF parameter into an ISP parameter and then performs interpolation processing, and thereby obtains an ISP coefficient for each sub-frame. The LP coefficient calculation unit 1121 then converts the ISP coefficient into a linear prediction coefficient (LP coefficient) and thereby obtains an LP coefficient for each sub-frame (Step 11301 in
[0142] An adaptive codebook calculation unit 1123 calculates an adaptive codebook vector by using the pitch lag, a long-term prediction parameter and an adaptive codebook 1122 (Step 11302 in
The adaptive codebook vector is calculated by interpolating the adaptive codebook u(n) using FIR filter Int(i). The length of the adaptive codebook is N.sub.adapt. The filter Int(i) that is used for the interpolation is the same as the interpolation filter of
This is the FIR filter with a predetermined length 2l+1. L′ is the number of samples of the sub-frame. It is not necessary to use a filter for the interpolation, whereas at the encoder end a filter is used for the interpolation.
[0143] The adaptive codebook calculation unit 1123 carries out filtering on the adaptive codebook vector according to the value of the long-term prediction parameter (Step 11303 in
v′(n)=0.18v′(n−1)+0.64v′(n)+0.18v′(n+1) Equation 21
[0144] On the other hand, when the long-term prediction parameter has a value indicating no filtering is needed, filtering is not performed, and v(n)=v′(n) is established.
[0145] An excitation vector synthesis unit 1124 multiplies the adaptive codebook vector by an adaptive codebook gain g.sub.p (Step 11304 in
e(n)=g.sub.p.Math.v′(n)+g.sub.c.Math.c(n) Equation 22
[0146] A post filter 1125 performs post processing such as pitch enhancement, noise enhancement and low-frequency enhancement, for example, on the excitation signal vector. An example of details of techniques such as pitch enhancement, noise enhancement and low-frequency enhancement are described in the section 6.1 in 3GPP TS26-190. (Step 11307 in
[0147] The adaptive codebook 1122 updates the state by an excitation signal vector according to the following equations (Step 11308 in
u(n)=u(n+L) (0≤n<N−L) Equation 23
u(n+N−L)=e(n) (0≤n<L) Equation 24
[0148] A synthesis filter 1126 synthesizes a decoded signal according to the following equation by linear prediction inverse filtering using the excitation signal vector as an excitation source (Step 11309 in
[0149] An perceptual weighting inverse filter 1127 applies an perceptual weighting inverse filter to the decoded signal according to the following equation (Step 11310 in
ŝ(n)=ŝ(n)+β.Math.ŝ(n−1) Equation 26
The value of β is typically 0.68 or the like, though not limited to this value.
[0150] The audio parameter missing processing unit 123 stores the audio parameters (ISF parameter, pitch lag, adaptive codebook gain, fixed codebook gain) used in the audio synthesis unit 124 into the buffer (Step 145 in
[0151] <When Packet Loss is Detected>
[0152] The audio parameter missing processing unit 123 reads a pitch lag {circumflex over (T)}.sub.p.sup.(j) (0≤j<M.sub.la) from the side information accumulation unit 126 and predicts audio parameters. The functional configuration example of the audio parameter missing processing unit 123 is shown in the example of
[0153] An ISF prediction unit 191 calculates an ISF parameter using the ISF parameter for the previous frame and the ISF parameter calculated for the past several frames (Step 1101 in
[0154] First, the buffer is updated using the ISF parameter of the immediately previous frame (Step 171 in
where {dot over (ω)}.sub.i.sup.(−j) is the ISF parameter, stored in the buffer, which is for the frame preceding by j-number of frames. Further, {dot over (ω)}.sub.i.sup.C, α, and β are the same values as those used at the encoding end.
[0155] In addition, the values of i are arranged so that {dot over (ω)}.sub.i satisfies 0<{dot over (ω)}.sub.0<{dot over (ω)}.sub.1< . . . {dot over (ω)}.sub.14, and values of {dot over (ω)}.sub.i are adjusted so that the adjacent {dot over (ω)}.sub.i is not too close. As an example procedure to adjust the value of {dot over (ω)}.sub.i, ITU-T G.718 (Equation 151) may be used (Step 173 in
[0156] A pitch lag prediction unit 192 decodes the side information code from the side information accumulation unit 126 and thereby obtains a pitch lag {circumflex over (T)}.sub.p.sup.(i) (0≤i<M.sub.la). Further, by using a pitch lag {circumflex over (T)}.sub.p.sup.(−j) (0≤j<J) used for the past decoding, the pitch lag prediction unit 192 outputs a pitch lag {circumflex over (T)}.sub.p.sup.(i)(M.sub.la≤i<M). The number of sub-frames contained in one frame is M, and the number of pitch lags contained in the side information is M.sub.la. For the prediction of the pitch lag {circumflex over (T)}.sub.p.sup.(i)(M.sub.la≤i<M), the procedure described in, for example, section 7.11.1.3 in ITU-T G.718 may be used (Step 1102 in
[0157] An adaptive codebook gain prediction unit 193 outputs an adaptive codebook gain g.sub.p.sup.(i)(M.sub.la≤i<M) by using a predetermined adaptive codebook gain g.sub.p.sup.C and an adaptive codebook gain g.sub.p.sup.(j) (0≤j<J) used in the past decoding. The number of sub-frames contained in one frame is M, and the number of pitch lags contained in the side information is M.sub.la. For the prediction of the adaptive codebook gain g.sub.p.sup.(i)(M.sub.la≤i<M), the procedure described in, for example, section 7.11.2.5.3 in ITU-T G.718 may be used (Step 1103 in
[0158] A fixed codebook gain prediction unit 194 outputs a fixed codebook gain g.sub.c.sup.(i) (0≤i<M) by using a fixed codebook gain g.sub.c.sup.(j) (0≤j<J) used in the past decoding. The number of sub-frames contained in one frame is M. For the prediction of the fixed codebook gain g.sub.c.sup.(i) (0≤i<M), the procedure described in the section 7.11.2.6 in ITU-T G.718 may be used, for example (Step 1104 in
[0159] A noise signal generation unit 195 outputs a noise vector, such as a white noise, with a length of L (Step 1105 in
[0160] The audio synthesis unit 124 synthesizes a decoded signal based on the audio parameters output from the audio parameter missing processing unit 123 (Step 144 in
[0161] The audio parameter missing processing unit 123 stores the audio parameters (ISF parameter, pitch lag, adaptive codebook gain, fixed codebook gain) used in the audio synthesis unit 124 into the buffer (Step 145 in
[0162] Although the case of encoding and transmitting the side information for all sub-frames contained in the look-ahead signal is described in the above example, the configuration that transmits only the side information for a specific sub-frame may be employed.
Alternative Example 1-1
[0163] As an alternative example of the previously discussed example 1, an example that adds a pitch gain to the side information is described hereinafter. A difference between the alternative example 1-1 and the example 1 is the operation of the excitation vector synthesis unit 155, and therefore description of the other parts is omitted.
[0164] <Encoding End>
[0165] The procedure of the excitation vector synthesis unit 155 is shown in the example of
[0166] An adaptive codebook gain g.sub.p.sup.C is calculated from the adaptive codebook vector v′(n) and the target signal x(n) according to the following equation (Step 1111 in
where y(n) is a signal y(n)=v(n)*h(n) that is obtained by convoluting the impulse response with the adaptive codebook vector.
[0167] The calculated adaptive codebook gain is encoded and contained in the side information code (Step 1112 in
[0168] By multiplying the adaptive codebook vector by an adaptive codebook gain ĝ.sub.p obtained by decoding the code calculated in the encoding of the adaptive codebook gain, an excitation vector is calculated according to the following equation (Step 1113 in
e(n)=ĝ.sub.p.Math.v′(n) Equation 30
[0169] <Decoding End>
[0170] The excitation vector synthesis unit 155 multiplies the adaptive codebook vector v′(n) by an adaptive codebook gain ĝ.sub.p obtained by decoding the side information code and outputs an excitation signal vector according to the following equation (Step 165 in
e(n)=ĝ.sub.p.Math.v′(n) Equation 31
Alternative Example 1-2
[0171] As an alternative example of the example 1, an example that adds a flag for determination of use of the side information to the side information is described hereinafter.
[0172] <Encoding End>
[0173] The functional configuration example of the side information encoding unit is shown in
[0174] The side information output determination unit 1128 calculates segmental SNR of the decoded signal and the look-ahead signal according to the following equation, and only when segmental SNR exceeds a threshold, sets the value of the flag to ON and adds it to the side information.
On the other hand, when segmental SNR does not exceed a threshold, the side information output determination unit 1128 sets the value of the flag to OFF and adds it to the side information (Step 1131 in
[0175] <Decoding End>
[0176] The side information decoding unit decodes the flag contained in the side information code. When the value of the flag is ON, the audio parameter missing processing unit calculates a decoded signal by the same procedure as in the example 1. On the other hand, when the value of the flag is OFF, it calculates a decoded signal by the packet loss concealment technique without using side information (Step 1151 in
Example 2
[0177] In this example, the decoded audio of the look-ahead signal part is also used when a packet is correctly received. For purposes of this discussion, the number of sub-frames contained in one frame is M sub-frames, and the length of the look-ahead signal is M′ sub-frame(s).
[0178] <Encoding End>
[0179] As shown in the example of
[0180] The error signal encoding unit 214 reads a concealment signal for one sub-frame from the concealment signal accumulation unit 213, subtracts it from the audio signal and thereby calculates an error signal (Step 221 in
[0181] The error signal encoding unit 214 encodes the error signal. As a specific example procedure, AVQ described in the section 6.8.4.1.5 in ITU-T G.718, can be used. In the encoding of the error signal, local decoding is performed, and a decoded error signal is output (Step 222 in
[0182] By adding the decoded error signal to the concealment signal, a decoded signal for one sub-frame is output (Step 223 in
[0183] The above Steps 221 to 223 are repeated for M′ sub-frames until the end of the concealment signal.
[0184] An example functional configuration of the main encoding unit 211 is shown in
[0185] The ISF encoding unit 2011 obtains an LP coefficient by applying the Levinson-Durbin method to the frame to be encoded and the look-ahead signal. The ISF encoding unit 2011 then converts the LP coefficient into an ISF parameter and encodes the ISF parameter. The ISF encoding unit 2011 then decodes the code and obtains a decoded ISF parameter. Finally, the ISF encoding unit 2011 interpolates the decoded ISF parameter and obtains a decoded LP coefficient for each sub-frame. The procedures of the Levinson-Durbin method and the conversion from the LP coefficient to the ISF parameter are the same as in the example 1. Further, for the encoding of the ISF parameter, the procedure described in, for example, section 6.8.2 in ITU-T G.718 can be used. An index obtained by encoding the ISF parameter, the decoded ISF parameter, and the decoded LP coefficient (which is obtained by converting the decoded ISF parameter into the LP coefficient) can be obtained by the ISF encoding unit 2011 (Step 224 in
[0186] The detailed procedure of the target signal calculation unit 2012 is the same as in Step 162 in
[0187] The pitch lag calculation unit 2013 refers to the adaptive codebook buffer and calculates a pitch lag and a long-term prediction parameter by using the target signal. The detailed procedure of the calculation of the pitch lag and the long-term prediction parameter is the same as in the example 1 (Step 226 in
[0188] The adaptive codebook calculation unit 2014 calculates an adaptive codebook vector by using the pitch lag and the long-term prediction parameter calculated by the pitch lag calculation unit 2013. The detailed procedure of the adaptive codebook calculation unit 2014 is the same as in the example 1 (Step 227 in
[0189] The fixed codebook calculation unit 2015 calculates a fixed codebook vector and an index obtained by encoding the fixed codebook vector by using the target signal and the adaptive codebook vector. The detailed procedure is the same as the procedure of AVQ used in the error signal encoding unit 214 (Step 228 in
[0190] The gain calculation unit 2016 calculates an adaptive codebook gain, a fixed codebook gain and an index obtained by encoding these two gains using the target signal, the adaptive codebook vector and the fixed codebook vector. A detailed procedure which can be used is described in, for example, section 6.8.4.1.6 in ITU-T G.718 (Step 229 in
[0191] The excitation vector calculation unit 2017 calculates an excitation vector by adding the adaptive codebook vector and the fixed codebook vector to which the gain is applied. The detailed procedure is the same as in example 1. Further, the excitation vector calculation unit 2017 updates the state of the adaptive codebook buffer 2019 by using the excitation vector. The detailed procedure is the same as in the example 1 (Step 2210 in
[0192] The synthesis filter 2018 synthesizes a decoded signal by using the decoded LP coefficient and the excitation vector (Step 2211 in
[0193] The above Steps 224 to 2211 are repeated for M-M′ sub-frames until the end of the frame to be encoded.
[0194] The side information encoding unit 212 calculates the side information for the look-ahead signal M′ sub-frame. A specific procedure is the same as in the example 1 (Step 2212 in
[0195] In addition to the procedure of the example 1, the decoded signal output by the synthesis filter 157 of the side information encoding unit 212 is accumulated in the concealment signal accumulation unit 213 in the example 2 (Step 2213 in
[0196] <Decoding Unit>
[0197] As shown in
[0198] The audio code buffer 231 determines whether a packet is correctly received or not. When the audio code buffer 231 determines that a packet is correctly received, the processing is switched to the audio parameter decoding unit 232, the side information decoding unit 235 and the error signal decoding unit 237. On the other hand, when the audio code buffer 231 determines that a packet is not correctly received, the processing is switched to the audio parameter missing processing unit 233 (Step 241 in
[0199] <When Packet is Correctly Received>
[0200] The error signal decoding unit 237 decodes an error signal code and obtains a decoded error signal. As a specific example procedure, a decoding method corresponding to the method used at the encoding end, such as AVQ described in the section 7.1.2.1.2 in ITU-T G.718 can be used (Step 242 in
[0201] A look-ahead excitation vector synthesis unit 2318 reads a concealment signal for one sub-frame from the concealment signal accumulation unit 238 and adds the concealment signal to the decoded error signal, and thereby outputs a decoded signal for one sub-frame (Step 243 in
[0202] The above Steps 241 to 243 are repeated for M′ sub-frames until the end of the concealment signal.
[0203] The audio parameter decoding unit 232 includes an ISF decoding unit 2211, a pitch lag decoding unit 2212, a gain decoding unit 2213, and a fixed codebook decoding unit 2214. The functional configuration example of the audio parameter decoding unit 232 is shown in
[0204] The ISF decoding unit 2211 decodes the ISF code and converts it into an LP coefficient and thereby obtains a decoded LP coefficient. For example, the procedure described in the section 7.1.1 in ITU-T G.718 is used (Step 244 in
[0205] The pitch lag decoding unit 2212 decodes a pitch lag code and obtains a pitch lag and a long-term prediction parameter (Step 245 in
[0206] The gain decoding unit 2213 decodes a gain code and obtains an adaptive codebook gain and a fixed codebook gain. An example detailed procedure is described in the section 7.1.2.1.3 in ITU-T G.718 (Step 246 in
[0207] An adaptive codebook calculation unit 2313 calculates an adaptive codebook vector by using the pitch lag and the long-term prediction parameter. The detailed procedure of the adaptive codebook calculation unit 2313 is as described in the example 1 (Step 247 in
[0208] The fixed codebook decoding unit 2214 decodes a fixed codebook code and calculates a fixed codebook vector. The detailed procedure is as described in the section 7.1.2.1.2 in ITU-T G.718 (Step 248 in
[0209] An excitation vector synthesis unit 2314 calculates an excitation vector by adding the adaptive codebook vector and the fixed codebook vector to which the gain is applied. Further, an excitation vector calculation unit updates the adaptive codebook buffer by using the excitation vector (Step 249 in
[0210] A synthesis filter 2316 synthesizes a decoded signal by using the decoded LP coefficient and the excitation vector (Step 2410 in
[0211] The above Steps 244 to 2410 are repeated for M-M′ sub-frames until the end of the frame to be encoded.
[0212] The functional configuration of the side information decoding unit 235 is the same as in the example 1. The side information decoding unit 235 decodes the side information code and calculates a pitch lag (Step 2411 in
[0213] The functional configuration of the audio parameter missing processing unit 233 is the same as in the example 1.
[0214] The ISF prediction unit 191 predicts an ISF parameter using the ISF parameter for the previous frame and converts the predicted ISF parameter into an LP coefficient. The procedure is the same as in Steps 172, 173 and 174 of the example 1 shown in
[0215] The adaptive codebook calculation unit 2313 calculates an adaptive codebook vector by using the pitch lag output from the side information decoding unit 235 and an adaptive codebook 2312 (Step 2413 in
[0216] The adaptive codebook gain prediction unit 193 outputs an adaptive codebook gain.
[0217] A specific procedure is the same as in Step 1103 in
[0218] The fixed codebook gain prediction unit 194 outputs a fixed codebook gain. A specific procedure is the same as in Step 1104 in
[0219] The noise signal generation unit 195 outputs a noise, such as a white noise as a fixed codebook vector. The procedure is the same as in Step 1105 in
[0220] The excitation vector synthesis unit 2314 applies gain to each of the adaptive codebook vector and the fixed codebook vector and adds them together and thereby calculates an excitation vector. Further, the excitation vector synthesis unit 2314 updates the adaptive codebook buffer using the excitation vector (Step 2417 in
[0221] The synthesis filter 2316 calculates a decoded signal using the above-described LP coefficient and the excitation vector. The synthesis filter 2316 then updates the concealment signal accumulation unit 238 using the calculated decoded signal (Step 2418 in
[0222] The above steps are repeated for M′ sub-frames, and the decoded signal is output as the audio signal.
[0223] <When a Packet is Lost>
[0224] A concealment signal for one sub-frame is read from the concealment signal accumulation unit and is used as the decoded signal (Step 2419 in
[0225] The above is repeated for M′ sub-frames.
[0226] The ISF prediction unit 191 predicts an ISF parameter (Step 2420 in
[0227] The pitch lag prediction unit 192 outputs a predicted pitch lag by using the pitch lag used in the past decoding (Step 2421 in
[0228] The operations of the adaptive codebook gain prediction unit 193, the fixed codebook gain prediction unit 194, the noise signal generation unit 195 and the audio synthesis unit 234 are the same as in the example 1 (Step 2422 in
[0229] The above steps are repeated for M sub-frames, and the decoded signal for M-M′ sub-frames is output as the audio signal, and the concealment signal accumulation unit 238 is updated by the decoded signal for the remaining M′ sub-frames.
Example 3
[0230] A case of using glottal pulse synchronization in the calculation of an adaptive codebook vector is described hereinafter.
[0231] <Encoding End>
[0232] The functional configuration of the audio signal transmitting device is the same as in example 1. The functional configuration and the procedure are different only in the side information encoding unit, and therefore only the operation of the side information encoding unit is described below.
[0233] The side information encoding unit includes an LP coefficient calculation unit 311, a pitch lag prediction unit 312, a pitch lag selection unit 313, a pitch lag encoding unit 314, and an adaptive codebook buffer 315. The functional configuration of an example of the side information encoding unit is shown in
[0234] The LP coefficient calculation unit 311 is the same as the LP coefficient calculation unit in example 1 and thus will not be redundantly described (Step 321 in
[0235] The pitch lag prediction unit 312 calculates a pitch lag predicted value {circumflex over (T)}.sub.p using the pitch lag obtained from the audio encoding unit (Step 322 in
[0236] Then, the pitch lag selection unit 313 determines a pitch lag to be transmitted as the side information (Step 323 in
[0237] First, a pitch lag codebook is generated from the pitch lag predicted value {circumflex over (T)}.sub.p and the value of the past pitch lag {circumflex over (T)}.sub.p.sup.(−j) (0≤j<J) according to the following equations (Step 331 in
<When {circumflex over (T)}.sub.p−{circumflex over (T)}.sub.p.sup.(−1)≤0>
The value of the pitch lag for one sub-frame before is {circumflex over (T)}.sub.p.sup.(−1). Further, the number of indexes of the codebook is I. δ.sub.j is a predetermined step width, and ρ is a predetermined constant.
[0238] Then, by using the adaptive codebook and the pitch lag predicted value {circumflex over (T)}.sub.p, an initial excitation vector u.sub.0(n) is generated according to the following equation (Step 332 in
The procedure of calculating the initial excitation vector can be, for example, similar to equations (607) and (608) in ITU-T G.718.
[0239] Then, glottal pulse synchronization is applied to the initial excitation vector by using all candidate pitch lags {circumflex over (T)}.sub.C.sup.j (0≤j<J) in the pitch lag codebook to thereby generate a candidate adaptive codebook vector u.sup.j(n) (0≤j<I) (Step 333 in
[0240] For the candidate adaptive codebook vector u.sup.j(n) (0≤j<I), a rate scale is calculated (Step 334 in
[0241] Instead of performing inverse filtering, segmental SNR may be calculated in the region of the adaptive codebook vector by using a residual signal according to the following equation.
In this case, a residual signal r(n) of the look-ahead signal s(n)(0<n<L′) is calculated by using the LP coefficient (Step 181 in
[0242] An index corresponding to the largest rate scale calculated in Step 334 is selected, and a pitch lag corresponding to the index is calculated (Step 335 in
[0243] <Decoding End>
[0244] The functional configuration of the audio signal receiving device is the same as in the example 1. Differences from the example 1 are the functional configuration and the procedure of the audio parameter missing processing unit 123, the side information decoding unit 125 and the side information accumulation unit 126, and only those are described hereinbelow.
[0245] <When Packet is Correctly Received>
[0246] The side information decoding unit 125 decodes the side information code and calculates a pitch lag {circumflex over (T)}.sub.C.sup.idx and stores it into the side information accumulation unit 126. The example procedure of the side information decoding unit 125 is shown in
[0247] In the calculation of the pitch lag, the pitch lag prediction unit 312 first calculates a pitch lag predicted value {circumflex over (T)}.sub.p by using the pitch lag obtained from the audio decoding unit (Step 341 in
[0248] Then, a pitch lag codebook is generated from the pitch lag predicted value {circumflex over (T)}.sub.p, and the value of the past pitch lag {circumflex over (T)}.sub.p.sup.(−j) (0≤j<J), according to the following equations (Step 342 in
<When {circumflex over (T)}.sub.p−{circumflex over (T)}.sub.p.sup.−1)≥0>
The procedure is the same as in Step 331 in
[0249] Then, by referring to the pitch lag codebook, a pitch lag {circumflex over (T)}.sub.C.sup.idx corresponding to the index idx transmitted as part of the side information is calculated and stored in the side information accumulation unit 126 (Step 343 in
[0250] <When Packet Loss is Detected>
[0251] Although the functional configuration of the audio synthesis unit is also the same as in the example 1 (which is the same as in
[0252] The audio parameter missing processing unit 123 reads the pitch lag from the side information accumulation unit 126 and calculates a pitch lag predicted value according to the following equation, and uses the calculated pitch lag predicted value instead of the output of the pitch lag prediction unit 192.
{circumflex over (T)}.sub.p={circumflex over (T)}.sub.p.sup.(−1)+κ.Math.({circumflex over (T)}.sub.C.sup.idx−{circumflex over (T)}.sub.p.sup.(−1)) Equation 42
where κ is a predetermined constant.
[0253] Then, by using the adaptive codebook and the pitch lag predicted value {circumflex over (T)}.sub.p, an initial excitation vector u.sub.0(n) is generated according to the following equation (Step 332 in
[0254] Then, glottal pulse synchronization is applied to the initial excitation vector by using the pitch lag {circumflex over (T)}.sub.C.sup.idx to thereby generate an adaptive codebook vector u(n). For the glottal pulse synchronization, the same procedure as in Step 333 of
[0255] Hereinafter, an audio encoding program 70 that causes a computer having a processor to execute at least part of the above-described processing by the audio signal transmitting device is described. As shown in
[0256] The audio encoding program 70 includes functionality for an audio encoding module 700 and a side information encoding module 701. The functions implemented by executing the audio encoding module 700 and the side information encoding module 701 with a processor and/or other circuitry can be the same as at least some of the functions of the audio encoding unit 111 and the side information encoding unit 112 in the audio signal transmitting device described above, respectively.
[0257] Note that a part or the whole of the audio encoding program 70 may be transmitted through a transmission medium such as a communication line, received and stored (including being installed) by another device. Further, each module of the audio encoding program 70 may be installed in computer readable medium, not in one computer but in any of a plurality of computers. In this case, the above-described processing of the audio encoding program 70 is performed by a computer system composed of the plurality of computers and corresponding processors.
[0258] Hereinafter, an audio decoding program 90 that causes a computer having a processor to execute at least part of the above-described processing by the audio signal receiving device is described. As shown in
[0259] The audio decoding program 90 includes functionality for an audio code buffer module 900, an audio parameter decoding module 901, a side information decoding module 902, a side information accumulation module 903, an audio parameter missing processing module 904, and an audio synthesis module 905. The functions implemented by executing the audio code buffer module 900, the audio parameter decoding module 901, the side information decoding module 902, the side information accumulation module 903, an audio parameter missing processing module 904 and the audio synthesis module 905 with a processor and/or other circuitry can be the same as at least some of the functions of the audio code buffer 231, the audio parameter decoding unit 232, the side information decoding unit 235, the side information accumulation unit 236, the audio parameter missing processing unit 233 and the audio synthesis unit 234 described above, respectively.
[0260] Note that a part or the whole of the audio decoding program 90 may be transmitted through a transmission medium such as a communication line, received and stored (including being installed) by another device. Further, each module of the audio decoding program 90 may be installed in computer readable medium, not in one computer but in any of a plurality of computers. In this case, the above-described processing of the audio decoding program 90 is performed by a computer system composed of the plurality of computers and corresponding processors.
Example 4
[0261] An example that uses side information for pitch lag prediction at the decoding end is described hereinafter.
[0262] <Encoding End>
[0263] The functional configuration of the audio signal transmitting device is the same as in the example 1. The functional configuration and the procedure are different only in the side information encoding unit 112, and therefore the operation of the side information encoding unit 112 only is described hereinbelow.
[0264] The functional configuration of an example of the side information encoding unit 112 is shown in
[0265] The LP coefficient calculation unit 511 is the same as the LP coefficient calculation unit 151 in example 1 shown in
[0266] The residual signal calculation unit 512 calculates a residual signal by the same processing as in Step 181 in example 1 shown in
[0267] The pitch lag calculation unit 513 calculates a pitch lag for each sub-frame by calculating k that maximizes the following equation (Step 163 in
[0268] The adaptive codebook calculation unit 514 calculates an adaptive codebook vector v′(n) from the pitch lag T.sub.p and the adaptive codebook u(n). The length of the adaptive codebook is N.sub.adapt (Step 164 in
v′(n)=u(n+N.sub.adapt−T.sub.p) Equation 44
[0269] The adaptive codebook buffer 515 updates the state by the adaptive codebook vector v′(n) (Step 166 in
u(n)=u(n+L′) (0≤n<N−L′) Equation 45
u(n+N−L′)=v′(n) (0≤n<L) Equation 46
[0270] The pitch lag encoding unit 516 is the same as that in example 1 and thus not redundantly described (Step 169 in
[0271] <Decoding End>
[0272] The audio signal receiving device includes the audio code buffer 121, the audio parameter decoding unit 122, the audio parameter missing processing unit 123, the audio synthesis unit 124, the side information decoding unit 125, and the side information accumulation unit 126, just like in example 1. The procedure of the audio signal receiving device is as shown in
[0273] The operation of the audio code buffer 121 is the same as in example 1.
[0274] <When Packet is Correctly Received>
[0275] The operation of the audio parameter decoding unit 122 is the same as in the example 1.
[0276] The side information decoding unit 125 decodes the side information code, calculates a pitch lag {circumflex over (T)}.sub.p.sup.(j) (0≤j<M.sub.la) and stores it into the side information accumulation unit 126. The side information decoding unit 125 decodes the side information code by using the decoding method corresponding to the encoding method used at the encoding end.
[0277] The audio synthesis unit 124 is the same as that of example 1.
[0278] <When Packet Loss is Detected>
[0279] The ISF prediction unit 191 of the audio parameter missing processing unit 123 (see
[0280] An example procedure of the pitch lag prediction unit 192 is shown in
[0281] In the prediction of the pitch lag {circumflex over (T)}.sub.p.sup.(i) (M.sub.la≤i<M), the pitch lag prediction unit 192 may predict the pitch lag {circumflex over (T)}.sub.p.sup.(i) (M.sub.la≤i<M) by using the pitch lag {circumflex over (T)}.sub.p.sup.(−j) (1≤j<J) used in the past decoding and the pitch lag {circumflex over (T)}.sub.p.sup.(i) (0≤i<M.sub.la). Further, {circumflex over (T)}.sub.p.sup.(i)={circumflex over (T)}.sub.p.sup.(M.sup.
[0282] Further, the pitch lag prediction unit 192 may establish {circumflex over (T)}.sub.p.sup.(i)={circumflex over (T)}.sub.p.sup.(M.sup.
[0283] The adaptive codebook gain prediction unit 193 and the fixed codebook gain prediction unit 194 are the same as those of the example 1.
[0284] The noise signal generation unit 195 is the same as that of the example 1.
[0285] The audio synthesis unit 124 synthesizes, from the parameters output from the audio parameter missing processing unit 123, an audio signal corresponding to the frame to be encoded.
[0286] The LP coefficient calculation unit 1121 of the audio synthesis unit 124 (see
[0287] The adaptive codebook calculation unit 1123 calculates an adaptive codebook vector in the same manner as in example 1. The adaptive codebook calculation unit 1123 may perform filtering on the adaptive codebook vector or may not perform filtering. Specifically, the adaptive codebook vector is calculated using the following equation. The filtering coefficient is f.sub.i.
v(n)=f.sub.−1v′(n−1)+f.sub.0v′(n)+f.sub.1v′(n+1) Equation 47
In the case of decoding a value that does not indicate filtering, v(n)=v′(n) is established (adaptive codebook calculation step A).
[0288] The adaptive codebook calculation unit 1123 may calculate an adaptive codebook vector in the following procedure (adaptive codebook calculation step B).
[0289] An initial adaptive codebook vector is calculated using the pitch lag and the adaptive codebook 1122.
v(n)=f.sub.−1v′(n−1)+f.sub.0v′(n)+f.sub.1v′(n+1) Equation 48
v(n)=v′(n) may be established according to a design policy.
[0290] Then, glottal pulse synchronization is applied to the initial adaptive codebook vector. For the glottal pulse synchronization, a similar procedure as in the case where a pulse position is not available as described, for example, in section 7.11.2.5 in ITU-T G.718 can be used. Note that, however, u(n) in ITU-T G.718 can correspond to: v(n) in the described embodiment(s), and extrapolated pitch corresponds to {circumflex over (T)}.sub.p.sup.(M-1) in the described embodiment(s), and the last reliable pitch (T.sub.c) corresponds to {umlaut over (T)}.sub.p.sup.(M.sup.
[0291] Further, in the case where the pitch lag prediction unit 192 outputs the above-described instruction information for the predicated value, when the instruction information indicates that the pitch lag transmitted as the side information should not be used as the predicated value (NO in Step 4082 in
[0292] The excitation vector synthesis unit 1124 outputs an excitation vector in the same manner as in example 1 (Step 11306 in
[0293] The post filter 1125 performs post processing on the synthesis signal in the same manner as in the example 1.
[0294] The adaptive codebook 1122 updates the state by using the excitation signal vector in the same manner as in the example 1 (Step 11308 in
[0295] The synthesis filter 1126 synthesizes a decoded signal in the same manner as in the example 1 (Step 11309 in
[0296] The perceptual weighting inverse filter 1127 applies an perceptual weighting inverse filter in the same manner as in the example 1.
[0297] The audio parameter missing processing unit 123 stores the audio parameters (ISF parameter, pitch lag, adaptive codebook gain, fixed codebook gain) used in the audio synthesis unit 124 into the buffer in the same manner as in the example 1 (Step 145 in
Example 5
[0298] In this embodiment, a configuration is described in which a pitch lag is transmitted as side information only in a specific frame class, and otherwise a pitch lag is not transmitted.
[0299] <Transmitting End>
[0300] In the audio signal transmitting device, an input audio signal is sent to the audio encoding unit 111.
[0301] The audio encoding unit III in this example calculates an index representing the characteristics of a frame to be encoded and transmits the index to the side information encoding unit 112. The other operations are the same as in example 1.
[0302] In the side information encoding unit 112, a difference from the examples 1 to 4 is only with regard to the pitch lag encoding unit 158, and therefore the operation of the pitch lag encoding unit 158 is described hereinbelow. The configuration of the side information encoding unit 112 in the example 5 is shown in
[0303] The procedure of the pitch lag encoding unit 158 is shown in the example of
[0304] When the number of bits to be assigned to the side information is 1 bit (No in Step S022 in
[0305] On the other hand, when the number of bits to be assigned to the side information is B bits (Yes in Step 5022 in
[0306] <Decoding End>
[0307] The audio signal receiving device includes the audio code buffer 121, the audio parameter decoding unit 122, the audio parameter missing processing unit 123, the audio synthesis unit 124, the side information decoding unit 125, and the side information accumulation unit 126, just like in example 1. The procedure of the audio signal receiving device is as shown in
[0308] The operation of the audio code buffer 121 is the same as in example 1.
[0309] <When Packet is Correctly Received>
[0310] The operation of the audio parameter decoding unit 122 is the same as in example 1.
[0311] The procedure of the side information decoding unit 125 is shown in the example of
[0312] On the other hand, when the side information index indicates transmission of the side information, the side information decoding unit 125 further performs decoding of B−1 bits and calculates a pitch lag {circumflex over (T)}.sub.p.sup.(j) (0≤j<M.sub.la) and stores the calculated pitch lag in the side information accumulation unit 126 (Step 5033 in
[0313] The audio synthesis unit 124 is the same as that of example 1.
[0314] <When Packet Loss is Detected>
[0315] The ISF prediction unit 191 of the audio parameter missing processing unit 123 (see
[0316] The procedure of the pitch lag prediction unit 192 is shown in the example of
[0317] <When the Side Information Index is a Value Indicating Transmission of Side Information>
[0318] In the same manner as in example 1, the side information code is read from the side information accumulation unit 126 to obtain a pitch lag {circumflex over (T)}.sub.p.sup.(i) (0≤i<M.sub.la) (Step 5043 in
[0319] Further, the pitch lag prediction unit 192 may establish {circumflex over (T)}.sub.p.sup.(i)={circumflex over (T)}.sub.p.sup.(M.sup.
[0320] <When the Side Information Index is a Value Indicating Non-Transmission of Side Information>
[0321] In the prediction of the pitch lag {circumflex over (T)}.sub.p.sup.(i) (M.sub.la≤i<M), the pitch lag prediction unit 192 predicts the pitch lag {circumflex over (T)}.sub.p.sup.(i) (0≤i<M) by using the pitch lag {circumflex over (T)}.sub.p.sup.(−j) (1≤j<J) used in the past decoding (Step 5048 in
[0322] Further, the pitch lag prediction unit 192 may establish {circumflex over (T)}.sub.p.sup.(i)={circumflex over (T)}.sub.p.sup.(−1) only when the reliability of the pitch lag predicted value is low (Step 5049 in
[0323] The adaptive codebook gain prediction unit 193 and the fixed codebook gain prediction unit 194 are the same as those of example 1.
[0324] The noise signal generation unit 195 is the same as that of the example 1.
[0325] The audio synthesis unit 124 synthesizes, from the parameters output from the audio parameter missing processing unit 123, an audio signal which corresponds to the frame to be encoded.
[0326] The LP coefficient calculation unit 1121 of the audio synthesis unit 124 (see
[0327] The procedure of the adaptive codebook calculation unit 1123 is shown in the example of
v(n)=f.sub.−1v′(n−1)+f.sub.0v′(n)+f.sub.1v′(n+1) Equation 49
Note that v(n)=v′(n) may be established according to the design policy.
[0328] By referring to the pitch lag instruction information, when the reliability of the predicted value is high (NO in Step 5052 in
[0329] First, the initial adaptive codebook vector is calculated using the pitch lag and the adaptive codebook 1122 (Step 5053 in
v(n)=f.sub.−1v′(n−1)+f.sub.0v′(n)+f.sub.1v′(n+1) Equation 50
v(n)=v′(n) may be established according to the design policy.
[0330] Then, glottal pulse synchronization is applied to the initial adaptive codebook vector. For the glottal pulse synchronization, a similar procedure can be used as in the example of the case where a pulse position is not available in section 7.11.2.5 in ITU-T G.718 (Step S054 in
[0331] The excitation vector synthesis unit 1124 outputs an excitation signal vector in the same manner as in the example 1 (Step 11306 in
[0332] The post filter 1125 performs post processing on the synthesis signal in the same manner as in example 1.
[0333] The adaptive codebook 1122 updates the state using the excitation signal vector in the same manner as in the example 1 (Step 11308 in
[0334] The synthesis filter 1126 synthesizes a decoded signal in the same manner as in example 1 (Step 11309 in
[0335] The perceptual weighting inverse filter 1127 applies an perceptual weighting inverse filter in the same manner as in example 1.
[0336] The audio parameter missing processing unit 123 stores the audio parameters (ISF parameter, pitch lag, adaptive codebook gain, fixed codebook gain) used in the audio synthesis unit 124 into the buffer in the same manner as in example 1 (Step 145 in
REFERENCE SIGNS LIST
[0337] 60, 80 . . . storage medium, 61, 81 . . . program storage area, 70 . . . audio encoding program, 90 . . . audio decoding program, 111 . . . audio encoding unit, 112 . . . side information encoding unit, 121, 231 . . . audio code buffer, 122, 232 . . . audio parameter decoding unit, 123, 233 . . . audio parameter missing processing unit, 124, 234 . . . audio synthesis unit, 125, 235 . . . side information decoding unit, 126, 236 . . . side information accumulation unit, 151, 511, 1121 . . . LP coefficient calculation unit, 152, 2012 . . . target signal calculation unit, 153, 513, 2013 . . . pitch lag calculation unit, 154, 1123, 514, 2014, 2313 . . . adaptive codebook calculation unit, 155, 1124, 2314 . . . excitation vector synthesis unit, 156, 315, 515, 2019 . . . adaptive codebook buffer, 157, 1126, 2018, 2316 . . . synthesis filter, 158, 516 . . . pitch lag encoding unit, 191 . . . ISF prediction unit, 192 . . . pitch lag prediction unit, 193 . . . adaptive codebook gain prediction unit, 194 . . . fixed codebook gain prediction unit, 195 . . . noise signal generation unit, 211 . . . main encoding unit, 212 . . . side information encoding unit, 213, 238 . . . concealment signal accumulation unit, 214 . . . error signal encoding unit, 237 . . . error signal decoding unit, 311 . . . LP coefficient calculation unit, 312 . . . pitch lag prediction unit, 313 . . . pitch lag selection unit, 314 . . . pitch lag encoding unit, 512 . . . residual signal calculation unit, 700 . . . audio encoding module, 701 . . . side information encoding module, 900 . . . audio parameter decoding module, 901 . . . audio parameter missing processing module, 902 . . . audio synthesis module, 903 . . . side information decoding module, 1128 . . . side information output determination unit, 1122, 2312 . . . adaptive codebook, 1125 . . . post filter, 1127 . . . perceptual weighting inverse filter, 2011 . . . ISF encoding unit, 2015 . . . fixed codebook calculation unit, 2016 . . . gain calculation unit, 2017 . . . excitation vector calculation unit, 2211 . . . ISF decoding unit, 2212 . . . pitch lag decoding unit, 2213 . . . gain decoding unit, 2214 . . . fixed codebook decoding unit, 2318 . . . look-ahead excitation vector synthesis unit