Encoding apparatus and decoding apparatus for transforming between modified discrete cosine transform-based coder and different coder
09773505 · 2017-09-26
Assignee
- Electronics And Telecommunications Research Institute (Daejeon, KR)
- KWANGWOON UNIVERSITY INDUSTRY-ACADEMIC COLLABORATION FOUNDATION (Seoul, KR)
Inventors
- Seung Kwon BEACK (Daejeon, KR)
- Tae Jin Lee (Daejeon, KR)
- Min Je Kim (Daejeon, KR)
- Dae Young Jang (Daejeon, KR)
- Kyeongok Kang (Daejeon, KR)
- Jin Woo Hong (Daejeon, KR)
- Ho Chong Park (Seongnam-si, KR)
- Young-cheol Park (Wonju-si, KR)
Cpc classification
International classification
G10L19/00
PHYSICS
G10L19/20
PHYSICS
G10L19/02
PHYSICS
Abstract
An encoding apparatus and a decoding apparatus in a transform between a Modified Discrete Cosine Transform (MDCT)-based coder and a different coder are provided. The encoding apparatus may encode additional information to restore an input signal encoded according to the MDCT-based coding scheme, when switching occurs between the MDCT-based coder and the different coder. Accordingly, an unnecessary bitstream may be prevented from being generated, and minimum additional information may be encoded.
Claims
1. An encoding apparatus, comprising: a first encoder configured to encode a previous frame for a speech characteristic signal in an input signal according to a Code Excited Linear Prediction (CELP); and a second encoder configured to encode a current frame for an audio characteristic signal in the input signal according to a Modified Discrete Cosine Transform (MDCT), wherein when switching occurs from the previous frame for the speech characteristic signal to the current frame for the audio characteristic signal in the input signal, the first encoder encodes an additional MDCT information extracted from the previous frame, the first encoder encodes additional MDCT information in the speech characteristic signal for overlap-add operation between the previous frame and the current frame, the current frame is decoded according to MDCT by applying a first window into the additional MDCT information, applying a second window into the current frame, and performing overlap-add between the current frame applied the first window and the additional MDCT information applied second window, in a decoding apparatus, the additional MDCT information is applied to the second window for removing time domain aliasing generated during MDCT, and the additional MDCT information is extracted from a delayed block in the previous frame with respect to a block of the current frame.
2. A decoding apparatus, comprising: a first decoder configured to decode a previous frame for a speech characteristic signal in an input signal encoded according to a Code Excited Linear Prediction (CELP); and a second decoder configured to decode a current frame for an audio characteristic signal in the input signal encoded according to a Modified Discrete Cosine Transform (MDCT), wherein when switching occurs from the previous frame for the speech characteristic signal to the current frame for the audio characteristic signal in the input signal, the first decoder decodes an additional MDCT information extracted from the previous frame, the second decoder decodes the current frame for the audio characteristic signal by performing an overlap-add operation according to the MDCT between the previous frame and the current frame, the additional MDCT information is determined in the speech characteristic signal for overlap-add operation between the previous frame and the current frame, the current frame is decoded according to MDCT by applying a first window into the additional MDCT information, applying a second window into the current frame, and performing overlap-add between the current frame applied the first window and the additional MDCT information applied second window, in a decoding apparatus, the additional MDCT information is applied to the second window for removing time domain aliasing generated during MDCT, and the additional MDCT information is extracted from a delayed block in the previous frame with respect to a block of the current frame.
3. An encoding apparatus, comprising: a first encoder configured to encode a previous frame for a speech characteristic signal in an input signal according to a Code Excited Linear Prediction (CELP); a second encoder configured to encode a current frame for an audio characteristic signal in the input signal according to a Modified Discrete Cosine Transform (MDCT); and a block delay circuit configured to delay a previous block with respect to a first block to be encoded by the second encoder when switching occurs between the speech characteristic signal and the audio characteristic signal in the input signal, when the switching occurs from the previous frame for the speech characteristic signal to the current frame for the audio characteristic signal in the input signal, the first encoder encodes an additional MDCT information extracted from the previous frame to be processed based on the CELP when the switching occurs from the speech characteristic signal to the audio characteristic signal in the input signal, wherein the additional MDCT information is used to decode the current frame for the audio characteristic signal according to the MDCT by performing an overlap-add operation between the previous frame and the current frame at a folding point in a decoding process.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
BEST MODE FOR CARRYING OUT THE INVENTION
(19) Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
(20)
(21) The encoding apparatus 101 may generate a bitstream by encoding an input signal for each block. In this instance, the encoding apparatus 101 may encode a speech characteristic signal and an audio characteristic signal. The speech characteristic signal may have a similar characteristic to a voice signal, and the audio characteristic signal may have a similar characteristic to an audio signal. The bitstream with respect to an input signal may be generated as a result of the encoding, and be transmitted to the decoding apparatus 102. The decoding apparatus 102 may generate an output signal by decoding the bitstream, and thereby may restore the encoded input signal.
(22) Specifically, the encoding apparatus 101 may analyze a state of the continuously inputted signal, and switch to enable an encoding scheme corresponding to the characteristic of the input signal to be applied according to a result of the analysis. Accordingly, the encoding apparatus 101 may encode blocks where a coding scheme is applied. For example, the encoding apparatus 101 may encode the speech characteristic signal according to a Code Excited Linear Prediction (CELP) scheme, and encode the audio characteristic signal according to a Modified Discrete Cosine Transform (MDCT) scheme. Conversely, the decoding apparatus 102 may restore the input signal by decoding the input signal, encoded according to the CELP scheme, according to the CELP scheme and by decoding the input signal, encoded according to the MDCT scheme, according to the MDCT scheme.
(23) In this instance, when the input signal is switched to the audio characteristic signal from the speech characteristic signal, the encoding apparatus 101 may encode by switching from the CELP scheme to the MDCT scheme. Since the encoding is performed for each block, blocking artifact may be generated. In this instance, the decoding apparatus 102 may remove the blocking artifact through an overlap-add operation among blocks.
(24) Also, when a current block of the input signal is encoded according to the MDCT scheme, MDCT information of a previous block is required to restore the input signal. However, when the previous block is encoded according to the CELP scheme, since MDCT information of the previous block does not exist, the current block may not be restored according to the MDCT scheme. Accordingly, additional MDCT information of the previous block is required. Also, the encoding apparatus 101 may reduce the additional MDCT information, and thereby may prevent a bit rate from increasing.
(25)
(26) Referring to
(27) The block delay unit 201 may delay an input signal for each block. The input signal may be processed for each block for encoding. The block delay unit 201 may delay back (−) or delay ahead (+) the inputted current block.
(28) The state analysis unit 202 may determine a characteristic of the input signal. For example, the state analysis unit 202 may determine whether the input signal is a speech characteristic signal or an audio characteristic signal. In this instance, the state analysis unit 202 may output a control parameter. The control parameter may be used to determine which encoding scheme is used to encode the current block of the input signal.
(29) For example, the state analysis unit 202 may analyze the characteristic of the input signal, and determine, as the speech characteristic signal, a signal period corresponding to (1) a steady-harmonic (SH) state showing a clear and stable harmonic component, (2) a low steady harmonic (LSH) state showing a strong steady characteristic in a low frequency bandwidth and showing a harmonic component of a relatively long period, and (3) a steady-noise (SN) state which is a white noise state. Also, the state analysis unit 202 may analyze the characteristic of the input signal, and determine, as the audio characteristic signal, a signal period corresponding to (4) a complex-harmonic (CH) state showing a complex harmonic structure where various tone components are combined, and (5) a complex-noisy (CN) state including unstable noise components. Here, the signal period may correspond to a block unit of the input signal.
(30) The signal cutting unit 203 may enable the input signal of the block unit to be a sub-set.
(31) The first encoding unit 204 may encode the speech characteristic signal from among input signals of the block unit. For example, the first encoding unit 204 may encode the speech characteristic signal in a time domain according to a Linear Predictive Coding (LPC). In this instance, the first encoding unit 204 may encode the speech characteristic signal according to a CELP-based coding scheme. Although a single first encoding unit 204 is illustrated in
(32) The second encoding unit 205 may encode the audio characteristic signal from among the input signals of the block unit. For example, the second encoding unit 205 may transform the audio characteristic signal from the time domain to the frequency domain to perform encoding. In this instance, the second encoding unit 205 may encode the audio characteristic signal according to an MDCT-based coding scheme. A result of the first decoding unit 204 and a result of the second encoding unit 205 may be generated in a bitstream, and the bitstream generated in each of the encoding units may be controlled to be a single bitstream through a bitstream multiplexer (MUX).
(33) That is, the encoding apparatus 101 may encode the input signal through any one of the first encoding unit 204 and the second encoding unit 205, by switching depending on a control parameter of the state analysis unit 202. Also, the first encoding unit 204 may encode the speech characteristic signal of the input signal according to the coding scheme different from the MDCT-based coding scheme. Also, the second encoding unit 205 may encode the audio characteristic signal of the input signal according to the MDCT-based coding scheme.
(34)
(35) Referring to
(36) In
(37) The window processing unit 301 may apply an analysis window to a current frame of the input signal. Specifically, the window processing unit 301 may apply the analysis window to a current block X(b) and a delayed block X(b−2). The current block X(b) may be delayed back to the previous block X(b−2) through the block delay unit 201.
(38) For example, the window processing unit 301 may apply an analysis window, which does not exceed a folding point, to the current frame, when a folding point where switching occurs between a speech characteristic signal and an audio characteristic signal exists in the current frame. In this instance, the window processing unit 301 may apply the analysis window which is configured as a window which has a value of 0 and corresponds to a first sub-block, a window corresponding to an additional information area of a second sub-block, and a window which has a value of 1 and corresponds to a remaining area of the second sub-block based on the folding point. Here, the first sub-block may indicate the speech characteristic signal, and the second sub-block may indicate the audio characteristic signal.
(39) A degree of block delay, performed by the block delay unit 201, may vary depending on a block unit of the input signal. When the input signal passes through the window processing unit 301, the analysis window may be applied, and thus {X(b−2), X(b)}{circle around (x)}W.sub.analysis may be extracted. Accordingly, the MDCT unit 302 may perform an MDCT with respect to the current frame where the analysis window is applied. Also, the bitstream generation unit 303 may encode the current frame and generate a bitstream of the input signal.
(40)
(41) Referring to
(42) When the current block X(b) is inputted, the window processing unit 301 may apply the analysis window to the current block X(b) and the previous block X(b−2). Here, the previous block X(b−2) may be delayed back by the block delay unit 102. For example, the block X(b) may be set as a basic unit of the input signal according to Equation 1 given as below. In this instance, two blocks may be set as a single frame and encoded.
X(b)=[s(b−1),s(b)].sup.T [Equation 1]
(43) In this instance, s(b) may denote a sub-block configuring a single block, and may be defined by,
s(b)=[s((b−1).Math.N/4),s((b−1).Math.N/4+1), . . . ,s(b−1).Math.N/4+N/4−1)].sup.T [Equation 2]
(44) s(n): a sample of an input signal
(45) Here, N may denote a size of a block of the input signal. That is, a plurality of blocks may be included in the input signal, and each of the blocks may include two sub-blocks. A number of sub-blocks included in a single block may vary depending on a system configuration and the input signal.
(46) For example, the analysis window may be defined according to Equation 3 given as below. Also, according to Equation 2 and Equation 3, a result of applying the analysis window to a current block of the input signal may be represented as Equation 4.
W.sub.analysis=[w.sub.1,w.sub.2,w.sub.3,w.sub.4].sup.T
w.sub.i=[w.sub.i(0), . . . ,w.sub.i(N/4−1)].sup.T [Equation 3]
[X(b−2),X(b)].sup.T{circle around (x)}W.sub.analysis=[s((b−2)N/4).Math.w.sub.1(0), . . . ,s((b−1)N/4+N/4−1).Math.w.sub.4(N/4−1)].sup.T [Equation 4]
(47) W.sub.analysis may denote the analysis window, and have a symmetric characteristic. As illustrated in
(48) The MDCT unit 302 may perform an MDCT with respect to the input signal where the analysis window is processed.
(49)
(50) An input signal configured as a block unit and an analysis window applied to the input signal are illustrated in
(51) The encoding apparatus 101 may apply an analysis window W.sub.analysis to the input signal. The input signal may be divided into four sub-blocks X.sub.1 (Z), X.sub.2(Z), X.sub.3(Z), X.sub.4(Z) included in a current frame, and the analysis window may be divided into W.sub.1(Z), W.sub.2 (Z), W.sub.2.sup.H(Z), W.sub.1.sup.H(Z). Also, when an MDCT/quantization/Inverse MDCT (IMDCT) is applied to the input signal based on the folding point dividing the sub-blocks, an original area and aliasing area may occur.
(52) The decoding apparatus 102 may apply a synthesis window to the encoded input signal, remove aliasing generated during the MDCT operation through an overlap-add operation, and thereby may extract an output signal.
(53)
(54) In
(55) In
(56) In this instance, encoding is performed according to an MDCT-based coding scheme, the decoding apparatus 102 may remove the blocking artifact through an overlap-add operation using both a previous block and a current block. However, when switching occurs between the speech characteristic signal and the audio characteristic signal like the C1 and the C2, an MDCT-based overlap add-operation may not be performed. Additional information for MDCT-based decoding may be required. For example, additional information S.sub.oL(b−1) may be required in the C1, and additional information S.sub.hL(b+m) may be required in the C2. According to an embodiment of the present invention, an increase in a bit rate may be prevented, and a coding efficiency may be improved by minimizing the additional information S.sub.oL(b−1) and the additional information S.sub.hL(b+m).
(57) When switching occurs between the speech characteristic signal and the audio characteristic signal, the encoding apparatus 101 may encode the additional information to restore the audio characteristic signal. In this instance, the additional information may be encoded by the first encoding unit 204 encoding the speech characteristic signal. Specifically, in the C1, an area corresponding to the additional information S.sub.oL(b−1) in the speech characteristic signal s(b−2) may be encoded as the additional information. Also, in the C2, an area corresponding to the additional information S.sub.hL(b+m) in the speech characteristic signal s(b+m+1) may be encoded as the additional information.
(58) An encoding method when the C1 and the C2 occur is described in detail with reference to
(59)
(60) When a block X(b) of an input signal is inputted, the state analysis unit 202 may analyze a state of the corresponding block. In this instance, when the block X(b) is an audio characteristic signal and a block X(b−2) is a speech characteristic signal, the state analysis unit 202 may recognize that the C1 occurs in a folding point existing between the block X(b) and the block X(b−2). Accordingly, control information about the generation of the C1 may be transmitted to the block delay unit 201, the window processing unit 301, and the first encoding unit 204.
(61) When the block X(b) of the input signal is inputted, the block X(b) and a block X(b+2) may be inputted to the window processing unit 301. The block X(b+2) may be delayed ahead (+2) through the block delay unit 201. Accordingly, an analysis window may be applied to the block X(b) and the block X(b+2) in the C1 of
(62) Also, to generate the additional information S.sub.oL(b−1) for an overlap-add operation with respect to the block X(b), the block delay unit 201 may extract a block X(b−1) by delaying back the block X(b). The block X(b−1) may include the sub-blocks s(b−2) and s(b−1). Also, the signal cutting unit 203 may extract the additional information S.sub.oL(b−1) from the block X(b−1) through signal cutting.
(63) For example, the additional information S.sub.oL(b−1) may be determined by,
s.sub.oL(b−1)=[s((b−2).Math.N/4), . . . ,s((b−2).Math.N/4+oL−1)].sup.T0<oL≦N/4 [Equation 5]
(64) In this instance, N may denote a size of a block for MDCT.
(65) The first encoding unit 204 may encode an area corresponding to the additional information of the speech characteristic signal for overlapping among blocks based on the folding point where switching occurs between the speech characteristic signal and the audio characteristic signal. For example, the first encoding unit 204 may encode the additional information S.sub.oL(b−1) corresponding to an additional information area (oL) in the sub-block s(b−2) which is the speech characteristic signal. That is, the first encoding unit 204 may generate a bitstream of the additional information S.sub.oL(b−1) by encoding the additional information S.sub.oL(b−1) extracted by the signal cutting unit 203. That is, when the C1 occurs, the first encoding unit 204 may generate only the bitstream of the additional information S.sub.oL(b−1). When the C1 occurs, the additional information S.sub.oL(b−1) may be used as additional information to remove blocking artifact.
(66) For another example, when the additional information S.sub.oL(b−1) may be obtained when the block X(b−1) is encoded, the first encoding unit 204 may not encode the additional information S.sub.oL(b−1).
(67)
(68) In
(69) For example, the window processing unit 301 may apply the analysis window. The analysis window may be configured as a window which has a value of 0 and corresponds to a first sub-block, a window corresponding to an additional information area of a second sub-block, and a window which has a value of 1 and corresponds to a remaining area of the second sub-block based on the folding point. The first sub-block may indicate the speech characteristic signal, and the second sub-block may indicate the audio characteristic signal. In
(70) In
(71) In this instance, the window processing unit 301 may substitute the analysis window w.sub.z for a value of zero with respect to the zero sub-block which is the speech characteristic signal. Also, the window processing unit 301 may determine an analysis window ŵ.sub.2 corresponding to the sub-block s(b−1) which is the audio characteristic signal according to Equation 6.
(72)
(73) That is, the analysis window ŵ.sub.2 applied to the sub-block s(b−1) may include an additional information area (oL) and a remaining area (N/4−oL) of the additional information area (oL). In this instance, the remaining area may be configured as 1.
(74) In this instance, w.sub.oL may denote a first half of a sine-window having a size of 2×oL. The additional information area (oL) may denote a size for an overlap-add operation among blocks in the C1, and determine a size of each of w.sub.oL, and s.sub.oL(b−1). Also, a block sample X.sub.c1=[X.sub.c1.sup.l, X.sub.c1.sup.h].sup.T may be defined for following description in a block sample 800.
(75) For example, the first encoding unit 204 may encode a portion corresponding to the additional information area in a sub-block, which is a speech characteristic signal, for overlapping among blocks based on the folding point. In
(76) As illustrated in
(77)
(78) When a block X(b) of an input signal is inputted, the state analysis unit 202 may analyze a state of a corresponding block. As illustrated in
(79) When a block X(b+m−1) of the input signal is inputted, the block X(b+m−1) and a block X(b+m+1), which is delayed ahead (+2) through the block delay unit 201, may be inputted to the window processing unit 301. Accordingly, the analysis window may be applied to the block X(b+m+1) and the block X(b+m−1) in the C2 of
(80) For example, when the C2 occurs in the folding point between the speech characteristic signal and an the audio characteristic signal in a current frame of the input signal, the window processing unit 301 may apply the analysis window, which does not exceed the folding point, to the audio characteristic signal.
(81) An MDCT may be performed with respect to the blocks X(b+m+1) and X(b+m−1) where the analysis window is applied through the MDCT unit 302. A block where the MDCT is performed may be encoded through the bitstream generation unit 303, and thus a bitstream of the block X(b+m−1) of the input signal may be generated.
(82) Also, to generate the additional information S.sub.hL(b+m) for an overlap-add operation with respect to the block X(b+m−1), the block delay unit 201 may extract a block X(b+m) by delaying ahead (+1) the block X(b+m−1). The block X(b+m) may include the sub-blocks s(b+m−1) and s(b+m). Also, the signal cutting unit 203 may extract only the additional information S.sub.hL(b+m) through signal cutting with respect to the block X(b+m).
(83) For example, the additional information S.sub.hL(b+m) may be determined by,
s.sub.hL(b+m)=[s((b+m−1).Math.N/4), . . . ,s((b+m−1).Math.N/4+hL−1)].sup.T0<hL≦N/4 [Equation 7]
In this instance, N may denote a size of a block for MDCT.
(84) The first encoding unit 204 may encode the additional information S.sub.hL(b+m) and generate a bitstream of the additional information S.sub.hL(b+m). That is, when the C2 occurs, the first encoding unit 204 may generate only the bitstream of the additional information S.sub.hL(b+m). When the C2 occurs, the additional information S.sub.hL(b+m) may be used as additional information to remove a blocking artifact.
(85)
(86) In
(87) For example, when a folding point where switching occurs exists between the audio characteristic signal and the speech characteristic signal in the current frame of the input signal, the window processing unit 301 may apply an analysis window which does not exceed the folding point to the audio characteristic signal. That is, the window processing unit 301 may apply the analysis window to the sub-block s(b+m) of the block X(b+m+1) and X(b+m−1).
(88) Also, the window processing unit 301 may apply the analysis window. The analysis window may be configured as a window which has a value of 0 and corresponds to a first sub-block, a window corresponding to an additional information area of a second sub-block, and a window which has a value of 1 and corresponds to a remaining area of the second sub-block based on the folding point. The first sub-block may indicate the speech characteristic signal, and the second sub-block may indicate the audio characteristic signal. In
(89) That is, the window processing unit 301 may substitute the analysis window w.sub.z for a value of zero. Here, the analysis window may correspond to the sub-block s(b+m+1) which is the speech characteristic signal. Also, the window processing unit 301 may determine an analysis window ŵ.sub.3 corresponding to the sub-block s(b+m) which is the audio characteristic signal according to Equation 8.
(90)
(91) That is, the analysis window ŵ.sub.3, applied to the sub-block s(b+m) indicating the audio characteristic signal based on the folding point, may include an additional information area (hL) and a remaining area (N/4−hL) of the additional information area (hL). In this instance, the remaining area may be configured as 1.
(92) In this instance, w.sub.hL may denote a second half of a sine-window having a size of 2×hL. An additional information area (hL) may denote a size for an overlap-add operation among blocks in the C2, and determine a size of each of w.sub.hL and s.sub.hL(b+m). Also, a block sample X.sub.c2=[X.sub.c2.sup.l, X.sub.c2.sup.h] may be defined for following description in a block sample 1000.
(93) For example, the first encoding unit 204 may encode a portion corresponding to the additional information area in a sub-block, which is a speech characteristic signal, for overlapping among blocks based on the folding point. In
(94) As illustrated in
(95)
(96) Additional information 1101 may correspond to a portion of a sub-block indicating a speech characteristic signal based on a folding point C1, and additional information 1102 may correspond to a portion of a sub-block indicating a speech characteristic signal based on a folding point C2. In this instance, a sub-block corresponding to an audio characteristic signal behind the C1 folding point may be applied to a synthesis window where a first half (oL) of the additional information 1101 is reflected. A remaining area (N/4−oL) may be substituted for 1. Also, a sub-block, corresponding to an audio characteristic signal ahead of the C2 folding point, may be applied to a synthesis window where a second half (hL) of the additional information 1102 is reflected. A remaining area (N/4−hL) may be substituted for 1.
(97)
(98) Referring to
(99) The block delay unit 1201 may delay back or ahead a block according to a control parameter (C1 and C2) included in an inputted bitstream.
(100) Also, the decoding apparatus 102 may switch a decoding scheme depending on the control parameter of the inputted bitstream to enable any one of the first decoding unit 1202 and the second decoding unit 1203 to decode the bitstream. In this instance, the first decoding unit 1202 may decode an encoded speech characteristic signal, and the second decoding unit 1203 may decode an encoded audio characteristic signal. For example, the first decoding unit 1202 may decode the audio characteristic signal according to a CELP-based coding scheme, and the second decoding unit 1203 may decode the speech characteristic signal according to an MDCT-based coding scheme.
(101) A result of decoding through the first decoding unit 1202 and the second decoding unit 1203 may be extracted as a final output signal through the block compensation unit 1204.
(102) The block compensation unit 1204 may perform block compensation with respect to the result of the first decoding unit 1202 and the result of the second decoding unit 1203 to restore the input signal. For example, when a folding point where switching occurs between the speech characteristic signal and the audio characteristic signal exists in a current frame of the input signal, the block compensation unit 1204 may apply a synthesis window which does not exceed the folding point.
(103) In this instance, the block compensation unit 1204 may apply a first synthesis window to additional information, and apply a second synthesis window to the current frame to perform an overlap-add operation. Here, the additional information may be extracted by the first decoding unit 1202, and the current frame may be extracted by the second decoding unit 1203. The block compensation unit 1204 may apply the second synthesis window to the current frame. The second synthesis window may be configured as a window which has a value of 0 and corresponds to a first sub-block, a window corresponding to an additional information area of a second sub-block, and a window which has a value of 1 and corresponds to a remaining area of the second sub-block based on the folding point. The first sub-block may indicate the speech characteristic signal, and the second sub-block may indicate the audio characteristic signal. The block compensation unit 1204 is described in detail with reference to
(104)
(105) Referring to
(106) The bitstream restoration unit 1301 may decode an inputted bitstream. Also, the IMDCT unit 1302 may transform a decoded signal to a sample in a time domain through an IMDCT.
(107) A block Y(b), transformed through the IMDCT unit 1302, may be delayed back through the block delay unit 1201 and inputted to the window processing unit 1303. Also, the block Y(b) may be directly inputted to the window processing unit 1303 without the delay. In this instance, the block Y(b) may have a value of Y(b)=[{tilde over ({circumflex over (X)})}(b−2), {tilde over ({circumflex over (X)})}(b)].sup.T. In this instance, the block Y(b) may be a current block inputted through the second encoding unit 205 in
(108) The window synthesis unit 1303 may apply the synthesis window to the inputted block Y(b) and a delayed block Y(b−2). When the C1 and C2 do not occur, the window synthesis unit 1303 may identically apply the synthesis window to the blocks Y(b) and Y(b−2).
(109) For example, the window synthesis unit 1303 may apply the synthesis window to the block Y(b) according to Equation 9.
[{tilde over ({circumflex over (X)})}(b−2),{tilde over ({circumflex over (X)})}(b)].sup.T{circle around (x)}W.sub.synthesis=[s((b−2)N/4).Math.w.sub.1(0), . . . ,s((b−1)N/4+N/4−1).Math.w.sub.4(N/4−1)].sup.T [Equation 9]
(110) In this instance, the synthesis window W.sub.synthesis may be identical to an analysis window W.sub.analysis.
(111) The overlap-add operation unit 1304 may perform a 50% overlap-add operation with respect to a result of applying the synthesis window to the blocks Y(b) and Y(b−2). A result {tilde over (X)}(b−2) obtained by the overlap-add operation unit 1304 may be given by,
{tilde over (X)}(b−2)=([{tilde over ({circumflex over (X)})}(b−2)].sup.T{circle around (x)}[w.sub.1w.sub.2].sup.T)⊕([.sub.p{tilde over ({circumflex over (X)})}(b−2)].sup.T{circle around (x)}[w.sub.3,w.sub.4].sup.T) [Equation 10]
(112) In this instance, [{tilde over ({circumflex over (X)})}((b−2)].sup.T and .sub.p[{tilde over ({circumflex over (X)})}(b−2)].sup.T may be associated with the block Y(b) and the block Y(b−2), respectively. Referring to Equation 10, {tilde over (X)}(b−2) may be obtained by performing an overlap-add operation with respect to a result of combining [{tilde over ({circumflex over (X)})}(b−2)].sup.T and a first half [w.sub.1, w.sub.2].sup.T of the synthesis window, and a result of combining .sub.p[{tilde over ({circumflex over (X)})}(b−2)].sup.T and a second half [w.sub.3, w.sub.4].sup.T of the synthesis window.
(113)
(114) Windows 1401, 1402, and 1403 illustrated in
(115) That is, referring to
(116) However, when the block 1404 is the speech characteristic signal and the block 1405 is the audio characteristic signal, that is, when the C1 occurs, an overlap-add operation may not be performed since MDCT information is not included in the block 1404. In this instance, MDCT additional information of the block 1404 may be required for the overlap-add operation. Conversely, when the block 1404 is the audio characteristic signal and the block 1405 is the speech characteristic signal, that is, when the C2 occurs, an overlap-add operation may not be performed since the MDCT information is not included in the block 1405. In this instance, the MDCT additional information of the block 1405 may be required for the overlap-add operation.
(117)
(118) The C1 may denote a folding point where the audio characteristic signal is generated after the speech characteristic signal in the current frame 800. In this instance, the folding point may be located at a point of N/4 in the current frame 800.
(119) The bitstream restoration unit 1301 may decode the inputted bitstream. Sequentially, the IMDCT unit 1302 may perform an IMDCT with respect to a result of the decoding. The window synthesis unit 1303 may apply the synthesis window to a block {tilde over ({circumflex over (X)})}.sub.c1.sup.h in the current frame 800 of the input signal encoded by the second encoding unit 205. That is, the second decoding unit 1203 may decode a block s(b) and a block s(b+1) which are not adjacent to the folding point in the current frame 800 of the input signal.
(120) In this instance, different from
(121) The result of applying the synthesis window to the block {tilde over ({circumflex over (X)})}.sub.c1.sup.h may be given by,
{tilde over (X)}.sub.c1.sup.h={tilde over ({circumflex over (X)})}.sub.c1.sup.h{circle around (x)}[w.sub.3,w.sub.4].sup.T [Equation 11]
(122) The block {tilde over (X)}.sub.c1.sup.h may be used as a block signal for overlap with respect to the current frame 800.
(123) Only input signal corresponding to the block {tilde over ({circumflex over (X)})}.sub.c1.sup.h in the current frame 800 may be restored by the second decoding unit 1203. Accordingly, since only block {tilde over ({circumflex over (X)})}.sub.c1.sup.l may exist in the current frame 800, the overlap-add operation unit 1304 may restore an input signal corresponding to the block {tilde over ({circumflex over (X)})}.sub.c1.sup.l where the overlap-add operation is not performed. The block {tilde over ({circumflex over (X)})}.sub.c1.sup.l may be a block where the synthesis window is not applied by the second decoding unit 1203 in the current frame 800. Also, the first decoding unit 1202 may decode additional information included in a bitstream, and thereby may output a sub-block {tilde over ({tilde over (s)})}.sub.oL(b−1).
(124) The block {tilde over ({circumflex over (X)})}.sub.c1.sup.l, extracted by the second decoding unit 1203, and the sub-block {tilde over ({tilde over (s)})}.sub.oL(b−1), extracted by the first decoding unit 1202, may be inputted to the block compensation unit 1204. A final output signal may be generated by the block compensation unit 1204.
(125)
(126) The block compensation unit 1204 may perform block compensation with respect to the result of the first decoding unit 1202 and the result of the second decoding unit 1203, and thereby may restore the input signal. For example, when a folding point where switching occurs between a speech characteristic signal and an audio characteristic signal exists in a current frame of the input signal, the block compensation unit 1204 may apply a synthesis window which does not exceed the folding point.
(127) In
{tilde over (s)}′.sub.oL(b−1)={tilde over ({tilde over (s)})}.sub.oL(b−1){circle around (x)}w.sub.oL.sup.γ [Equation 12]
(128) Also, the block {tilde over ({circumflex over (X)})}.sub.c1.sup.l, extracted by the overlap-add operation unit 1304, may be applied to a synthesis window 1601 through the block compensation unit 1204.
(129) For example, the block compensation unit 1204 may apply a synthesis window to the current frame 800. Here, the synthesis window may be configured as a window which has a value of 0 and corresponds to a first sub-block, a window corresponding to an additional information area of a second sub-block, and a window which has a value of 1 and corresponds to a remaining area of the second sub-block based on the folding point. The first sub-block may indicate the speech characteristic signal, and the second sub-block may indicate the audio characteristic signal. The block {tilde over (X)}′.sub.c1.sup.l where the synthesis window 1601 is applied may be represented as,
(130)
(131) That is, the synthesis window may be applied to the block {tilde over (X)}′.sub.c1.sup.l. The synthesis window may include an area W.sub.1 of 0, and have an area corresponding to the sub-block {tilde over (ŝ)}(b−1) which is identical to ŵ.sub.2 in
{tilde over ({circumflex over (s)})}(b−1)=[{tilde over (s)}.sub.oL(b−1),{tilde over ({circumflex over (s)})}.sub.N/4-oL(b−1)].sup.T [Equation 14]
(132) Here, when the block compensation unit 1204 performs an overlap-add operation with respect to an area W.sub.oL in the synthesis windows 1601 and 1602, the sub-block {tilde over (s)}.sub.oL(b−1) corresponding to an area (oL) may be extracted from the sub-block {tilde over (ŝ)}(b−1). In this instance, the sub-block
{tilde over (s)}.sub.oL(b−1)={tilde over (s)}′.sub.oL(b−1)⊕{tilde over (ŝ)}′.sub.oL(b−1) [Equation 15]
{tilde over ({circumflex over (s)})}.sub.N/4-oL(b−1)=[{tilde over ({circumflex over (s)})}((b−2).Math.N/4+oL), . . . ,{tilde over ({circumflex over (s)})}((b−2).Math.N/4+N/4−1)].sup.T [Equation 16]
(133) Accordingly, an output signal {tilde over (s)}(b−1) may be extracted by the block compensation unit 1204.
(134)
(135) The C2 may denote a folding point where the speech characteristic signal is generated after the audio characteristic signal in the current frame 1000. In this instance, the folding point may be located at a point of 3N/4 in the current frame 1000.
(136) The bitstream restoration unit 1301 may decode the inputted bitstream. Sequentially, the IMDCT unit 1302 may perform an IMDCT with respect to a result of the decoding. The window synthesis unit 1303 may apply the synthesis window to a block {tilde over ({circumflex over (X)})}.sub.c2.sup.l in the current frame 1000 of the input signal encoded by the second encoding unit 205. That is, the second decoding unit 1203 may decode a block s(b+m−2) and a block s(b+m−1) which are not adjacent to the folding point in the current frame 1000 of the input signal.
(137) In this instance, different from
(138) The result of applying the synthesis window to the block {tilde over ({circumflex over (X)})}.sub.c2.sup.l may be given by,
{tilde over (X)}.sub.c2.sup.l={tilde over ({circumflex over (X)})}.sub.c2.sup.l{circle around (x)}[w.sub.1,w.sub.2].sup.T. [Equation 17]
(139) The block {tilde over ({circumflex over (X)})}.sub.c2.sup.l may be used as a block signal for overlap with respect to the current frame 1000.
(140) Only input signal corresponding to the block {tilde over ({circumflex over (X)})}.sub.c2.sup.l in the current frame 1000 may be restored by the second decoding unit 1203. Accordingly, since only block {tilde over ({circumflex over (X)})}.sub.c2.sup.h may exist in the current frame 1000, the overlap-add operation unit 1304 may restore an input signal corresponding to the block {tilde over ({circumflex over (X)})}.sub.c2.sup.h where the overlap-add operation is not performed. The block {tilde over ({circumflex over (X)})}.sub.c2.sup.h may be a block where the synthesis window is not applied by the second decoding unit 1203 in the current frame 1000. Also, the first decoding unit 1202 may decode additional information included in a bitstream, and thereby may output a sub-block {tilde over ({tilde over (s)})}.sub.hL(b+m).
(141) The block {tilde over ({circumflex over (X)})}.sub.c2.sup.h, extracted by the second decoding unit 1203, and the sub-block {tilde over ({tilde over (s)})}.sub.hL(b+m), extracted by the first decoding unit 1202, may be inputted to the block compensation unit 1204. A final output signal may be generated by the block compensation unit 1204.
(142)
(143) The block compensation unit 1204 may perform block compensation with respect to the result of the first decoding unit 1202 and the result of the second decoding unit 1203, and thereby may restore the input signal. For example, when a folding point where switching occurs between a speech characteristic signal and an audio characteristic signal exists in a current frame of the input signal, the block compensation unit 1204 may apply a synthesis window which does not exceed the folding point.
(144) In
{tilde over (s)}′.sub.hL(b+m)={tilde over (s)}.sub.hL(b+m){circle around (x)}w.sub.hL.sup.γ [Equation 18]
(145) Also, the block {tilde over ({circumflex over (X)})}.sub.c2.sup.h, extracted by the overlap-add operation unit 1304, may be applied to a synthesis window 1801 through the block compensation unit 1204. For example, the block compensation unit 1204 may apply a synthesis window to the current frame 1000. Here, the synthesis window may be configured as a window which has a value of 0 and corresponds to a first sub-block, a window corresponding to an additional information area of a second sub-block, and a window which has a value of 1 and corresponds to a remaining area of the second sub-block based on the folding point. The first sub-block may indicate the speech characteristic signal, and the second sub-block may indicate the audio characteristic signal. The block {tilde over (X)}′.sub.c2.sup.h where the synthesis window 1801 is applied may be represented as,
(146)
(147) That is, the synthesis window 1801 may be applied to the block {tilde over (X)}′.sub.c2.sup.h. The synthesis window 1801 may include an area corresponding to the sub-block s(b+m) of 0, and have an area corresponding to the sub-block s(b+m+1) which is identical to ŵ.sub.3 in
{tilde over (s)}(b+m)=[{tilde over (ŝ)}.sub.N/4-hL(b+m),{tilde over (s)}′.sub.hL(b+m)].sup.T [Equation 20]
(148) Here, when the block compensation unit 1204 performs an overlap-add operation with respect to an area W.sub.hL, in the synthesis windows 1801 and 1802, the sub-block {tilde over (s)}.sub.hL(b+m) corresponding to an area (hL) may be extracted from the sub-block {tilde over (s)}(b+m). In this instance, the sub-block {tilde over (s)}′.sub.hL(b+m) may be determined according to Equation 21. Also, a sub-block {tilde over (ŝ)}.sub.N/4-hL (b+m) corresponding to a remaining area excluding the area (hL) from the sub-block {tilde over (s)}(b+m), may be determined according to Equation 22.
{tilde over (s)}.sub.hL(b+m)={tilde over (s)}′.sub.hL(b+m)⊕{tilde over ({circumflex over (s)})}′.sub.hL(b=m) [Equation 21]
{tilde over ({circumflex over (s)})}.sub.N/4-hL(b+m)=[{tilde over ({circumflex over (s)})}(b+m−1).Math.N/4), . . . ,{tilde over ({circumflex over (s)})}((b+m−1).Math.N/4+hL−1)].sup.T [Equation 22]
(149) Accordingly, an output signal {tilde over (s)}(b+m) may be extracted by the block compensation unit 1204.
(150) Although a few embodiments of the present invention have been shown and described, the present invention is not limited to the described embodiments. Instead, it would be appreciated by those skilled in the art that changes may be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.