Filtering strength determination method, moving picture coding method and moving picture decoding method
09854240 · 2017-12-26
Assignee
Inventors
- Teck Wee Foo (Singapore, SG)
- Chong Soon Lim (Singapore, SG)
- Sheng Mei Shen (Singapore, SG)
- Shinya KADONO (Fukuoka, JP)
Cpc classification
H04N19/91
ELECTRICITY
H04N19/577
ELECTRICITY
H04N19/134
ELECTRICITY
H04N19/86
ELECTRICITY
H04N19/159
ELECTRICITY
H04N19/573
ELECTRICITY
H04N19/90
ELECTRICITY
H04N19/157
ELECTRICITY
H04N19/44
ELECTRICITY
H04N19/139
ELECTRICITY
International classification
H04N7/12
ELECTRICITY
H04N19/86
ELECTRICITY
H04N19/90
ELECTRICITY
H04N19/573
ELECTRICITY
H04N19/91
ELECTRICITY
H04N19/44
ELECTRICITY
H04N19/577
ELECTRICITY
H04N19/139
ELECTRICITY
H04N19/157
ELECTRICITY
H04N19/14
ELECTRICITY
H04N19/134
ELECTRICITY
Abstract
A moving picture coding apparatus includes an inter-pixel filter having filters for filtering decoded image data so as to remove block distortion which is high frequency noise around block boundaries. The inter-pixel filter includes filters having different filtering strengths. The coding apparatus also includes a filter processing control unit for determining a filtering strength of the inter-pixel filter.
Claims
1. A decoding apparatus which decodes a block in a P-picture and a B-picture, the decoding apparatus comprising: a non-transitory memory storing a program; and a hardware processor that executes the program and causes the decoding apparatus to: generate a predictive image for a current block to be decoded by referring to one reference picture in the case that the current block is a block in the P-picture and by referring to one or two reference pictures in the case that the current block is a block in the B-picture; decode coded data of the current block in a bit stream to obtain a decoded difference image between the current block and the predictive image of the current block, the coded data in the bit stream being generated by coding a transform coefficient that indicates a spatial frequency component resulting from an orthogonal transformation and a quantization of the difference image; generate a reconstructed block by adding the decoded difference image and the predictive image; determine a predetermined filtering strength from among a filtering strength corresponding to filtering not being performed, a weakest filtering strength, a second-weakest filtering strength, a third-weakest filtering strength, and a strongest filtering strength; remove a coding distortion between the current block and a neighboring block adjacent to the current block by performing a filtering on the current block and the neighboring block with the predetermined filtering strength; and store the reconstructed block for which a coding distortion is removed, into a memory, wherein, (a) in the case where both of the current block and the neighboring block are blocks in a P-picture and contain coded data of a transform coefficient in the bit stream, the determining of the predetermined filtering strength selects, as the filtering strength, a second-weakest filtering strength among the plurality of the filtering strengths, and (b) in the case where both of the current block and the neighboring block are blocks in a B-picture and contain coded data of a transform coefficient in the bit stream, the determining of the predetermined filtering strength selects, as the filtering strength, a second-weakest filtering strength among the plurality of the filtering strengths, and (c) in the case where both of the current block and the neighboring block are blocks in a P-picture and do not contain coded data of a transform coefficient in the bit stream, the determining of the predetermined filtering strength: selects, as the filtering strength, a weakest filtering strength among the plurality of the filtering strengths, excluding the one filtering strength corresponding to no filtering being performed, when the reference picture referred to by the current block and the reference picture referred to by the neighboring block are not the same; and selects, as the filtering strength, one of (i) a weakest filtering strength among the plurality of the filtering strengths, excluding the one filtering strength corresponding to no filtering being performed, and (ii) the filtering strength corresponding to no filtering being performed, when the reference picture referred to by the current block and the reference picture referred to by the neighboring block are the same, and (d) in the case where both of the current block and the neighboring block are blocks in a B-picture and do not contain coded data of a transform coefficient in the bit stream, the determining of the predetermined filtering strength: selects, as the filtering strength, a weakest filtering strength among the plurality of the filtering strengths, excluding the one filtering strength corresponding to no filtering being performed, when the number of reference pictures referred to by the current block and the number of reference pictures referred to by the neighboring block are not the same; and selects, as the filtering strength, one of (i) a weakest filtering strength among the plurality of the filtering strengths, excluding the one filtering strength corresponding to no filtering being performed, and (ii) the filtering strength corresponding to no filtering being performed, when the number of reference pictures referred to by the current block and the number of reference pictures referred to by the neighboring block are the same.
2. A decoding method for decoding a block in a P-picture and a B-picture, the decoding method comprising: generating a predictive image for a current block to be decoded by referring to one reference picture in the case that the current block is a block in the P-picture and by referring to one or two reference pictures in the case that the current block is a block in the B-picture; decoding coded data of the current block in a bit stream to obtain a decoded difference image between the current block and the predictive image of the current block, the coded data in the bit stream being generated by coding a transform coefficient that indicates a spatial frequency component resulting from an orthogonal transformation and a quantization of the difference image; generating a reconstructed block by adding the decoded difference image and the predictive image; determining a predetermined filtering strength from among a filtering strength corresponding to filtering not being performed, a weakest filtering strength, a second-weakest filtering strength, a third-weakest filtering strength, and a strongest filtering strength; removing a coding distortion between the current block and a neighboring block adjacent to the current block by performing a filtering on the current block and the neighboring block with the predetermined filtering strength; and storing the reconstructed block for which a coding distortion is removed, into a memory, wherein, (a) in the case where both of the current block and the neighboring block are blocks in a P-picture and contain coded data of a transform coefficient in the bit stream, the determining of the predetermined filtering strength selects, as the filtering strength, a second-weakest filtering strength among the plurality of the filtering strengths, and (b) in the case where both of the current block and the neighboring block are blocks in a B-picture and contain coded data of a transform coefficient in the bit stream, the determining of the predetermined filtering strength selects, as the filtering strength, a second-weakest filtering strength among the plurality of the filtering strengths, and (c) in the case where both of the current block and the neighboring block are blocks in a P-picture and do not contain coded data of a transform coefficient in the bit stream, the determining of the predetermined filtering strength: selects, as the filtering strength, a weakest filtering strength among the plurality of the filtering strengths, excluding the one filtering strength corresponding to no filtering being performed, when the reference picture referred to by the current block and the reference picture referred to by the neighboring block are not the same; and selects, as the filtering strength, one of (i) a weakest filtering strength among the plurality of the filtering strengths, excluding the one filtering strength corresponding to no filtering being performed, and (ii) the filtering strength corresponding to no filtering being performed, when the reference picture referred to by the current block and the reference picture referred to by the neighboring block are the same, and (d) in the case where both of the current block and the neighboring block are blocks in a B-picture and do not contain coded data of a transform coefficient in the bit stream, the determining of the predetermined filtering strength: selects, as the filtering strength, a weakest filtering strength among the plurality of the filtering strengths, excluding the one filtering strength corresponding to no filtering being performed, when the number of reference pictures referred to by the current block and the number of reference pictures referred to by the neighboring block are not the same; and selects, as the filtering strength, one of (i) a weakest filtering strength among the plurality of the filtering strengths, excluding the one filtering strength corresponding to no filtering being performed, and (ii) the filtering strength corresponding to no filtering being performed, when the number of reference pictures referred to by the current block and the number of reference pictures referred to by the neighboring block are the same.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
DETAILED DESCRIPTION OF THE INVENTION
(14) The following explains embodiments of the present invention with reference to the figures.
(15)
(16) As shown in
(17) The picture memory 101 stores a moving picture which has been inputted in display order on a picture-by-picture basis. “Picture” here means a unit of coding so-called screen which includes a frame and fields. The motion vector estimation unit 107, using as a reference picture a picture which has been coded and decoded, estimates a motion vector indicating a position deemed most appropriate in a search area within the picture on a block-by-block basis. Furthermore, the motion vector estimation unit 107 notifies the estimated motion vector to the motion compensation coding unit 109 and the motion vector storage unit 108.
(18) The motion compensation coding unit 109 determines, using the motion vector estimated by the motion vector estimation unit 107, a coding mode used for coding a block, and generates predictive image data on the basis of such coding mode. A coding mode, which is indicative of a method to be used for coding a macroblock, indicates which one of non-intra picture coding (motion compensated coding) and intra picture coding and the like should be performed on a macroblock. For example, when there is a weak correlation between pictures and therefore intra picture coding is more suitable than motion prediction, intra picture coding shall be selected. Such selected coding mode is notified to the filter control unit 110. The motion vector and the coding mode are notified from the motion compensation coding unit 109 to the bit stream generation unit 104. The motion vector storage unit 108 stores the motion vector estimated by the motion vector estimation unit 107.
(19) The difference calculation unit 102 calculates the difference between a picture read out from the picture memory 101 and the predictive image data inputted by the motion compensation coding unit 109 so as to generate prediction residual image data. The prediction residual coding unit 103 performs coding processing such as orthogonal transform and quantization on the input prediction residual image data, and generates coded data. The bit stream generation unit 104 performs variable length coding and other processing on the coded data generated by the prediction residual coding unit 103, and generates a bit stream after adding, to such coded data, motion vector information and coding mode information and the like inputted by the motion compensation coding unit 109.
(20) The prediction residual decoding unit 105 performs decoding processing such as inverse quantization and inverse orthogonal transform on the coded data so as to generate decoded differential image data. The adder 106 adds the decoded differential image data inputted by the prediction residual decoding unit 105 to the predictive image data inputted by the motion compensation coding unit 109 so as to generate decoded image data. The picture memory 111 stores the decoded image data to which filtering has been applied.
(21) The filter processing control unit 110 selects a filtering strength of the inter-pixel filter 114 according to the input motion vector information and the coding mode information, i.e. selects which one of the following should be used: a filter A114a; a filter B114b; a filter C114c; a filter D114d; and no-filtering (skip), and controls the switch 112 and the switch 113. The switch 112 and the switch 113 are switches which selectively connect to one of their respective terminals “1”˜“5” under the control of the filter processing control unit 110. The switch 113 is placed between the output terminal of the adder 106 and the input terminal of the inter-pixel filter 114. Meanwhile, the switch 112 is placed between the input terminal of the picture memory 111 and the output terminal of the inter-pixel filter 114.
(22) The inter-pixel filter 114, which is a deblocking filter that filters decoded image data so as to remove block distortion which is high frequency noise around block boundaries, has the filter A114a, the filter B114b, the filter C114c, the filter D114d, each having a different filtering strength. Of these filters, the filter A114a is intended for the strongest filtering, the filter B114b for the second strongest, the filter C114c for the third strongest, and the filter D114d for the weakest filtering. Meanwhile, the amount of operation processing required for filtering depends on a filtering strength. Note that the switch 112, the switch 113 and other components illustrated in the diagram may be implemented either as hardware or software.
(23)
(24)
(25) As values of the first reference index, with respect to a current picture to be coded, “0” is assigned to a forward reference picture which is closest to the current picture in display order, and values starting from “1” are assigned to the other forward reference pictures. After values starting from “0” are assigned to all the forward reference pictures, the subsequent values are assigned to backward reference pictures, starting with a backward reference picture which is closest to the current picture.
(26) As values of the second reference index, with respect to the current picture, “0” is assigned to a backward reference picture which is closest to the current picture in display order, and values starting from “1” are assigned to the other backward reference pictures. After values starting from “0” are assigned to all the backward reference pictures, the subsequent values are assigned to forward reference pictures, starting with a forward reference picture which is closest to the current picture.
(27) For example, when the first reference index Ridx1 is “0” and the second reference index Ridx2 is “1” in
(28) Next, an explanation is given of the operation of the moving picture coding apparatus with the above configuration.
(29) As illustrated in
(30) The pictures reordered in the picture memory 101 are then read out on a macroblock basis. A macroblock is a group of pixels in the size of horizontal 16×vertical 16, for example. Meanwhile, motion compensation and the extraction of a motion vector are performed for each block which is a group of pixels in the size of horizontal 8×vertical 8, for example.
(31) A current macroblock read out from the picture memory 101 is inputted to the motion vector estimation unit 107 and the difference calculation unit 102.
(32) The motion vector estimation unit 107 performs vector estimation for each block in the macroblock, using the decoded image data stored in the picture memory 111 as a reference picture. Then, the motion vector estimation unit 107 outputs, to the motion compensation coding unit 109, the estimated motion vector and the reference index indicating a reference picture.
(33) The motion compensation coding unit 109 determines a coding mode to be used for the macroblock, utilizing the estimated motion vector and the reference index from the motion vector estimation unit 107. Here, in a case of a B picture, for example, one of the following methods shall be selectable as a coding mode: intra picture coding; inter-picture prediction coding using a forward motion vector; inter-picture prediction coding using a backward motion vector; inter-picture prediction coding using two motion vectors; and direct mode.
(34) Referring to
(35) The motion compensation coding unit 109 generates predictive image data according to the above-determined coding mode, and outputs such predictive image data to the difference calculation unit 102 and the adder 106. Note that since a motion vector of a block, which is co-located with a current block, in a backward reference picture is used as a reference motion vector as described above when the motion compensation coding unit 109 selects direct mode, such reference motion vector and its reference index are read out from the motion vector storage unit 108. Also note that when the motion compensation coding unit 109 selects intra picture coding, no predictive image data is outputted. Furthermore, the motion compensation coding unit 109 outputs the determined coding modes, the motion vector and reference index information to the filter processing control unit 110 and the bit stream generation unit 104, and outputs reference index values indicating reference pictures to the filter processing control unit 110.
(36) The difference calculation unit 102, which has received the predictive image data from the motion compensation coding unit 109, calculates the difference between such predictive image data and image data corresponding to a macroblock of the picture B11 read out from the picture memory 101 so as to generate prediction residual image data, and outputs it to the prediction residual coding unit 103.
(37) The prediction residual coding unit 103, which has received the prediction residual image data, performs coding processing such as orthogonal transform and quantization on such prediction residual image data so as to generate coded data, and outputs it to the bit stream generation unit 104 and the prediction residual decoding unit 105. The bit stream generation unit 104, which has received the coded data, performs variable length coding and the like on such coded data and adds, to such input coded data, the motion vector information, the coding mode information and the like inputted by the motion compensation coding unit 109 so as to generate and output a bit stream. Note that when macroblocks are coded in direct mode, motion vector information is not to be added to a bit stream.
(38) The prediction residual decoding unit 105 performs decoding processing such as inverse quantization and inverse orthogonal transform on the input coded data so as to generate decoded differential image data, and outputs it to the adder 106. The adder 106 adds the decoded differential image data to the predictive image data inputted by the motion compensation coding unit 109 so as to generate decoded image data, and outputs it to the inter-pixel filter 114 via the switch 113.
(39) The inter-pixel filter 114, which has received the decoded image data, applies filtering on such decoded image data using one of the following filters selected by the switch 112 and the switch 113: the filter A114a; the filter B114b; the filter C114c; and the filter D114d. Or, the inter-pixel filter 114 stores the decoded image data in the picture memory 111 via the switch 112 without performing filtering (skip). When this is done, the switching of the terminals “1”˜“5” of each of the switch 112 and the switch 113 is controlled by the filter processing control unit 110 in a manner described below.
(40)
(41) The filter processing control unit 110 determines filtering strengths required for block boundaries in both vertical and horizontal directions of the decoded image data. A determination for selecting a filtering strength used for filtering is made at the boundary of the two adjacent blocks p and q, as in the case of the prior art illustrated in
(42) If the result of the check shows that the block boundary falls on the macroblock boundary, that is, if the two blocks are not from the same macroblock, the filter processing control unit 110 selects the filter A114a (Bs=4) with the strongest filtering strength (Yes in Step S203). To put it another way, the filter processing control unit 110 exerts control for switching a terminal of the switch 112 and a terminal of the switch 113 to “1”, respectively. If the result of the check shows that the block boundary does not fall on the macroblock boundary, that is, if these two blocks are from the same macroblock, the filter processing control unit 110 selects the filter B114b (Bs≧3) with the second strongest strength (No in Step S203). To put it another way, the filter processing control unit 110 exerts control for switching a terminal of the switch 112 and a terminal of the switch 113 to “2”, respectively. Note that Bs≧3 here indicates that Bs is 3 or a larger value at least under the conditions illustrated in this flowchart, and whether Bs is Bs=3 or a value larger than 3 shall be determined by other conditions not disclosed here. In the following, an equation that includes this inequality sign shall indicate a value range which can be determined by the conditions not disclosed in the present invention.
(43) If the result of the check (Step S202) shows that neither of the blocks p nor q is intra-coded (No in Step S202), the filter processing control unit 110 checks to see if any of the two blocks p and q contains coefficients indicating spatial frequency components resulted from orthogonal transform (Step S204). If one of these blocks contains coefficients (Yes in Step S204), the filter processing control unit 110 selects the filter C114c (Bs≧2) with the third strongest strength. To put it another way, the filter processing control unit 110 exerts control for switching a terminal of the switch 112 and a terminal of the switch 113 to “3”, respectively.
(44) If neither of the two blocks contains coefficients, that is, if coefficients are not coded in both blocks p and q (No in Step S204), the filter processing control unit 110 checks to see if the picture that includes the blocks p and q is a P picture or a B picture (Step S205). If the picture that includes the blocks p and q is a P picture, the filter processing control unit 110 checks to see if (i) the blocks p and q refer to the same reference picture and (ii) each difference between vertical components (V(p,y) and (V(q,y)) and horizontal components (V(p,x) and (V(q,x)) of the motion vectors of the respective blocks p and q is less than one pixel (Step S208), on the basis of the reference index values inputted by the motion compensation coding unit 109 and the motion vectors inputted by the motion vector storage unit 108. In other words, the filter processing control unit 110 checks if the following equations (A), (B) and (C) are all satisfied or not:
Ref(p)=Ref(q) (A)
|V(p,x)−V(q,x)|<1 (B)
|V(p,y)−V(q,y)|<1 (C)
(45) Ref(p) and Ref(q) here denote reference pictures referred to by the block p and the block q.
(46) If the result of the check shows that the blocks p and q refer to the same reference picture and that each difference between vertical and horizontal motion vectors of the blocks p and q is less than one pixel (Yes in Step S208), the filter processing control unit 110 selects no-filtering (Bs=0). To put it another way, the filter processing control unit 110 exerts control for switching a terminal of the switch 112 and a terminal of the switch 113 to “5”, respectively. In the other case (No in Step S208), the filter processing control unit 110 selects the filter D114d (Bs≧1) with the weakest filtering strength. To put it another way, the filter processing control unit 110 exerts control for switching a terminal of the switch 112 and a terminal of the switch 113 to “4”, respectively.
(47) If the result of the check (Step S205) shows that the picture that includes the blocks p and q is a B picture, a coding mode used for coding a macroblock shall be one of the following: inter-picture prediction coding using a forward motion vector; inter-picture prediction coding using a backward motion vector; inter-picture prediction coding using two motion vectors; and direct mode. For example, when the block p uses only forward prediction and the block q uses prediction using two reference pictures, the number of reference pictures used by the block p is “1”, whereas the number of reference pictures used by the block q is “2”. Thus, the filter processing control unit 110 checks to see if the number of reference pictures referred to by the block p and the number of reference pictures referred to by the block q are the same (Step S206). If the result of the check shows that the blocks p and q refer to a different number of reference pictures (No in Step S206), the filter processing control unit 110 selects the filter D114d (Bs≧1) with the weakest filtering strength.
(48) On the other hand, when the blocks p and q refer to the same number of reference pictures (Yes in Step S206), the filter processing control unit 110 checks to see if the blocks p and q use exactly the same reference picture(s), on the basis of the reference index values inputted from the motion compensation coding unit 109 (Step S207). If the result of the check shows that any of the reference pictures referred to by the blocks p and q differs (No in Step S207), the filter processing control unit 110 selects the filter D114d (Bs≧1) with the weakest filtering strength.
(49) Meanwhile, if the reference picture(s) referred to by the blocks p and q is/are exactly the same (Yes in Step S207), the filter processing control unit 110 checks to see if the weighting (ABP) coefficients for weighted prediction in the blocks p and q are the same (Step S209). If the result of the check shows that the ABP coefficients of the respective blocks p and q differ (No in Step S209), the filter processing control unit 110 selects the filter D114d (Bs 1) with the weakest filtering strength. “Weighted prediction” here is a prediction method in which a value obtained by multiplying a pixel value in a reference picture by the first weighting coefficients α and further by adding the second weighting coefficients β to a result of such multiplication, serves as a predicted pixel value in inter picture prediction.
(50) On the other hand, if the ABP coefficients of the blocks p and q are the same (Yes in Step S209), the filter processing control unit 110 checks to see if each difference between all of the vertical and horizontal motion vectors of the blocks p and q are less than one pixel (Step S210). In other words, the filter processing control unit 110 checks if the following equations (D)˜(G) are all satisfied or not:
|Vf(p,x)−Vf(q,x)|<1 (D)
|Vf(p,y)−Vf(q,y)|<1 (E)
|Vb(p,x)−Vb(q,x)|<1 (F)
|Vb(p,y)−Vb(q,y)|<1 (G)
(51) Here, Vf and Vb denote motion vectors in the respective blocks p and q, and there is only one of Vf and Vb when only one reference picture is used.
(52) If the result of the check shows that each difference between all of the vertical and horizontal motion vectors of the blocks p and q is less than one pixel (Yes in Step S210), the filter processing control unit 110 selects no-filtering (Bs=0). In the other case (No in Step S210), the filter processing control unit 110 selects the filter D114d (Bs≧1) with the weakest filtering strength.
(53) Note that it is possible to make a prediction on the macroblocks of a B picture using direct mode as described above. When direct mode is employed, motion vectors of a current block are derived from the motion vector of a block, in a reference picture whose second reference index Ridx2 is “0”, which is co-located with the current block. In this case, a forward reference picture of the current block is a reference picture to be referred to by the motion vector of the corresponding block, and a backward reference picture of the current block is a reference picture whose second reference index Ridx2 is “0”. Subsequently, the filter processing control unit 110 utilizes such derived motion vectors and the reference picture to determine a filtering strength.
(54) As described above, when the picture that includes the blocks p and q is a B picture, since a check is made to see if the number of reference pictures referred to by the block p and the number of reference pictures referred to by the block q are the same, and if exactly the same reference picture(s) is/are used or not, it is possible to select an optimum filtering strength even when prediction coding in which two pictures are referred to is employed. This makes it possible for moving pictures to be coded in a manner which allows the improvement in the quality of such moving pictures to be decoded.
(55)
(56) As shown in
(57) The bit stream analysis unit 201 extracts, from the input bit stream, various data including the coding mode information and the information indicating the motion vectors used for coding. The prediction residual decoding unit 202 decodes the input prediction residual coded data so as to generate the prediction residual image data. The motion compensation decoding unit 203 obtains image data from reference pictures stored in the picture memory 206 so as to generate motion compensated image data, on the basis of the coding mode information at the time of coding, the motion vector information and the like. The motion vector storage unit 204 stores the motion vectors extracted by the bit stream analysis unit 201. The adder 207 adds the prediction residual image data inputted by the prediction residual decoding unit 202 to the motion compensated image data inputted by the motion compensation decoding unit 203 so as to generate decoded image data. The picture memory 206 stores the decoded image data for which filtering has been applied.
(58) The filter processing control unit 205 selects a filtering strength of the inter-pixel filter 210, i.e. selects one of a filter A210a, a filter B210b, a filter C210c, a filter D210d, and no-filtering (skip), and controls the switch 208 and the switch 209. The switch 208 and the switch 209 are switches which selectively connect to one of their respective terminals “1”˜“5” under the control of the filter processing control unit 205. The switch 209 is placed between the output terminal of the adder 207 and the input terminal of the inter-pixel filter 210. Meanwhile, the switch 208 is placed between the input terminal of the picture memory 206 and the output terminal of the inter-pixel filter 210.
(59) The inter-pixel filter 210, which is a deblocking filter that filters decoded image data so as to remove block distortion which is high frequency noise around block boundaries, has the filter A210a, the filter B210b, the filter C210c, the filter D210d, each having a different filtering strength. Of these filters, the filter A210a is indented for the strongest filtering, the filter B210b for the second strongest, the filter C210c for the third strongest, and the filter D210d for the weakest filtering. Meanwhile, the amount of operation required for filtering depends on a filtering strength.
(60) Next, an explanation is given of the moving picture decoding apparatus with the above configuration. The bit stream analysis unit 201 extracts, from the input bit stream, various data including the coding mode information and the motion vector information. The bit stream analysis unit 201 outputs the extracted coding mode information to the motion compensation decoding unit 203 and the filter processing control unit 205, and outputs the motion vector information and the reference indices to the motion vector storage unit 204. Furthermore, the bit stream analysis unit 201 outputs the extracted prediction residual coded data to the prediction residual decoding unit 202. The prediction residual decoding unit 202, which has received such prediction residual coded data, decodes the prediction residual coded data so as to generate the prediction residual image data, and outputs it to the adder 207.
(61) The motion compensation decoding unit 203 generates the motion compensated image data, referring to the reference pictures stored in the picture memory 206, on the basis of the coding mode information and the reference index values inputted by the bit stream analysis unit 201, and the motion vector information read out from the motion vector storage unit 204. Then, the motion compensation decoding unit 203 outputs the generated motion compensated image data to the adder 207, and outputs the reference index values indicating reference pictures to the filter processing control unit 205. The adder 207 adds the motion compensated image data to the prediction residual image data inputted by the prediction residual decoding unit 202 so as to generate decoded image data, and outputs it to the inter-pixel filter 210 via the switch 209.
(62) The inter-pixel filter 210, which has received the decoded image data, applies filtering on such decoded image data using one of the following filters selected by the switch 208 and the switch 209: the filter A210a; the filter B210b; the filter C210c; and the filter D210d. Or, the inter-pixel filter 210 stores the decoded image data in the picture memory 206 via the switch 208 without performing filtering (skip). When this is done, the switching of the terminals “1”˜“5” of each of the switch 208 and the switch 209 is controlled by the filter processing control unit 205 in an equivalent manner to that of the aforementioned filter processing control unit 110 of the moving picture coding apparatus.
(63) As described above, when the picture that includes the blocks p and q is a B picture, since a check is made to see if the number of reference pictures referred to by the block p and the number of reference pictures referred to by the block q are the same, and if reference picture(s) to be referred to is/are exactly the same or not, it is possible to select an optimum filtering strength even when prediction coding in which two pictures are referred to is employed. This makes it possible for moving pictures to be decoded in a manner which allows the improvement in the quality of such moving pictures.
Second Embodiment
(64) The second embodiment presents a filtering strength determination method which is partly different from one employed by the filter processing control unit 110 explained in the first embodiment. Note that the configuration required for the method according to the present embodiment is equivalent to that of the first embodiment, and therefore that detailed explanations thereof are omitted. Also note that an explanation is also omitted where a filtering strength is determined in the filter processing control unit 110 in the same manner as that of the first embodiment. It should be noted that the filtering strength determination method of the filter processing control unit 205 is applicable to the present embodiment regarding a moving picture decoding apparatus.
(65)
(66) If the result of a check (Step S304) performed by the filter processing control unit 110 to see whether any of the two blocks p and q contains coefficients indicating spatial frequency components resulted from orthogonal transform, shows that one of these blocks contains coefficients (Yes in Step S304), the filter processing control unit 110 performs processing described below.
(67) The filter processing control unit 110 checks to see if the picture that includes the blocks p and q is a P picture or a B picture (Step S311). If the picture that includes the blocks p and q is a P picture, the filter processing control unit 110 selects the filter C114c (Bs (p)≧2) with the third strongest filtering strength. Meanwhile, if the picture that includes the blocks p and q is a B picture, the filter processing control unit 110 selects Bs (b) (Bs (b)>Bs (p)) with a stronger filtering strength than Bs (p) used for a P picture.
(68) As described above, when any of the blocks p and q contains coefficients indicating spatial frequency components resulted from orthogonal transform, since a check is made to see if the picture that includes these blocks p and q is a P picture or a B picture, it is possible to select an optimum filtering strength even when prediction coding in which two pictures are referred to is employed. This makes it possible for moving pictures to be coded in a manner which allows the improvement in the quality of such moving pictures to be decoded.
(69) Note that when the filter processing control unit 110 selects no-filtering (Bs=0) in the above embodiments, it is possible that a filter with a weaker strength than the filter D114d (Bs≧1) with the weakest filtering strength may be used, instead of applying no filtering (skip).
(70) Also note that the filter processing control unit 110 does not have to execute all steps illustrated in the flowchart of
(71) Furthermore, although coding is performed on a picture-by-picture basis in the above embodiments, a field or a frame may also serve as a unit of coding.
Third Embodiment
(72) If a program for realizing the configuration of the moving picture coding method or the moving picture decoding method as shown in each of the aforementioned embodiments is recorded on a recording medium such as a flexible disk, it becomes possible to easily perform the processing presented in each of the aforementioned embodiments in an independent computer system.
(73)
(74)
(75)
(76) The above explanation is given on the assumption that a recording medium is a flexible disk, but the same processing can also be performed using an optical disc. In addition, the recording medium is not limited to a flexible disk and an optical disc and any other medium, such as an IC card and a ROM cassette, capable of recording a program can be used.
(77) Following is the explanation of the applications of the moving picture coding method and the moving picture decoding method as shown in the above embodiments, and the system using them.
(78)
(79) In this content supply system ex100, a computer ex111, a PDA (Personal Digital Assistant) ex112, a camera ex113, a cell phone ex114, and a camera-equipped cell phone ex115 are connected to the Internet ex101 via an Internet service provider ex102, a telephone network ex104 and the base stations ex107˜ex110.
(80) However, the content supply system ex100 is not limited to the configuration as shown in
(81) The camera ex113 is a device such as a digital video camera capable of shooting moving pictures. The cell phone may be a cell phone of a PDC (Personal Digital Communication) system, a CDMA (Code Division Multiple Access) system, a W-CDMA (Wideband-Code Division Multiple Access) system or a GSM (Global System for Mobile Communications) system, a PHS (Personal Handyphone system) or the like.
(82) A streaming server ex103 is connected to the camera ex113 via the base station ex109 and the telephone network ex104, which enables live distribution or the like using the camera ex113 based on coded data transmitted from the user using the camera ex113. Either the camera ex113 or the server and the like for carrying out data transmission may code the shot data. Also, moving picture data shot by a camera ex116 may be transmitted to the streaming server ex103 via the computer ex111. The camera ex116 is a device such as a digital camera capable of shooting still pictures and moving pictures. In this case, either the camera ex116 or the computer ex111 may code the moving picture data. An LSI ex117 included in the computer ex111 and the camera ex116 performs coding processing. Note that software for coding and decoding moving pictures may be integrated into a certain type of storage medium (such as a CD-ROM, a flexible disk and a hard disk) that is a recording medium readable by the computer ex111 or the like. Furthermore, the camera-equipped cell phone ex115 may transmit the moving picture data. This moving picture data is data coded by the LSI included in the cell phone ex115.
(83) In the content supply system ex100, content (such as a music live video) shot by the user using the camera ex113, the camera ex116 or the like is coded in the same manner as the above-described embodiments and transmitted to the streaming server ex103, and the streaming server ex103 makes stream distribution of the content data to clients at their request. The clients include the computer ex111, the PDA ex112, the camera ex113, the cell phone ex114 and so on capable of decoding the above-mentioned coded data. The content supply system ex100 with the above structure is a system in which the clients can receive and reproduce the coded data, and can further receive, decode and reproduce the data in real time so as to realize personal broadcasting.
(84) The moving picture coding apparatus and the moving picture decoding apparatus presented in the above embodiments may be employed as an encoder and a decoder in the devices making up such system.
(85) As an example of such configuration, a cell phone is taken as an example.
(86)
(87) The cell phone ex115 has an antenna ex201 for transmitting/receiving radio waves to and from the base station ex110 via radio waves, a camera unit ex203 such as a CCD camera capable of shooting video and still pictures, a display unit ex202 such as a liquid crystal display for displaying the data obtained by decoding video and the like shot by the camera unit ex203 and decoding videos and the like received by the antenna ex201, a main body including a set of operation keys ex204, a voice output unit ex208 such as a speaker for outputting voices, a voice input unit ex205 such as a microphone for inputting voices, a recording medium ex207 for storing coded or decoded data such as data of moving or still pictures shot by the camera, data of received e-mails and moving picture data or still picture data, and a slot unit ex206 for enabling the recording medium ex207 to be attached to the cell phone ex115. The recording medium ex207 stores in itself a flash memory element, a kind of EEPROM (Electrically Erasable and Programmable Read Only Memory) that is an electrically erasable and rewritable nonvolatile memory, in a plastic case such as a SD card.
(88) Next, the cell phone ex115 will be explained with reference to
(89) When a call-end key or a power key is turned ON by a user's operation, the power supply circuit unit ex310 supplies to each unit with power from a battery pack so as to activate the digital camera-equipped cell phone ex115 for making it into a ready state.
(90) In the cell phone ex115, the voice processing unit ex305 converts voice signals received by the voice input unit ex205 in conversation mode into digital voice data under the control of the main control unit ex311 comprised of a CPU, a ROM, a RAM and others, the modem circuit unit ex306 performs spread spectrum processing on it, and a transmit/receive circuit unit ex301 performs digital-to-analog conversion processing and frequency transform processing on the data, so as to transmit it via the antenna ex201. Also, in the cell phone ex115, the transmit/receive circuit unit ex301 amplifies a received signal received by the antenna ex201 in conversation mode and performs frequency transform processing and analog-to-digital conversion processing on the data, the modem circuit unit ex306 performs inverse spread spectrum processing on the data, and the voice processing unit ex305 converts it into analog voice data, so as to output it via the voice output unit ex208.
(91) Furthermore, when transmitting an e-mail in data communication mode, the text data of the e-mail inputted by operating the operation keys ex204 on the main body is sent out to the main control unit ex311 via the operation input control unit ex304. In the main control unit ex311, after the modem circuit unit ex306 performs spread spectrum processing on the text data and the transmit/receive circuit unit ex301 performs digital-to-analog conversion processing and frequency transform processing on it, the data is transmitted to the base station ex110 via the antenna ex201.
(92) When the picture data is transmitted in data communication mode, the picture data shot by the camera unit ex203 is supplied to the picture coding unit ex312 via the camera interface unit ex303. When the picture data is not transmitted, it is also possible to display the picture data shot by the camera unit ex203 directly on the display unit 202 via the camera interface unit ex303 and the LCD control unit ex302.
(93) The picture coding unit ex312, which incorporates the moving picture coding apparatus according to the present invention, compresses and codes the picture data supplied from the camera unit ex203 by the coding method employed in the moving picture coding apparatus presented in the above embodiments, so as to convert it into coded picture data, and sends it out to the multiplexing/demultiplexing unit ex308. At this time, the cell phone ex115 sends out the voices received by the voice input unit ex205 while the shooting by the camera unit ex203 is taking place, to the multiplexing/demultiplexing unit ex308 as digital voice data via the voice processing unit ex305.
(94) The multiplexing/demultiplexing unit ex308 multiplexes the coded picture data supplied from the picture coding unit ex312 and the voice data supplied from the voice processing unit ex305 using a predetermined method, the modem circuit unit ex306 performs spread spectrum processing on the resulting multiplexed data, and the transmit/receive circuit unit ex301 performs digital-to-analog conversion processing and frequency transform processing so as to transmit the processed data via the antenna ex201.
(95) When receiving data of a moving picture file which is linked to a Web page or the like in data communication mode, the modem circuit unit ex306 performs inverse spread spectrum processing on the data received from the base station ex110 via the antenna ex201, and sends out the resulting multiplexed data to the multiplexing/demultiplexing unit ex308.
(96) In order to decode the multiplexed data received via the antenna ex201, the multiplexing/demultiplexing unit ex308 separates the multiplexed data into a picture data bit stream and a voice audio data bit stream, and supplies the coded picture data to the picture decoding unit ex309 and the voice data to the voice processing unit ex305 via the synchronous bus ex313.
(97) Next, the picture decoding unit ex309, which incorporates the moving picture decoding apparatus according to the present invention, decodes the picture data bit stream by the decoding method paired with the coding method presented in the above embodiments to generate reproduced moving picture data, and supplies this data to the display unit ex202 via the LCD control unit ex302, and thus moving picture data included in a moving picture file linked to a Web page, for instance, is displayed. At the same time, the voice processing unit ex305 converts the voice data into analog voice data, and supplies this data to the voice output unit ex208, and thus voice data included in a moving picture file linked to a Web page, for instance, is reproduced.
(98) Note that the aforementioned system is not an exclusive example and therefore that at least either the moving picture coding apparatus or the moving picture decoding apparatus of the above embodiments can be incorporated into a digital broadcasting system as shown in
(99) Furthermore, it is also possible to code an image signal by the moving picture coding apparatus presented in the above embodiments and record the coded image signal in a recording medium. Some examples are a DVD recorder for recording an image signal on a DVD disc ex421, and a recorder ex420 such as a disc recorder for recording an image signal on a hard disk. Moreover, an image signal can be recorded in an SD card ex422. If the recorder ex420 is equipped with the moving picture decoding apparatus presented in the above embodiments, it is possible to reproduce an image signal recorded on the DVD disc ex421 and in the SD card ex422, and display it on the monitor ex408.
(100) As the configuration of the car navigation system ex413, the configuration without the camera unit ex203 and the camera interface unit ex303, out of the configuration shown in
(101) Concerning the terminals such as the cell phone ex114, a transmitting/receiving terminal having both an encoder and a decoder, as well as a transmitting terminal only with an encoder and a receiving terminal only with a decoder are possible as forms of implementation.
(102) As stated above, it is possible to employ the moving picture coding method and the moving picture decoding method according to the aforementioned embodiments in any one of the apparatuses and the system described above, and thus the effects explained in the above embodiments can be achieved by so doing.
(103) From the invention thus described, it will be obvious that the embodiments of the invention may be varied in many ways. Such variations are not to be regarded as a departure from the spirit and scope of the invention, and all such modifications as would be obvious to one skilled in the art are intended for inclusion within the scope of the following claims.
(104) As is obvious from the above explanation, the filtering strength determination method according to the present invention is capable of determining, in an optimum manner, a strength of a filter for removing block distortion by filtering decoded image data including high frequency noise around block boundaries, even when prediction coding in which two pictures are referred to is employed. Accordingly, it is possible for moving pictures to be coded in a manner which allows the improvement in the quality of such moving pictures to be decoded. What is more, the filtering strength determination method according to the present invention is applicable to both a moving picture coding apparatus and a moving picture decoding apparatus, offering a significant practical value.
(105) As described above, the filtering strength determination method, the moving picture coding method and the moving picture decoding method according to the present invention are suited as methods for generating a bit stream by coding image data corresponding to each of pictures making up a moving picture and for decoding the generated bit stream on a cell phone, a DVD apparatus, a personal computer and the like.