Mixed domain collaborative post filter for lossy still image coding
10965958 ยท 2021-03-30
Assignee
Inventors
- Victor Alexeevich Stepin (Moscow, RU)
- Roman Igorevich Chernyak (Moscow, RU)
- Albert Yurievich Silantiev (Moscow, RU)
Cpc classification
H04N19/86
ELECTRICITY
H04N19/196
ELECTRICITY
H04N19/154
ELECTRICITY
International classification
H04N19/635
ELECTRICITY
H04N19/196
ELECTRICITY
H04N19/154
ELECTRICITY
H04N19/86
ELECTRICITY
Abstract
An image coding apparatus, comprising an image reconstruction unit configured to reconstruct an image, a parameter determination unit configured to determine one or more filter parameters, based on one or more first parameters which are based on the reconstructed image and one or more second parameters which are based on codec signaling information, and a mixed-domain filtering unit configured to filter in a frequency domain and a pixel domain the reconstructed image based on the determined filter parameters to obtain a filtered image.
Claims
1. An image coding apparatus, comprising: a processor; and a memory storing instructions that when executed configure the processor to: reconstruct an image; determine one or more filter parameters, based on one or more first parameters that are based on the reconstructed image and one or more second parameters which are based on codec signaling information; filter, in a frequency domain and a pixel domain, the reconstructed image based on the determined filter parameters to obtain a filtered image; estimate the original image from the reconstructed image; and determine the first parameters based on the estimated original image; wherein the processor is further configured to determine the filter parameters by: partitioning the estimated original image into blocks; and for each of the blocks; determine a cluster of patches that are similar to the block, 2D-transform the cluster of patches to obtain transformed patches, and determine the first parameters based on the transformed patches; wherein the processor is further configured to determine, for each of the blocks, the filter parameters based on the transformed patches by: regrouping elements of the transformed patches to obtain a matrix T.sub.i, wherein each row of the matrix T.sub.i comprises frequency components with same spatial frequencies, transforming the matrix T.sub.i to obtain a transformed matrix tf.sub.vw.sup.i, wherein each row of the matrix tf.sub.vw.sup.i is a 1D transform of a corresponding row of matrix T.sub.i, and determining the filter parameters g.sub.vw.sup.i as:
2. The image coding apparatus of claim 1, wherein the processor is further configured to determine a quantization noise value from the codec signaling information and determine the second parameters based on the derived quantization noise value.
3. The image coding apparatus of claim 2, wherein the processor is further configured to analytically derive the quantization noise value from a quantization parameter QP of the codec signaling information.
4. The image coding apparatus of claim 1, wherein the processor is further configured to determine where filtering should be implemented.
5. The image coding apparatus of claim 1, wherein the processor is further configured to, for each of a set of blocks of the reconstructed image: determine a set of patches that are similar to the block; 2D-transform the patches into a frequency domain to obtain frequency-domain patches; perform collaborative filtering of the frequency-domain patches in the frequency domain to obtain filtered transformed frequency-domain patches; inverse 2D transform the filtered transformed frequency-domain patches in the frequency domain to obtain filtered patches; and perform collaborative filtering of the filtered patches in the pixel domain along pixel patches from different sets of patches with the same spatial coordinates.
6. The image coding apparatus of claim 5, wherein the 2D transformation is a Haar wavelet transform.
7. The image coding apparatus of claim 5, wherein the processor is further configured to perform, for each of the blocks, the collaborative filtering based on the transformed patches by: regrouping elements of the transformed patches to obtain a matrix T.sub.i, wherein each row of the matrix T.sub.i comprises frequency components with same spatial frequencies; transforming the matrix T.sub.i to obtain a transformed matrix tf.sub.vw.sup.i, wherein each row of the matrix tf.sub.vw.sup.i is a 1D transform of a corresponding row of matrix T.sub.i; and perform filtering by multiplying each element of matrix tf.sub.vw.sup.i by a filter parameter g().sub.vw.sup.i, wherein is a column number in matrix tf.sub.vw.sup.i and spatial frequencies v,w correspond to a j-th row of matrix tf.sub.vw.sup.i.
8. The image coding apparatus of claim 1, wherein the 1D transformation is a Hadamard transform.
9. The image coding apparatus of claim 1, wherein: the adaptive_filtering_flag is used to indicate that a mixed-domain filter should be used to filter an image, the frame_level_usage_flag is used to indicate that the entire reconstructed image should be filtered, the macroblock_size field is used to indicate a macroblock size which should be used for filtering, and the use_filtered_mb_flag is used to indicate whether a filtered macroblock should be used.
10. A method for still image coding, the method comprising: reconstructing an image; determining one or more filter parameters based on one or more first parameters that are based on the reconstructed image and one or more second parameters that are based on codec signaling information; filtering in a frequency domain and in a pixel domain the reconstructed image based on the determined filter parameters to obtain a filtered image; estimating the original image from the reconstructed image; and determining the first parameters based on the estimated original image; wherein the filter parameters are determined by: partitioning the estimated original image into blocks; and for each of the blocks: determining a cluster of patches that are similar to the block, 2D-transforming the cluster of patches to obtain transformed patches, and determining the first parameters based on the transformed patches; wherein the determining the first parameters based on the transformed patches comprises: regrouping elements of the transformed patches to obtain a matrix T.sub.i, wherein each row of the matrix T.sub.i comprises frequency components with same spatial frequencies, transforming the matrix T.sub.i to obtain a transformed matrix tf.sub.vw.sup.i, wherein each row of the matrix tf.sub.vw.sup.i is a 1D transform of a corresponding row of matrix T.sub.i, and determining the filter parameters g.sub.vw.sup.i as:
11. A non-transitory computer-readable storage medium storing program code, the program code comprising instructions that when executed by a processor configure the processor to perform steps comprising: reconstructing an image; determining one or more filter parameters based on one or more first parameters that are based on the reconstructed image and one or more second parameters that are based on codec signaling information; filtering in a frequency domain and in a pixel domain the reconstructed image based on the determined filter parameters to obtain a filtered image; estimating the original image from the reconstructed image; and determining the first parameters based on the estimated original image; wherein the filter parameters are determined by: partitioning the estimated original image into blocks; and for each of the blocks: determining a cluster of patches that are similar to the block, 2D-transforming the cluster of patches to obtain transformed patches, and determining the first parameters based on the transformed patches; wherein the determining the first parameters based on the transformed patches comprises: regrouping elements of the transformed patches to obtain a matrix T.sub.i, wherein each row of the matrix T.sub.i comprises frequency components with same spatial frequencies, transforming the matrix T.sub.i to obtain a transformed matrix tf.sub.vw.sup.i, wherein each row of the matrix tf.sub.vw.sup.i is a 1 D transform of a corresponding row of matrix T.sub.i, and determining the filter parameters g.sub.vw.sup.i as:
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) To illustrate the technical features of embodiments of the present invention more clearly, the accompanying drawings provided for describing the embodiments are introduced briefly in the following. The accompanying drawings in the following description are merely some embodiments of the present invention, modifications on these embodiments are possible without departing from the scope of the present invention as defined in the claims.
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
DETAILED DESCRIPTION OF THE EMBODIMENTS
(16)
(17) The image coding apparatus 100 comprises an image reconstruction unit 110, a parameter determination unit 120 and a mixed-domain filtering unit 130.
(18) The image reconstruction unit 110 is configured to reconstruct an image.
(19) The parameter determination unit 120 is configured to determine one or more filter parameters, based on one or more first parameters which are based on the reconstructed image and one or more second parameters which are based on codec signaling information.
(20) The mixed-domain filtering unit 130 is configured to filter in a frequency domain and a pixel domain the reconstructed image based on the determined filter parameters to obtain a filtered image.
(21) The image coding apparatus 100 can be an encoder and/or a decoder.
(22)
(23)
(24) The method comprises a first step 310 of reconstructing an image.
(25) The method comprises a second step 320 of determining one or more filter parameters based on one or more first parameters which are based on the reconstructed image and one or more second parameters which are based on codec signaling information.
(26) The method comprises a third step 330 of filtering in a frequency domain and in a pixel domain the reconstructed image based on the determined filter parameters to obtain a filtered image.
(27) The reconstructed (decoded) image can be divided into a set of small macroblocks and then each macroblock can be filtered by a filter as described below.
(28)
(29) The filter 400 comprises three blocks: a parameter estimation block 410, a mixed domain collaborative filtering block 420 and an application map block 430.
(30) Similar to ALF, the parameter estimation block 410 calculates filter parameters. But in contrast to ALF, the filter parameters are calculated without knowledge of the source image. The filter parameter estimation is based on two groups of input parameters. The first group of parameters is estimated based on the reconstructed image and the second group of parameters is derived from codec signaling information which are already transferred from the encoder to the decoder. According to the above described procedure the filter parameters can be estimated on the decoder side and so filter parameters should not be transferred from the encode to decoder side. Similar to ALF, a parameter estimation block calculates the filter impulse response, but the parameter estimation block 410 of filter 400 estimates a frequency impulse response, because base filtering is performed in frequency domain. The frequency domain implementation allows building a more efficient non-linear filter. The frequency impulse response is an example of filter parameters.
(31) In contrast to ALF, which performs local filtering in pixel domain, non-local collaborative filtering is performed in a mixed domain (spatial frequency and pixel domain). Such approach allows more efficient usage of spatial redundancy. Initial filtering is performed in frequency domain and the final averaging is performed in pixel domain.
(32) Similar to ALF, during the RDO process the application map block determines areas where filtering should be applied. If the coding gain from removing quantization noise from the decoded image is significantly more than degradation of the filtered decoded image, then filtering is applied. Otherwise, the unfiltered reconstructed image is used as output for the end user.
(33) The filter 400 can be used as decoder side post filter, too. In this case the application map block 430 can be excluded from the post filter design and there is no additional bit budget for application map transmission. Of course, in this case coding gain will be decreased, but it will be significant also.
(34) The application map block 430 generates application map data that is provided to an entropy coder 440.
(35)
(36)
(37)
(38)
(39) For the particular case of Wiener collaborative filter the frequency impulse response (filter parameter in frequency domain) can be determined using the following procedure. At the first step a StackTransform( ) procedure is performed for each frequency domain cluster F.sub.i
(40)
(41) The following scanning rule is used: each row of matrix T.sub.i consists of frequency component from different patch of the same frequency domain cluster F.sub.i with the same spatial frequencies [v,w]:
(42)
(43) On the last step 920 of the StackTransform( ) procedure the output matrix TF.sub.i is created. Each row of this matrix is a 1D transform 920 of a corresponding row of T.sub.i matrix.
(44) A frequency impulse response for a Wiener collaborative filter is determined by the following equation:
(45)
wherein is a column number in matrix tf.sub.vw.sup.i, spatial frequencies v,w correspond to a j-th row of matrix tf.sub.vw.sup.i and N is a quantization noise value derived from the codec signaling information.
(46)
(47) As in the filter parameter estimator the image partitioning block 1010 creates a set of macroblocks which cover the reconstructed image. Then, for each reference macroblock from this set k closest blocks are found using a MSE metric by the block matching block 1020. In a next step, found spatial patches are combined in a pixel domain cluster corresponding to the reference macroblock. The 2D transform block 1030 performs a 2D transform over each patch in a chosen pixel domain cluster and produces a frequency domain cluster which comprises 2D spectra of corresponding pixel domain patches. The collaborative frequency domain filter 1040 performs collaborative filtering of 2D spectra of pixel domain patches using filter parameters calculated in the previous step. The inverse 2D transform block 1050 returns filtered frequency domain patches to the pixel domain. Then, pixel based collaborative filtering 1060 performs final averaging of pixel domain patches corresponding to the reference macroblock.
(48)
P.sub.i=BlockMatch(S,b.sub.i)={b.sub.i,p.sub.i.sup.0,p.sub.i.sup.1,p.sub.i.sup.2, . . . ,p.sub.i.sup.k-1},
where Sis the reconstructed image and p.sub.i.sup.j is a j-th patch corresponding to the b.sub.i reference macroblock. In the next stage for each patch from pixel domain cluster P.sub.i a 2D transform is performed. A frequency domain cluster F.sub.i includes 2D spectra of pixel domain patches from pixel domain cluster P.sub.i is used for collaborative filtering. In a general case the collaborative filter in frequency domain performs frequency domain patch averaging and produces filtered frequency domain patches
R.sub.i=FreqCollaborativeFiltering(F.sub.i,G.sub.i)
corresponding to pixel domain patches. An inverse 2D transform returns filtered in frequency domain patches N to pixel domain and produces filtered pixel domain patches {tilde over (P)}.sub.i. On the last processing stage frequency domain filtered patches {tilde over (P)}.sub.0, {tilde over (P)}.sub.1, . . . , {tilde over (P)}.sub.M are averaged in pixel domain based on a procedure SameBlockAvg( ), which will be described below.
(49)
(50)
(51) Collaborative filter in pixel domain averages the same patch (patch with fixed spatial coordinates) along all clusters which include this patch. This allows decreasing noise and introducing low edge distortion.
(52)
(53) If a coding gain from removing quantization noise from the decoded image is significantly more than degradation of the filtered decoded image, then filtering is applied. Otherwise, a reconstructed image is used as output for an end user. The application map block decisions are encoded with an entropy encoder 1430 and transferred from encoder to decoder side.
(54) As outlined above, embodiments of the invention overcome one or more of the following disadvantages of the Adaptive Loop Filter: In ALF, filter parameters need to be transmitted from the encoder to the decoder, because the filter parameters cannot be estimated in the decoder (the original signal is absent in the decoder and so it cannot be used for filter parameters estimation). In ALF, linear pixel domain filtering is used and potential benefits from non-linear filtering cannot be achieved. In ALF, only local filtering is used and so spatial redundancy cannot be used fully.
(55) Further embodiments of the present invention may include: 1. A method and an apparatus for still image coding (compression), comprising: reconstructing image generator corresponding to coded source image adaptive post filter in mixed domain (spatial frequency and pixel domain) applied to reconstructing image for post filtering (decoded signal improvement), where part of filter parameters is estimated from reconstructed image and second part of filter parameters is derived from encoder signaling information which is already encoded into bitstream and using for encoded signal reconstruction in codec's without post filtering 2. Same as previous, where both part of adaptive post filter parameters can be derived on the decoder side and so should not be encoded into bitstream. 3. Same as previous, where application map is implemented on the filter output for optimal tradeoff between quantization noise suppression and decoded video degradation. 4. Same as previous, where filter parameter estimation is based on original image estimation from reconstructed signal and quantization noise estimation. 5. Same as 1 to 3, where original image estimation is based on the reconstructed image only 6. Same as previous, where noise estimation is a function of an encoder quantization parameter (qp) 7. Same as previous, where the adaptive filter in the mixed domain comprises the following steps: Generating a set of patches covering the reconstructed image Spatial search of patches similar to reference block selected on the first stage Grouping found patches to clusters 2D transform of each patch from each cluster Collaborative filtering in frequency domain of 2D pixels patches spectra corresponding to one cluster Inverse 2D transform of filtered frequency domain patches into pixel domain Averaging frequency domain filtered pixel patches in pixel domain with the same coordinates from different patch clusters 8. Same as 7, where Haar wavelet transform is used as 2D transform. 9. Same as 7 or 8, where Wiener collaborative filtering in frequency domain is used as collaborative filtering in frequency domain. 10. Same as previous, where adaptive_filtering_flag flag is used for event signaling when proposed compression tool should be used 11. Same as previous, where frame_level_usage_flag flag is used for signaling case when whole reconstructed image should be filtered 12. Same as previous, where macroblock_size determines macroblock size which should be used for filtering 13. Same as previous, where use_filtered_mb_flag flag shows whether filtered macroblock should be used
(56) The foregoing descriptions are only implementation manners of the present invention, the scope of the present invention is not limited to this. Any variations or replacements can be easily made through person skilled in the art. Therefore, the protection scope of the present invention should be subject to the protection scope of the attached claims.