SYSTEM AND METHOD TO ESTIMATE BLOCKINESS IN TRANSFORM-BASED VIDEO ENCODING

20220295074 · 2022-09-15

    Inventors

    Cpc classification

    International classification

    Abstract

    A method for estimating blockiness in a video frame of transform-based video encoding includes: obtaining a bitstream of a transform coded video signal, the signal being partitioned into video frames and all operations being performed on a per frame basis, wherein coefficients constituting transforms encoded in the bitstream of the video frames are read; averaging the coefficients of the transforms encoded in the bitstream into one averaged transform matrix per transform block size i; generating or making available one weighting matrix per averaged transform of block size i; computing intermediate weighted average transform matrices; processing all members of each weighted and averaged transform matrix into a single value per transform of block size i, to obtain intermediate signals; and computing a single value by weighting values of the intermediate signals according to an area in the respective video frame and adding up the weighted values of the intermediate signals.

    Claims

    1. A method for estimating blockiness in a video frame of transform-based video encoding, the method comprising: obtaining a bitstream of a transform coded video signal, the signal being partitioned into video frames and all operations being performed on a per frame basis, wherein coefficients constituting transforms encoded in the bitstream of the video frames are read; averaging the coefficients of the transforms encoded in the bitstream into one averaged transform matrix per transform block size i; generating or making available one weighting matrix per averaged transform of block size i, comprising weighting factors; computing intermediate weighted average transform matrices by processing each averaged transform matrix with the corresponding weighting matrix; processing all members of each weighted and averaged transform matrix into a single value per transform of block size i, to obtain intermediate signals Bls[i]; and computing a single value Bl.sub.0 by weighting values of the intermediate signals Bls[i] according to an area in the respective video frame and adding up the weighted values of the intermediate signals Bls[i]; wherein the area is the area which the quantity of the transform matrices of block size i cover per video frame, and wherein this area is dependent on the number of transforms of a specific block size i, NTr[i], in the examined frame and the number of pixels each of these transforms covers, and wherein Bl.sub.0 is computed using the formula
    Bl.sub.0=Bls[4×4]*NTr[4×4]+4*Bls[8×8]*NTr[8×8]+16*Bls[16×16]*NTr[16×16]+ . . . ; wherein blockiness is further estimated as Bl.sub.1 using a factor C(QP) dependent on a frame averaged quantization parameter QP and a codec dependent maximum QP, QP_max, where C(QP) is defined as C=exp (5*((QP_max−QP)/QP_max)); and wherein Bl.sub.1=Bl.sub.0*C(QP).

    2. The method according to claim 1, wherein the weighting step is performed using linear, quadratic or logarithmic weighting, or using predefined weighting matrices; wherein for linear weighting, the weighting increases linearly with the frequency a coefficient of the transform matrix to be weighted represents; wherein for quadratic weighting, the weighting increases quadratically with the frequency a coefficient of the transform matrix to be weighted represents; and wherein for logarithmic weighting, the weighting increases logarithmically with the frequency a coefficient of the transform matrix to be weighted represents.

    3. The method according to claim 1, wherein the processing of each averaged transform matrix with the corresponding weighting matrix is performed by multiplying both matrices value by value.

    4. The method according to claim 1, wherein the processing of each averaged and weighted transform matrix of block size i to obtain the intermediate singals Bls[i] is performed by adding up all averaged and weighted coefficients of the matrix into one result.

    5. The method according to claim 1, wherein blockiness Bl is computed as Bl=1/(1+Bl.sub.1), so that Bl becomes 1 in case of maximal blockiness and 0 for minimal blockiness.

    6. A system for estimating blockiness in a video frame of transform-based video encoding, the system comprising: data processing means configured to obtain a bitstream of a transform coded video signal, the signal being partitioned into video frames and all operations being performed on a per frame basis, wherein coefficients constituting transforms encoded in the video signal frames are read; averaging means configured to average the coefficients of the transforms encoded in the bitstream into one averaged transform matrix per transform block size i; weighting means configured to generate or make available one weighting matrix per averaged transform of block size i, comprising weighting factors, wherein the weighting means are further configured to compute intermediate weighted and averaged transform matrices by processing each averaged transform matrix with the corresponding weighting matrix; computing means configured to compute all members of each weighted and averaged transform matrix into a single value per transform of block size i, to obtain intermediate signals Bls[i]; and computing means configured to compute a single value Bl.sub.0 by weighting values of the intermediate signals Bls[i] according to an area in the respective video frame and adding up the weighted values of the intermediate signals Bls[i], wherein the area is the area which the quantity of transform matrices of block size i cover per video frame, wherein this area is dependent on the number of transforms of a specific size i, NTr[i], in the examined frame and the number of pixels each of these transforms covers, and wherein Bl.sub.0 is computed using the formula
    Bl.sub.0=Bls[4×4]*NTr[4×4]+4*Bls[8×8]*NTr[8×8]+16*Bls[16×16]*NTr[16×16]+ . . . ; wherein blockiness is further estimated as Bl.sub.1 using a factor C(QP) dependent on a frame averaged quantization parameter QP and a codec dependent maximum QP, QP_max, where C(QP) is defined as C=exp (5*((QP_max−QP)/QP_max)); and wherein Bl.sub.1=Bl.sub.0*C(QP).

    7. The system according to claim 6, wherein the weighting means is configured to compute the intermediate weighting matrices using linear, quadratic or logarithmic weighting, or using predefined weighting matrices; wherein for linear weighting, the weighting increases linearly with the frequency a coefficient of the transform matrix to be weighted represents; wherein for quadratic weighting, the weighting increases quadratically with the frequency a coefficient of the transform matrix to be weighted represents; and wherein for logarithmic weighting, the weighting increases logarithmically with the frequency a coefficient of the transform matrix to be weighted represents.

    8. The system according to claim 6, wherein the weighting means is further configured to process each averaged transform matrix with the corresponding weighting matrix by multiplying both matrices value by value.

    9. The system according to claim 6, wherein the computing means is configured to process each averaged and weighted transform matrix i to obtain the intermediate signals Bls[i] by adding up all coefficients of that matrix into one result.

    10. The system according to claim 6, wherein blockiness Bl is computed as Bl=1/(1+Bl.sub.1), so that Bl becomes 1 in case of maximal blockiness and 0 for minimal blockiness.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0023] Subject matter of the present disclosure will be described in even greater detail below based on the exemplary figures. All features described and/or illustrated herein can be used alone or combined in different combinations. The features and advantages of various embodiments will become apparent by reading the following detailed description with reference to the attached drawings, which illustrate the following:

    [0024] FIG. 1 shows a flowchart of method steps according to an embodiment of the invention;

    [0025] FIGS. 2A and 2B show two exemplary weighting matrices for linear weighting;

    [0026] FIGS. 3A and 3B show two exemplary weighting matrices for quadratic weighting; and

    [0027] FIGS. 4A and 4B show two exemplary weighting matrices for logarithmic weighting.

    DETAILED DESCRIPTION

    [0028] In the present invention, absolute values of DC-coefficients are not considered. Therefore, blockiness can be measured on every frame of a compressed video.

    [0029] Exemplary embodiments of the present invention provide a system and method to estimate blockiness in transform-based video encoding on a frame by frame basis with no regard to the frame-type. The resulting indicator is considered as an addition to any bitstream-based method of quality estimation, in order to make its result more precise, or as a standalone metric to diagnose video quality encoding problems.

    [0030] An exemplary embodiment of the present invention relates to a method to estimate blockiness in a video frame of transform-based video encoding. The method comprises the following steps: [0031] obtaining a bitstream of a transform coded video signal, the signal being partitioned into video frames and all operations being performed on a per frame basis, wherein coefficients constituting transforms encoded in the bitstream of the video frames are read, [0032] averaging the coefficients of the transforms encoded in the video bitstream into one averaged transform matrix per transform block size i. [0033] generating or making available one weighting matrix per averaged transform of block size i comprising weighting factors. [0034] computing intermediate weighted average transform matrices by processing each averaged transform matrix with the corresponding weighting matrix. [0035] processing all members of each weighted and averaged transform matrix into a single value per transform of block size i, to obtain an intermediate signal Bls[i]. [0036] computing a single value Bl.sub.0 by weighting the values of the intermediate signal Bls according to an area in the respective video frame and adding up the weighted Bls[i]. [0037] the area is the area which the quantity of the transform matrices of block size i cover per video frame, and wherein the area is dependent on the number of transforms of a specific block size i, NTr[i], in the examined frame and the number of pixels each of these transforms covers. [0038] Blockiness is further estimated as Bl.sub.1 using a factor C(QP) dependent on a frame averaged quantisation parameter QP and a codec dependent maximum QP, QP_max, were C(QP) is defined as


    C=exp(5*((QP_max−QP)/QP_max)),


    and


    Bl1=Bl0*C(QP).

    [0039] The method may further comprise multiplying Bl.sub.0 with a QP-dependent term: Bl.sub.1=Bl.sub.0*C(QP), where QP is the averaged quantization parameter of the video frame. This operation is performed in order to avoid false blockiness detection for low values of QP.

    [0040] Transforming Bl.sub.1 into a more convenient form, Bl, may be achieved by applying an inverse operation like Bl=1/(1+Bl.sub.1), in order to restrict the range of Bl from zero, for no blockiness, to one for high blockiness.

    [0041] Exemplary embodiments of the invention further relate to a data processing apparatus which comprises components for carrying out the method described above.

    [0042] The invention further relates to a system for estimating blockiness in a video frame of transform-based video encoding. The system comprises data processing means configured to obtain a bitstream of a transform coded video signal, the signal being partitioned into video frames. The coefficients constituting transforms encoded in the video frames are read. Averaging means are configured to average the transforms into one averaged transform matrix per transform size i. Weighting means are configured to generate or compute weighting matrices or make weighting matrices available. The weighting means are further configured to compute intermediate averaged and weighted transform matrices by processing the averaged transform matrices with a corresponding weighting matrix of the same block size i.

    [0043] Computing means are configured to process all members of the averaged and weighted transform matrices into a single value per transform block size i, to obtain an intermediate signal Bls[i]. The computing means are further configured to compute a single value Bl.sub.0 by weighting the members of the intermediate signal Bls[i] according to an area in the respective video frame, and adding up the weighted Bls[i]. Hence, a weighted sum Bl.sub.0 of all Bls[i] may be calculated, where the weighting may be according to the area that the transforms of block size i cover in the current video frame, so that the weighting will be proportional to the number of transforms of a specific block size i in the examined frame and the number of pixels that each transform of block size i covers.

    [0044] Processing means may be configured to multiply the intermediate blockiness value Bl.sub.0 with a term dependent on the averaged quantization parameter QP resulting in an intermediate Blockiness value Bl.sub.1, which avoids false blockiness detection at low values of QP, and finally applying an inverse operation like Bl=1/(1+Bl.sub.1) to restrict the range of Bl to values from 0 for “no blockiness” to 1 for “high blockiness”.

    [0045] Thus, the system may be adapted to and comprise components configured to carry out the method steps described above.

    [0046] Exemplary embodiments of the invention provide a method to estimate on a per-frame basis of a compressed video sequence, an indicator value which reflects the assumed blockiness in each frame. This indicator can serve conveniently as an addition to any bitstream-based method of quality estimation, to make its result more precise.

    [0047] As described above, blockiness is the consequence of a lack of high frequency coefficients of the transformed and quantized picture content. The measure therefore relies on a weighted sum of the coefficients of averaged transform-blocks, where the weighting increases with the frequency the transform-coefficient is representing. If the result of such a weighted sum is very small or even zero, this is a sufficient sign that no or very few higher frequency coefficients are present in the transform, and therefore blockiness is likely to be observed.

    [0048] As shown in the embodiment of FIG. 1, the bitstream of a transform coded video signal is obtained (S1) as a first step and the signal is partitioned into video frames. For each video frame, the coded transform coefficients, which are part of the bitstream, are read.

    [0049] In a second step, an averaging of all the read transforms in a frame is performed (S2). Here, only the amplitudes of the coefficients (the absolute values) are taken into account. After this process, for every occurring transform block size i, there is one averaged coefficient matrix available. Intra-blocks in predicted frames are not taken into account, because they tend to have different statistical properties.

    [0050] In a third step, weighting matrices are made available (S3), one for every occurring transform block size. For a convenient description, the elements of the averaged coefficient-matrix and the weighting matrix are indexed by the two indexes for the matrix-row r=1 . . . size.sub.r, and the matrix column c=1 . . . size.sub.c, where size.sub.r and size.sub.c are the size of the transform in vertical and horizontal direction. For the time being, both sizes have the possible values 4, 8, 16, and 32. Larger sizes are possible, but not yet used in any available video encoding algorithm. Consequently, the number size.sub.max=32 for now.

    [0051] The weighting matrices may be designed in order to amplify high frequency coefficients and diminish low frequency coefficients. Matrices with logarithmic, linear or quadratic characteristics are recommended solutions, which are exemplified below.

    [0052] In established video compression standards, like MPEG-2, H.264, HEVC or VP9, two dimensional transforms are defined as a square. Only the recently introduced AV1 uses a combination of different vertical and horizontal sizes.

    [0053] Apart from the rule that the higher the frequency of a coefficient, the higher the weighting, different weighting-algorithms are conceivable. Three different algorithms to create a possible weighting-table are described below. Other possibilities include predefined “hand-filled” or automatically optimized tables.

    a) Linear weighting: the weighting increases linearly with the frequency a coefficient represents. FIGS. 2A and 2B show exemplary weighting matrices.
    The weighting-factors W for the two-dimensional weighting-matrix are defined by:


    if((r==1)&&(c==1))


    W.sub.lin(r,c)=0


    else


    W.sub.lin(r,c)=r*(size.sub.max/size.sub.r)+c*(size.sub.max/size.sub.c)/2*size.sub.max

    With this formula, the diagonal lines from top right to bottom left—which represent coefficients with similar frequency dependency—are assigned the same value, raising from 0.0 in the upper left corner to 1.0 in the bottom right corner. In FIGS. 2A and 2B this is exemplified for 4×4 and 8×8 matrices.

    b) Quadratic Weighting:

    [0054]
    W.sub.sqr(r,c)=W.sub.lin(r,c)*W.sub.lin(r,c)

    With this weighting low frequency coefficients are even weighted less than with linear weighting. FIGS. 3A and 3B exemplify this for 4×4 and 8×8 matrices.
    c) Logarithmic weighting:


    W.sub.log(r,c)=ln(a*W.sub.lin(r,c)+1)/b

    As an example, in FIGS. 4A and 4B, logarithmic weighting is exemplified for a=9 and b=2.3. Using logarithmic weighting, coefficients with medium frequency are weighted stronger than with linear weighting.

    [0055] According to FIG. 1, in a fourth step, the weighting process takes place (S4). For this purpose, the averaged absolute coefficient-matrices (one for each transform block size) are multiplied value by value with the corresponding weighting matrix values.

    [0056] In a fifth step the members of the resulting weighted matrices are processed (S5) into one value Bls[i] per transform-size i. Bls may be called an intermediate signal. Processing may be performed by adding up all members of each averaged and weighted transform matrix into one value Bls[i] per transform-size i. For normalizing purposes, these results Bls[i] may then be divided by the number of nonzero coefficients in the corresponding averaged coefficient matrix.

    [0057] In a sixth step the intermediate results Bls[i], are then merged into one final result. For this process, the intermediate results Bls[i] are weighted (S6) according to the area the corresponding transforms of block size i cover in the video frame, which is proportional to the number of transforms of that specific size i in the examined frame, NTr[i], and the number of pixels each transform covers.


    Bl.sub.0=Bls[4×4]*NTr[4×4]+4*Bls[8×8]*NTr[8×8]+16*Bls[16×16]*NTr[16×16] . . . .

    The resulting Bl.sub.0 is a good indicator for a lack of high frequency transform-coefficients and thus for blockiness. Still, as mentioned above, there is the possibility that even with relatively low values of QP, no high frequency coefficients need to be transmitted, without any visible blockiness.

    [0058] Each video frame may be coded with a different transform size i. The “area” denotes the area which theses transforms with size i cover in a video frame. It may be computed from the block size of the transformation (e.g. block size 16, i.e. area=16×16 pixel) and the number of transforms with that size (NTr[i]) in the video frame.

    [0059] In a seventh step the result Bl.sub.0 may therefore be made more reliable. Since the probability of blockiness is reduced with a decreasing value of QP, this effect can be taken into account, by multiplying Bl.sub.0 with a term C(QP) (S7), which is 1.0 for the maximum QP value, QP_Max, (with the maximum probability for blockiness) and which increases exponentially for lower values of QP.


    C=exp(5.0*((QP_Max−QP)/QP_Max))


    Bl.sub.1=Bl.sub.0*C

    As a consequence, Bl.sub.1 is only marginally influenced by C(QP) for that range of QP, where blockiness has a high probability, but is increased significantly for small values of QP.

    [0060] In a final step Bl.sub.1 is transformed (S8) in order to make it more convenient to be used. The measure Bl.sub.1 gets smaller and smaller, the fewer high frequency coefficients are present in a video frame. For a video frame with strong blockiness, it will generally be zero or very close to zero. For high-quality video it can reach values of several hundreds. This property makes the parameter unfavourable for any further use in a MOS-model. Therefore, Bl.sub.1 may be converted to


    Bl=1.0/(1.0+Bl.sub.1)

    This final version of the blockiness measure, Bl, is a real number and varies in the range of [0, 1], where 0 is the lowest blockiness and 1 is maximal blockiness. For higher picture quality, it quickly decreases to values close to zero.

    [0061] The present invention also relates to a system for carrying out the method described above. Furthermore, a data processing apparatus comprising components for carrying out the method described above is also encompassed by the invention.

    [0062] Other aspects, features, and advantages will be apparent from the disclosure of the present application.

    [0063] While subject matter of the present disclosure has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. Any statement made herein characterizing the invention is also to be considered illustrative or exemplary and not restrictive as the invention is defined by the claims. It will be understood that changes and modifications may be made by those of ordinary skill within the scope of the following claims. In particular, the present invention covers further embodiments with any combination of features from different embodiments described above and below.

    [0064] The terms used in the claims should be construed to have the broadest reasonable interpretation consistent with the foregoing description. For example, the use of the article “a” or “the” in introducing an element should not be interpreted as being exclusive of a plurality of elements. Likewise, the recitation of “or” should be interpreted as being inclusive, such that the recitation of “A or B” is not exclusive of “A and B,” unless it is clear from the context or the foregoing description that only one of A and B is intended. Further, the recitation of “at least one of A, B and C” should be interpreted as one or more of a group of elements consisting of A, B and C, and should not be interpreted as requiring at least one of each of the listed elements A, B and C, regardless of whether A, B and C are related as categories or otherwise. Moreover, the recitation of “A, B and/or C” or “at least one of A, B or C” should be interpreted as including any singular entity from the listed elements, e.g., A, any subset from the listed elements, e.g., A and B, or the entire list of elements A, B and C.

    [0065] Furthermore, in the claims the word “comprising” does not exclude other elements or steps, and the indefinite article “a” or “an” does not exclude a plurality. A single unit may fulfil the functions of several features recited in the claims. The terms “essentially”, “about”, “approximately” and the like in connection with an attribute or a value particularly also define exactly the attribute or exactly the value, respectively. Any reference signs in the claims should not be construed as limiting the scope.