Determining a quality measure for a processed video signal

Abstract

A method of determining a quality measure for a processed video signal generated from an original video signal. A statistical metric for a value for a set of pixels of the original video signal is determined, and the statistical metric for the value for a corresponding set of pixels of the processed video signal is also determined. The quality measure for the processed video signal is then determined by comparing the statistical metrics for the original video signal and the processed video signal.

Claims

1. A method of determining a quality measure for a processed video signal generated from an original video signal, the method comprising: determining a statistical metric for a value for a set of pixels of the original video signal; determining the statistical metric for the value for a corresponding set of pixels of the processed video signal; comparing the statistical metric for the original video signal with the statistical metric for the processed video signal to determine a quality measure for the processed video signal that approximates a peak signal-to-noise ratio of the processed video signal; determining a level of clipping for the set of pixels of the original video signal having a luminance value that falls outside a clipping value range; determining a level of clipping for the corresponding set of pixels of the processed video signal having a luminance value that falls outside the clipping value range; calculating a difference between the determined level of clipping between the original video signal and the determined level of clipping between the processed video signal; and modifying the quality measure for the processed video signal using the calculated difference between the respective levels of clipping for the original video signal and the processed video signal.

2. A method as claimed in claim 1, wherein the statistical metric for the value for the set of pixels of the original video signal and the processed video signal is a standard deviation of the value for the respective set of pixels.

3. A method as claimed in claim 1, wherein the value for a pixel is the luminance of the pixel.

4. A method as claimed in claim 3, further comprising determining the quality measure using a predetermined function that relates the statistical metrics to values for the peak signal-to-noise ratio.

5. A method as claimed in claim 1, further comprising modifying the quality measure for the processed video signal using a measure indicative of a number of distinct values in the set of pixels of the original video signal and/or the processed video signal.

6. A method as claimed in claim 5, wherein the measure is an entropy of the value for the set of pixels of the original video signal and/or the processed video signal.

7. A method as claimed in claim 1, wherein the set of pixels corresponds to a region of an image of the original video signal and processed video signal.

8. A method as claimed in claim 7, further comprising: determining the quality measure for a plurality of regions of the original video signal and the processed video signal, and determining the quality measure by comparing the statistical metrics for the original video signal and the processed video signal for each region.

9. A method as claimed in claim 1, further comprising determining an average quality measure from an average of the quality measures for a predetermined time period.

10. A method as claimed in claim 1, further comprising adding the determined statistical metric to fingerprint data for the original video signal and/or processed video signal.

11. A video signal fingerprint generator for use in a method as claimed in claim 1, configured to: receive an input video signal; determine a statistical metric for a value for a set of pixels of the video signal; and output fingerprint data for the input video signal; wherein the fingerprint data includes the determined statistical metric for the input video signal.

12. A video signal quality measure determiner for use in a method as claimed in claim 1, configured to: receive a statistical metric for a value for a set of pixels of an original video signal; receive the statistical metric for the value for a corresponding set of pixels of a processed video signal; and determine the quality measure for the processed video signal by comparing the statistical metrics for the original video signal and the processed video signal.

13. A method as claimed in claim 1, wherein the determining of the level of clipping for the set of pixels of the original and the processed video signals comprises determining a portion of the respective sets of pixels having the luminance value that falls outside the clipping value range.

14. A system for determining a quality measure for a processed video signal generated from an original video signal, comprising: a first video signal fingerprint generator configured to receive the original video signal as an input video signal and output fingerprint data for the input video signal that includes a statistical metric for a value for a set of pixels of the original video signal and a level of clipping for the set of pixels of the original video signal that have a luminance value that falls outside a clipping value range; a second video signal fingerprint generator configured to receive the processed video signal as an input video signal and output fingerprint data for the input video signal that includes a statistical metric for a value for a set of pixels of the processed video signal and a level of clipping for the set of pixels of the processed video signal that have a luminance value that falls outside a clipping value range; and a video signal quality measure determiner configured to: receive the fingerprint data output by the first and second video signal fingerprint generators; receive the statistical metric for the value for the set of pixels of the original video signal; receive the statistical metric for the value for the corresponding set of pixels of the processed video signal; compare the statistical metric for the original video signal with the statistical metric for the processed video signal to determine a quality measure for the processed video signal that approximates a peak signal-to-noise ratio of the processed video signal; obtain, from the fingerprint data received from the first video signal fingerprint generator, the level of clipping for the set of pixels of the original video signal having the luminance value that falls outside a clipping value range; obtain, from the fingerprint data received from the second video signal fingerprint generator, the level of clipping for the corresponding set of pixels of the processed video signal having the luminance value that falls outside the clipping value range; calculate a difference between the determined level of clipping of the original video signal and the determined level of clipping of the processed video signal, and modify the quality measure for the processed video signal using the calculated difference between the respective levels of clipping for the original video signal and the processed video signal.

15. The system as claimed in claim 14, wherein the statistical metric for the value for the set of pixels of the original video signal and the processed video signal is a standard deviation of the value for the respective set of pixels.

16. The system as claimed in claim 14, wherein the value for a pixel is the luminance of the pixel.

17. The system as claimed in claim 16, wherein the video signal quality measure determiner is further configured to determine the quality measure using a predetermined function that relates the statistical metrics to values for the peak signal-to-noise ratio.

18. The system as claimed in claim 14, wherein the video signal quality measure determiner is further configured to modify the quality measure for the processed video signal using a received measure indicative of a number of distinct values in the set of pixels of the original video signal and/or the processed video signal.

19. The system as claimed in claim 18, wherein the measure is an entropy of the value for the set of pixels of the original video signal and/or the processed video signal.

20. The system as claimed in claim 14, wherein the set of pixels corresponds to a region of an image of the original video signal and processed video signal, and wherein the video signal quality measure determiner is further configured to: determine the quality measure for a plurality of regions of the original video signal and processed video signal, and determine the quality measure by comparing the statistical metrics for the original video signal and the processed video signal for each region.

21. The system as claimed in claim 14, wherein the video signal quality measure determiner is further configured to determine the level of clipping for the set of pixels of the original and the processed video signals by determining a portion of the respective sets of pixels having the luminance value that falls outside the clipping value range.

Description

DESCRIPTION OF THE DRAWINGS

(1) Embodiments of the present invention will now be described by way of example only with reference to the accompanying schematic drawings of which:

(2) FIG. 1 shows a video processing system in accordance with a first embodiment of the invention;

(3) FIG. 2 is a flowchart showing the operation of the video signal fingerprint generators of FIG. 1;

(4) FIG. 3 is an image from a video signal split into a plurality of regions; and

(5) FIG. 4 is a graph plotting approximated values against actual PSNR values for a test set of video signals.

DETAILED DESCRIPTION

(6) An embodiment of the invention is now described, with reference to FIGS. 1 to 4.

(7) FIG. 1 shows a video processing system 1 in which the embodiment of the invention is implemented. The video processing system may be a part of a system for producing the video content for a television station, for example. The video processing system 1 takes an original video signal V.sub.in, which is processed by a video signal processor 2, to produce a processed video signal V.sub.out. It will be appreciated that the video signal processor 2 may in practice comprise multiple processing systems that each perform one or more processing operations on the original video signal as it passes through the video processing system 1.

(8) The video processing system 1 further comprises a first video signal fingerprint generator 3, which receives the original video signal V.sub.in before it is passed to the video signal processor 2, and a second video signal fingerprint generator 4, which receives the processed video signal V.sub.out after it has been processed by the video signal processor 2.

(9) As discussed in more detail below, each of the first video signal fingerprint generator 3 and second video signal fingerprint generator 4 generates a stream of low bandwidth fingerprint data from its respective video signal, and passes that fingerprint data to a correlator 5. Again as discussed in more detail below, the correlator 5 analyses the two streams of fingerprint data, and then passes them to a PSNR estimator 6, which determines an estimation of the PSNR of the video signals.

(10) The operation of the first and second video signal fingerprint generators 3 and 4 of FIG. 1 is now described in more detail with reference to the flowchart of FIG. 2. Each of the video signal fingerprint generators operates in an identical fashion.

(11) First, the video signal fingerprint generator receives an input video signal (step 101). The fingerprint generator determines conventional fingerprint data from the input video signal (step 102), for example as described in WO 2009/104022 A2 published 27 Aug. 2009. The conventional fingerprint data is used by the correlator 5 as described below.

(12) The fingerprint generator then determines the standard deviation of the luminance of the pixels of the input video signal (step 103). In fact, the standard deviation is determined separately for a plurality of regions of an image of the input video signal, as shown in FIG. 3. As can be seen, the a frame of the input video signal (or in the case of an interlaced signal, a field of the input video signal) is divided into four regions 201, 202, 203 and 204. In other embodiments, the frame may be divided into different regions, which may be different in number and/or shape. In still other embodiments, pixels from a plurality of frames are used to determine the standard deviation.

(13) The square of the standard deviation of each region is calculated from the variance of each region, expressed as a first accumulator that accumulates the squares of the pixel luminance values, and the square of a second accumulator that accumulates the pixel values themselves:

(14) $σ^{2} = (\frac{1}{N} {.Math.}_{n = 0}^{N - 1} Y_{n} \times Y_{n}) - {(\frac{1}{N} {.Math.}_{n = 1}^{N - 1} Y_{n})}^{2}$

(15) (The advantage of using this approach is that it allows the result to be obtained using only one pass of the data.)

(16) The standard deviation is then the square root of the calculated square of the standard deviation:
σ=√{square root over (σ.sup.2)}

(17) Next, the fingerprint generator determines the entropy for each region (step 104). The entropy is calculated from a histogram of luminance values for the region, where the histogram for each region has 256 “bins” b.sub.0 to b.sub.255, as:

(18) $e = \log_{2} (C) - \frac{1}{C} {.Math.}_{n = 0}^{255} b_{n} \log_{2} (b_{n})$
where C is the number of pixels in the region and b.sub.n is the bin occupancy for bin b.sub.n. (256 bins are used in the case that the data has 8 bits; it will be appreciated that a different number of bins could be used, particularly in the case that the data has a different number of bits.)

(19) The fingerprint generator then determines the clip values for each region (step 105). The clip values for each region can be determined from the same histogram used to determine the entropy, as:

(20) $\begin{matrix} Y^{lo} = {.Math.}_{n = 0}^{Y^{lo}} b_{n} \approx \frac{C}{64} \\ Y^{hi} = {.Math.}_{n = Y^{hi}}^{255} b_{n} \approx \frac{C}{64} \end{matrix}$
where Y.sup.lo and Y.sup.hi are the lower and upper bounds of the allowed luminance values, and the ≈ sign indicates the first bin encountered in each case for which the bin occupancy is greater than C/64. (It will be appreciated that a number other than 64 could be used.)

(21) Once the various values for the input video signal have been determined, they are combined to generate fingerprint data for the video signal (step 106), which is then output (step 107).

(22) The determined fingerprint data is provided so that it is available when required for determining quality measures using the video signals. It will be appreciated that in other embodiments the fingerprint data may already be available having been generated elsewhere, and so can the existing fingerprint data rather than needing to be calculated from the pixels of the video signal itself.

(23) In either case, the fingerprint data for the original video signal and the processed video signal is passed to the correlator 5. The correlator 5 uses the conventional fingerprint data to identify corresponding frames of the original video signal and processed video signal, using any appropriate technique, so that errors do not occur due to one of the video signals being delayed with respect to the other, with the result that the standard deviations and other values for different frames are compared.

(24) The correlator 5 then passes the other determined fingerprint data for the original video signal and processed video signal, i.e. the standard deviation, entropy and clip values, to the PSNR estimator 6, so that the PSNR estimator 6 receives the fingerprint data for corresponding frames of the video signals.

(25) The PSNR estimator 6 then uses this data to estimate the PSNR for the processed video signal, as follows. The standard deviation difference (plus corrections) Δs for the processed video signal A and original video signal B is:

(26) $Δ s = \frac{1}{R} {.Math.}_{i = 1}^{R - 1} .Math. S_{i}^{A} - S_{i}^{B} + Δ C_{i}^{lo (A, B)} + Δ C_{i}^{hi (A, B)} .Math. \times (α + \frac{(1 - α)}{2} (1 + \tanh ((E_{i}^{A} + E_{i}^{B} - q) \times s)))$
for regions R of the video signals. The standard deviations S.sub.i for the two video signals are used to determine their difference, and differences in the standard deviations due to clipping are compensated for by the terms ΔC.sub.i.sup.lo(A,B) and ΔC.sub.i.sup.hi(A,B), which give a measure of the difference in the clipping of the values in the regions, and are calculated as:
ΔC.sub.i.sup.lo(A,B)=Y.sub.i.sup.lo(A)−Y.sub.i.sup.lo(B)
ΔC.sub.i.sup.hi(A,B)=Y.sub.i.sup.hi(A)−Y.sub.i.sup.hi(B)
for low clip value Y.sub.i.sup.lo and high clip value Y.sub.i.sup.hi. The entropies E.sub.i are used to compensate for overestimating at low entropies, where α, q and s are parameters determined to be appropriate to give a good result. Example values for the case where the luminance values are 8-bit are 0.2, 6 and 0.5 respectively. The hyperbolic tangent function tank then provides a “soft switch” which is 0.0 when the entropies are 0.0 and 1.0 when they are 8.0, and switches between the values 0.0 and 1.0 at the value q.

(27) The standard deviation difference Δs is then used to give a value y:

(28) $y = \begin{matrix} \log_{10} (Δ s) & if Δ s > 0 \\ 0 & otherwise \end{matrix}$
and this is passed as an argument to an exponential function as follows:

(29) $P (t) = \exp (\frac{(A_{m} y + A_{c}) - 4 y}{B_{m} y + B_{c}})$
where the parameter t indicates that the value is for a particular time t. A.sub.m, A.sub.c, B.sub.m and B.sub.c are appropriate parameters determined from a set of test video signals using standard statistical methods, by determining Δs for each of the test video signals and comparing it to the actual PSNR values for each video signal as calculated by a conventional method. A graph plotting approximated values against actual PSNR values for a test set of video signals is shown in FIG. 4. Example values for the parameters are −14, 40.287, −3.9497 and 11.7182 respectively.

(30) The PSNR at a time t can then be approximated by averaging the values for the surrounding 16 frames, as follows:

(31) 0 $PSNR \approx \frac{1}{16} {.Math.}_{k = - 8}^{7} Clip [P (t + k), 0, 48]$
where the function Clip ensures the values being averaged are within an appropriate range of values (i.e. between 0 and 48 inclusive).

(32) While the present invention has been described and illustrated with reference to particular embodiments, it will be appreciated by those of ordinary skill in the art that the invention lends itself to many different variations not specifically illustrated herein.

Determining a quality measure for a processed video signal

Assignee

Inventors

Cpc classification

Classification Explorer

H04N21/44008

ELECTRICITY

Classification Explorer

H04N21/647

ELECTRICITY

Classification Explorer

H04N5/91

ELECTRICITY

Classification Explorer

H04N17/00

ELECTRICITY

Classification Explorer

G06F17/18

PHYSICS

Classification Explorer

H04N2005/91335

ELECTRICITY

International classification

Classification Explorer

H04N7/12

ELECTRICITY

Classification Explorer

H04N21/647

ELECTRICITY

Classification Explorer

H04N5/91

ELECTRICITY

Classification Explorer

H04N21/44

ELECTRICITY

Classification Explorer

H04N17/00

ELECTRICITY

Classification Explorer

G06F17/18

PHYSICS

Abstract

Claims

Description