Dithering for chromatically subsampled image formats
09762876 · 2017-09-12
Assignee
Inventors
Cpc classification
G09G2340/02
PHYSICS
H04N9/646
ELECTRICITY
H04N19/90
ELECTRICITY
G09G3/2048
PHYSICS
H04N1/646
ELECTRICITY
H04N9/68
ELECTRICITY
International classification
G09G3/20
PHYSICS
H04N9/68
ELECTRICITY
H04N19/90
ELECTRICITY
H04N1/64
ELECTRICITY
Abstract
Dithering techniques for images are described herein. An input image of a first bit depth is separated into a luma and one or more chroma components. A model of the optical transfer function (OTF) of the human visual system (HVS) is used to generate dither noise which is added to the chroma components of the input image. The model of the OTF is adapted in response to viewing distances determined based on the spatial resolution of the chroma components. An image based on the original input luma component and the noise-modified chroma components is quantized to a second bit depth, which is lower than the first bit depth, to generate an output dithered image.
Claims
1. A method to dither images, the method comprising: by apparatus comprising one or more data processors configured by software, one or more programmable logic devices, one or more logic circuits or a combination thereof: receiving an input image in a first bit depth, the input image represented in a first color format comprising a luma component and one or more chroma components; generating white noise; filtering the white noise according to the inverse of an optical transfer function (OTF) of the human visual system (HVS) to generate dither noise, wherein the OTF is adapted in response to at least a horizontal viewing distance or a vertical viewing distance, the horizontal and vertical viewing distances being based on spatial resolution of the chroma components; and adding the dither noise to the one or more chroma components to generate noise-modified chroma components.
2. The method of claim 1, further comprising adding the dither noise to the luma component to generate a noise-modified luma component.
3. The method of claim 1, further comprising: converting the luma component and the noise-modified chroma components to a second image in a second color format; and quantizing the second image to generate an output dithered image in a second bit depth, wherein the second bit depth is lower than the first bit depth.
4. The method of claim 2, further comprising: quantizing the noise-modified luma component and the one or more noise modified chroma components to generate an output dithered image in a second bit depth, wherein the second bit depth is lower than the first bit depth.
5. The method of claim 1, further comprising: quantizing the luma component and the noise-modified chroma components to generate an output dithered image in a second bit depth, wherein the second bit depth is lower than the first bit depth.
6. The method of claim 1 wherein the white noise comprises two or more separate white noise outputs, wherein each noise output is generated using a different seed number.
7. The method of claim 1, wherein a model of the OTF comprises the function
8. The method of claim 7, wherein d=3 mm.
9. The method of claim 7, wherein the OTF model is adapted to the digital image domain by converting frequencies between cycles/degrees to cycles/pixel by using a conversion function
10. The method of claim 1, wherein at least one chroma component of the input image comprises a spatial resolution that is lower than the spatial resolution of the luminance component of the input image.
11. The method of claim 10, wherein the horizontal resolution of at least one color component of the input image is half of the horizontal resolution of the luminance component of the input image.
12. The method of claim 1, wherein the first color format comprises a 4:2:2 or 4:2:0 YCbCr color format.
13. The method of claim 6 wherein each of the components of the white noise output is filtered separately to generate one or more dither noise components.
14. The method of claim 13 further comprising: scaling a first dither noise component by a first noise scale factor to generate a first scaled dither noise component; and scaling a second dither noise component by a second noise scale factor to generate a second scaled dither noise component, wherein the first and second scale factors are determined in response to local region analysis of the input image.
15. The method of claim 14, wherein the local region analysis comprises detecting whether at least a pixel of the input image is within one or more predetermined bounds of pixel component values.
16. The method of claim 4 wherein before converting the luma component and the noise-modified chroma components to the output dither image in the second bit depth, the noise-modified chroma components are up-scaled to match the spatial resolution of the luma component.
17. The method of claim 1 wherein the first color format is a linearized color format.
18. The method of claim 17, wherein the linearized color format is linearized YCbCr, LMS, or CIE XYZ.
19. The method of claim 1, wherein the steps up to generating the dithering noise are computed off-line by the processor, while the remaining steps are computed in real-time by the processor or a second processor.
20. A method to dither images, the method comprising: by apparatus comprising one or more data processors configured by software, one or more programmable logic devices, one or more logic circuits or a combination thereof: receiving an input image in a first bit depth, the input image represented in a first color format comprising a luma component in a first spatial resolution and one or more chroma components in a second spatial resolution, the second spatial resolution being lower than the first spatial resolution in at least a vertical or a horizontal direction; generating white noise; filtering the white noise according to the inverse response of a first low-pass filter to generate chroma dither noise, wherein the response of the first low-pass filter is adapted in response to the second spatial resolution; filtering the white noise according to the inverse response of a second low-pass filter to generate luma dither noise, wherein the response of the second low-pass filter is adapted in response to the first spatial resolution; adding the luma dither noise to the luma component to generate a noise-modified luma component; adding the chroma dither noise to the one or more chroma components to generate noise-modified chroma components; and generating an output dithered image in a second bit depth in response to quantizing the noise-modified luma component and the noise-modified chroma components, wherein the second bit depth is lower than the first bit depth.
21. The method of claim 20, wherein the responses of the first low-pass filter and the second low-pass filter match the response of a model of an optical transfer function of the human visual system.
22. An apparatus comprising a processor and configured to perform the method of claim 1.
23. A non-transitory computer-readable storage medium having stored thereon computer-executable instructions for executing a method in accordance with claim 1.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) An embodiment of the present invention is illustrated by way of example, and not in way by limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
(2)
(3)
(4)
DESCRIPTION OF EXAMPLE EMBODIMENTS
(5) Dithering techniques for images are described herein. A model of the optical transfer function (OTF) of the human visual system (HVS) is used to shape noise which is added to the color components of a chromatically subsampled video signal. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are not described in exhaustive detail in order to avoid unnecessarily obscuring the present invention.
(6) Overview
(7) Example embodiments described herein relate to the dithering of images. An input image of a first bit depth is separated into a luma and one or more chroma components. A model of the optical transfer function (OTF) of the human visual system (HVS) is used to generate dither noise which is added to the chroma components of the input image to generate noise-modified chroma components. The dither noise is generated by filtering pseudo-random white noise using a filter that is based on the inverse of the OTF, where the model of the OTF is adapted in response to viewing distances determined based on the spatial resolution of the subsampled chroma components.
(8) In one embodiment, the dithered output image is generated by quantizing the input luma component and the noise-modified chroma components to a second bit depth which is lower than the first bit depth.
(9) In another embodiment, the input luma signal component and the noise-modified chroma components are converted to a second image in a second color format before being quantized to the second bit depth to generate the dithered output image.
(10) In some embodiments the input image signal is converted to a linearized color space before being dithered.
(11) In some embodiments the OTF model is based on the Deeley OTF model for a fixed pupil size.
(12) In some embodiments the filtered white noise is further adjusted according to local region analysis of the input image before being added to the chroma components.
Example of OTF-Based Dithering Noise Generation
(13) Given, a sequence of N-bit input images (e.g., N=8 bits) to be quantized down to a P-bit images, where P<N bits, during digital dithering, noise is added to these images before the lower bits are dropped or quantized. The basic engineering trade-off is to add as much noise as possible to enable the preservation of as much effective perceptual bits through the bit depth reduction process, yet have the noise itself be invisible. The invisibility depends primarily on display and viewing distance parameters. In an embodiment, the noise characteristics of the noise source used in image dithering are determined based on a model of the optical transfer function (OTF) of the human visual system (HVS).
(14) The OTF of the HVS, from now on to be denoted simply as OTF, is a strictly low-pass function, and is thus a better representative to an averaging process than the contrast sensitivity function (CSF) of the HVS. Further, the OTF is a linear shift-invariant filter in the linear luminance domain, whereas the CSF is a complex nonlinear process, sometimes modeled as a combination process of filters and amplitude nonlinearities. Lastly, the majority of the high-frequency attenuation of the CSF is caused by the OTF.
(15) In an example embodiment, the dither noise may be spectrally shaped so that it is the inverse of the OTF, so that the behavior of the visual system's OTF will result in a perceptually uniform noise, equally visible at all frequencies. This will give the maximum noise variance for any level of visibility. In general, the design approach is to keep the dither noise equally invisible for all frequencies. That is, the OTF's effect on the noise is precompensated, so the noise reaching the retina is a white noise.
(16) The OTF can be modeled with several functions, arising from different data sets, and the particular version used is not important. For example, Williams et al, in “Double-pass and interferometric measures of the optical quality of the eye,” JOSA A 11.12 (1994): 3123-3135, incorporated herein by reference, describe the OTF using the following equations:
(17)
where s is visual frequnecy in cy/deg, and the equation parameters are a=0.1212, w.sub.1=0.3481, and w.sub.2=0.6519.
(18) Another common OTF representation is described in Deeley, Robin J., Neville Drasdo, and W. Neil Charman. “A simple parametric model of the human ocular modulation transfer function.” Ophthalmic and Physiological Optics 11.1 (1991): 91-93, which is incorporated herein by reference in its entirety.
(19) The advantage of the Deeley model is that it is parameterized for pupil size, d. The equation for this OTF is given by:
(20)
Where f is the spatial frequency in cycles per degrees (cy/deg) and d is the pupil size in millimeters (mm).
(21) In some embodiments, instead of using the OTF function, one may apply any low-pass filter defined withing the frequency spectrum of the HVS (e.g., 0 to 60 cy/dec). Then the 1/OTF noise filter (220) may be represented by any high-pass filter defined withing the frequency spectrum of the HVS.
(22) In order to apply the OTF to the digital image domain, the frequencies need to be converted from cy/deg to cy/pixel. (Note that 0.5 cy/pixel is the Nyquist folding frequency, that is, the maximum possible frequency that can be carried in a digital image). The following equations are used to convert between the visual spatial frequencies, given in cy/deg, to physical frequencies, such as cy/mm, or the digital frequencies, in cy/pixel:
(23)
where D is the viewing distance, measured either in the same units as the physical frequencies in equation (3), (e.g., in mm) or in pixels (see equation (4)). For example, when viewing full high-definition (HD) television (e.g., using a 1920×1080 pixel resolution), at the common three picture heights viewing distance (3H), D=3×1080=3240 pixels.
Spectral Shaping for Chromatic Subsampling
(24) In an example embodiment, dithering is applied to signals with a luminance/chrominance (also to be referred as luma/chroma) representations of a video signal, such as YCbCr, YUV, Lab, and the like. In such representations, the majority of the luminance information is carried in the Y or L signals. The chrominance components signals (e.g., CbCr) carry very little luminance information, and are referred to as being approximately isoluminant, or pseudo-isoluminant.
(25) In an embodiment, dither noise is added solely to the pseudo-isoluminant signals, such as Cr and Cb, and not to the luminance component. This is because the visibility of noise and other spatial patterns is much higher in the luminance channel than the chromatic channels.
(26) In an embodiment, the spectrum of the noise is shaped based on the spatial resolution of the chroma components, which may be different than the spatial resolution of the luma component. For example, using a 4:2:2 chromatic sampling, chroma components are sub-sampled by a factor of two in the horizontal direction. For example, a 1920×1080 video signal may comprise a luma signal component (e.g., Y) at a 1920×1080 spatial resolution and chroma signal components (e.g., Cb and Cr) at a 960×1080 spatial resolution. In another example, using 4:2:0 chromatic sampling, chroma components are sub-sampled by a factor of two in both the horizontal and vertical directions. Then a 1920×1080 video signal may comprise chroma components at a 960×540 spatial resolution.
(27) In an embodiment, the noise is added in the subsampled domain. Before finally being displayed, the chromatically subsampled image is upsampled as needed for display (e.g., to a 4:4:4 representation) and generally is converted from a luma/chroma representation to an RGB representation to drive the display primaries. The dither noise, which is added to the signal, also undergoes the same process and its spectrum is altered by the geometry of chromatic upscaling process. Therfore, it is designed to be compensated by the upscaling process such that it has the desired spectal shape after the upsampling process. The desired spectral shape is the inverse of the OTF, as described previously.
(28)
(29) As depicted in
(30) As depicted in
(31) Using separate OTF models for the vertical and horizontal frequencies (e.g., OTF(f.sub.h, d) and OTF(f.sub.w, d)) based on equation (2)), one may generate a 2-D OTF model (e.g., OTF(f.sub.h, f.sub.w)). In some embodiment, the 2-D model may be Cartesian separable (e.g., OTF(f.sub.h, f.sub.w)=OTF(f.sub.h, d)*OTF(f.sub.w, d)). In some other embodiments, the joint model may be Polar-separable.
(32) In general, the OTF function is rotationally symmetric, hence it is Polar separable; however, a Cartesian model may work as well due to variations across humans. In an example embodiment, let
r=√{square root over (f.sub.w.sup.2+f.sub.h.sup.2)}, (5)
then OTF(f.sub.h, f.sub.w)=OTF(r, d) of equation (2).
(33) Step (235) represents a two-dimensional white (e.g. Gaussian) noise generator. In a preferred embodiment the noise generator generates distinct noise outputs (237-A and 237-B) for each of the chroma channels using two separate pseudo-random generator seeds. In an example embodiment, for noise output within the (0,255) range, the noise generator may generate white noise with σ=30.
(34) Each of the outputs (237-A, 237-B) of the noise generator (235) is filtered using a filter based on the inverse of the OTF (that is, 1/OTF). Filtering may be performed either in the spatial domain or in the frequency domain. Since the OTF (215) is typically in the frequency domain, filtering in the frequency domain comprises a) transforming the output (237) of the noise generator in the frequency domain, say by using a Fast Fourier Transform (FFT) b) multiplying the output of the transform step by the inverse OTF, and c) transforming the product back to the spatial domain by using an inverse transform, such as an inverse FFT. Hence, the outputs (222 and 224) of the noise filter (220) represent two sets of 2D dither noise patterns to be added to the chroma components (e.g., 202-Cr and 202-Cb) of the input signal to be dithered (e.g., signal 202).
(35) As depicted in
(36) If input pixel is “full red” then W.sub.Cr=c1*W.sub.Cr and W.sub.Cb=c2*W.sub.Cb;
where c1 and c2 are predetermined scalers (e.g. c1=0 and c2=2.0).
(37) In some embodiment, the local region analysis (225) may operate in a color domain (e.g., RGB) different than the input color domain (e.g., YCbCr). In some other embodiments, the local region analysis (225) may operate in the same color domain as the input color domain (e.g., YCbCr).
(38) After optional weighting (230-A and 230-B), dithering noise is added to the chroma components (202-Cb and 202-Cr) of the input signal (202) to generate noise-modified color components (246-Cb, 246-Cr). Dithering noise (222 and 224) may be represented in a smaller bit-depth than the input signal bit-depth. In an embodiment, noise and signal are added by aligning the least-significant bits of the two signals.
(39) Following the addition of the dither noise, the original luma signal (202-Y) and the noise-modified chroma signals (246-Cb and 246-Cr) may be quantized directly, or they may be converted first to a display-dependent color space (e.g., RGB) in step (245). This step may also comprise up-sampling of the chroma components to match the resolution of the luma component. Following (optional) color conversion (245), its output (247), may be quantized to the desired bit-depth (e.g., P<N) for each color component using any of known in the art quantization schemes to generate the output dithered image (249). The addition of the filtered noise to the sub-sampled chroma components signals, the subsequent conversion back to RGB, and the truncation of bit-depth in RGB are shown as the real-time process in the bottom half of
(40) In some embodiments, the input signal to be dithered (e.g., 202) may be in a gamma-corrected domain, which is approximately a power function of luminance (e.g., 1/2.2). In some embodiments, to take advantage from the fact that the OTF filtering process acts like a linear filter in the linear luminance domain, an additional signal-linearization step (not shown) may precede the noise-adding steps (240-A and 240-B). In such embodiments (not shown), signal (202) may be generated by a) converting an original input YCbCr signal to RGB, b) applying an inverse gamma function to the RGB signal, and c) converting back to linear YCbCr (e.g., Y′Cb′Cr′). In some other embodiments, the dither noise addition steps may be performed in an alternative linearized color space (e.g., LMS or CIE XYZ). In such embodiments, the color conversion step (245) may be adapted as needed (e.g., LMS to RGB or XYZ to RGB).
(41)
(42) As depicted in
(43) Filter (220) comprises now three separate 1/OTF filters, one for each color component, each filter filtering white noise generated by 2D spatial white noise generator (235) discussed earlier. Noise generator (235) may use now three distinct seeds, one for each of the color components.
(44) As depicted in
(45) Following the addition of the dither noise, the modified luma signal (246-Y) and the noise-modified chroma signals (246-Cb and 246-Cr) are quantized by quantizer (250) to the desired bit-depth (e.g., P<N) using any of known in the art quantization schemes to generate the output dithered image (252).
(46) In some embodiments, quantization (250) may be followed by optional color conversion (245) (e.g., YCbCr to RGB) to a color domain suitable for display or other post-processing to generate dithered signal (254). This step may also comprise up-sampling of the chroma components to match the resolution of the luma component.
(47) In some embodiments, as depicted in
Example Computer System Implementation
(48) Embodiments of the present invention may be implemented with a computer system, systems configured in electronic circuitry and components, an integrated circuit (IC) device such as a microcontroller, a field programmable gate array (FPGA), or another configurable or programmable logic device (PLD), a discrete time or digital signal processor (DSP), an application specific IC (ASIC), and/or apparatus that includes one or more of such systems, devices or components. The computer and/or IC may perform, control or execute instructions relating to image dithering, such as those described herein. The computer and/or IC may compute any of a variety of parameters or values that relate to image dithering as described herein. The image dithering embodiments may be implemented in hardware, software, firmware and various combinations thereof.
(49) Certain implementations of the invention comprise computer processors which execute software instructions which cause the processors to perform a method of the invention. For example, one or more processors in a display, an encoder, a set top box, a transcoder or the like may implement methods for image dithering as described above by executing software instructions in a program memory accessible to the processors. The invention may also be provided in the form of a program product. The program product may comprise any medium which carries a set of computer-readable signals comprising instructions which, when executed by a data processor, cause the data processor to execute a method of the invention. Program products according to the invention may be in any of a wide variety of forms. The program product may comprise, for example, physical media such as magnetic data storage media including floppy diskettes, hard disk drives, optical data storage media including CD ROMs, DVDs, electronic data storage media including ROMs, flash RAM, or the like. The computer-readable signals on the program product may optionally be compressed or encrypted.
(50) Where a component (e.g. a software module, processor, assembly, device, circuit, etc.) is referred to above, unless otherwise indicated, reference to that component (including a reference to a “means”) should be interpreted as including as equivalents of that component any component which performs the function of the described component (e.g., that is functionally equivalent), including components which are not structurally equivalent to the disclosed structure which performs the function in the illustrated example embodiments of the invention.
Equivalents, Extensions, Alternatives and Miscellaneous
(51) Example embodiments that relate to image dithering are thus described. In the foregoing specification, embodiments of the present invention have been described with reference to numerous specific details that may vary from implementation to implementation. Thus, the sole and exclusive indicator of what is the invention, and what is intended by the applicants to be the invention, is the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Hence, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.