Sampled image compression methods and image processing pipeline
11582467 · 2023-02-14
Assignee
Inventors
Cpc classification
G06T1/20
PHYSICS
H04N9/3182
ELECTRICITY
H04N19/1883
ELECTRICITY
H04N19/635
ELECTRICITY
International classification
H04N19/169
ELECTRICITY
G06T1/20
PHYSICS
H04N19/635
ELECTRICITY
H04N9/31
ELECTRICITY
Abstract
A method for processing image or video data is performed in an image processing pipeline. Color filtered mosaiced raw image or video data is received. A one-level wavelet transform of subbands of the color filtered mosaiced raw image or video data to provide LL, HH, LH and HL subbands. The LH and HL subbands are de-correlated by summing and difference operations to provide decorrelated sum and difference subbands. Additional n-level wavelet transformation on the sum and difference subbands and the LL and HH subbands to provide sparsified subbands for encoding. LL and HH and sum subbands are recombined into standard color images e.g., red, green, and blue color components, which are subsequently processed by color correction, white balance, and gamma correction. The sparsified subbands are encoded.
Claims
1. A method for processing image or video data performed in an image processing pipeline, the method comprising: receiving color filtered mosaiced raw image or video data; conducting a one-level wavelet transform of the color filtered mosaiced raw image or video data to provide LL, HH, LH and HL subbands; de-correlating the LH and HL subbands by summing and difference operations to provide decorrelated sum and difference subbands; conducting additional n-level wavelet transformation on the sum and difference subbands and the LL and HH subbands to provide sparsified subbands for encoding, and encoding the sparsified subbands.
2. The method of claim 1, wherein the color filtered mosaiced raw image or video data comprises CFA (color filter array) image data.
3. The method of claim 1, wherein the de-correlating comprises an orthogonal transformation.
4. The method of claim 1, further comprising quantizing the sum and difference subbands and the LL and HH subbands prior to conducting the additional n-level wavelet transformation.
5. The method of claim 1, wherein the de-correlating comprises replacing LH (w.sub.LH.sup.y) and HL (w.sub.HL.sup.y) subband coefficients by decorrelated sum v.sub.s.sup.y and difference v.sub.d.sup.y coefficients, according to the following the following:
6. The method of claim 1, further comprising an initial black level adjustment of the color filtered mosaiced raw image or video data.
7. The method of claim 1, further comprising, prior to n-level wavelet transformation: creating a low resolution canonical color space image from the LL, HH, and sum subbands; correcting the low-resolution image; conducting a luma/chroma decomposition of the low-resolution image.
8. The method of claim 7, wherein the creating approximates a quarter resolution color image directly from coefficients of the LL, HH, and sum subbands.
9. The method of claim 1, wherein the correcting comprises color correction, white balance correction and gamma correction.
10. The method of claim 1, further comprising quantizing the sum and difference subbands and the LL and HH subbands prior to conducting the additional n-level wavelet transformation and wherein the N-level transform comprises a Daubechies 9/7 transform.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
DESCRIPTION OF THE PREFERRED EMBODIMENTS
(6) Preferred methods, pipelines and digital cameras of the invention provide include an efficient lossless or lossy compression that can maximize the quality of color images deconstructed from the decompressed CFA (color filter array) images. Images, as used herein, includes still images and video data, as the invention can be applied to frames of video data that are considered to be images. Preferred embodiments provide a lossless or lossy compression method for raw sensor data and leverage a de-correlated Mallat wavelet transform to generate sparse wavelet coefficients. An experimental system confirmed that methods of the invention improve coding efficiency compared to the standard and the state-of-the-art lossless CFA sampled image and video compression schemes. The wavelet coefficients of CFA sampled images are highly correlated. The present method further makes the correlated wavelet transform sparser. In addition, the invention provides a camera processing pipeline that can maximizing the quality of the color images constructed from the decompressed CFA sampled images and video streams.
(7) The present inventors determined, and rigorous analysis has confirmed, that each one-level wavelet sub band of the CFA sampled image combines low frequency of chrominance and high frequency of luminance components. Lowpass components yield poor compression efficiency because their coefficients are not sparse. The inventors have also determined via analysis that the LH and HL subbands are highly correlated, which is leveraged for the first time in the present invention to enhance compression of raw image data.
(8) Those knowledgeable in the art will appreciate that embodiments of the present invention lend themselves well to practice in the form of computer program products. Accordingly, it will be appreciated that embodiments of the present invention may comprise computer program products comprising computer executable instructions stored on a non-transitory computer readable medium that, when executed, cause a computer to undertake methods according to the present invention, or a computer configured to carry out such methods. The executable instructions may comprise computer program language instructions that have been compiled into a machine-readable format. The non-transitory computer-readable medium may comprise, by way of example, a magnetic, optical, signal-based, and/or circuitry medium useful for storing data. The instructions may be downloaded entirely or in part from a networked computer. Also, it will be appreciated that the term “computer” as used herein is intended to broadly refer to any machine capable of reading and executing recorded instructions. It will also be understood that results of methods of the present invention may be displayed on one or more monitors or displays (e.g., as text, graphics, charts, code, etc.), printed on suitable media, stored in appropriate memory or storage, streamed after encoding, uploaded to the cloud, transmitted via wired or wireless connection, employed in the internet of things, implemented in hardware, integrated circuits, application specific integrated circuits, etc.
(9) Preferred embodiments of the invention will now be discussed with respect to the drawings and experiments used to demonstrate the invention. The drawings may include schematic representations, which will be understood by artisans in view of the general knowledge in the art and the description that follows.
(10) Lossless Compression
(11)
(12) Each one-level wavelet subband of the CFA sampled image combines low frequency of chrominance and high frequency of luminance components. Lowpass components yield poor compression efficiency because their coefficients are not sparse. The inventors note that, even if decomposed by subsequent wavelet transforms, the coefficients w.sub.LL*.sup.α, w.sub.L*L.sup.α, w.sub.LL*.sup.β and w.sub.L*L.sup.β would never achieve the compression rate of v.sub.s.sup.y and v.sub.d.sup.y because the latter have a finer scale wavelet transform. The * denotes the subbands of conjugate wavelet transform coefficients computed using conjugated wavelet filters as described in K. Hirakawa and P. J. Wolfe, “Rewiring filterbanks for local fourier analysis: Theory and practice,” IEEE Trans. Inf. Theory, vol. 57, pp. 5360-5374, July 2011.
(13) In a preferred embodiment, the decorrelator 14 decorrelates w.sub.LH.sup.g and w.sub.HL.sup.g by orthogonal transformation using the bases [1 1].sup.T and [−1 1].sup.T. Considering (3), w.sub.LL*.sup.α in w.sub.LH.sup.y and w.sub.L*L.sup.α in w.sub.HL.sup.y are the transforms of the same chrominance image α using two different wavelet types. As such, they are highly correlated, as evidenced by w.sub.LL*.sup.α plotted against w.sub.L*L.sup.α, as shown in
(14)
(15) The [ . . . ] denotes a floor (rounding down) operation. The coefficients w.sub.LH.sup.y(n) and w.sub.HL.sup.y(n) can be perfectly reconstructed from v.sub.d.sup.y(n) and v.sub.s.sup.y(n) from the following relationships:
(16)
(17) The difference subband v.sub.d.sup.y(n) decorrelates w.sub.LL*.sup.α and w.sub.L*L.sup.α as well as w.sub.LL*.sup.β and w.sub.L*L.sup.β.
(18) The decorrelated coefficient v.sub.d.sup.y(n) consists of bandpass components w.sub.LL*.sup.α−w.sub.L*L.sup.α and w.sub.LL*.sup.β−w.sub.L*L.sup.β, which are approximately zero, and a highpass component w.sub.LH.sup.g−w.sub.HL.sup.g. For this reason, the N-level transform 16 can include a minimal number of additional levels of wavelet transforms. Conventional additional N′-level wavelet transforms can be conducted. A preferred wavelet transform is LeGall 5/3 in lossless and 9/7 biorthogonal wavelet in lossy, where N>>N′. In a preferred embodiment, a LeGall 5/3 transform is used to sparsify v.sub.d.sup.y. Then the transformed v.sub.d.sup.y is encoded by the luminance highpass encoding scheme because v.sub.d.sup.y is dominated by w.sub.HL.sup.g and w.sub.LH.sup.g. The coding efficiency is nonetheless comparable to the fine level wavelet transform coefficients w.sub.HL.sup.g and w.sub.LH.sup.g. The decorrelation of the invention works with any off-the-shelf encoding methods, and experiments demonstrated the effectiveness in both of a JPEG2000 encoder and in HEVC encoder. After pixels are turned into wavelet transform coefficients (or in in the present invention, decorrelated wavelet transform coefficients), the wavelet coefficient values must be turned into “bits.” This is sometimes called the “encoder” or “variable length encoder” or “entropy coder,” which can be JPEG2000. Advantageously, the decorrelation of the present invention is “encoder-agnostic,” and is independent of the entropy encoder that is used. Entropy encoders are most efficient when coding sparse signal. The decorrelated wavelet transform provided by the invention provides a sparse signal, such that encoder requires fewer bits. Preferred embodiments provide a transform that yields a very sparse output, which is a benefit to any entropy encoder.
(19) The sum subband v.sub.s.sup.y represents a combination of low pass components of α and β by the filter LL*−L*L, which is also a lowpass component. The wavelet coefficient v.sub.s.sup.y is thus dominated by the chrominance w.sub.LL.sup.α and w.sub.LL.sup.β and can be treated as a chrominance image. This can be further decomposed by applying an additional LeGall 5/3 transform when using a lossless compression, or 9/7 biorthogonal wavelet transform when using a lossy compression. The N-level wavelet transform of v.sub.s.sup.y is encoded (N>>N′) by the lossless encoding scheme of the chrominance component.
(20) The components w.sub.LL.sup.y and w.sub.HH.sup.y in (3) play the roles of lowpass luminance and chrominance, respectively. Hence, additional wavelet decompositions are applied. N-level Mallat wavelet packet transforms of w.sub.LL.sup.y and w.sub.HH.sup.y are encoded by luminance and chrominance encoders 18 and 20, respectively.
(21) In wavelet-based compression schemes such as JPEG2000, the coding efficiency increases as more coefficients are concentrated near zero. In order to distribute coefficients around zero, each color component of the CFA sampled image can be shifted by adjusting its offset before taking wavelet transform, as follows:
y′(n)=y(n)−k
(22) where k=[k.sub.rk.sub.gk.sub.b].sup.T. The shift k is stored as sideband information to be used to later decompress the image. In experiments, the black offset was computed from a calibration using a color checker. This has the effect of shifting v.sub.s.sup.y≅(w.sub.LL*.sup.α+w.sub.L*L.sup.α)/2 and w.sub.HH.sup.y(n)≅w.sub.L*L*.sup.β toward zero, which further increases coding efficiency.
(23) Lossy Compression
(24)
(25) In the lossy compression, w.sub.LH.sup.y and w.sub.HL.sup.y are decorrelated by a non-integer transformation. A relationship for this transformation is:
(26)
(27) where M∈.sup.2×2. In
.sup.2×2, the error between the wavelet coefficients w and its quantized coefficients can be minimized while penalizing non-sparsity of the decorrelated coefficients v, as follows:
(28)
(29) Here, the first term is the fidelity term penalizing distortions caused by the quantization. The L1 norm in the second term is designed to promote sparse representation. By the law of large numbers (as denoted by the expectation operator the above can be approximated applying to the quantization errors q.sub.s and q.sub.d
(30)
(31) where the simplification by the Frobenius norm ∥.Math.∥.sub.F.sup.2 stems from the assumption that q.sub.s and q.sub.d are zero mean and independent. Increasing the value of λ promotes sparsity (and coding efficiency) at the sacrifice of the reconstruction error. In practice:
(32)
(33) where the transformation of a and b were stable while k decreased with increasing λ, where v.sub.s.sup.y≈ka(w.sub.LH.sup.α+w.sub.HL.sup.α) and v.sub.d.sup.y≈ka(w.sub.LH.sup.l+w.sub.HL.sup.l). This is reasonable because horizontal w.sub.LH.sup.l and vertical w.sub.HL.sup.l coefficients behave similarly. The above minimization can also be performed numerically by gradient descent. Data plots verified that the transformation M decorrelates the v.sub.s.sup.y and v.sub.d.sup.y coefficients, and the Pearson product-moment correlation coefficient decreased to 0.014. The entropy of the decorrelated coefficient reduced from 12.05 to 6.91.
(34) Optimization of Wavelet Transforms
(35) There are two main sources of distortions in lossy com-pression: round-off error and quantization error. The round-off error stems from finite precision operators used to carry out the forward and reverse wavelet transforms. The quantization error (commonly referred to as the “residual error”) is caused by reducing the number of bits to represent wavelet coefficients, at the expense of accuracy. Specifically, a larger quantization step yields higher compression ratio and higher loss in quality.
(36) The interactions between the two sources of noise depend on the bitrate. Although the quantization errors dominate at the low bitrates, the round-off error limits the image quality at the higher bitrates. The inventors have determined that better quality would be achieved if the round-off error is reduced at the higher bitrates. By experimentation, we heuristically arrived at an alternative decomposition scheme that performs better at high bitrates. With regard to
(37) As the quantization step increases, the round-off error become insignificant relative to the quantization error. Hence at the lower bitrates, we empirically found that Daubechies 9/7 would be more effective for the decorrelated wavelet coefficients.
(38) Camera Processing Pipeline-Aware Lossy Compression
(39) Lossy compression yields an approximation of the desired image with fewer bits. The error introduced by the lossy compression is not uniform to the eye. This is because the uncompressed raw sensor data is further processed by camera processing pipeline (which is comprised of black offset, color correction, white balance, gamma correction, among others) before the image can be observed by the eye. When the error introduced by a uniform quantization propagates through the camera processing pipeline, the end result error is no longer uniform.
(40) The color construction 32 operation relies upon the fact that the color components x(n) are reconstructable from l, α, and β by the relation:
(41)
(42) If w.sub.LL.sup.y≈w.sub.LL.sup.l, w.sub.HH.sup.y≈w.sub.LL.sup.β, and v.sub.s.sup.y≈w.sub.LL.sup.β (or 2kaw.sub.LL*.sup.α) are taken as the “quarter resolution” versions of l, α and β, then the following relation permits reconstruction:
(43)
(44) (v.sub.s.sup.y is replaced by v.sub.s.sup.y/2 ka if lossy). In other words, the reconstruction 32 can approximately recover a quarter resolution color image w.sub.LL.sup.x(n) directly from the decorrelated one-level wavelet transform coefficients.
(45) Then the corrections used match that used in the camera processing. First, the black offset is subtracted. A demosaicking step estimates the color image from the CFA sampled image. In color correction 36, the tristimulus values of the recovered image corresponding to the spectral transmittance of the color filters are converted to a canonical color space by multiplying by a color transformation matrix. The “white balance” 38 rescales the color to make it (nearly) invariant to the illumination color. Lastly, a compander known as gamma correction 40 enhances the low-intensity pixels while compressing the high-intensity pixels by a non-linear mapping.
(46)
(47) We note that the approximation used by the low-resolution reconstruction essentially treats v.sub.s.sup.y (sum of w.sub.LL.sup.α and (w.sub.LH.sup.l+w.sub.HL.sup.l)/2) like a lowpass component w.sub.LL.sup.α—justified in part by the fact that w.sub.LL.sup.α dominates. There is effectively no image quality penalty associated with this approximation because the highpass components are more aggressively quantized in a typical compression scheme than the lowpass components. In other words, the highpass components included in v.sub.s.sup.y are encoded with fewer quantization distortions than a typical compression scheme.
(48) When recovering the quarter resolution color image, a few coefficients can take on negative values (which would not be there if this were the genuine color image). Thresholding them to zero would introduce additional distortion, which is unattractive. Instead, the pipeline takes the absolute value of w.sub.LL.sup.x, encoding the sign bits separately. The binary image of sign bits is encoded by the standard encoder, which added about 0.004 bits per pixel on average in testing.
(49) While specific embodiments of the present invention have been shown and described, it should be understood that other modifications, substitutions and alternatives are apparent to one of ordinary skill in the art. Such modifications, substitutions and alternatives can be made without departing from the spirit and scope of the invention, which should be determined from the appended claims.
(50) Various features of the invention are set forth in the appended claims.