SHAPE-ADAPTIVE MODEL-BASED CODEC FOR LOSSY AND LOSSLESS COMPRESSION OF IMAGES
20170251214 · 2017-08-31
Assignee
Inventors
Cpc classification
H04N19/159
ELECTRICITY
H04N19/13
ELECTRICITY
International classification
H04N19/159
ELECTRICITY
H04N19/14
ELECTRICITY
Abstract
The present invention relate to methods and codecs for image and video compression. Embodiments of the present invention include a novel shape-adaptive model-based codec (SAM) that supports binary shapes as well as matte and soft segmentation image compression by decomposing input shapes into deterministic and stochastic components for flexible lossy and lossless coding. The present invention can provide inter/intra prediction and flexibly adapts between lossy and lossless modes with various parameters for compression quality control. The compression module can also be adapted with numerous other compression techniques.
Claims
1. A method for compression of images or shape information, comprising the steps of: separating binary shape images into layers and objects; subtracting holes from a representation of the objects; identifying deterministic and stochastic components, and block sizes of the object based on shapes of the objects; encoding the deterministic components of each object based on parametric models; encoding the stochastic components of each object; and combining the encoded components to provide a compressed encoded output.
2. The method of claim 1, wherein the representation of the objects is
Object.sub.m=S.sub.m−Σ.sub.n=1.sup.NmH.sub.m,n, where S.sub.m is the shape of the m-th object Object.sub.m, m=1, 2, . . . , M, defined by the outer boundary of the original shape, which is equivalent to the original shape having undergone a morphological filling process, and H.sub.m,n=1, 2, . . . , N.sub.m are holes within the m-th object, which are the regions filled under the morphological filling process previously identified and are arranged in a descending order of their size.
3. The method of claim 1, wherein the step of identifying the deterministic and stochastic component comprises the steps of: obtaining a boundary of the solid shapes; storing vertices on the boundary in a point list; generating line segments from the vertices of the point list; extracting the line segments from the object; fitting a parametric model to the line segments; calculating an approximation error between the parametric model and the line segments of the object; and comparing the approximation error with a threshold, wherein line segments below the threshold are taken as deterministic components and line segments above the threshold are taken as stochastic components.
4. The method of claim 3, wherein the step of obtaining the boundary is achieved by removing all pixels of the object having 4-connected neighbors that are non-zero, and leaving 8-connected boundary pixels.
5. The method of claim 3, wherein the step of extracting the line segments comprises: employing a sliding window for each pixel on the shape boundary to include its neighboring pixels so that a degree of smoothness of the line segment formed by the pixels and its neighbor can be measured; and clustering the degree of smoothness obtained from each pixel into two groups such that adjacent pixels in a same group can be connected together to form a line segment.
6. The method of claim 5, wherein the step of measuring the degree of smoothness is achieved by comparing a least-squares fitting error.
7. The method of claim 3, wherein the parametric model is one or more of a polygon, a piecewise polynomial, and a B-spline.
8. The method of claim 1, wherein the parametric model is one or more of a polygon, a piecewise polynomial, and a B-spline.
9. The method of claim 1, wherein the coding of the stochastic components is achieved with block based methods or chain code.
10. The method of claim 1, wherein the coding of the stochastic components is achieved with a block-based method using a rotated micro-processing unit with variable size, and an orientation of the rotated micro-processing unit is optimized using entropy coding or a content-based arithmetic coding algorithm for achieving a high compression ratio for a block-based method.
11. The method of claim 1, further comprising a step of intra/inter prediction of parameters of adjacent components to further increase the compression ratio, whereby components are coded as a displacement from a previous vertice.
12. The method of claim 1, wherein a lossless and a lossy compression mode can be selected independently for the deterministic components and the stochastic components.
13. The method of claim 1, wherein the compressed encoded output has a data structure that includes: overhead that includes a header file, starting and end points of a micro-processing unit, including component information and mean opacity; and if a deterministic components is chosen, parameters of the parametric model.
14. A shape-adaptive model-base entropy coding or content-based arithmetic coding algorithm codec for lossy and lossless compression of binary images, the codec including a method comprising: separating binary shape images into layers and objects; subtracting holes from representations of the objects; identifying deterministic and stochastic components, and block sizes of each object based on shape of the objects; encoding sequentially the deterministic components of each object based on parametric models; encoding sequentially the stochastic components of each object; and combining the encoded components to provide a compressed encoded output.
15. The shape-adaptive model-based codec of claim 14, wherein the step of identifying the deterministic and stochastic components is performed by: obtaining a boundary of solid shapes; storing in a point list all vertices on the shape boundary; generating line segments from the vertices in a point list; extracting the line segments from the object; fitting a parametric model to the line segments; calculating an approximation error between the parametric model and the line segments of the shape boundary; and comparing the approximation error with a threshold, wherein those line segments below the threshold are deterministic components and those above the threshold are stochastic components.
16. The shape-adaptive model-based codec of claim 15, wherein the extracting the line segments includes: employing a sliding window for each pixel on the shape boundary contour to include its neighboring pixels so that a degree of smoothness of the line segment formed by the pixels and its neighbor can be measured; assigning a fitting error to the pixels as a degree of smoothness; and clustering the degree of smoothness obtained from each pixel into two groups such that adjacent pixels in the same group can be connected together to form a line segment.
17. The shape-adaptive model-based codec of claim 14, wherein the stochastic components are coded with a block-based method using a rotated micro-processing unit with variable size, and an orientation of the rotated micro-processing unit is optimized using entropy coding or content-based arithmetic coding for achieving a high compression ratio for a block-based method.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] The foregoing and other objects and advantages of the present invention will become more apparent when considered in connection with the following detailed description and appended drawings in which like designations denote like elements in the various views, and wherein:
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
DETAILED DISCLOSURE OF THE INVENTION
[0025] An image is a representation of visual perceptions, which may be two-dimensional (2D), such as a photograph or screen display. It may also be three-dimensional (3D), such as stereo images. In general, 3D images can also be represented as multiple 2D images, or a 2D image with a deformation/depth map. Hence, the shape processing of the present invention can be considered for 2D images for simplicity. However, in general, these processing techniques can also be extended to the processing of 3D or stereo images.
[0026] To understand the techniques of the present invention it is helpful to understand inpainting.
[0027]
[0028]
Object.sub.m=S.sub.m−Σ.sub.n=1.sup.NmH.sub.n,m, (1)
where S.sub.m is the shape of the m-th object Object.sub.m, m=1, 2, . . . , M, defined by the outer boundary of the original shape, which is equivalent to the original shape having undergone a morphological filling process H.sub.m,n, n=1, 2, . . . , N.sub.m are holes within the m-th object, which are the regions filled under the morphological filling process previously mentioned, and are arranged in a descending order according to their size. [0031] 2. Identification of deterministic and stochastic components: After obtaining the shapes and holes, as shown in
[0033] Identification of Deterministic and Stochastic Components
[0034] In a method according to the present invention, the first step can be to obtain the boundary of the solid shapes. For example, morphological filtering can be performed on the shape by removing all pixels where their 4-connected neighbors are non-zero, thus leaving 8-connected boundary pixels. All of the vertices on the boundary can be traversed and stored in a point list. Line segments can be generated from the set of point lists. Afterwards, the following two steps can be performed to obtain the deterministic and stochastic components: [0035] 1. Extraction of the line segments from the object: Line segments can be obtained using a polygon fitting procedure, e.g., iteratively, including the vertices until the line fitting error exceeds a certain tolerance. However, preferably, this procedure should be directly applied to this problem as the stochastic components may have a large fitting error, which may exceed the specified tolerance. To overcome this limitation, a new method according to the present invention can be used to segment the shape contour as follows: [0036] a. For each pixel on the shape contour, a sliding window can be employed to include its neighboring pixels so that the degree of smoothness of the line segment formed by the pixels and its neighbor can be measured, e.g., by comparing their least-squares fitting error. Then, this fitting error can be assigned to the pixel as a degree of smoothness. [0037] b. The degree of smoothness obtained from each pixel can then be clustered into two groups. Adjacent pixels in the same group can be connected together to form a line segment. [0038] 2. Identification of deterministic and stochastic component: After obtaining the line segments, a parametric model, such as a polygon, piecewise polynomial, B-spline, etc., can be used to fit the line segments. Since the deterministic component is generally smooth and can be better represented by the parametric model, the resulting approximation error will generally be much smaller than that of the stochastic component. By comparing the approximation error with a certain threshold, the line segments can be separated into deterministic and stochastic components.
Γ.sub.ADAPTIVE(m),μΓ.sub.ADAPTIVE(m)+(1−μ)k.sub.ξσ.sub.m, and Γ.sub.ADAPTIVE(1)=k.sub.ξσ.sub.1, (3)
σ.sub.m=median(|ELR.sub.m,k−median(ELR.sub.m,1,ELR.sub.m,2, . . . ELR.sub.m,K.sub.
where σ.sub.m is a robust scale estimate of ELR.sub.k,m of the m-th object, μ is a forgetting factor, and k.sub.ξ is a threshold quartile parameter corresponding to the upper (1−P{X>ξ}) percentile of the Gaussian distribution. Hence, the probability that the parameter P{X>ξ}=(2/π)∫.sub.ξ.sup.∞e.sup.−x.sup.
If ELR.sub.m,k−median(ELR.sub.m,1,ELR.sub.m,2, . . . ELR.sub.m,K)>min(Γ.sub.ADAPTIVE(m),Γ.sub.USER),
the k-th line segment is stochastic. (5)
where the threshold min(Γ.sub.ADAPTIVE(m), Γ.sub.USER) provides the flexibility to switch from the adaptive threshold or the user specified threshold Γ.sub.USER.
[0043] Coding of the Deterministic Components
[0044] After identifying the deterministic component, the shape of the deterministic component can be further represented using a parametric model, say a polygon, piecewise polynomial, B-spline, etc. More specifically, the coordinates of the pixels on the k-th line segment,
j=1, 2, . . . , J.sub.k can be modeled as
where ƒ.sub.k(a.sub.k,m, j) is a function that describes the relationship between the vertex number j and the coordinate of the j-th vertex. In general, this concept can be generalized into a vertex in a higher dimension, such as a 3D coordinate. a.sub.k is a vector containing the parameters of the function for the k-th line segment. For example, if a spline model is used, a.sub.k may obtain the control knots of the spline curve and the chosen order. e.sub.j.k is the approximation error and it can be used to identify the deterministic and stochastic components. The k-th deterministic component can be represented by the parameter a.sub.k.
[0045] Intra/inter prediction of the parameters of adjacent components can be performed to further increase the compression ratio. For example, if the parameters are integers, such as positions of vertices, they can be coded sequentially and their position can thus be represented as the position of their previous vertices plus a displacement. Generally, if the magnitude of the displacement is much smaller than that of the vertices' coordinates (ranging from 0 to the size of the image), fewer bits will be required to code the displacement and this will reduce storage in practice. Intra prediction is similar to inter prediction but differs in the way the displacement is retrieved. The displacement, in this case, is defined by the difference between the coordinate of the current vertex in the P frame and the nearest vertex in the I frame. In practice, the selection between inter mode and intra mode will be determined by the value of displacement. Inter vertices prediction is similar to that of intra prediction except that in inter prediction the reference vertices are not in sequential order but are calculated as the nearest corresponding vertices in the reference frame using either iterative closest point (ICP) or free-form deformation (FFD). Given the corresponding reference vertices, the predicted vertices can be represented as the reference vertices plus displacement, similar to that in intra vertices prediction.
[0046] Whereas for intra/inter prediction of real-valued parameters of adjacent components, e.g. coefficients of a parametric model, the real-valued parameters of the current components are regarded as reference parameters to predict the coefficients of the subsequent components. Afterwards, the prediction residual is encoded and stored rather than the original coefficients. A high compression ratio can be achieved if the range of the prediction residual is much smaller than that of the original coefficients. More precisely, one may first scale and quantize the real-valued parameters into fixed-point integers. Afterwards, the prediction residual, i.e., the difference between the parameters of the current component and that of the subsequent component, are computed and stored. For example, a variable scale and differential category coding approach can be employed for intra-prediction of the categories of the floating point parameters, which explores the redundancies among the order of the parameters and is able to further improve the compression ratio of the intra-prediction. Inter prediction of parametric model coefficient can be achieved similarly.
[0047] The parametric representation, also known as the deterministic component, gives more flexibility in lossy compression of a shape. The shape can be easily down-sampled and up-sampled without generating a staircase effect or blocking artifacts. Moreover, the parametric representation of a boundary is continuous. Hence, it not only allows an arbitrary number of interpolated boundary points within the curve, but it is also effective in shape registration for smooth shape transition. As a result, it can be employed in applications such as video streaming and live broadcasting on wireless networks and mobile phone networks, which have much lower bandwidth than that of a wired network and hence a high compression ratio of video content is required to achieve low latency in streaming. Moreover, image and video processing techniques like super-resolution or morphing can also be directly achieved using parametric representation of shapes.
[0048] Coding of the Stochastic Component
[0049] In a SAM codec of the present invention, the stochastic component can be coded using, e.g., block based methods or chain code. Similar to that of the deterministic component, depending of the chosen coding method, e.g., content-based arithmetic coding, the user can specify a block size to achieve whatever coding performance is desired. Thus, the program should further break each stochastic component to fit in a block size B.sub.S before coding starts.
[0050] If the stochastic component is coded using a block-based method (
[0051] The coding of stochastic components allows the user to choose between lossless or lossy compression modes. In lossless compression is required, it can be coded using arithmetic coding or other methods. Alternatively, the shape can be decimated into a lower resolution shape image and coded before up conversion. This process is lossy but generally leads to a higher compression ratio.
[0052] For inter/intra prediction of stochastic components, predictions similar to vertices prediction for the deterministic component can be performed to estimate the position of the start point and end point of the component region. The component itself, either coded by chain code or block based methods, can apply its inter prediction methods. In particular, to compress the temporal and spatial redundancy, the position of the component region can be predicted by the previously coded value as an intra/inter prediction. Codecs according to the present invention can use any conventional coding method to compress the stochastic and deterministic component, such as a chain code or block based method. For example, if CAE is applied to the stochastic component compression, a block matching method for inter prediction can be employed.
[0053] Mode Selection
[0054] In a SAM codec of the present invention, a shape can be separate into two major components: stochastic components and deterministic components. Generally, for each component in the shape boundary, the user is free to choose between lossless and lossy compression modes. For example, a choice can be made to code the deterministic component in lossy mode and code the stochastic component in lossless mode. Thus, a smoothed boundary would be obtained on long curves with fluctuation, while still maintaining a complex boundary, also known as the stochastic component. On the other hand, if no loss of shape information is allowed, the user can choose the lossless mode of the codec. In this case, the codec will regard all line segments as stochastic components for simplification.
[0055] Data Structure
[0056]
[0057] Experimental Results
[0058] Table 1 shows experimental results of lossless compression between a SAM codec of the present invention and the CAE method, which is a state-of-the-art lossless algorithm for compression. To make a fair comparison, only the lossless compression mode of the SAM codec was compared with the CAE method, since it is a lossless algorithm. A distinctive feature that codecs of the present invention can have is flexibility in choosing between lossless and lossy compression modes.
[0059] In the experiment, a linear model was used for deterministic component modeling, thus there will not be any deterministic component after identification, but stochastic components separated by boundary segmentation. The stochastic components were further separated into smaller blocks and CAE was used for stochastic component coding. The block size, BS, was set to 10 pixels in the experiment. The experimental comparison used the CAE method in MPEG4. It can be seen from Table I that the lossless compression of the present invention has performance that is generally better than the CAE method.
[0060] While the invention has been particularly shown and described with reference to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention. Additionally, many modifications may be made to adapt a particular situation to the teachings of claimed subject matter without departing from the central concept described herein. Therefore, it is intended that claimed subject matter not be limited to the particular examples disclosed, but that such claimed subject matter may also include all implementations falling within the scope of the appended claims, and equivalents thereof.
REFERENCES
[0061] The following documents are incorporated by reference to the extent that they are not inconsistent with the teachings disclosed herein. [0062] Ostermann, J. “Core experiments on MPEG-4 video shape coding,” International Standards Organization, ISO/IEC/JTCI/SC29/WG11 N 1584 (1997). [0063] Rabbani et al., “Digital Image Compression Techniques,” SPIE, Int. Soc. Opt. Eng., (1991). [0064] Brandy et al., “Context-based arithmetic encode of 2D shape sequences,” Special Session on Shape Coding, ICIP 97 (1997).