Method and apparatus for encoding or decoding blocks of pixel
11259033 · 2022-02-22
Assignee
Inventors
Cpc classification
H04N19/463
ELECTRICITY
International classification
H04N19/463
ELECTRICITY
Abstract
A method and a device for processing a current pixel block of an image using a palette prediction mode according to HEVC Format Range Extension. The mode uses a current palette to build a predictor block of indexes to predict the current pixel block. The current palette comprises entries associating entry indexes with pixel values. The method comprises predicting the current palette from entries of two or more palettes that are palettes previously used to process blocks of pixels. A palette predictor may be used that is built from the two or more palettes, preferably from the last used palette for a coding unit in the current coding entity and from the previously used palette for which a flag bitmap indicates whether or not its elements have been copied into the last used palette. Accordingly, the coding of the palette mode is improved.
Claims
1. A method for processing a current block of pixels of an image using a palette coding mode, the palette coding mode using a current palette that comprises a set of entries associating respective entry indexes with corresponding pixel values, the method comprising a step of predicting the current palette using a palette predictor built from entries of two or more palettes, wherein: a first palette of the two or more palettes is used in processing a block of pixels immediately preceding the current block of the image, and the building of the palette predictor comprises: taking all of the entries from the first palette as entries of the palette predictor; determining if the size of the palette predictor is smaller than the maximum predictor size limit; and determining if one or more entries of a second palette are used in predicting the first palette using corresponding one or more flags, wherein each flag is represented by a bit and indicates whether or not a corresponding entry of the second palette of the two or more palettes is to be used in predicting the first palette; if the size of the palette predictor is smaller than the maximum predictor size limit, and if the corresponding one or more flags indicate the one or more entries are not used in predicting the first palette, taking said one or more entries as entries of the palette predictor; the entries from the second palette coming after the entries from the first palette in the built palette.
2. The method of claim 1, wherein the second palette is a palette predictor built from entries of two or more palettes used in processing the block of pixels immediately preceding the current block of the image.
3. The method of claim 1, wherein the flags form a bitmap which includes at least one element at a predefined position in the bitmap for signalling whether or not the bitmap includes, after the predefined position, at least one additional flag that indicates whether or not a corresponding entry of the second palette is in the first palette, whereby it defines selection of that entry of the second palette to generate the first palette.
4. The method of claim 1, wherein the current palette is predicted from the palette predictor using flags, each flag indicating whether or not a corresponding entry in the palette predictor is in the current palette.
5. A device for processing a current block of pixels of an image using a palette coding mode, the palette coding mode using a current palette that comprises a set of entries associating respective entry indexes with corresponding pixel values, the device comprising: prediction means for predicting the current palette using a palette predictor built from entries of two or more palettes, wherein: a first palette of the two or more palettes is used in processing a block of pixels immediately preceding the current block of the image; and the prediction means is arranged to build the palette predictor by: taking all of the entries from the first palette as entries of the palette predictor; determining if the size of the palette predictor is smaller than the maximum predictor size limit, and determining if one or more entries of a second palette are used in predicting the first palette using corresponding one or more flags, wherein each flag is represented by a bit and indicating whether or not a corresponding entry of the second palette of the two or more palettes is to be used in predicting the first palette, if the size of the palette predictor is smaller than the maximum predictor size limit, and if the corresponding one or more flags indicate the one or more entries are not used in predicting the first palette, taking said one or more entries as entries of the palette predictor; the entries from the second palette coming after the entries from the first palette in the built palette.
6. A non-transitory computer readable carrier medium comprising processor executable code for a programmable apparatus, the computer readable carrier medium comprising a sequence of instructions for implementing a method for processing a current block of pixels of an image using a palette coding mode, the palette coding mode using a current palette that comprises a set of entries associating respective entry indexes with corresponding pixel values, the method comprising a step of predicting the current palette using a palette predictor built from entries of two or more palettes, wherein: a first palette of the two or more palettes is used in processing a block of pixels immediately preceding the current block of the image, and the building of the palette predictor comprises: taking all of the entries from the first palette as entries of the palette predictor; determining if the size of the palette predictor is smaller than the maximum predictor size limit; determining if one or more entries of a second palette are used in predicting the first palette using corresponding one or more flags, wherein each flag is represented by a bit and indicates whether or not a corresponding entry of the second palette of the two or more palettes is to be used in predicting the first palette; if the size of the palette predictor is smaller than the maximum predictor size limit, and if the corresponding one or more flags indicate the one or more entries are not used in predicting the first palette, taking said one or more entries as entries of the palette predictor; the entries from the second palette coming after the entries from the first palette in the built palette.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Embodiments of the invention will now be described, by way of example only, and with reference to the following drawings in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION
(30)
(31) An INTRA Coding Unit is generally predicted from the encoded pixels at its causal border by a process called INTRA prediction.
(32) Temporal prediction of an INTER coding mode first consists in finding in a previous or future frame called the reference frame 116 the reference area of which is the closest to the Coding Unit, in a motion estimation step 104. This reference area constitutes the predictor block. Next this Coding Unit is predicted using the predictor block to compute the residue in a motion compensation step 105.
(33) In both cases, spatial and temporal prediction, a residual is computed by subtracting the Coding Unit from the original predictor block.
(34) In the INTRA prediction, a prediction direction is encoded. In the temporal prediction, at least one motion vector is encoded. However, in order to further reduce the bitrate cost related to motion vector encoding, a motion vector is not directly encoded. Indeed, assuming that motion is homogeneous, it is particularly advantageous to encode a motion vector as a difference between this motion vector, and a motion vector in its surroundings. In the H.264/AVC coding standard for instance, motion vectors are encoded with respect to a median vector computed between 3 blocks located above and on the left of the current block. Only a difference, also called residual motion vector, computed between the median vector and the current block motion vector is encoded in the bitstream. This is processed in module “Mv prediction and coding” 117. The value of each encoded vector is stored in the motion vector field 118. The neighboring motion vectors, used for the prediction, are extracted from the motion vector field 118.
(35) Next, the mode optimizing the rate distortion performance is selected in module 106. In order to further reduce the redundancies, a transform, typically a DCT, is applied to the residual block in module 107, and a quantization is applied to the coefficients in module 108. The quantized block of coefficients is then entropy coded in module 109 and the result is inserted into the bitstream 110.
(36) The encoder then performs a decoding of the encoded frame for the future motion estimation in modules 111 to 116. This is a decoding loop at the encoder. These steps allow the encoder and the decoder to have the same reference frames. To reconstruct the coded frame, the residual is inverse quantized in module 111 and inverse transformed in module 112 in order to provide the “reconstructed” residual in the pixel domain. According to the encoding mode (INTER or INTRA), this residual is added to the INTER predictor 114 or to the INTRA predictor 113.
(37) Next, this first reconstruction is filtered in module 115 by one or several kinds of post filtering. These post filters are integrated into the decoding loop. This means that they need to be applied to the reconstructed frame at the encoder and decoder in order to use the same reference frames at the encoder and decoder. The aim of this post filtering is to remove compression artifacts.
(38) The principle of an HEVC decoder has been represented in
(39)
(40) At a high-level, an image is divided into Coding Units that are encoded in raster scan order. Thus, when coding block 3.1, all the blocks of area 3.3 have already been encoded, and can be considered available to the encoder. Similarly, when decoding block 3.1 at the decoder, all the blocks of area 3.3 have already been decoded and thus reconstructed, and can be considered as available at the decoder. Area 3.3 is called the causal area of the Coding Unit 3.1. Once Coding Unit 3.1 is encoded, it will belong to the causal area for the next Coding Unit. This next Coding Unit, as well as all the next ones, belongs to area 3.4 illustrated as a dotted area, and cannot be used for coding the current Coding Unit 3.1. It is worth noting that the causal area is constituted by reconstructed blocks. The information used to encode a given Coding Unit is not the original blocks of the image for the reason that this information is not available at decoding. The only information available at decoding is the reconstructed version of the blocks of pixels in the causal area, namely the decoded version of these blocks. For this reason, at encoding, previously encoded blocks of the causal area are decoded to provide this reconstructed version of these blocks.
(41) It is possible to use information from a block 3.2 in the causal area when encoding a block 3.1. In the HEVC Range Extension draft specifications, a displacement vector 3.5, which can be transmitted in the bitstream, may indicate this block 3.2.
(42)
(43) Each Coding Tree Block contains one or more square Coding Units (CU). The Coding Tree Block is split based on a quad-tree structure into several Coding Units. The processing (coding or decoding) order of each Coding Unit in the Coding Tree Block follows the quad-tree structure based on a raster scan order.
(44) In HEVC, several methods are used to code the different syntax elements, for example block residuals, information on predictor blocks (motion vectors, INTRA prediction directions, etc.). HEVC uses several types of entropy coding such as the Context based Adaptive Binary Arithmetic Coding (CABAC), Golomb-rice Code, or simple binary representation called Fixed Length Coding. Most of the time a binary encoding process is performed to represent the different syntax elements. This binary encoding process is also very specific and depends on the different syntax element.
(45) For example, the syntax element called “coeff_abs_level_remaining” contains the absolute value or a part of an absolute value of the coefficient residual. The idea of this binary encoding process is to use Golomb-Rice code for the first values and Exponential Golomb for the higher values. More specifically, depending on a given parameter called Golomb Order, this means that for representing the first values, for example values from 0 to 3, a Golomb-Rice code is used, then for higher values, for example values from 4 and above, an Exponential Golomb code is used. The Golomb Order is a parameter used by both the Golomb-Rice code and the exponential Golomb code.
(46)
(47) The prefix value is set equal to 1 at step 602 then 1 bit is extracted from the bitstream at step 601 and the variable flag is set equal to the decoded value 603. If this flag is equal to 0 at step 604 the Prefix value is incremented 605 and another bit is extracted from the bitstream 603. When the flag value is equal to 1, the decision module 606 checks if the value Prefix is strictly inferior to 3. If this is true, the N=Order bits are extracted 608 from the bitstream 601 and set to the variable “codeword”. This corresponds to the Golomb-Rice representation. The Symbol value 612 is set equal to ((prefix«Order)+codeword) as represented in step 609. Where ‘«’ is the left shift operator.
(48) If the Prefix is superior or equal to 3 at step 606, the next step is 610 where N=(prefix−3+Order) bits are extracted from the bitstream and set to the variable “codeword” 610. The symbol value 611 is set equal to ((1«(prefix−3))+2)«Order)+codeword. This corresponds to the exponential Golomb representation.
(49) In the following, this decoding process, and in a symmetric way the corresponding encoding process, is called Golomb_H with an input parameter corresponding to the Golomb Order. It can be noted in a simple way Golomb_H(Order).
(50) In HEVC, for some syntax elements such as residuals, the Golomb Order is updated in order to adapt the entropy coding to the signal to be encoded. The updating formula tries to reduce the Golomb code size by increasing the Golomb Order when the coefficients have large values. In the HEVC standard, the update is given by the following formula:
Order=Min(cLastRiceOrder+(cLastAbsLevel>(3*(1<<cLastRiceOrder))?1:0),4)
(51) Where cLastRiceOrder is the last used Order, cLastAbsLevel is the last decoded coeff_abs_level_remaining. Please note that for the first parameter to be encoded or decoded, cLastRiceOrder and cLastAbsLevel are set equal to 0. Morever please note that the parameter Order cannot exceed the value of 4 in this formula. And where the expression (C?A:B) has the value A if the condition C is true and B if the condition C is false.
(52) The HEVC Range Extension, also commonly called HEVC RExt, is an extension that is currently being drafted for the new video coding standard HEVC.
(53) An aim of this extension is to provide additional tools to code video sequences with additional colour formats and bit-depth, and possibly losslessly. In particular, this extension is designed to support 4:2:2 colour format as well as 4:4:4 video format in addition to 4;2.0 video format (see
(54) Regarding the bit-depth which is the number of bits used to code each colour component of a pixel, although the current HEVC standard is able to deal with 4:2:0 colour format with 8 and 10 bits bit-depth (i.e. 256 to 1,024 possible colours), HEVC RExt is about to be designed to additionally support 4:2:2 and 4:4:4 video format with an extended bit-depth ranging from 8 bits up to 16 bits (i.e. up to 65,536 possible colours). This is particularly useful to have a larger dynamic of colour components.
(55) HEVC RExt is also designed to provide a lossless encoding of the input sequences; this is to have a decoded output 209 strictly identical to the input 101. To achieve this, a number of tools have been modified or added, compared to the conventional HEVC lossy codec. A non-exhaustive list of exemplary modifications or additions to implement losslessly is provided here below: removal of the quantization step 108 (203 at the decoder); forced activation of the bypass transform, as normal cosine/sine transforms 107 may introduce errors (204 at the decoder); removal of tools specifically tailored for compensating quantization noise, such as post filtering 115 (207 at the decoder).
(56) For HEVC RExt, the updating formula of the Golomb Order has been further modified in order to be adapted to deal with higher bit-depth and to take into account very high quality required by application dealing with video compression of extended format (4:2:2 and 4:4:4) including lossless coding. For HEVC RExt, the updating formula has been changed as follows:
Order=Min(cLastRiceOrder+(cLastAbsLevel>>(2+cLastRiceOrder)),7)
(57) With this formula, the maximum value of Order is 7. Moreover, for the first coding of the coeff_abs_level_remaining for a sub-block of Transform block, the Golomb order is set equal to:
Order=Max(0,cRiceOrder−(transform_skip_flag ∥cu_transquant_bypass_flag?1:2))
where the variable “transform_skip_flag” is set to 1 if the transform (e.g. DCT 107 or 204) is skipped for the current coding unit and 0 if the transform is used, the variable “cu_transquant_bypass_flag” is set to 1 if the coding unit is losslessly encoded and 0 otherwise, the variable “cRiceOrder” is set equal to last used Order from another sub-block of the transform block and otherwise is set to 0.
(58) Additional tools for HEVC RExt are currently being designed to efficiently encode “screen content” video sequences in addition to natural sequences. The “screen content” video sequences refer to particular video sequences which have a very specific content corresponding to those captured from a personal computer of any other device, containing for example text, PowerPoint presentation, Graphical User Interface, tables (e.g. screen shots). These particular video sequences have quite different statistics compared to natural video sequences. In video coding, performance of conventional video coding tools, including HEVC, proves sometimes to be underwhelming when processing such “screen content”.
(59) The current tools currently discussed on in HEVC RExt to process “screen content” video sequences include the Intra Block Copy mode and the Palette mode. Prototypes for these modes have shown good coding efficiency compared to the conventional method targeting natural video sequences. The present application focuses on the Palette coding mode.
(60) The Palette mode of HEVC RExt is a prediction mode. It means that the Palette method is used to build a predictor for the coding of a given coding unit similarly to a prediction performed by motion prediction (Inter case) or by an Intra prediction. After the generation of the prediction, a residual coding unit is transformed, quantized and coded. In other words, the same processes as described above with reference to
(61) A palette is generally represented by a table containing a finite set of N-tuple of colors, each color being defined by its components in a given colour space (see for example 803 in
(62) At the encoder side, the Palette mode, under consideration in RExt, consists in transforming pixel values of a given input coding unit into indexes called levels identifying the entries in an associated palette. After the transformation, the resulting coding unit or block is composed of levels and is then transmitted to the decoder with the associated palette, generally a table having a finite number of triplets of colours used to represent the coding unit. Since the palette defines a finite number of colours, the transformation into a block of indexes usually approximates the original input coding unit.
(63) To apply the Palette mode at the encoder side, an exemplary way to transform a coding unit of pixels is performed as follows: find the P triplets best describing the coding unit of pixels to encode, for example by minimizing overall distortion; then associate with each pixel of the coding unit the closest colour among the P triplets: the value to encode (or level) (which thus forms part of the block of indexes) is then the index corresponding to the entry of the associated closest colour. The predictor block of indexes is thus obtained from the palette by comparing the entries of the palette to each pixel of the coding unit, in order to identify, for each pixel, the entry which defines the closest colour.
(64) For each coding unit, the palette (i.e. the P triplets found), the block of indexes or levels and the residual representing the difference between the original coding unit and the block of indexes in the colour space (which is the block predictor) are coded in the bitstream 110 and sent to the decoder.
(65) At the decoder, the Palette mode consists in performing the conversion in the reverse way. This means that each decoded index associated with each pixel of the coding unit is replaced by the corresponding colour in the palette decoded from the bitstream, in order to reconstruct the corresponding colour for each pixel of the coding unit. This is the reconstruction of the block of indexes in the colour space (i.e. of the coding unit predictor). Since the Palette mode is a prediction mode, the associated residual is decoded from the bitstream and then added to the reconstructed coding unit predictor to build the final reconstructed coding unit.
(66)
(67) Next, during step 706, two elements are built from the decoded data: the palette 707 and the block of levels 708. From this block of levels and the associated palette, the coding unit predictor in pixel domain 710 is built 709. This means that for each level of the block of levels, a color (RGB or YUV) is associated with each pixel.
(68) Then the coding unit residual is decoded 711 from the bitstream 701. In the current implementation of Palette mode, the residual associated with a Palette mode is coded using the common HEVC Inter residual coding method, i.e. using Golomb coding. To obtain the residual of the coding unit, the conventional inverse quantization and inverse transformation are performed. The block predictor 710 is added 713 to this coding unit residual 712 in order to form the reconstructed coding unit 714.
(69)
(70) As mentioned in relation to
(71) The block of levels 91 is exactly the same as the one illustrated in
(72) The block of levels is encoded by group of successive pixels in scan order. Each group is encoded using a first syntax element giving a prediction direction, a second element giving the repetition, and an optional third element giving the value of the pixel, namely the level. The repetition corresponds to the number of pixels in the group.
(73) These two tables represent the current syntax associated with the Palette mode. These syntax elements correspond to the encoded information associated in the bitstream for the block of levels 91. In these tables, three main syntax elements are used to fully represent the operations of the Palette mode and are used as follows when successively considering the levels of the block of levels 91.
(74) A first syntax element, called “Pred mode” makes it possible to distinguish between two encoding modes. In a first mode corresponding to “Pred mode” flag equal to “0”, a new level is used for the current pixel. The level is immediately signaled after this flag in the bitstream. In a second mode corresponding to “Pred mode” flag equal to “1”, a “copy up” mode is used. More specifically, this means that the current pixel level corresponds to the pixel level located at the line immediately above starting on the same position for a raster scan order. In that case of “Pred mode” flag equal to “1”, there is no need to signal a level immediately after the flag because the value of the level is known by reference to the value of the level of the pixel just above in the block of levels 91.
(75) A second syntax element called “Level” indicates the level value of the palette for the current pixel only in the first mode of “Pred mode”.
(76) A third syntax element, called “Run”, is used to encode a repetition value in both modes of “Pred mode”. Considering that the block of levels 91 is scanned from the top left corner to the bottom right corner, row by row from left to right and top to bottom, the Run syntax element gives the number of successive pixels in block 91 having the same encoding.
(77) This “Run” syntax element has a different meaning which depends on the “pred mode” flag. When Pred mode is 0, “Run” element is the number of successive pixels of the predictor block having the same level value. For example, if Run=8 this means that the current “Level” is applied to the current pixel and to the following 8 pixels which corresponds to 9 identical successive samples in raster scan order.
(78) When Pred mode is 1, “Run” element is the number of successive pixels of the predictor block having a level value corresponding to the level value of their above pixel in block 91, i.e. where the “copy up” mode is applied. For example, if Run=31 this means that the level of the current pixel is copied from the pixel of the line above as well as the following 31 pixels which corresponds to 32 pixels in total.
(79) Tables 92 and 93 represent the eight steps to represent the block 91 by using the Palette mode. Each step starts with the coding of the “Pred mode” flag which is followed by the “Level” syntax element when “Pred mode” flag equals “0”, or by the “Run” syntax element when “Pred mode” flag equals “1”. The “Level” syntax element is always followed by a “Run” syntax element.
(80) When the prediction mode decoded for the current block is the palette mode, the decoder first decodes the syntax relating to this block and then applies the reconstruction process for the coding unit.
(81)
(82) Next the process corresponding to the palette values decoding starts. A variable i corresponding to the index of the palette is set equal to 0 at step 1004 next a test is performed at step 1005 to check whether i is equal to the palette size (Palette_size) or not. If it is different from the palette size at step 1005, one palette element is extracted from the bitstream 1001 and decoded at step 1006 and is then added to the palette with the associated level/index equal to i. Then the variable i is incremented through step 1007. If i is equal to the palette size at step 1005, the palette has been completely decoded.
(83) Next the process corresponding to the decoding of the block of levels 91 is performed. First, the variable j, corresponding to a pixel counter, is set to 0 as well as the variable syntax_i 1008. Then a check is performed to know whether the pixel counter corresponds to the number of pixels contained in the block. If the answer is yes at step 1009 the process ends at step 1017, otherwise the value of the flag “Pred mode” corresponding to one prediction mode is extracted from the bitstream 1001 and decoded 1010.
(84) The value of “Pred mode” is added to a table at the index syntax_i containing all “Pred mode” value decoded. If the value of this “Pred mode” is equal to 0, step 1011, the syntax element corresponding to “Level” is extracted from the bitstream 1001 and decoded 1012. This variable “Level” is added to a table at the index syntax_i containing all levels decoded. The variable j corresponding to the pixel counter is incremented by one 1013.
(85) Next the “Run” syntax element is decoded at step 1014. If the syntax element “Pred Mode” is equal to 1, step 1011, the “Run” value is also decoded at step 1014. This syntax element “Run” is added to a table at the index syntax_i containing all the runs decoded.
(86) Next at step 1015, the value j is incremented by the value of the run decoded at step 1014. The variable syntax_i is incremented by one to consider the next set of syntax elements. If the counter j is equal to the number of pixels in the block then the syntax to build the block of levels 91 is finished 1017. At the end of this process related to the Palette, the decoder knows the palette, and the tables containing the list of all the “Pred mode”, “Level” and “Run” syntax elements associated with the Palette mode of this coding unit. The decoder can then proceed with the reconstruction process of the coding unit as described through
(87) In a slight variant of this embodiment of
(88) In an embodiment that can be combined with either the above embodiment of
(89) The process of
(90) In a variant of the Palette mode prediction process as described in
(91) However, there may still be pixels of the Coding Unit that are improperly described by levels of the palette, meaning that no corresponding relevant levels have been found in the palette. These pixels are referred to as “escape” pixels, since no corresponding value is set in the block of levels.
(92) The syntax elements built during the process of
(93) An example of signalling the pixels is to add an “escape” flag before the “Pred mode” element (i.e. before step 1010) indicating whether a pixel is palette-coded (therefore subject to step 1010) or escape-coded (therefore with an explicit pixel value). The “escape” flag is followed by the explicit pixel value (no “Pred mode”, “Level” and “Run” elements are provided for this pixel).
(94) In a variant to the “escape” flag, a specific Level value (dedicated to “escape” pixels and obtained at step 1012) may be used to signal an “escape” pixel. In this case, the “Run” element should be the explicit pixel value. This specific value may only occur when the palette being built reaches its maximum size, thereby saving the cost of signalling escape values for each palette size.
(95) In any embodiment, the explicit pixel values may be coded predictively (e.g. as a difference to a neighbour pixel value) or not, and may be quantized or not, with possible consequences for the entropy coding (contextual and number of bits, etc.).
(96) Referring back to the palette, each palette element, constituted by three values in the above examples, is generally encoded using three binary codes. The length of the binary codes corresponds to the bit-depth of each color component. The palette size is typically encoded using unary code. The “Pred mode” element is encoded using one bit (as well as the “escape” flag if any). The “Level” element is encoded using binary code with binary code length equal to b, where 2b is the smallest integer equal or above the palette size. And the “Run” element is encoded using Golomb_H(Order=3) as explained above in relation to
(97)
(98) An additional item of input data to the “Pred mode”, “Level” and “Run” elements is the size of the coding unit 801 (which is the same as the size of the block of levels 802/91) known from the quadtree (
(99) In a first step 1101, a variable i, representing a pixel counter, is set equal to 0 and a variable j, to successively consider each set of syntax elements, is also set equal to 0. At step 1104, the element Pred_mode[j] extracted from the table of “Pred mode” at index j is checked against 0.
(100) If it is equal to 0, a new level is encoded for the current pixel i. As a consequence, the value of the pixel at position i is set equal to the level at the index j from the table of levels; Block[i]=Level[j]. This is step 1105. The variable i is incremented by one at step 1106 to consider the next pixel, and the variable k, dedicated to count the pixels already processed in the current Run, is set equal to 0 at step 1107.
(101) A check is performed at step 1108 to determine whether or not k is equal to the “Run” element of the table of runs at the index j: k=Run[j]?. If not equal, the level of the pixel at position i is set equal to the level value of the pixel at position i−1: Block[i]=Block[i−1]. This is step 1109. The variable i and the variable k are then incremented by one at respectively steps 1110 and 1111. If k=Run[j] at step 1108, the propagation of the left level value is finished and step 1120 is performed (described below).
(102) If Pred_mode[j] is different from 0 at step 1104, the “copy up” mode starts with the variable k set equal to 0 at step 1112. Next, step 1113 checks whether or not (k−1) is equal to the “Run” element of the table of runs at the index j: k=Run[j]+1? If not equal, the level value of the pixel at position i is set equal to the level value of the pixel at position i of the above line: Block[i]=Block[i-width], where “width” is the width of the block of levels (the same as the coding unit) as deduced from the input size of the coding unit. This is step 1114. Next, the variable i and the variable k are incremented by one at respectively steps 1115 and 1116. If k=Run[j]+1 at step 1113, the prediction mode ‘copy up’ is completed and the process goes on at step 1120.
(103) At step 1120, a check is performed to determine whether or not the variable i is equal to the amount of pixels in the block 91/CU 801. If not equal, the variable j is incremented by one at step 1121 to consider the next set of syntax elements and the process loops back to step 1104 described above.
(104) If all the pixels have been processed at step 1120, the final block of levels 91 is obtained at step 1122: this corresponds to table Block[ ]. Then a final step 1123 consists in converting each level in colour values using the palette 803 decoded using the process of
(105) Other aspects of the palette mode as introduced in HEVC RExt concern the determination by the encoder of the palette to be used to encode the current coding unit (see
(106)
(107) At a first step 1201, a variable j representing a pixel counter is set to 0, a variable “Palette_size” to follow the growth of the palette as it is being built is also set to 0, and a variable “TH” representing a threshold is set to 9. Next at step 1203, the pixel p.sub.i, i.e. having the index i according to a scanning order, is read at step 1203 from the original coding unit 1204. Then the variable j is set equal to 0 at 1205 and at step 1206 a check is performed to determine whether or not the palette size is equal to the variable “j” (meaning that all the palette elements of the palette under construction have been considered).
(108) If the palette size is equal to j, the palette at the index “j” is set equal to the pixel value p.sub.i at step 1209. This means that the current pixel p.sub.i becomes a new element in the palette, with index j associated with it. More precisely the following assignment is performed:
PAL.sub.Y[j]=(Yi)
PAL.sub.U[j]=(Ui)
PAL.sub.V[j]=(Vi)
where PAL.sub.Y,U,V are three tables to store the colour values.
(109) The palette size (Palette_size) is incremented by one at step 1210 and an occurrence table Counter is set equal to 1 for the index ‘Palette size’ at step 1211. Then the variable i is incremented by one at step 1213 to consider the next pixel “i” of the current coding unit. A check is then performed at step 1214 to determine whether or not all the pixels of the current coding unit have been processed. If they have all been processed, the process is completed by an ordering step 1215 explained later on, otherwise the next pixel is considered at step 1203 described above.
(110) Back to step 1206, if j is different from palette_size, step 1207 is performed where the absolute value for each colour component between p.sub.i and the palette element at the index j is computed. The formulas are shown in the Figure. If all the absolute differences are strictly less than the predefined threshold TH, the occurrence counter regarding the element “j” in the palette is incremented by one at step 1212. Step 1207 creates a class for each element of the palette under construction, such a class encompassing colours neighbouring the colour of the element, given the margin TH. Thus step 1212 counts the occurrences of each class. Step 1212 is followed by step 1213 already described.
(111) If the condition of step 1207 is not met, the variable j is incremented by one at step 1208 to consider the next palette element in the palette. This is to compare the other palette colour elements to the current pixel through new occurrence of step 1207. If no element in the palette meets the criterion of step 1207, a new element is added to the palette as described above with reference to steps 1209, 1210 and 1211.
(112) One may note that the decision module 1207 can compared each color element for a 4:4:4 (YUV or RGB) sequences and can only compare the Luma colour component for 4:2:0 sequences.
(113) At the end of the process of
(114) One may also note that the size of the palette can be limited to a maximum size, for example 24 entries. In such a case, if the size of the palette resulting from step 1215 exceeds 24, the palette is reduced by removing the elements (entries) from the 25.sup.th position in the ordered palette. This results in a palette being built.
(115) Turning now to the selection of the Pred mode, Level and Run syntax elements at the encoder, input data of the process of
(116) At a first step 1301, the variable “i” representing a pixel counter is set to 0. The process described below seeks to determine the syntax elements for the pixels starting from i. The two modes of prediction are evaluated independently: “Pred mode”=0 on the right hand part of the Figure, and “Pred mode”=1 on the left hand part of the Figure.
(117) For the ‘copy up’ prediction (corresponding to “Pred mode”=1), the variable “i.sub.copy” used to count the number of levels in the current Run is set equal to 0 at step 1303. Then at step 1304 the current level at pixel location i: Block[i+i.sub.copy], is compared to the level of the pixel located just above in the above line: Block[i+i.sub.copy−width], where “width” corresponds to the width of the current coding unit. Note that the level Block[i+i.sub.copy] of each pixel of the coding unit is determined in parallel at step 1308. This step consists in associating with the pixel at the position i, the closest palette element (in practice its index or level) as already explained above. This step uses the position i, the palette 1306 and the original coding unit 1307.
(118) If Block[i+i.sub.copy]=Block[i+i.sub.copy−width] at step 1304, the variable “i.sub.copy” is incremented by one at step 1305 to consider the next pixel value of the block of pixels and to indicate that the current pixel level at position i+i.sub.copy can be included in the current “copy up” Run. If Block[i+i.sub.copy] is different from Block[i+i.sub.copy−width] at step 1304 meaning that the current evaluation of a “copy up” Run has ended, the variable “i.sub.copy” is transmitted to the decision module 1314. At this stage of the process, the variable “i.sub.copy” corresponds to the number of values copied from the line just above.
(119) For the left value prediction (corresponding to “Pred mode”=0), the loop to determine the Run value (i.sub.left) is processed in parallel or sequentially. First the variable “i.sub.Start” used to store the index i of the current pixel is set to “i”, and the variable “j” used to consider successively the pixel levels following index “i” is also set equal to “i” and the variable “i.sub.left” used to count the current Run under construction is set equal to 0. This is step 1309. Next, step 1310 consists in determining whether or not j !=0 and “Pred_mode[j−1]”=0 and Block[j]=Block[j−1]. Pred_mode[ ] is a table used by the encoder to store the prediction mode (either 1 or 0 for respectively the “copy up” prediction and the left value prediction). It is filled up progressively at step 1317 described below as the successive pixels are processed, and has been initialized with zero values for example at step 1301: Pred_mode[k]=0 for any k.
(120) If the condition at step 1310 is met, the variable “i.sub.left” is incremented by one at step 1311 to indicate that the current pixel level at position j can be included in the current “left value” Run, and the variable j is incremented by one at step 1312 to consider the next pixel value of the block of pixels.
(121) If the condition at step 1310 is not met, the variable “j” is compared to “i.sub.start” to determine if it is the first pixel value to be examined for the current “left value”Run. This is step 1313. If “j” is equal to or less than “i.sub.start”, meaning that it is the first pixel value to be examined for the current Run, then it starts the current Run and the next pixel value is considered at step 1312 described above. If “j” is strictly higher than “i.sub.Start”, meaning that a first pixel value different from the pixel value of the current “left value”Run has been detected, the variable “i.sub.left” which corresponds to the length of the current “left value” Run is transmitted to the decision module 1314. Note that, as the loop for “copy up” prediction, the level Block[i] at the index i is determined in the same loop at step 1308.
(122) After having computed the maximum run for the ‘left value prediction’ and the ‘copy up’ mode, the variable “i.sub.left” and “i.sub.copy” are compared at step 1314. This is to determine whether or not “i.sub.copy”!=0 and “i.sub.copy”+2 is higher than “i.sub.left”. This is an exemplary criterion to select either the copy up mode or the left value prediction mode. In particular, the parameter “2” may be slightly changed.
(123) The condition at step 1314 means that if “i.sub.copy” is equal to 0 or is smaller than or equal to i.sub.left−2, the “left value prediction” mode is selected at step 1315. In that case, a “PredMode” variable is set equal to 0 and a Run variable is set equal to “i.sub.left” at same step 1315. On the other hand, if “i.sub.copy” is different from 0 and is strictly higher than “i.sub.left−2”, the “copy-up” mode is selected at step 1316. In that case, the “PredMode” variable is set equal to 1 and the Run variable to i.sub.copy−1 at step 1316.
(124) Next the tables containing the “Pred_mode” and the “Run” at the encoder are updated with the current value “Predmode” and “Run”, at step 1317. Then, the next position to consider in the block of pixels is computed at step 1318, which corresponds to the current position i incremented by the “run” value+1. Then a check is performed at step 1319 to determine whether the last pixels of the coding unit have been processed. If it is the case, the process ends at step 1320, otherwise the evaluation of the two prediction modes “left prediction” and “copy up” are evaluated starting at steps 1303 and 1309 for the next pixel position to obtain a new set of syntax elements.
(125) At the end of this process, the encoder knows the levels for each sample of the coding unit, and is able to encode the corresponding syntax of the block of levels based on the content of the three tables Pred_mode[ ], Block[ ] and Run[ ].
(126) As described above, the Palette mode as currently designed in HEVC RExt requires a palette to be transmitted for each coding unit. This represents a large amount of data in the bitstream, and thus a coding cost. The inventors have envisaged an improvement of the Palette mode to improve its coding efficiency
(127) According to one aspect of the invention, the current palette for a current coding unit is predicted using a palette predictor. In a variant regarding the invention, the current palette is predicted from entries of two or more palettes, the two or more palettes being palettes previously used to process blocks of pixels, for example using a palette predictor built from the two or more palettes. This makes it possible to reduce the amount of information to be transmitted for each palette associated with a coding unit. Various embodiments are described below.
(128) As the approach of the invention is prediction-based, the obtaining of the predictor for the current coding unit is first described with reference to
(129) General steps of a decoding process implementing the invention are shown in
(130) According to first embodiments to obtain the palette predictor, blocks of pixels of the image are processed according to a predefined scanning order as described above with reference to
(131) According to the scan order used, it is generally observed that the last used palette, and thus the palette predictor, is one of the palettes used for previously processed coding units that are contiguous with the current coding unit. This ensures high redundancy between the coding units is taken into account to obtain an efficient coding. For example, based on
(132) To avoid poor palette prediction due to a palette predictor which has content far from the actual content of the current coding unit, specific embodiments provide a reset of the last used palette (or of the set of previously used palettes). Such reset may occur at each new CTB, i.e. when the current coding unit starts a new coding entity made of blocks of pixels, or at each new line or row of CTBs as introduced above, or even at each new frame.
(133) Such reset also makes it possible for the encoder or the decoder to estimate the palette mode for each CTB (or row of CTBs or frame) in parallel. However, since some correlations often exist between the coding units of 2 CTBs, the reset is preferably performed for each new row of CTBs. This is illustrated using
(134) The reset at the first CTB of a line is a more efficient approach than the reset at Frame level. This is because the CTBs are encoded in a horizontal raster scan order, and thus for the first CTB of a line of CTBs, the last used palette is potentially predicted from a spatially far CTB (the last one of the CTB line just above). Given the spatial distance between the CTBs, the correlation between them is very low and thus a dependency (prediction) between their respective palettes is not worthwhile.
(135) The reset of the last used palette may mean that no palette is available as a palette predictor to possibly predict a current palette. In that case, the current palette, e.g. the palette of the very first coding unit processed in the first CTB of the line, cannot be predicted. This reset may be performed by setting the Previous_Palette_size variable (storing the size of the last used palette) to 0.
(136) In addition, the reset of the last used palette also increases the global coding cost of the palette mode in a substantial way since no value can be predicted, thereby biasing towards smaller palette sizes and therefore smaller and less efficient palette predictors. This is because new elements in the block can be reused by other blocks, therefore actually mutualizing the bits spent for the first block. A typical solution would be using dynamic programming algorithms such as Viterbi, where the coding decision for a block is selected only after a few blocks have been encoded. This is however prohibitively complex and simpler solutions are preferred.
(137) Such increase of the coding cost may thus cause the coding mode selector (see step 106) to select it more rarely and not immediately after the reset (thus with delay). However, it is worth having the palette mode selected for coding the coding units. For this purpose, embodiments provide to boost the probability that the palette mode is selected by cheating on the bit coding cost of the first coding of a coding unit after the reset occurred (i.e. with an empty or default palette predictor). This is to have a bit coding cost lower than it should be, thereby increasing the probability that the palette mode is selected.
(138) False information on the bits used for coding may be provided to artificially obtain this lowering of the bit coding costs. This false or cheating information or “bonus” for a color component may depend on the operating point of the codec (i.e. how much distortion improvement a bit should bring), on the chroma format, on the at least one color component concerned and/or the number of elements in the palette for the current coding unit.
(139) For instance, the case may be considered of a palette of RGB or YUV elements, each component R/G/B/Y/U/V having e.g. 8 bits components. Therefore, the bit cost resulting from the coding of one entry of the palette used for the first coding unit after palette reset is normally 3×8=24 bits. However, in the embodiments discussed here, a bonus may be applied to correct this bit cost, for example by artificially decreasing it to only 8 bits. The palette mode is thus more likely to be selected at step 106.
(140) In particular embodiments where two or more levels are needed to represent a pixel (e.g. one level for the component Y and one level for the pair of components U and V), two or more palettes are actually used to code a block of pixels. A bonus may be applied to each of the two or more palettes. For example, if two palettes, one for component Y and the other for component U+V, are provided, a bonus may consist in modifying the bit cost of one Y element from 8 bits to 4 bits and the bit cost of one U+V element from 16 bits to 6 bits.
(141) As it may be worth predicting the palette for the coding unit first processed, a variant includes replacing the last used palette by a by-default palette, so that the by-default palette is used as a palette predictor for the first coding unit of the line of CTBs. Various ways to generate a by-default palette (both the encoder and the decoder operate in the same way) may be considered. As an example, the by-default palette may include a set of predetermined entries corresponding to colour values equally distributed over a colour space. The equidistribution may concern the three colour components. However, in a preferred embodiment, the colour values may be equally distributed over the Y component. And the U and V component values are fixed, for example at the median value of the U or V components in the colour space. The U and V values may be directly computed from the bit-depth of the components, by assigning the value bit-depth/2 or bit-depth» 1, where ‘»’ is the right shift operator. An example of the distribution along the Y component is the following formula: YLevel=(Level*bit-depth)/Previous_Palette_size,
(142) where Level corresponds to the entry index of the by-default table as explained above (and thus is incremented by one at each new value Y), bit-depth is the bit-depth of the Y component. Note that the Previous_Palette_size may be equal to the last used palette, or to an average Palette_size computed from the last decoded CTB or line of CTBs or frame, or to a predefined number such as 4.
(143) According to other embodiments to obtain the palette predictor, reference palette predictors are associated with respective coding entities of coding units that form the image, such as CTBs, lines of CTBs, slices, slice segment, tile, frame or sequence. And the palette predictor for the current block of pixels is the reference palette predictor associated with the coding entity that includes the current block of pixels. These embodiments may require the reference palette predictor to be transmitted in the bitstream for each coding entity.
(144)
(145) At the beginning of the CTB decoding (or CTB line, Slice, Frame, etc. decoding), the reference palette predictor 1516 is extracted from the bitstream at step 1515. Note that the use of the reference palette predictor may lead to having no palette provided in the bitstream for any coding unit of the current coding entity CTB. In this case one flag may be provided in the bitstream and thus be extracted therefrom in order to signal not to decode any palette for the coding units. In a variant, this flag may be replaced by using the value 0 for the Palette_size to indicate the decoder that no palette is to be decoded for the current coding unit. This variant requires that the Palette_size be equal to decodedSize instead of decodedSize+1 in step 1003 above. To save any bit used for signaling the use of the reference palette predictor, the reference palette predictor can be transmitted at the end of the CTB if at least one CU of the current CTB is coded using the Palette coding mode.
(146) In any case, the reference palette predictor is extracted and decoded if needed to decode one of the coding units of the current CTB. Module 1502 extracts the prediction mode. If it is not the Palette mode (test 1503), the decoder decodes the CU with the corresponding mode 1517. Otherwise (the palette mode), the palette for the current coding unit is decoded as in
(147) When the coding unit is decoded (1517, 1520) as described above for
(148) As described, the reference palette predictor transmitted is used as the predictor of the palette used for each coding unit CU in the current CTB. As described below, the reference palette predictor can be used to predict elements of the palette or, in a variant, the reference palette predictor can be used as the palette for the current coding unit. In that case, the reference palette predictor 1516 is directly transmitted to module 1509 thereby causing module 1507 to be no longer required.
(149) The selection of the reference palette predictor at the encoder may contribute to coding efficiency. Several algorithms can be used to determine the “best” reference palette predictor. For example, the palette used to predict the largest coding unit in the current CTB can be selected as the reference palette predictor for the CTB. In another example, the palette that minimizes a rate-distortion criterion from the palettes used to predict all the coding units composing the current CTB may be determined and used as the reference palette predictor for the CTB. Of course, other selection criteria may be used.
(150) According to yet other embodiments to obtain the palette predictor, the palette predictor for the current coding unit includes entries corresponding to values of pixels neighbouring the current coding unit. In these embodiments, the palette predictor for the current CU is extracted from the neighboring pixels as shown by way of example in
(151) In this example, the selected pixels are pixels contiguous with the upper and left sides of the current block of pixels, because they belong to the causal area as defined above with reference to
(152) In one embodiment, a restricted set of neighbouring pixels is considered. For example, this set of pixels is selected in order that the pixels have the highest spatial distance. This creates diversity and avoids duplicate pixels.
(153)
(154) In one embodiment, the neighbouring pixels 1701 are classified 1702. Note that the neighbouring pixels may include pixels that are not directly contiguous with the current coding unit. Each neighboring pixel of the considered set of neighbouring pixels is associated with a class (so with an entry index) depending on its colour distance from already existing entries in the palette predictor, for example using the criteria of step 1207 in
(155) Based on the non-ordered palette predictor 1703 and the occurrences 1704, the palette predictor 1706 with ordered entries is built at step 1705, for example having the most frequent entries first. Note that the entries having insignificant occurrences (for example below a threshold) may be discarded from the palette predictor.
(156) In an embodiment, two pixels of the same class have exactly the same pixel value (the criterion used for classification thus does not involve absolute value and requires a threshold TH set to zero). Note that in the images targeted by HEVC RExt (and thus by the Palette mode) which include text or screenshots, there are few different values in contiguous coding units. Classifying the pixels based on an identity of pixel values is thus relevant.
(157)
(158) According to yet other embodiments to obtain the palette predictor, the current palette has ordered entries, and predicting the current palette using the palette predictor comprises predicting an entry of the current palette from a preceding entry of the same current palette. In other words, the palette predictor when processing a given entry of the palette is made of (includes) the entries that are prior to the given entry in the colour palette currently being built. The current palette is thus intra predicted.
(159) This is illustrated by
(160) As shown, the Palette_size is decoded and computed at steps 2201 and 2202. Next, the first palette element is decoded. As the palette is intra predicted, the first palette element is not predicted and thus is directly decoded from the bitstream. Then the variable i provided to successively consider each palette entry is set equal to 1 at step 2204. The other palette elements are decoded through the next steps. In particular, for each palette element, a flag, namely Use_Pred, is decoded at step 2206 to determine whether or not (test 2207) the palette element at index i uses intra prediction or not. If it does not use Intra prediction, the palette element is decoded directly from the bitstream at step 2208. Otherwise, index j corresponding to the index of the palette element predictor in the current palette is decoded from the bitstream at step 2210. Note that the encoder may have coded index j relatively to index i in order to save bits, in which case the decoder operates in the reverse way. Then the residual is decoded at step 2211 and the palette element Pal[i] is set equal to Res[i]+Pal[j] and added to the palette at step 2212. Then the index i is incremented by one at step 2209 to consider the next palette element. Once all the palette elements have been decoded (test 2205), the process continues at step 1008 of
(161) In one embodiment, the element predictor of the palette element i is the palette element i−1, i.e. the palette entry directly preceding the current palette element in the current palette. In such case, module 2210 can be omitted, and the palette element Pal[i] is set equal to Res[i]+Pal[i−1] when it is predicted. In one embodiment, all the palette entries, except the first one, are predicted from the palette element directly preceding them in the current palette. In such case, the Use_pred flag may be omitted since the decoder knows how to obtain/decode the palette elements using intra prediction. This means that modules 2206 and 2208 can be omitted.
(162) To improve the coding efficiency of the intra prediction of the palette element, the palette element may be ordered according to their values and not to their occurrences at the encoder.
(163) According to yet other embodiments, the current palette is predicted from entries of two or more palettes. This means that a palette predictor could be built from two or more palettes. In particular, the two or more palettes may be partly or entirely merged to form a new palette predictor for the current palette.
(164) This is because the prediction mechanisms as introduced above may rely on a single palette selected, as palette predictor, from e.g. a set of palettes used to predict previously processed blocks of pixels. This may impact the quality of the palette predictor. For example, if among successive blocks of pixels B1, B2 and B3, blocks B1 and B3 are each made of a lot of different pixels but B2 is made of few different pixels, the fact of using the directly previous palette as a palette predictor for the next palette leads to using the palette of B2 (which has few palette elements) as a palette predictor for B3. But this would drastically reduce the number of elements in the palette predictor for B3 and thus the ability to efficiently predict the palette for B3.
(165) The inventors have found it is worth combining two or more palettes to build a new palette to process a new block of pixels.
(166)
(167) In the process of
(168) The palettes pal.sub.0, . . . , pal.sub.P−1 may be ordered for example to first process the most recent palettes, e.g. with low indexes. This is to add more recent elements as close as possible to the beginning in the palette predictor.
(169) The process starts at step 2400 by initializing the first palette to consider (“i”=0) and the current predictor element in the palette predictor “pred” to be built (“n”=0). The process then enters the loops to successively consider each palette pal.sub.i.
(170) At step 2401, the palette element counter “j” is initialized at 0 to consider the first palette element of the current palette pal.sub.i.
(171) At step 2402, it is checked whether or not the current palette element pal.sub.i[j] of the current palette pal.sub.i, satisfies a particular criterion to trigger or not addition of this palette element to the palette predictor “pred”.
(172) The triggering criterion may simply rely on comparing pal.sub.i[j] to the elements already added in the palette predictor “pred” (i.e. pred[0] . . . pred[n−1]) to decide the addition of the current palette element to “pred” if pal.sub.i[j] is different from pred[0] . . . pred[n−1], and to decide not to add the current palette element to “pred” if pal.sub.i[j] is the same as one element of “pred”. Note that the comparison between two elements pal.sub.i[j] and pred[k] may be a strict comparison or strict similarity (strict equality between their components) or a loosen comparison/similarity (the differences between corresponding components of the elements are below respective thresholds). In a variant, only a specific amount of the n elements pred[k] may be involved in the comparison, the exact amount depending on the value of “n”. This is because the number of comparisons may quickly increase. This, using for example n/2 or at most 4 elements involved in the comparison may be a good trade-off between coding efficiency and complexity.
(173) However other triggering criteria may be involved such as a bitmap of Use_pred flags as described below with reference to
(174) The outcome of step 2402 is that a decision is taken as to add or not the current palette element pal.sub.i[j] to the palette predictor “pred”.
(175) If decision is taken not to add it, the process goes to step 2405.
(176) If decision is taken to add it, the process goes to step 2403 where the current predictor element pred[n] is set to the current palette element pal.sub.i[j]. The next predictor element in “pred” is next selected by incrementing “n”.
(177) Next, step 2404 consists in checking whether or not a maximum number of predictor elements of “pred” have been determined. If they have not been, the process goes to step 2405. Otherwise, the palette predictor “pred” is fully determined and the process ends at step 2409.
(178) At step 2405, the next element in the current palette pal.sub.i, is selected by incrementing the palette element counter “j”.
(179) At step 2406, it is checked whether or not all the palette elements of the current palette pal.sub.i, have been considered and processed: j<J.sub.i. If not, the process loops back to step 2402 to process the next palette element pal.sub.i[j]. If the whole current palette pal.sub.i, has been processed, the process goes to step 2407, where the palette counter “i” is incremented to consider the next palette, if all the palettes pal.sub.0, . . . pal.sub.P−1 have not yet been processed (check at step 2408).
(180) If all the palettes pal.sub.0, . . . , pal.sub.P−1 have been processed, the process ends at step 2409.
(181) Note that the above various embodiments to obtain the palette predictor may be partly or all combined to provide several basis for predicting all or some of the palette elements.
(182) Turning now to the syntax elements to actually describe the palette prediction to the decoder, reference is now made to
(183) According to embodiments concerning the syntax elements, predicting the current palette using the palette predictor comprises obtaining a bitmap of flags, each of which defining whether or not a corresponding entry in the palette predictor is selected as an entry of the current palette. As a result, in addition to information making it possible for the decoder to retrieve the appropriate palette predictor, only the bitmap needs to be sent to the decoder. This bitmap may be sent in replacement of the palette as defined in HEVC RExt for the current coding unit.
(184) The syntax of the bitmap contains M flags, M being equal to the number of elements in the palette predictor. The ith decoded flag defines whether or not the element i from the palette predictor is used to fill (predict) the current palette for the current coding unit. In a variant, the bitmap may be restricted to a lower number of flags from a flag corresponding to the first element in the palette predictor to a flag corresponding to the last element that has to be used as an element predictor. The size of the bitmap is specified in the bitstream in a similar fashion to that in which the palette size is specified in HEVC RExt bitstream.
(185) For example, the elements of the palette predictor that are associated with a flag (bit) equal to 1 are copied in the current palette at the first available position, keeping their order.
(186) In another embodiment, additional entries are added at the end of the current palette having the selected entries from the palette predictor. For example, first the bitmap is decoded from the bitstream and the corresponding entries of the palette predictor are copied into the current palette, then additional pixels may be added at the end of the current palette in the same way as the conventional palette transmission.
(187) In one embodiment seeking to provide the predicted palette element as additional palette element, the determining of the Palette_size is adapted to be increased by the number of predicted palette elements: to do so, the Palette_size is set equal to the decoded size+the number of flags set equal to 1 in the bitmap (Palette_pred_enable_size). If Palette_pred_enabled_size is equal to 0, the Palette_size is set equal to the decoded size+1 as described in step 1003.
(188)
(189) First, the palette predictor 1902 is obtained at step 1901 according to any of the embodiments described above. In addition, the Predictor_of_palette_size 1903 is also obtained. Module 1905 decodes N flags from the bitstream 1904, where N=Predictor_of_palette_size.
(190) For each flag equal to 1, the corresponding element from the palette predictor is added to the current palette 1907 at the first available index, during step 1906. Palette_pred_enabled_size 1908 representing the number of flags equal to 1 in the bitmap is transmitted to decision module 1910. The size of the remainder of the palette is also decoded from the bitstream 1909. Decision module 1910 determines whether or not Palette_pred_enabled_size is equal to 0. If it is equal to 0 meaning that there is no predicted palette element in the current palette, the Palette_size is set equal to the decoded Size+1 at step 1911, and the variable i used to successively consider each entry of the current palette is set equal to 0 at step 1912. If Palette_pred_enabled_size is different from 0 meaning that there is at least one predicted palette element in the current palette, the Palette_size is set equal to the decoded Size+Palette_pred_enabled_size at step 1913, and the variable i is set equal to Palette_pred_enabled_size. Next, the decoding loop of palette elements is performed through steps 1915, 1916 and 1917 corresponding to steps 1005, 1006 and 1007 of
(191) Note that the values “0” and “1” for the flags may be inverted, meaning that flag=1 is used when the corresponding element in the palette predictor is not used to predict an element of the palette under construction (the reverse for flag=0).
(192) This inversion of the meaning of the values of the flags is useful to prevent a phenomenon called “start code emulation”: if a series of bytes matches what is called a start code, the series must be transformed to make it no longer matching the start code and to have a unique start code, through an expansion process increasing the size of the bitstream. By using 1 instead of 0, this size increase is avoided.
(193)
(194) One may note that when the palette predictor is transmitted, only the flags (bitmap) corresponding to the palette predictors is needed. To reduce signaling, the same bitmap may be used for all the coding units belonging to the same CTB, slice, tile, slice segment, frame or sequence for which a single reference palette predictor is transmitted.
(195) The bitmap of Use_pred flags has been defined in the above description referring to
(196) In some embodiments, the palette predictor under construction includes entries from a first palette which has been predicted based on a second palette (used as predictor) using a bitmap of flags as described above, each flag of which defining whether or not a corresponding entry in the second palette is selected as an entry to predict an entry in the first palette. Particular to this embodiment is that the palette predictor is built by also including the entries of the second palette corresponding to a flag of the bitmap that defines no selection of the entry to predict the first palette.
(197)
(198) Three Coding Units, CU1 to CU3, are shown that may be consecutive coding units being processed in a current image.
(199) Reference 2500 represents the palette used to process (encode or decode) CU1. This palette may have been encoded in the bitstream (and thus retrieved by the decoder) or predicted using any mechanism described in the present application.
(200) Using the predictor generation mechanism based on the last used palette as described above, this palette 2500 is used as a palette predictor for building the palette 2501 to process CU2. The prediction of palette 2501 is based on bitmap 2506 of Use_pred flags as described above. It is to be recalled that the flags take the value 1 or 0 depending of the use or not, respectively, of the corresponding element for predicting the palette of a next CU. In a variant, flag=1 may mean not selecting the corresponding element, while flag=0 may mean selecting the element for predicting the palette of the next CU.
(201) As a result, in the present example, the first, third, fourth and fifth elements of palette predictor 2500 are copied into palette 2501 as defined in the bitmap 2506. The second element 2502 is not reused (flag=0 in bitmap 2506). Note that an additional palette element 2503 may have been added to the end of palette 2501 being built, based on the mechanisms described above (e.g. explicitly transmitted in the bitstream).
(202) Also, palette 2501 is used as a palette predictor to build the palette to process CU3. In the example of the Figure, all the elements of palette 2501 are copied (step 2504) into the palette for CU3. In a variant to this example, a bitmap (not shown) may be provided to identify which elements of palette 2501 should be copied into the palette for CU3, similarly to the bitmap 2506 defining the elements to be copied into palette 2501.
(203) Specific to this embodiment of the invention, palette predictor 2500 is also used as a predictor for building the palette to process CU3.
(204) To achieve such building, a palette predictor 2505 is built from palettes 2500 and 2501. As mentioned above, all the elements of palette 2501 are copied (step 2504) into palette predictor 2505. In this example, the entries of palette predictor 2500 corresponding to a flag of the bitmap that defines no selection of the entry to predict palette 2501 (i.e. usually with flag=0, for example element 2502), are added (step 2508) to palette predictor 2505. This is because the other entries of palette predictor 2500 are already in palette predictor 2505 thanks to the copying step 2504. This selection of element 2502 can be performed very quickly thanks to the flags in bitmap 2506.
(205) A bitmap may be provided to predict, based on palette predictor 2505, the palette to process CU3.
(206) Of course, palette predictor 2505 may also be directly the palette to process CU3. However, palette predictor 2505 continuously grows as it includes all the elements defined in previous palettes. It grows up to a limit from which the elements of the palette predictor no longer fitting in it are not added to the predictor in spite of the value of their Use_pred flag.
(207) The addition of element 2502 is preferably performed at the end of palette predictor 2505. One may directly observe that the resulting palette predictor is enriched compared to situations described above.
(208) One particular advantage of adding the unused elements at the end of the palette predictor is that the elements are approximately ordered by their age and their level of use. This results in having the last elements in the palette predictor that are the least useful ones and the most likely to be removed. A decision can thus be taken to remove some elements from the palette predictor under construction, for example based on the number of uses of this element when processing to the last M (M integer to be defined) blocks of pixels using respective palettes that include this element.
(209) Of course, this process can be adapted so as to put unused elements first in the palette predictor, or even interleaved with some of the elements from palette 2401.
(210) Note that the selection of unused elements from a previous palette guarantees that the elements are unique, and therefore the Use_pred flags are not redundant. The palette predictor efficiency is thus maximized.
(211) The above approach of the invention involving the building of a palette predictor from two or more palettes only impacts the palette predictor building step 1901 of
(212)
(213) To sum up this process with reference to the example of
(214) In addition, the possible stuffing elements (such as 2502) are added to “pred”, further to the already copied elements.
(215) Structure “pred” contains a maximum of N.sub.MAX elements. N.sub.MAX ideally can be bigger than the maximum number of elements in a palette. A good compromise between coding efficiency and memory has been found with N.sub.MAX=64, i.e. twice the maximum size of a palette in the example.
(216) “pal” is a structure containing N.sub.cur elements, dedicated to storing the last palette used, i.e. palette 2501 in the example of
(217) “last” is a structure containing N.sub.last elements, dedicated to storing a previous palette or predictor, for example the palette predictor of the last used palette, i.e. palette predictor 2500 in the example of
(218) Note that “pal” is the last palette used while “last” is the palette predictor of this last palette used.
(219) Step 2600 initializes the copy of “pal” into “pred”: the first element of each structure is selected by setting the loop counter “i” to 0. Then the copy loop starts at step 2601: the current element of “pred” is set equal to the one of “pred”. Then step 2602 makes it possible to select the next element of “pal” by incrementing the loop counter “i”.
(220) Step 2603 then checks whether or not the last element of either the palette predictor under construction “pred” or the palette “pal” has been reached (given N.sub.MAX for “pred” and N.sub.cur for “pal”).
(221) If the last element has not been reached, the process loops back to step 2601 to copy the next element. Otherwise, the process goes to step 2604.
(222) The contribution of the other palette, here previous palette 2500, to the building of the palette predictor 2505 according to this embodiment of the invention results from the following steps 2604 to 2608.
(223) These steps allow the stuffing operation referenced 2508 in
(224) Step 2604 selects the first element of the previous palette “last” by initializing the loop counter “j” to 0.
(225) Step 2605 then occurs where it is checked whether or not the last element of either “pred” or “last” has been reached.
(226) If not, the process continues in step 2606. Otherwise, the process ends at step 2609.
(227) Step 2606 consists in checking whether or not the current element “j” in “last” has already been reused. Like at step 2402 above, this may consist in checking whether the Use_pred flag associated with this element in the “Use_pred” array is set to 0 (not reused) or 1 (reused). In a variant, it may consist in verifying whether or not the current element already exists in “pred” being built.
(228) If not reused, step 2607 occurs where the current element “j” is added from “last” to “pred” (at the end of “pred”). So the next element of “pred” is selected by incrementing “i”.
(229) In any case, the next element in “last” is selected by incrementing “j” at step 2608.
(230) When all the elements of “last” or “pred” have been processed, the process ends at step 2609.
(231) Note that for a next iteration of the process of
(232) As mentioned above, this embodiment provides a way to build each new palette predictor in such a way that the size of the palette predictor tends to continuously increase. Of course, the number N.sub.MAX provides a limit to the maximum size of the palette predictor. However, N.sub.MAX is usually selected quite large, often larger than the maximum size of the palettes.
(233) It is to be recalled that a bitmap of Use_pred flags has to be provided to the client device to perform prediction of the new palette from the newly constructed palette predictor. The larger the value of N.sub.MAX, the greater the number of extraneous Use_pred flags in the bitmap. This has a cost in transmission because each flag costs at least one bit.
(234) A way to mitigate these extraneous costs, without additional techniques, is to compute N.sub.MAX as a combination of N.sub.cur and N.sub.last. For example, N.sub.MAX may be defined as fact*N.sub.last, where fact may depend on the color format and the component(s) affected. For instance, fact=6 for 4:4:4 color format, while fact=1 for luma and fact=2 for chroma in other formats.
(235) However, this approach has been found not to be the optimal solution, in particular in the case of the embodiment of
(236) An embodiment as proposed in
(237)
(238) A bitmap as described above (i.e. without the modified syntax) is shown at the top of
(239) part referenced 2700 which intermingles flags set to “0” and flags set to “1”. Part 2700 ends with the last flag set to “1”. This part defines all the elements of the palette predictor that are used in the predicted palette; and
(240) part referenced 2701 which is the remaining part of the bitmap, exclusively made of flags set to “0” and thus corresponding to elements not reused. Note that it is usual that the last part of the bitmap is made exclusively of “0”s since the corresponding elements are usually the old and less used ones.
(241) Note that this split into two parts is provided here only for illustrative purposes. The bitmap 2700+2701 is provided if the modified syntax according to the invention is not implemented.
(242) Below this bitmap, the bitmap with the modified syntax is shown. One may note that its size is drastically reduced.
(243) The modified syntax exploits the existence of consecutive “0” flags 2702 by inserting additional elements or bits at specific locations in the series of Use_pred flags.
(244) As an illustration, bits 2703 and 2704 have been added to indicate whether or not there are other Use_pred flags set to “1” afterwards in the bitmap.
(245) These additional bits are designated “end-of-prediction” flags and they do not provide indication as to whether or not a corresponding entry in the palette predictor is selected as an entry to predict an entry in the palette currently under construction. On the contrary, these additional flags shift the Use_pred flags to the right.
(246) As an example, the “end-of-prediction” flags may take the value “0” to indicate that there are other flags equal to 1 in the remaining part of the bitmap, while they may take the value “1” when there are no other flags equal to 1 in the remaining of the bitmap.
(247) Regarding the example of 2703 and 2704, the value “0” of flag 2703 indicates that there are remaining elements to predict using the bitmap, as evidenced by the flags being set to 1 in the subpart 2705, while flag 2704 is set to 1, because there is no longer any element predicted (all the other Use_pred flags are set to 0).
(248) As a consequence, while flags 2703 and 2704 were added, subpart 2701 could be entirely skipped, although only subpart 2702 is skipped in the example, thus reducing costs of transmission.
(249) The locations for the additional “end-of-prediction” flags are preferably predefined depending on the properties of the palette mode. However, they should be selected taking into account what follows:
(250) it is worth having an end-of-prediction flag early to avoid sending too many Use_pred flags for small palettes; and
(251) it is worth having end-of-prediction flags at periodic intervals, ideally powers of 2, to parse the bitmap easily.
(252) Taking this into account, an embodiment provides predefined positions for the end-of-prediction flags after the 4.sup.th Use_pred flag location, and then every eight Use_pred flags starting after the 16.sup.th Use_pred flag, then after the 24.sup.th Use_pred flag and so on.
(253) The bottom of
(254) The steps 2715 to 2717 of the process are specific to the modified syntax.
(255) Step 2710 initializes the decoding loop by setting the loop counter “i” to 0 and the number “j” of predicted elements to 0.
(256) Step 2711 then checks whether or not there is any element left in the palette predictor (i<N), as it may happen that the predictor is empty (N=0), and that there is no flag left to decode (a palette having a maximum number N.sub.MAX of elements).
(257) If there is no further element to process, the process ends at step 2718.
(258) Otherwise, the Use_pred flag is decoded for element i of the palette predictor at step 2712.
(259) At step 2713, it is determined whether or not the element i is used for predicting the palette under construction. For example, a flag set to 1 means that the corresponding element i of the palette predictor is used.
(260) In the affirmative, the number “j” of used elements is incremented at step 2714.
(261) In any case, the process continues with step 2715, rather than going directly to step 2719 as it would without the modified syntax.
(262) Step 2715 checks whether or not an “end-of-prediction” flag (e.g. 2703 or 2704) is present next to the current i.sup.th Use_pred flag. For example, the check can be based on the value of “i” (e.g. whether it is 4, 16, 24, . . . as suggested above).
(263) If the next flag is not an “end-of-prediction” flag, a normal process resumes by going to step 2719.
(264) Otherwise, the “end-of-prediction” flag is decoded at step 2716. Once it has been decoded, step 2717 determines whether or not the “end-of-prediction” flag indicates the end of the prediction, i.e. if it is set to 1.
(265) If the “end-of-prediction” flag does not indicate the end of the prediction, the decoding of the Use_pred flags proceeds at step 2719 by selecting the next Use_pred flag through incrementing of the loop counter “i”.
(266) If the “end-of-prediction” flag indicates the end of the prediction, the process ends at 2718.
(267) The outcome of this process is that all relevant Use_pred flags have been obtained to determine which element of the palette predictor should be used for predicting the palette under construction. Note that the elements for which no Use_pred flag has been obtained must be considered as unused.
(268) According to other embodiments regarding the syntax elements, predicting the current palette using the palette predictor comprises obtaining at least one (possibly two or more) entry residual corresponding to the difference between at least one corresponding entry of the current palette and an entry of the palette predictor. In these embodiments, a residual between the current palette element and the palette predictor is transmitted in the bitstream. The residual Res[i] is equal to Pal[i]-Pal_Pred[j],
(269) where Res[i] is the residual for level i, Pal[i] is the current palette element for level i, and Pal_Pred[j] is the element predictor identified by level j. Note that the palette predictor j usually needs to be transmitted unless it is known by the decoder (for example because j is fixed relatively to i, for instance j=i).
(270) The decoding of the residual for the three colour components is different from the decoding of the palette element. Indeed, as mentioned in the prior art, a palette element is coded with a fix length of N bits, with N=3*bit-depth. For the residual and in order to save bits, each color component residual may be coded with an adaptive code, such as a Golomb code (in the same way as the coefficients of the block residual).
(271)
(272) The decoded size of the palette is extracted from the bitstream at step 2101 and the Palette_size is set equal to the decoded Size+1 at step 2102. The variable i used to successively consider each palette entry is set to 0 at step 2103. Then the loop to decode the palette starts with test 2104 to determine whether or not all the palette entries have been processed. For the palette element i, a flag, Use_pred, is decoded from the bitstream at step 2105 to determine (test 2106) whether or not the palette element i uses prediction or not. If the palette element i does not use prediction, it is decoded at step 2107 using conventional mechanisms and added to the palette at the level equal to i. Then the variable i is incremented by one at step 2108 to consider the next palette element. If the palette element i uses prediction, the predictor index j is decoded from the bitstream at step 2112. Note that for coding efficiency purposes, the length of the code used to encode the predictor index j depends on Predictor_of_Palette_size 2110. Thus, in parallel, the palette predictor 2110 is obtained as described above and Predictor_of_Palette_size 2011 is also obtained.
(273) Once the predictor index j is known, the residual Res[i] of the palette element is also decoded from the bitstream at step 2113. Then, the palette element Pal[i] is computed from the formula Res[i] +Pal_Pred[j] at step 2114 using the palette predictor Pal_Pred 2111. The palette element Pal[i] is then added to the current palette. Next, the variable i is incremented by one at step 2108 to consider the next palette element. At the end of this process, the current palette has been decoded.
(274) In one embodiment, the index j is set equal to i, in which case the predictor index j is no longer required to be transmitted to the decoder. Consequently, module 2112 can be omitted. In addition, a residual may be obtained for every element of the current palette that has a corresponding entry with the same entry index/level in the palette predictor. In this case, if i is superior or equal to the Predictor_of_Palette_size, no residual is decoded. Furthermore, the flag Use_pred is no longer required since the decoder knows which palette elements to predict based on Predictor_of_Palette_size. Consequently, modules 2105 and 2106 can be omitted. These approaches reduce the number of signaling bits required for the palette prediction, by removing the signaling of the predictors. This is useful when the palette elements are ordered according to their occurrences.
(275) In embodiments, only one or two colour components out of the three (or more) are predicted.
(276) Above have been described several ways to obtain a palette predictor (
(277)
(278) The executable code may be stored either in read only memory 2303, on the hard disk 2306 or on a removable digital medium such as for example a disk. According to a variant, the executable code of the programs can be received by means of a communication network, via the network interface 2304, in order to be stored in one of the storage means of the communication device 2300, such as the hard disk 2306, before being executed.
(279) The central processing unit 2301 is adapted to control and direct the execution of the instructions or portions of software code of the program or programs according to embodiments of the invention, which instructions are stored in one of the aforementioned storage means. After powering on, the CPU 2301 is capable of executing instructions from main RAM memory 2302 relating to a software application after those instructions have been loaded from the program ROM 2303 or the hard-disk (HD) 2306 for example. Such a software application, when executed by the CPU 2301, causes the steps of the flowcharts shown in
(280) Any step of the algorithms shown in
(281) Although the present invention has been described hereinabove with reference to specific embodiments, the present invention is not limited to the specific embodiments, and modifications will be apparent to a skilled person in the art which lie within the scope of the present invention.
(282) Many further modifications and variations will suggest themselves to those versed in the art upon making reference to the foregoing illustrative embodiments, which are given by way of example only and which are not intended to limit the scope of the invention, that being determined solely by the appended claims. In particular the different features from different embodiments may be interchanged, where appropriate.
(283) In the claims, the word “comprising” does not exclude other elements or steps, and the indefinite article “a” or “an” does not exclude a plurality. The mere fact that different features are recited in mutually different dependent claims does not indicate that a combination of these features cannot be advantageously used.