Video Encoding or Decoding Methods and Apparatuses with Scaling Ratio Constraint
20230007281 · 2023-01-05
Inventors
- Tzu-Der Chuang (Hsinchu City, TW)
- Chih-Wei Hsu (Hsinchu City, TW)
- Ching-Yeh Chen (Hsinchu City, TW)
- Chia-Ming TSAI (Hsinchu City, TW)
- Chun-Chia Chen (Hsinchu City, TW)
- Olena Chubach (San Jose, CA, US)
- Lulin CHEN (San Jose, CA, US)
- Yu-Wen Huang (Hsinchu City, TW)
Cpc classification
H04N19/196
ELECTRICITY
H04N19/132
ELECTRICITY
H04N19/105
ELECTRICITY
H04N19/59
ELECTRICITY
H04N19/70
ELECTRICITY
H04N19/139
ELECTRICITY
International classification
H04N19/196
ELECTRICITY
H04N19/105
ELECTRICITY
H04N19/132
ELECTRICITY
H04N19/139
ELECTRICITY
Abstract
Video processing methods and apparatuses for processing a current block in a current picture by reference picture resampling include receiving input data of the current block, determining a scaling window of the current picture and a scaling window of a reference picture. The current picture and reference picture may have different scaling window sizes. A ratio between a scaling window width, height, or size of the current picture and a scaling window width, height, or size of the reference picture is constrained to be within a ratio constraint. A reference block is generated from the reference picture according to the ratio, and used to encode or decode the current block.
Claims
1. A video processing method in a video encoding or decoding system, comprising: receiving input video data of a current block in a current picture; determining a scaling window width, height, or size of the current picture; determining a scaling window width, height, or size of a reference picture, wherein a ratio between the scaling window width, height, or size of the current picture and the scaling window width, height, or size of the reference picture is within a ratio constraint; generating a reference block from the reference picture according to the ratio between the scaling window width, height, or size of the current picture and the scaling window width, height, or size of the reference picture; performing motion compensation for the current block using the reference block; and encoding or decoding the current block in the current picture.
2. The method of claim 1, wherein the ratio constraint is between ⅛ and 2.
3. The method of claim 2, wherein the scaling window size comprises both the scaling window width and the scaling window height, and the ratio between the scaling window size of the current picture and the scaling window size of the reference picture is within the ratio constraint when 2 times the scaling window width of the current picture is greater than or equal to the scaling window width of the reference picture, 2 times the scaling window height of the current picture is greater than or equal to the scaling window height of the reference picture, the scaling window width of the current picture is less than or equal to 8 times the scaling window width of the reference picture, and the scaling window height of the current picture is less than or equal to 8 times the scaling window height of the reference picture.
4. The method of claim 1, wherein the scaling window width of the current picture is derived by a picture width, a left scaling window offset, and a right scaling window offset of the current picture, and the scaling window height of the current picture is derived by a picture height, a top scaling window offset, and a bottom scaling window offset of the current picture.
5. The method of claim 4, wherein the scaling window width of the current picture is derived by subtracting the left scaling window offset and the right scaling window offset from the picture width of the current picture, and the scaling window height of the current picture is derived by subtracting the top scaling window offset and the bottom scaling window offset from the picture height of the current picture.
6. The method of claim 4, wherein the picture width, left scaling window offset, right scaling window offset, picture height, top scaling window offset, and bottom scaling window offset of the current picture are signaled in a Picture Parameter Set (PPS) associated with the current picture.
7. The method of claim 4, wherein the left scaling window offset, right scaling window offset, top scaling window offset, and bottom scaling window offset are measured in chroma samples.
8. The method of claim 7, wherein the scaling window width of the current picture is further derived by a variable SubWidthC and the scaling window height of the current picture is further derived by a variable SubHeightC, wherein the variables SubWidthC and SubHeightC indicate down-sampling ratios associated with chroma bitplanes in horizontal and vertical dimensions.
9. The method of claim 8, wherein the scaling window width of the current picture is derived by multiplying the variable SubWidthC with a sum of the left scaling window offset and the right scaling window offset and then subtracting from the picture width of the current picture, and the scaling window height of the current picture is derived by multiplying the variable SubHeightC with a sum of the top scaling window offset and the bottom scaling window offset and then subtracting from the picture height of the current picture.
10. The method of claim 1, wherein the ratio constraint is between 1/M and N, wherein M and N are positive integers.
11. The method of claim 1, wherein the scaling window size comprises both the scaling window width and the scaling window height, and the ratio between the scaling window size of the current picture and the scaling window size of the reference picture is within the ratio constraint when N times the scaling window width of the current picture is greater than or equal to the scaling window width of the reference picture, N times the scaling window height of the current picture is greater than or equal to the scaling window height of the reference picture, the scaling window width of the current picture is less than or equal to M times the scaling window width of the reference picture, and the scaling window height of the current picture is less than or equal to M times the scaling window height of the reference picture.
12. The method of claim 1, wherein a reference picture scaling ratio is derived for motion compensation from the scaling window width, height, or size of the current picture and the scaling window width, height, or size of the reference picture, and the reference picture scaling ratio is constrained to be within a range of [2048, 32768].
13. The method of claim 1, further comprising generating, at an encoder side, or receiving, at a decoder side, a bitstream corresponding to encoded data of a video sequence, wherein the bitstream complies with a bitstream conformance requirement that 2 times the scaling window width of the current picture is greater than or equal to the scaling window width of the reference picture, 2 times the scaling window height of the current picture is greater than or equal to the scaling window height of the reference picture, the scaling window width of the current picture is less than or equal to 8 times the scaling window width of the reference picture, and the scaling window height of the current picture is less than or equal to 8 times the scaling window height of the reference picture.
14. An apparatus of processing video data in a video encoding or decoding system, the apparatus comprising one or more electronic circuits configured for: receiving input video data of a current block in a current picture; determining a scaling window width, height, or size of the current picture; determining a scaling window width, height, or size of a reference picture, wherein a ratio between the scaling window width, height, or size of the current picture and the scaling window width, height, or size of the reference picture is within a ratio constraint; generating a reference block from the reference picture according to the ratio between the scaling window width, height, or size of the current picture and the scaling window width, height, or size of the reference picture; performing motion compensation for the current block using the reference block; and encoding or decoding the current block in the current picture.
15. A non-transitory computer readable medium storing program instruction causing a processing circuit of an apparatus to perform a video processing method for video data, and the method comprising: receiving input video data of a current block in a current picture; determining a scaling window width, height, or size of the current picture; determining a scaling window width, height, or size of a reference picture, wherein a ratio between the scaling window width, height, or size of the current picture and the scaling window width, height, or size of the reference picture is within a ratio constraint; generating a reference block from the reference picture according to the ratio between the scaling window width, height, or size of the current picture and the scaling window width, height, or size of the reference picture; performing motion compensation for the current block using the reference block; and encoding or decoding the current block in the current picture.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0021] Various embodiments of this disclosure that are proposed as examples will be described in detail with reference to the following figures, and wherein:
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
DETAILED DESCRIPTION OF THE INVENTION
[0028] It will be readily understood that the components of the present invention, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following more detailed description of the embodiments of the systems and methods of the present invention, as represented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention.
[0029] Constrain Reference Picture Scaling Ratio In VVC draft 6, one bitstream conformance requirement is applied to constrain a picture size ratio of a reference picture to a current picture to be within [⅛, 2]. The picture size ratio is derived from a reference picture width/height/size and a current picture width/height/size. The picture size ratio constraint is specified to be within [⅛, 2] as the interpolation filters only supports scaling ratios between ⅛ and 2. Some embodiments of the present invention apply the [⅛, 2] ratio constraint to a scaling ratio between a current scaling window width, height, or size and a reference scaling window width, height, or size. The scaling ratio is calculated by scaling window widths, heights, or sizes instead of picture widths, heights, or sizes.
[0030] In one embodiment, a scaling window width of a current picture PicOutputWidthL is derived by a picture width pic_width_in_luma_samples, a left scaling window offset scaling_win_left_offset, and a right scaling window offset scaling_win_right_offset signaled in the PPS associated with the current picture, i.e. PicOutputWidthL=pic_width_in_luma_samples−(scaling_win_right_offset+scaling_win_left_offset), and a scaling window height of the current picture PicOutputHeightL is derived by a picture height pic_height_in_luma_samples, a top scaling window offset scaling_win_top_offset, and a bottom scaling window offset scaling_win_bottom_offset, i.e. PicOutputHeightL=pic_height_in_luma_samples−(scaling_win_bottom_offset+scaling_win_top_offset). When scaling_window_flag is equal to 1, let refPicOutputWidthL and refPicOutputHiehgtL be a scaling window width of a reference picture and a scaling window height of the reference picture respectively. A reference block in the reference picture is determined to be referenced by a current block of the current picture. For example, a video encoding system determines the reference block by motion estimation, and a video decoding system determines the reference block by parsing motion information of the current block signaled in the video bitstream. It is a requirement of the bitstream conformance that all of the following four conditions are satisfied when the ratio between the scaling window size of the current picture and the scaling window size of the reference picture is within the ratio constraint [⅛, 2]. Two times the scaling window width of the current picture is greater than or equal to the scaling window width of the reference picture, two times the scaling window height of the current picture is greater than or equal to the scaling window height of the reference picture, the scaling window width of the current picture is less than or equal to eight times the scaling window width of the reference picture, and the scaling window height of the current picture is less than or equal to eight times the scaling window height of the reference picture. That is, PicOutputWidthL*2≥refPicOutputWidthL, PicOutputHeightL*2≥refPicOutputHeightL, PicOutputWidthL≤refPicOutputWidthL*8, and PicOutputHeightL≤refPicOutputHeightL*8.
[0031] To generalize the above embodiment of constraining the scaling window width and scaling window height of the current picture based on the scaling window width and scaling window height of the reference picture, it is a requirement of the bitstream conformance that all of the following conditions are satisfied. N times the scaling window width of the current picture is greater than or equal to the scaling window width of the reference picture, N times the scaling window height of the current picture is greater than or equal to the scaling window height of the reference picture, the scaling window width of the current picture is less than or equal to M times the scaling window width of the reference picture, and the scaling window height of the current picture is less than or equal to M times the scaling window height of the reference picture. The ratio between the scaling window size of the current picture and the scaling window size of the reference picture is between a ratio constraint [1/M, N], where N and M are positive integers, for example, N is 2 and M is 8 in the previous embodiment. PicOutputWidthL*N≥refPicOutputWidthL, PicOutputHeight*N≥refPicOutputHeight, PicOutputWidthL≤refPicOutputWidthL*M, and PicOutputHeightL≤refPicOutputHeightL*M.
[0032] In one embodiment, a ratio constraint [1/M, N] is determined, to encode or decode a current picture, an encoder or decoder checks if one or more reference pictures satisfied the ratio constraint by determining a scaling window width, height, or size of the current picture and a scaling window width, height, or size of the reference picture. Only the reference picture with a scaling window width, height, or size satisfying the ratio constraint can be referenced by the current picture.
[0033] In some other embodiments, a ratio constraint [1/M, N] is determined, and an encoder or decoder determines a scaling window width, height, or size of a current picture according to a scaling window width, height, or size of a reference picture in order to satisfy the ratio constraint. In one embodiment, the same ratio constraint may constrain both the scaling window ratio and the picture size ratio, and the encoder or decoder also determines a picture size of the current picture according to a picture size of the reference picture to follow the ratio constraint.
[0034] In another embodiment, scaling window offsets signaled in the PPS are measured in chroma samples, a scaling window width of a current picture PicOutputWidthL is derived by a picture width pic_width_in_luma_samples, a left scaling window offset scaling_win_left_offset, and a right scaling window offset scaling_win_right_offset signaled in the PPS, as well as a variable SubWidthC. The value of the variable SubWidthC is defined according to the color sampling format of the video data; for example, SubWidthC is equal to 2 when the color sampling format is 4:2:0. PicOutputWidthL=pic_width_in_luma_samples−SubWidthC*(scaling_win_right_offset+scaling_win_left_offset). Similarly, a scaling window height of the current picture PicOutputHeightL is derived by a picture height pic_height_in_luma_samples, a top scaling window offset scaling_win_top_offset, and a bottom scaling window offset scaling_win_bottom_offset, as well as a variable SubHeightC. The value of the variable SubHeightC is also defined according to the color sampling format of the video data. SubHeightC is equal to 2 when the color sampling format is 4:2:0. PicOutputHeightL=pic_height_in_luma_samples−SubHeightC*(scaling_win_bottom_offset+scaling_win_top_offset). The variables SubWidthC and SubHeightC indicate down-sampling ratios associated with the chroma bitplanes in horizontal and vertical dimensions respectively.
[0035] Let refPicOutputWidthL and refPicOutputHeightL be a scaling window width and a scaling window height of a reference picture referenced by a current block of the current picture, where refPicOutputWidthL and refPicOutputHeightL are derived by the picture width and height, scaling window offsets, and the variable SubWidthC and SubHeightC. It is a requirement of the bitstream conformance that all of the following four conditions are satisfied. Two times the scaling window width of the current picture is greater than or equal to the scaling window width of the reference picture, two times the scaling window height of the current picture is greater than or equal to the scaling window height of the reference picture, the scaling window width of the current picture is less than or equal to eight times the scaling window width of the reference picture, and the scaling window height of the current picture is less than or equal to eight times the scaling window height of the reference picture. PicOutputWidthL*2≥refPicOutputWidthL, PicOutputHeightL*2≥refPicOutputHeightL, PicOutputWidthL≤refPicOutputWidthL*8, PicOutputHeightL≤refPicOutputHeightL*8.
[0036] A reference picture scaling ratio, RefPicScale[i][j][0], RefPicScale[i][j][1], is derived for motion compensation from the scaling window size, width, or height specified in the PPS. The reference picture scaling ratio affects which filters are used in the motion compensation stage, and it also affects the memory bandwidth used for the motion compensation stage. In addition to constrain the picture size ratio, embodiments of the present invention constrain the reference picture scaling ratio as well. For example, the reference picture scaling ratio RefPicScale[i][j][0] and RefPicScale[i][j][1] shall be constrained to be within the range of [2048, 32768], which is equivalent to a scaling ratio of [⅛, 2]. It is a requirement of the bitstream conformance that all of the following conditions are satisfied: RefPicScale[i][j][0] shall be greater than or equal to 2048, and shall be smaller than or equal to 32768, and RefPicScale[i][j][1] shall be greater than or equal to 2048, and shall be smaller than or equal to 32768.
[0037] For example, three different interpolation filter sets can be selected in motion compensation depending on the scaling ratio. A first interpolation filter set (set 0) includes a 8-tap DCT-IF filter, an affine 6-tap DCT-IF filter, and a 6-tap Half pixel IF filter, and a second interpolation filter set (set 1) includes 8-tap RPR filters and corresponding 6-tap affine filters for 1.5× ratio, and a third interpolation filter set (set 2) includes 8-tap RPR filters and corresponding 6-tap affine filters for 2.0× ratio. For processing a current block associated with a scaling ratio between ⅛ and 1.25, filters in set 0 are selected, for processing a current block associated with a scaling ratio between 1.25 and 1.75, filters in set 1 are selected, and for processing a current block associated with a scaling ratio between 1.75 and 2, filters in set 2 are selected.
[0038] Exemplary Flowchart for
[0039]
[0040] Video Encoder and Decoder Implementations The foregoing proposed video processing methods for reference picture resampling can be implemented in video encoders or decoders. For example, a proposed video processing method is implemented in an inter prediction module of an encoder, and/or an inter prediction module of a decoder. Alternatively, any of the proposed methods is implemented as a circuit coupled to one or a combination of the inter prediction module and/or one or a combination of the inter prediction module of the decoder, so as to provide the information needed by the inter prediction module.
[0041] A corresponding Video Decoder 600 for decoding the video bitstream generated from the Video Encoder 500 of
[0042] Various components of Video Encoder 500 and Video Decoder 600 in
[0043] Embodiments of the video processing method for encoding or decoding may be implemented in a circuit integrated into a video compression chip or program codes integrated into video compression software to perform the processing described above. For examples, determining a reference block in a reference picture may be realized in program codes to be executed on a computer processor, a Digital Signal Processor (DSP), a microprocessor, or field programmable gate array (FPGA). These processors can be configured to perform particular tasks according to the invention, by executing machine-readable software codes or firmware codes that defines the particular methods embodied by the invention.
[0044] Reference throughout this specification to “an embodiment”, “some embodiments”, or similar language means that a particular feature, structure, or characteristic described in connection with the embodiments may be included in at least one embodiment of the present invention. Thus, appearances of the phrases “in an embodiment” or “in some embodiments” in various places throughout this specification are not necessarily all referring to the same embodiment, these embodiments can be implemented individually or in conjunction with one or more other embodiments. Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. One skilled in the relevant art will recognize, however, that the invention can be practiced without one or more of the specific details, or with other methods, components, etc. In other instances, well-known structures, or operations are not shown or described in detail to avoid obscuring aspects of the invention.
[0045] The invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described examples are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.