DETERMINATION OF A PARAMETER SET FOR A TONE MAPPING CURVE
20230054477 · 2023-02-23
Assignee
Inventors
- Hu Chen (Munich, DE)
- Yichuan WANG (Beijing, CN)
- Weiwei XU (Hangzhou, CN)
- Quanhe YU (Beijing, CN)
- Elena Alexandrovna Alshina (Munich, DE)
Cpc classification
International classification
Abstract
The present disclosure relates generally to the field of video processing, and more particularly, to high dynamic range (HDR) image processing. In particular, the present disclosure relates to a method for determining a parameter set for a tone mapping curve. The method comprises obtaining a plurality of parameter sets, wherein each parameter set defines the tone mapping curve, and wherein each parameter set is derived based on one of a plurality of HDR video frames. Further, the method comprises temporally filtering the plurality of parameter sets to obtain a temporally filtered parameter set.
Claims
1. A method for determining a parameter set for a tone mapping curve by a decoder, the method comprising: obtaining, by the decoder, a plurality of parameter sets, wherein each parameter set defines a tone mapping curve, and wherein each parameter set is derived based on one of a plurality of High Dynamic Range (HDR) video frames; and temporally filtering, by the decoder, the plurality of parameter sets to obtain a temporally filtered parameter set.
2. The method according to claim 1, wherein the temporally filtering the plurality of parameter sets comprises calculating a weighted average or an average of at least a part of parameters of the plurality of parameter sets.
3. The method according to claim 1, further comprising: generating the tone mapping curve based on the temporally filtered parameter set.
4. The method according to claim 1, wherein each parameter set directly or indirectly defines the tone mapping curve.
5. The method according to claim 1, wherein: each parameter set comprises metadata of the HDR video frame or one or more curve parameters of the tone mapping curve.
6. The method according to claim 1, wherein each parameter set comprises metadata extracted from the respective HDR video frame, and the temporally filtered parameter set comprises temporally filtered metadata.
7. The method according to claim 6, further comprising: computing one or more curve parameters of the tone mapping curve based on the temporally filtered metadata.
8. The method according to claim 7, further comprising: generating the tone mapping curve based on the one or more curve parameters.
9. The method according to claim 1, wherein each parameter set comprises one or more curve parameters of the tone mapping curve computed based on metadata extracted from the respective HDR video frame, and the temporally filtered parameter set comprises one or more temporally filtered curve parameters.
10. The method according to claim 9, further comprising: generating the tone mapping curve based on the one or more temporally filtered curve parameters.
11. The method according to claim 1, wherein the tone mapping curve is obtained by:
12. The method according to claim 1, further comprising: transmitting or storing the temporally filtered parameter set together with the plurality of HDR video frames.
13. The method according to claim 1, comprising: obtaining a first parameter set of a first HDR video frame and pushing the first parameter set into a queue; obtaining a second parameter set of a second HDR video frame; detecting whether a scene change is occurred between the first HDR video frame and the second HDR video frame, and pushing the second parameter set into the queue when no scene change is occurred; computing an average of the parameter sets in the queue to obtain the temporally filtered parameter set.
14. The method according to claim 13, comprising: clearing the queue when a scene change is occurred.
15. A decoder comprising: one or more processors; and a non-transitory computer-readable storage medium coupled to the one or more processors and storing programming instructions for execution by the one or more processors, wherein the programming instructions, when executed by the one or more processors, configures the decoder to perform operations comprising: obtaining a plurality of parameter sets, wherein each parameter set defines a tone mapping curve, and wherein each parameter set is derived based on one of a plurality of High Dynamic Range (HDR) video frames; and temporally filtering the plurality of parameter sets to obtain a temporally filtered parameter set.
16. The decoder according to claim 15, wherein the temporally filtering of the plurality of parameter sets comprises calculating a weighted average or an average of at least a part of parameters of the plurality of parameter sets.
17. The decoder according to claim 15, further comprising: generating the tone mapping curve based on the temporally filtered parameter set.
18. The decoder according to claim 15, wherein each parameter set directly or indirectly defines the tone mapping curve.
19. The decoder according to claim 15, wherein each parameter set comprises metadata of the HDR video frame or one or more curve parameters of the tone mapping curve.
20. An encoder comprising: one or more processors; and a non-transitory computer-readable storage medium coupled to the one or more processors and storing programming instructions for execution by the one or more processors, wherein the programming instructions, when executed by the one or more processors, configures the encoder to perform operations comprising: obtaining a plurality of parameter sets, wherein each parameter set defines a tone mapping curve, and wherein each parameter set is derived based on one of a plurality of High Dynamic Range (HDR) video frames; and temporally filtering the plurality of parameter sets to obtain a temporally filtered parameter set.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0051] The above described aspects and implementation forms (embodiments of the disclosure) will be explained in the following description of specific embodiments in relation to the enclosed drawings, in which:
[0052]
[0053]
[0054]
[0055]
[0056]
[0057]
[0058]
[0059]
[0060]
[0061]
[0062]
[0063]
[0064]
[0065]
DETAILED DESCRIPTION OF EMBODIMENTS
[0066]
[0067] A system (see e.g.,
[0068] Notably, embodiments of the disclosure may be implemented in blocks 1301/1302 or blocks 1303/1304 of a pipeline 1300 as shown in
[0069] The device 100 may be configured (with reference to
[0070] Thereby, as will be explained later in more detail, each parameter set 102 may comprise metadata 402 (see e.g.,
[0071] Alternatively, each parameter set 102 may comprise one or more curve parameters 502 of the tone mapping curve 300. The one or more curve parameters 502 of the tone mapping curve 300 may be computed (i.e., as obtaining step 201) based on metadata 402 extracted from the respective HDR video frame(s) 101. The curve parameters 502 may be temporally filtered 202, to obtain temporally filtered curve parameters 503 (see e.g.,
[0072] The temporal filtering 202 of the parameter set 102 (i.e., either the metadata 402 or the curve parameters 502) may be done at the encoder or decoder (thus acting as the device 100; various embodiments are explained later with respect to
[0073] The device 100 (encoder or decoder) may comprise a processing circuitry (not shown in
[0074] In particular, the device 100 may comprises a processor for executing a computer program comprising a program code for performing the method 200, i.e. for controlling the device 100 to perform the above-mentioned steps of obtaining 201 the parameter sets 102 and temporally filtering 202 the parameter sets 102 to obtain the filtered parameter set 103.
[0075] According to embodiments of the disclosure, an exemplary tone mapping curve 300, which is also be referred to as “phoenix curve” in this disclosure, may be given by:
[0076] wherein L is a brightness of an input pixel of HDR video frame(s) 101, m_n is a first value, particularly m_n=1, m_m is a second value, particularly m_m=2.4, and m_b is a predetermined perception quantization (PQ) value, wherein m_p is a brightness control factor and m_a is a scaling factor defining a maximum brightness of an output pixel.
[0077] Other embodiments using the “phoenix curve” may use other parameters, e.g. m_m may be in a range from 1 to 5, and m_n may be in a range from 0.5 to 2.
[0078] Embodiments may use other non-linear tone-mapping curves (other than the “phoenix curve”), conduct temporal filtering of the metadata used to define these other non-linear curves or conduct temporal filtering of the curve parameters directly of these other non-linear curves.
[0079] The curve parameters 502 mentioned above may, in particular, comprise the parameters m_p and m_a.
[0080] The temporally filtered parameter set 103 may be used by the decoder to generate such a phoenix tone mapping curve 300. Thereby, the filtering of the parameters set 102 may be done by the encoder or the decoder. For instance, the temporally filtered parameter set 103 may comprise temporally filtered curve parameters 503 including the parameters m_p and m_a.
[0081] Thereby, the parameter m_p has the physical meaning of a brightness control factor, and the larger m_p, the higher the brightness is. Moreover, m_a is a scaling factor that controls the maximum output brightness of the tone mapping performed with the tone mapping curve 300. The tone mapping curve 300 can be designed in the PQ domain. In other words, the input L and output of the tone mapping curve 300 can both refer to PQ values. The input L ranges from 0 to 1, where PQ value of 0 is 0 nit in linear domain and PQ value of 1 is 10000 nit in linear domain. The output value ranges from 0 to a PQ value that is equal to or below the maximum display brightness in PQ domain.
[0082] A few examples 300A, 300B, and 300C of the tone mapping curve 300 with different maximum input brightness and maximum display brightness are plotted in
[0083] There may be different modes, in which a system of encoder and decoder operates, and wherein embodiments of the disclosure may be applied differently. Depending on the mode of the system, the encoder and decoder may be operated differently, and the device 100 may be either the encoder or the decoder. That is, the method 200 according to embodiments of the disclosure may be performed in the encoder or the decoder depending on the mode. The encoder and decoder may generally have different roles in the modes
[0084] As an example, with reference to the China Ultra-HD Video Industrial Alliance (CUVA) HDR standard (CUVA HDR standard), a first mode may hereinafter be referred to as “automatic mode”, and a second mode may hereinafter be referred to as “artistic mode”. In both of these modes, the parameter set 102 defining the tone mapping curve 300 may be temporally filtered 202, as described above. These two modes of the CUVA HDR standard are briefly explained in the following:
[0085] Automatic mode (mode flag tone_mapping_mode=0). In this first mode, the curve parameters 502 for generating the tone mapping curve 300 are computed based on (basic) metadata 402 at the decoder. The metadata 402 (or temporally filtered metadata 403 in some embodiments) is provided to the decoder by the encoder. Exemplary metadata 402 is shown in
[0090] Therein, the maxrgb value of a pixel is the maximum of R, G and B values of a pixel. The values are in the PQ domain. And all the four parameters given above are values in the PQ domain (thus each value name ends with _pq).
[0091] Artistic mode (mode flag tone_mapping_mode=1). In this second mode, the one or more curve parameters 502 may be determined, for example, computed using an algorithm at the stage of the content production or at the stage of the transcoding. In the artistic mode, additional cubic spline parameters such as TH1, TH2, TH3, TH-strength and computed “phoenix ” curve parameters 502 such as m_a, m_p, m_b may be used. As shown exemplarily in
[0092] It is worth to be noting that the “artistic mode” does not necessarily mean that a human artist or colorist is involved. Artificial intelligence (AI) may replace a human artist or colorist in color grading and help in determining the curve parameters 502. Therefore, the fundamental difference between automatic and artistic mode, according to the above definition, is that only the (basic) metadata 402 is transmitted in the automatic mode, whereas the curve parameters 502 or 503 may be computed at the encoder and embedded in the extended metadata in the artistic mode.
[0093] In the following, a way of computing the curve parameters 502, particularly the parameters m_p and m_a, based on the (basic) metadata 402 is described with respect to
[0094] Firstly, intermediate values of MAX1 and max_lum may be computed. Herein, two parameters of the metadata 402, i.e. average_maxrgb and variance_maxrgb, may be used.
[0095] A, B are preset weighting factors, MIN is a preset value of lower threshold for max_lum. MaxRefDisplay is the peak brightness of a reference display. The reference display is the display, on which a colorist displays the HDR video on during color grading. A standard reference display peak brightness may be 1000 nit or 4000 nit, although there may also be other values in practice.
[0096] Secondly, the parameter m_p may be computed as follows:
[0097] Therein, avgL is the average of the maxRGB values of all pixels in a frame, TPL0 and TPH0 are preset thresholds for avgL, PvalueH0 and PvalueH1 are preset threshold values for m_p, and g0(w0) is the weight.
[0098] Thirdly, the parameter m_p may be updated using max_lum:
[0099] Therein, TPL1 and TPH1 are preset threshold values for max_lum, PdeltaH1 and PdeltaL1 are preset threshold values for m_p offset, and g1(w1) is the weight.
[0100] Finally, the parameter m_a may be computed using m_p. In this step, the other intermediate value H(L) is calculated.
[0101] Therein, MaxDISPLAY and MinDISPLAY mean the maximum and minimum brightness of the display, and MaxSource and MinSource means the maximum and minimum brightness (maxRGB value) of the source HDR video.
[0102] Advantageously, the tone mapping curve 300, e.g. generated according to embodiments of the disclosure, is more stable than conventional tone mapping curves, as is shown in
[0103]
[0104] Initially, at least a first parameter set 102 of a first HDR video frame 101 may be obtained 201, and the obtained first parameter set 102 may be pushed into a queue (not shown). Then, a second parameter set 102 of a second HDR video frame 101 may be obtained 201 (as shown). Then it may be detected (at block 601), if a scene change occurred between the first HDR video frame 101 and the second HDR video frame 101. The second parameter set 102 is pushed into the queue (at block 602) if there occurred no scene change (N at block 601). If there occurred a scene change (Y at block 601), the queue is cleared (at block 603). In the first case (N at block 601), an average of the parameter sets 102 in the queue may be computed (at block 604), as the temporal filtering step 202, to obtain the temporally filtered parameter set 103.
[0105] In an embodiment, the temporal filtering 202 of the plurality of parameter sets 102 comprises calculating a weighted average (as, e.g., done at block 604), or an average of at least a part of parameters of the plurality of parameter sets 102.
[0106] In particular, the weighted average of the queue can be calculated (at block 604), in order to get the filtered parameter set 103 of the HDR video frames 101 in the time domain. The temporal filtering 202 may be performed based on the following equation:
R=Σ.sub.k=1.sup.nQ(k)*w.sub.k, Σ.sub.k=1.sup.nw.sub.k=1
[0107] wherein Q(k) is the kth value in the queue and w.sub.k is the weight. The sum of all weights equals 1. By default, all weights may be equal, i.e. w=1/n. In an embodiment, larger weights can be assigned for parameter sets of HDR video frames 101 closer to the current HDR video frame 101 (in
[0108] The procedure of
[0109] This procedure of
[0110] Although embodiments of the disclosure comprise that the temporal filtering 202 can be applied to both metadata 402 and curve parameters 502, it may be more beneficial to filter metadata 402 for the following two reasons: [0111] 1) Curve parameters 502, like the parameters m_a and m_p, can be a non-linear function of metadata 402. Filtering such curve parameters 502 in the non-linear domain may thus be more difficult to control. The metadata 402 may be more suitable for such linear filtering. [0112] 2) In automatic mode, filtering curve parameters 502 like the parameters m_a and m_p can only be used in the decoder. It may not be possible to filter the curve parameters 502 at the encoder, because they are not transmitted by the encoder to the decoder in the automatic mode. But filtering the metadata 402 can be conducted at the encoder, as well as at the decoder, in the automatic mode.
[0113] The above-mentioned embodiments provide the advantage that the computation complexity and memory requirements are reduced in comparison to temporally filtering the entire tone mapping curve 300. Moreover, the potential flickering phenomenon is avoided or reduced, and the stability of displayed contents can be guaranteed even when the scene changes rapidly between consecutive HDR video frames 101.
[0114] In the following, some specific embodiments for the first mode (e.g., for the automatic mode) are described with respect to
[0115]
[0116] In particular, the encoder first extracts 701 (as the obtaining step 201) the metadata 402 from the plurality of HDR video frames 101. The metadata 402 is then temporally filtered 202 at the encoder. As for the temporal filtering 202, the same processing procedure as shown in
[0117]
[0118] In particular, the encoder first extracts 701 the metadata 402 from the plurality of HDR video frames 101, and sends the metadata 402 to the decoder. The decoder receives (as the obtaining step 201) the metadata 402, and temporally filters 202 the metadata 402 to obtain temporally filtered metadata 403 (being the filtered parameter set 103), As for the temporal filtering 202 at the decoder, the same processing procedure as shown in
[0119] After obtaining the filtered metadata 403, the tone mapping curve parameters 502 are calculated 702 at the decoder, in order to the generate 703 the tone mapping curve 300 at the decoder.
[0120] In particular, the encoder first extracts 701 the metadata 402 from the plurality of HDR video frames 101, and sends it to the decoder. Then the decoder computes 702 (as the obtaining step 201) the curve parameters 502, and then temporally filters 202 the curve parameters 502 to obtain temporally filtered curve parameters 503 (being the filtered parameter set 103). For the temporal filtering 202, the same processing procedure as shown in
[0121] In the following, some specific embodiments for the second mode (e.g., for the artistic mode) are described with respect to
[0122]
[0123] In particular, the encoder first extracts 701 (as the obtaining step 201) the metadata 402 from the plurality of HDR video frames 101. The metadata is then temporally filtered 202 at the encoder to obtain filtered metadata 103 (being the temporally filtered parameter set 103). As for the temporal filtering 202, the same processing procedure as shown in
[0124]
[0125] In particular, the encoder first extracts 701 the metadata 402 from the plurality of HDR video frames 101. The encoder then computes 702 (as the obtaining step 201) the curve parameters 502. The curve parameters 502 are is then temporally filtered 202 at the encoder to obtain filtered curved parameters 503 (being the temporally filtered parameter set 103). As for the temporal filtering 202, the same processing procedure as shown in
[0126]
[0127]
[0128] Similarly, in SMPTE 2094-40, anchor points may be used to determine the curve parameters 502, and thus particularly temporal filtering 202 can also be applied to such anchor points. In SMPTE 2094-20 and 2094-30, three parameters 502 in the metadata 402 determine the tone mapping curve 300, including “ShadowGainControl”, “MidtoneWidthAdjustmentFactor” and “HighlighGainControl”, and thus temporal filtering 202 can particularly be applied to these three parameters.
[0129] In all of the above embodiments, the metadata 402 may be dynamic metadata, i.e., the metadata 402 may change from frame 101 to frame 101. Further, in all of the above embodiments, the curve parameters 502 may be the parameters m_a and m_p, which may be used to define the phoenix o tone mapping curve 300.
[0130] In the above embodiments shown in
[0131]
[0132] In the HDR preprocessing block 1301, the HDR video remains the same as the input. However, metadata is computed. Further, in the HDR video coding block 1302, the HDR video is compressed, e.g. by a video codec, e.g. a video codec according to H.265 or any other video standard (national, international or proprietary). Moreover, the metadata is embedded in the headers of the video stream, which is sent from the encoder to the decoder (or stored on a storage medium for later retrieval by a decoder). In the HDR video decoding block 1303, the decoder receives the HDR video bitstream, decodes the compressed (or coded) video, and extracts the metadata from the headers.
[0133] Furthermore, in the HDR dynamic tone mapping block 1304, a tone mapping is conducted, to adapt the HDR video to the display capacity.
[0134] For example, the HDR pre-processing block 1301 and/or the HDR dynamic tone mapping block 1304 may implement embodiments of the disclosure.
[0135] The present disclosure has been described in conjunction with various embodiments as examples as well as implementations. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed disclosure, from the studies of the drawings, this disclosure and the independent claims. In the claims as well as in the description the word “comprising” does not exclude other elements or steps and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.