LOCAL DYNAMIC RANGE ADJUSTMENT COLOR PROCESSING

20170347113 · 2017-11-30

    Inventors

    Cpc classification

    International classification

    Abstract

    For obtaining robust luminance dynamic range conversion in particular in coding technologies for defining a second image look from a first one, we describe an image color processing apparatus (205) arranged to transform an input color (R,G,B) of a pixel of an input image (Im_in) having a first luminance dynamic range into an output color (Rs, Gs, Bs) of a pixel of an output image (Im_res) having a second luminance dynamic range, which first and second dynamic ranges differ in extent by at least a multiplicative factor 2, comprising: -a color transformer (100) arranged to transform the input into the output color, the color transformer having a capability to locally process colors depending on a spatial location (x,y) of the pixel in the input image (Im_in); -wherein the color processing apparatus (205) comprises a geometric situation metadata reading unit (203) arranged to analyze received data (220) indicating that a geometric transformation has taken place between an original image (Im_orig), on which geometric location data (S) was determined for enabling a receiver of that geometric location data to determine at least one region of the original image, and the input image.

    Claims

    1. An image color processing apparatus arranged to transform an input color of a pixel of an input image having a first luminance dynamic range into an output color of a pixel of an output image having a second luminance dynamic range, which first and second dynamic ranges differ in extent by at least a multiplicative factor 2, comprising: a color transformer arranged to transform the input color into the output color, the color transformer having a capability to locally process colors depending on a spatial location of the pixel in the input image; wherein the color processing apparatus comprises a geometric situation metadata reading unit arranged to analyze received data indicating that a geometric transformation has taken place between an original image, on which geometric location data was determined for enabling a receiver of that geometric location data to determine at least one region of the original image, and the input image; and the color transformer is arranged to transform the input color into the output color in dependence on the geometric location data.

    2. An image color processing apparatus as claimed in claim 1, in which the data comprises an indicator codifying that any geometric transformation has taken place, such as e.g. a scaling of the size of the region comprising the image pixels of the original image.

    3. An image color processing apparatus as claimed in claim 1, in which the data comprises a new recalculated value of at least one parameter of the geometric location data codifying at which geometric position of the input image a pixel is to be processed locally.

    4. An image color processing apparatus as claimed in claim 3 further comprising a second indicator indicating that at least one parameter of the geometric location data has been recalculated from its original value.

    5. An image color processing apparatus as claimed in claim 1, in which the data comprises transformation data specifying the geometric transformation which has taken place.

    6. An image transmission apparatus arranged to transmit at least one image comprising pixels with input colors, and arranged to transmit transformation data specifying functions or algorithms for color transforming the input colors in a first luminance dynamic range into output colors (RS, Gs, Bs) in a second luminance dynamic range, which first and second dynamic ranges differ in extent by at least a multiplicative factor 2, in which the transformation data comprises data for performing local color transformation, that data comprising geometric location data enabling a receiver to calculate which pixel positions of the at least one image are to be processed with the local color transformation, wherein the apparatus comprises geometric situation specification means arranged to encode data indicating that a geometric transformation has taken place between an original image on which the geometric location data was determined and the input image.

    7. An image transmission apparatus as claimed in claim 6, in which the geometric situation specification means is arranged to encode in the data an indicator codifying that any geometric transformation has taken place.

    8. An image transmission apparatus as claimed in claim 6, in which the geometric situation specification means is arranged to change at least one parameter of the geometric location data compared to a value it received for that parameter.

    9. An image transmission apparatus as claimed in claim 6, in which the geometric situation specification means is arranged to encode in the data data specifying the geometric transformation which has taken place.

    10. A method of image color processing comprising the steps of: analyzing received data indicating that a geometric transformation has taken place between an original image, on which geometric location data was determined for enabling a receiver of that geometric location data to determine at least one region of the original image, and the input image; and transforming an input color of a pixel of an input image having a first luminance dynamic range into an output color of a pixel of an output image having a second luminance dynamic range, which first and second dynamic ranges differ in extent by at least a multiplicative factor 2, wherein the applied color transformation depends on the value of the received data.

    11. A method of image color processing as claimed in claim 10, which performs only global color transformation if an indicator in the received data indicates that a geometric transformation has occurred.

    12. A method of image color processing as claimed in claim 10, which performs a re-determination of the geometric location data if the analyzing concludes that a geometric transformation has occurred.

    13. A method of image transmission comprising: obtaining an image; obtaining transformation data for color transforming the image from input colors in a first luminance dynamic range into output colors in a second luminance dynamic range, which first and second dynamic ranges differ in extent by at least a multiplicative factor 2; determining whether the image is geometrically deformed compared to an original image which was used when determining the transformation data; and transmitting the image, the transformation data, and data indicating that a geometric transformation has taken place between the original image and the input image.

    14. (canceled)

    15. A computer program product comprising code codifying each of the steps in claim 1, thereby when run enabling a processor to perform that respective method.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0031] These and other aspects of any variant of the method and apparatus according to the invention will be apparent from and elucidated with reference to the implementations and embodiments described hereinafter, and with reference to the accompanying drawings, which drawings serve merely as non-limiting specific illustrations exemplifying the more general concept, and in which dashes are used to indicate that a component is optional, non-dashed components not necessarily being essential. Dashes can also be used for indicating that elements, which are explained to be essential, are hidden in the interior of an object, or for intangible things such as e.g. selections of objects/regions, indications of value levels in charts, etc.

    [0032] In the drawings:

    [0033] FIG. 1 schematically illustrates a possible color processing apparatus for doing dynamic range transformation including local color processing, which color processing will typically include at least changing the luminances of objects in an input image;

    [0034] FIG. 2 schematically illustrates an example of a system which is arranged to coordinate the dynamic range color transformations needed, when any source apparatus may perform various geometrical transformations on an image to be dynamic range color transformed;

    [0035] FIG. 3 elucidates with one possible example the problems that can occur in practical HDR image or video handling systems which make use of image look encoding on a local basis; and

    [0036] FIG. 4 schematically illustrates basic functionalities of a possible typical HDR image or video handling apparatus, which will supply HDR image(s) to a further apparatus via some image communication technology.

    DETAILED DESCRIPTION OF THE DRAWINGS

    [0037] FIG. 2 shows an easy to understand practical example of how one can embody our invention. The skilled reader will understand that one can use the same component configurations in other HDR video handling systems, so we by no means meant this to be any limitation of our invention's basic framework principles. Suppose a grader 251 has made a master grading, which is an HDR grading on an image creation apparatus 250. This image may be seen as a normalized image (with R,G,B having values in [0,1]), i.e. still irrespective of its optimal rendering. In other words, the statistics of the color values will determine on which display with which peak brightness this image is best shown (typically thought there may also be included in the coded image signal S_src a peak brightness of an associated reference display, stating that this image is correctly graded for display on e.g. a 2000 nit display). Because of the normalization, this image may be stored in a legacy video encoding, e.g. HEVC with 10 bits per channel. As such an HDR only image encoding would only render correctly on HDR displays however, the grader needs to include some color transformation function data (F) to be able to calculate at least an 100 nit legacy LDR grading from the coded HDR image (Im_orig). The grader will have specified this function(s), and more importantly the data S specifying for which pixel locations at least one local color transformation should be done, based on the geometry of the original image (Im_orig) he was working on. Although many image communication technologies are possible, in this example we assume the data (HDR image+functions for re-grading at least to an LDR grading) is stored on a blu-ray disk, with HDR capabilities, which can be purchased by a consumer. A BD player as example of the image transmission apparatus (201) can read at least the image data on the disk, and play it out at normal position, size etc. It therefore sends this image data, and the functions F, in an image signal S-out to the image color processing apparatus (205), which is in this example incorporated in a television with LED backlights as example of a receiving apparatus (202), but this receiving apparatus could be also e.g. a data storage server with calculation capabilities for re-grading images before storage, etc. In the scenario where the BD player just passes through the geometrically unmodified video, there is no issue. The television will do the required color transformation to get a re-grading optimal for its physical characteristics, and send that image to a display driver 204, e.g. for driving a backlight and LCD pixel valves. However, e.g. if the user starts interacting with the menu of the BD player, it may show images with text and a small rescaled version of the content video, and send that to the television over an image connection (210), e.g. an HDMI cable, or wireless image communication channel, etc.

    [0038] To communicate the geometrical transformation situation, the BD player may add in the signal S-out one or more types of additional data (220), characterizing the geometrical transformation situation, so that the receiving side can understand it. E.g. there may be a simple indicator (221) merely codifying that any geometric transformation has taken place. But the BD player may also recalculate the data needed for obtaining the spatial positions of pixels to be locally processed in accordance with the geometric transformation it applied. This it may indicate in the data as e.g. a second indicator (222) indicating that at least one parameter of the geometric location data (S) has been recalculated from its original value, and geometric selection parameters (223), which now do not contain the original geometric location data (S), but e.g. a new starting point (xs2, ys2) as left-uppermost point of a rectangle, etc.

    [0039] In case the receiving apparatus is to re-determine how the pixels to be locally processed should be determined geometrically, the transmitting apparatus may add in the data (220) transformation data (224) specifying the geometric transformation which has taken place. This may e.g. be a scale factor (s=e.g. ¼), and an offset (xws, yws) as a number of pixels, or more complex information codifying more complex transformations, which need not necessarily comprise all original pixel, but may also e.g. select a subset of the Im_orig in some pass-through window, etc. Finally, the BD player will transmit the data required for doing the correct color transformations (F) and the primary image (Im_in) which can be used directly if a display is connected with the associated PB, or re-graded otherwise. This will be the image gradings encoded data 225.

    [0040] FIG. 3 shows a typical HDR image or in particular video handling scenario for which the present invention and its embodiments was designed. The reader should realize also that typically in HDR, there may not necessarily be only one image (corresponding to only one look having its relative luminances of its scene objects in a particular configuration of proportions of a first object luminance to a second one). This was the situation for legacy LDR video encoding, because there was only one 0-100 nit luminance range which existed by definition. Now however, one must cater for having all possible HDR scenes and their images rendered optimally on various possible final displays, with peak brightness of e.g. 100 nit, 400 nit, 1000 nit, 2000 nit, 5000 nit, and 10,000 nit. One can imagine that if one renders an image which has been color graded to look optimally bright on a 100 nit PB display as is on a 10,000 nit monitor or TV, that it may look painfully too bright. So a solution to this is that typically the better coding systems don't just encode a single set of HDR images (e.g. defined on a 5000 nit PB reference luminance range), but rather the encode the dual set of images being e.g. a 5000 nit grading and an legacy 100 nit grading (i.e. images having a look which is correct for them to be directly rendered, i.e. without needing further colorimetric transformation, on a legacy 100 nit LDR display). And furthermore, to save on bandwidth, one may typically want to encode the second image(s) with as little data overhead as possible, i.e. as a functional or algorithmic transformation of the first set of images, which does get sent as actual images, i.e. DCT-ed pixel blocks, e.g. according to the HEVC standard. I.e. e.g. one sends a set of LDR images (which can be used for direct rendering on LDR displays, but surprisingly at the same time double as images for a HDR high dynamic range look to supply say a 4000 nit display with an optimally or reasonably looking image), and one sends metadata allowing a receiver to transform the LDR images into HDR images being a close reconstruction of the HDR look images that were created by the content creator at a transmitting side, or the other way around, the metadata comprises functions to downgrade transmitted HDR images (HEVC encoded) to LDR images.

    [0041] There would be no problem if we only used global transformations, but it has come to light that it may be advantageous or necessary in some scenarios to define some of the color transformations locally (i.e. although e.g. the colors seen through a window 303 are to a certain extent similar to colors in the remainder of the image (of PIP 302), they nonetheless are transformed differently because they need to become e.g. very bright, or vice versa subnominally dim). One should realize that this is not just any mere transformation, with which one can play at will, but it is an actual encoding of new images, which ideally need to look precisely as their content creator defined them, i.e. significant special technical care is needed to keep handling them correctly anywhere in any HDR handling apparatus or chain. So we don't just have a situation of handling image resolutions, but actually a handling of image re-definitions, namely a correct adjustment of the color transformation functions.

    [0042] In FIG. 3 we see an example of a PIP, although other similar scenarios are conceivable (e.g. POP, display on a second side display like a mobile phone, coarsening a part of an image to form a low resolution ambilight projected light pattern, etc.). For elucidation, we will assume without wanting to limit ourselves that this would be a scenario of say a blu-ray disk reader doing a PIP of say a second video stream containing some director comments.

    [0043] In the main area 301, there will be a movie. It may be defined according to some version of the possible HDR codecs, and it needs to be ultimately converted to output luminances Luminance_out to be rendered on the display. Now there are many different aspects in HDR coding, which are not needed to complicate the discussion of the present invention, e.g. the video may be encoded according to various code allocation functions or EOTFs relating luma codes to luminances, and it may be defined compared to a peak brightness e.g. 5000 nit, which may be different from that of the rendering display, e.g. 2500 nit. Furthermore, a display may want to do its own image processing etc. In any case, we can summarize the situation as a global mapping which we represent with custom transformation curve 311 between input luminances Luminance in (which would correspond on a 5000 nit luminance axis to the lumas and in general pixel colors received in the HEVC images, in particular Im_1 as in FIG. 4), and ultimate output luminances. Furthermore, it can be demonstrated that one may define this transformation on normalized luminance axes, but the 1.0 on the x-axis then corresponds to an actual 5000 nit, and the 1.0 on the y-axis e.g. to 2500 nit, the PB of the connected or to be ultimately supplied TV. In this example the curve dictates that one needs to brighten to some degree the darker regions (relative luminance sub-range 312), which may e.g. be a black motorcycle in a night scene, and we want to increase the contrast of some brighter regions (relative luminance sub-range 313), e.g. to see everything nicely crisp in the incandescently lit rooms of the houses as seen through the windows. This constitutes the colorimetric transformation graph 310 for the main video.

    [0044] Now secondly, there is the PIP 302, which gets its own video/image(s), and has its own specific, and different color transformation (graph 315). If the system knew nothing, it would just apply the global transformation 311 also on those pixel colors. Here we assumed that we may have a global color/luminance transformation for the majority of the pixels, and a local transformation 316, e.g. to brighten the outside pixels as seen through window 303 (without losing sight of the generic concepts, the reader can take the example that this secondary video was quickly shot with a cheaper LDR camera, and not specifically HDR graded with much care, and basically it is converted into rough pseudo-HDR by keeping all the pixels LDR, and only boosting the bright outside region 303. So actually we are interested in the functional luminance transformation shape 316 for processing locally the brighter outside pixels only, and we don't need to bother in this elucidation with what happens to other pixels, like pixels having similar luminances elsewhere in the PIP video (getting transformation 317), or the transformation for the darker pixel colors.

    [0045] But it is important that the local transformation of the outside region 303 will go correctly, otherwise unpleasant and unnatural looking colors may appear, or worse, geometric artefacts may occur in the ultimate image, and not necessarily the PIP region, but potentially also in the main movie.

    [0046] FIG. 4 shows a little more of a possible apparatus which creates the geometric situation information. We will again for simple elucidation describe a BD-player, although the skilled person understands that in a similar manner such a system may occur in many apparatuses, e.g. a video compositor in a TV truck mixing feeds from various cameras, a video inserter in a local cable distribution centre, an video server on internet compositing two streams, etc.

    [0047] A first image (or set of images) Im_1 comes from a first image source 401, and a second image Im_2 (which we assume gets e.g. PIP-ed, but of course several other geometric transformations are possible, even with dynamically moving regions etc.) comes from a second image source 402. Of course for a simple elucidation of our principles one may assume both come from a blu-ray disk, but of course, even with blu-ray applications the second image may come from a server over an internet connection, or in case of a live production apparatus embodiment from a camera etc.

    [0048] A geometrical transformation unit 403 does a geometrical transformation on the video (Im_2), e.g. in accordance with rules of a user interface software, e.g. it scales and repositions the video in a PIP. Now the assumption is that a receiving device later in the chain like a television still has to do some of the dynamic range processing, be it only the conversion to its dynamic range (e.g. 5000 nit PB video to the 2500 nit display dynamic range). If the apparatus 201, say a BD player would do all optimization color transformation and directly supply the display drivers of a (dumb) display with the correct values, there would in most scenarios also not be a problem. A geometric situation specification means 212 can get the information of what was done geometrically from the geometrical transformation unit 403, and then define the situation parameters which need to be communicated to the receiving side, according to whatever embodiment is desired for a certain application. As said, some embodiments need no detailed codification of which geometric transformation(s) were actually done, but only a bit indicating that something was done, and that this is no longer the pure original movie video Im_1 to which the transformation function(s) F1 corresponds (which incidentally as we have shown in research can apart from defining a 100 nit look from say the 5000 nit image(s) Im_1 or vice versa, also be used to calculate the optimal looking images for rendering on a display of peak brightness unequal to those two values, say 2500 or 1400 nit). So in some scenarios geometric situation specification means 212 will generate a sole bit to be output in the video signal (or multiple correlated video signal parts potentially being communicated via different mechanisms) going to some output system 401 (e.g. directly to a display via a HDMI if apparatus 201 resides at a final consumer premise, or to a network video storage memory for transcoders or apparatuses for networked video delivery etc.). This may be good for application scenarios where an incorrect decoding is not necessarily too critical, and the receiving side apparatus can then switch to a safe mode (e.g. no local processing, in the main and/or secondary region). That will in principle lead to the wrong decoding, i.e. reconstruction of the wrong e.g. HDR image look for the PIP, getting incorrect colors in some areas, namely at least those which needed to be locally reconstructed. E.g., in the example of FIG. 3, we would get by using the global luminance transformation curve (i.e. on those brighter pixels its part 317) sunny outside colors which are too dark. But the apparatus 201 could determine what the severity of the situation would be, e.g. a small window in only a PIP maybe needn't be perfect. This will depend on various factors, such as the precise geometrical situation, but also the details of the image content, but also the characteristics of the ultimate rendering (e.g. on a 1000 nit TV an error in the window may be less severe than on a 5000 nit TV, and if the error is that the region becomes too bright with the global mapping, especially if close to the PB, then it may be very inappropriate for TVs above 3000 nit, and less problematic that there is an error on TVs of PB below 1000 nit). As to the influence of content, note that the local transformation may have been done primarily for getting better contrasts, or less artefacts like banding, and the apparatus 201 can take that into account in its decision of how to encode the necessary geometric transformation information. Especially if a human is present and interacting with the apparatus 201, e.g. in a video production system, he can check what the severity of the impact of incorrectly doing the decoding by e.g. dropping the local transformations would be, especially if he has a fixed or range final display in mind. Automatic apparatuses may calculate an error measure which takes into account the amount of pixels (size of the local region), and the differences of the colors of the reconstruction versus the ideal, and even further image information, of course only in case they do some HDR calculations (we designed the simpler variants also for cheap systems, which do (almost) nothing, and just pass true all the colorimetric coding parameters to another apparatus for it to do all calculations. I.e. if immediately rendered on some—especially if lower quality—display, the single bit solution may be appropriate, but if all data is archived for later use, the higher quality versions with all information encoded as precisely as possible may be in order.

    [0049] In this example elucidation we assume that apparatus 201 just calculates new rules S2* to find the pixels of Im_2 on which the local color transformation (316) should be applied, and that local function shape F2_L is just directly passed through from being read as say metadata from video source 402 to the output, similarly to how Im_1 and F1 may typically be passed through in this embodiment for color processing by some receiving side apparatus.

    [0050] The algorithmic components disclosed in this text may (entirely or in part) be realized in practice as hardware (e.g. parts of an application specific IC) or as software running on a special digital signal processor, or a generic processor, etc. They may be semi-automatic in a sense that at least some user input may be/have been (e.g. in factory, or consumer input, or other human input) present.

    [0051] It should be understandable to the skilled person from our presentation which components may be optional improvements and can be realized in combination with other components, and how (optional) steps of methods correspond to respective means of apparatuses, and vice versa. The fact that some components are disclosed in the invention in a certain relationship (e.g. in a single figure in a certain configuration) doesn't mean that other configurations are not possible as embodiments under the same inventive thinking as disclosed for patenting herein. Also, the fact that for pragmatic reasons only a limited spectrum of examples has been described, doesn't mean that other variants cannot fall under the scope of the claims. In fact, the components of the invention can be embodied in different variants along any use chain, e.g. all variants of a creation side like an encoder may be similar as or correspond to corresponding apparatuses at a consumption side of a decomposed system, e.g. a decoder and vice versa. Several components of the embodiments may be encoded as specific signal data in a signal for transmission, or further use such as coordination, in any transmission technology between encoder and decoder, etc. The word “apparatus” in this application is used in its broadest sense, namely a group of means allowing the realization of a particular objective, and can hence e.g. be (a small part of) an IC, or a dedicated appliance (such as an appliance with a display), or part of a networked system, etc. “Arrangement” or “system” is also intended to be used in the broadest sense, so it may comprise inter alia a single physical, purchasable apparatus, a part of an apparatus, a collection of (parts of) cooperating apparatuses, etc.

    [0052] The computer program product denotation should be understood to encompass any physical realization of a collection of commands enabling a generic or special purpose processor, after a series of loading steps (which may include intermediate conversion steps, such as translation to an intermediate language, and a final processor language) to enter the commands into the processor, to execute any of the characteristic functions of an invention. In particular, the computer program product may be realized as data on a carrier such as e.g. a disk or tape, data present in a memory, data traveling via a network connection—wired or wireless—, or program code on paper. Apart from program code, characteristic data required for the program may also be embodied as a computer program product. Such data may be (partially) supplied in any way.

    [0053] The invention or any data usable according to any philosophy of the present embodiments like video data, may also be embodied as signals on data carriers, which may be removable memories like optical disks, flash memories, removable hard disks, portable devices writeable via wireless means, etc.

    [0054] Some of the steps required for the operation of any presented method may be already present in the functionality of the processor or any apparatus embodiments of the invention instead of described in the computer program product or any unit, apparatus or method described herein (with specifics of the invention embodiments), such as data input and output steps, well-known typically incorporated processing steps such as standard display driving, etc. We also desire protection for resultant products and similar resultants, like e.g. the specific novel signals involved at any step of the methods or in any subpart of the apparatuses, as well as any new uses of such signals, or any related methods.

    [0055] It should be noted that the above-mentioned embodiments illustrate rather than limit the invention. Where the skilled person can easily realize a mapping of the presented examples to other regions of the claims, we have for conciseness not mentioned all these options in-depth. Apart from combinations of elements of the invention as combined in the claims, other combinations of the elements are possible. Any combination of elements can be realized in a single dedicated element.

    [0056] Any reference sign between parentheses in the claim is not intended for limiting the claim, nor is any particular symbol in the drawings. The word “comprising” does not exclude the presence of elements or aspects not listed in a claim. The word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements.