Parallelising per-pixel compositing
09727982 · 2017-08-08
Assignee
Inventors
- Zhi-Min Pan (Carlingford, AU)
- David Ian Johnston (Leichhardt, AU)
- Gregory Joseph Johnson (North Epping, AU)
- Kok Tjoan Lie (Como, AU)
Cpc classification
G06F3/1208
PHYSICS
G06K15/1857
PHYSICS
G06K15/181
PHYSICS
G06F3/1212
PHYSICS
G06F3/1243
PHYSICS
G06K15/1859
PHYSICS
International classification
G06F3/12
PHYSICS
Abstract
A method of compositing layers by grouping the layers into a foreground group and a background group; identifying independent instructions of compositing model for execution independently from the background group and dependent instructions requiring a compositing output of a background layer in order to composite foreground layers; executing the independent instructions on the foreground layers in parallel with compositing the background layers, a first independent instruction storing a corresponding result in a first buffer and a second independent instruction storing a corresponding result in a second buffer; executing a dependent instruction by updating the second buffer using the background compositing output; and determining a compositing output for the foreground group dependent upon contents of the first buffer and the updated second buffer.
Claims
1. A method of compositing a plurality of layers, the method comprising the steps of: grouping the plurality of layers comprising image data into at least a foreground group and a background group to be composited by separate computing threads, wherein a foreground compositing model for compositing layers in the foreground group is dependent on a compositing output for the background group; identifying, in relation to the foreground group, independent instructions of the foreground compositing model to be executed independently from the background group and dependent instructions of the foreground compositing model requiring a compositing output of at least one layer of the background group in order to composite layers in the foreground group; executing the identified independent instructions using image data associated with the layers in the foreground group in parallel with compositing the layers in the background group, a first independent instruction storing a corresponding result of execution in a first buffer and a second independent instruction storing a corresponding result in a second buffer; upon receipt of the compositing output for the background group, executing the dependent compositing instruction by updating the second buffer using the background compositing output; and determining a compositing output for the foreground group dependent upon contents of the first buffer and the updated second buffer.
2. The method according to claim 1, wherein: the independent instructions are sub-expressions of the foreground compositing model that contain only terms independent of the layers of the background group; and the dependent instructions are sub-expressions that have at least one term dependent on the layers of the background group.
3. The method according to claim 1, further comprising, prior to the executing step, scheduling at least some of the independent instructions or the dependent instructions for execution by specified computing threads dependent upon cached locality of data in memory of processors executing the specified computing threads.
4. The method according to claim 1, further comprising, prior to the executing step, allocating (i) independent instructions using image data associated with the layers in the background group and (ii) the independent instructions using image data associated with the layers in the foreground group for execution by available threads or processors in any order provided that the independent instructions using image data associated with the layers in the background group are executed no later than the independent instructions using image data associated with the layers in the foreground group.
5. The method according to claim 1, wherein the executing step gives priority to instructions using image data associated with the layers in the background group over instructions using image data associated with the layers in the foreground group.
6. The method according to claim 1, wherein the grouping step further comprises the steps of: identifying a number of available processing threads; and specifying the number of layers in each group depending upon the number of identified threads.
7. The method according to claim 1, wherein an independent instruction for processing image data of the foreground group is scheduled to execute on a first computing thread, a dependent instruction for processing image data of the foreground group is scheduled to execute on a second computing thread, and the compositing output of the at least one background layer of the background group is communicated to the second computing thread for processing the image data of the foreground group.
8. The method according to claim 1, further comprising the steps of: processing, in a first pass, the groups in order down a z-stack in a parallel manner, top-most group first, bottom-most group last to determine a top-most opaque level in the z-stack; and processing, in a second pass, the groups in order up the z-stack in a sequential manner, reusing the independent terms for the layer groups that have been pre-calculated in the first pass.
9. The method according to claim 8, wherein the first pass processing is performed in order to determine a level in the z-stack having an opacity greater than a pre-determined threshold.
10. The method according to claim 1, further comprising the steps of: processing, in a first pass, the groups in order down a z-stack in a parallel manner, top-most group first, bottom-most group last to determine a top-most group in the z-stack which does not contribute to the compositing output, the determining is based on at least one of opacity and a compositing operator associated with at least one layer in a group up in the z-stack; and processing, in a second pass, the groups in order up the z-stack in a sequential manner, reusing the independent terms for the layer groups that have been pre-calculated in the first pass.
11. The method according to claim 1, wherein: the grouping step is performed in regard to a plurality of foreground groups of layers and a plurality of background groups of layers; and the executing step executes independent instructions for the plurality of foreground groups of layers in parallel.
12. An apparatus for compositing a plurality of layers, the apparatus comprising: a memory storing a computer executable program; and a processor for executing the program in order to perform a method comprising the steps of: grouping the plurality of layers comprising image data into at least a foreground group and a background group to be composited by separate computing threads, wherein a foreground compositing model for compositing layers in the foreground group is dependent on a compositing output for the background group; identifying, in relation to the foreground group, independent instructions of the foreground compositing model to be executed independently from the background group and dependent instructions of the foreground compositing model requiring a compositing output of at least one layer of the background group in order to composite layers in the foreground group; executing the identified independent instructions using image data associated with the layers in the foreground group in parallel with compositing the layers in the background group, a first independent instruction storing a corresponding result of execution in a first buffer and a second independent instruction storing a corresponding result in a second buffer; upon receipt of the compositing output for the background group, executing the dependent compositing instruction by updating the second buffer using the background compositing output; and determining a compositing output for the foreground group dependent upon contents of the first buffer and the updated second buffer.
13. A computer readable non-transitory storage medium storing a computer executable program for directing a processor to perform a method for compositing a plurality of layers comprising the steps of: grouping the plurality of layers comprising image data into at least a foreground group and a background group to be composited by separate computing threads, wherein a foreground compositing model for compositing layers in the foreground group is dependent on a compositing output for the background group; identifying, in relation to the foreground group, independent instructions of the foreground compositing model to be executed independently from the background group and dependent instructions of the foreground compositing model requiring a compositing output of at least one layer of the background group in order to composite layers in the foreground group; executing the identified independent instructions using image data associated with the layers in the foreground group in parallel with compositing the layers in the background group, a first independent instruction storing a corresponding result of execution in a first buffer and a second independent instruction storing a corresponding result in a second buffer; upon receipt of the compositing output for the background group, executing the dependent compositing instruction by updating the second buffer using the background compositing output; and determining a compositing output for the foreground group dependent upon contents of the first buffer and the updated second buffer.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) One or more embodiments of the invention will now be described with reference to the following drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
DETAILED DESCRIPTION INCLUDING BEST MODE
(27) Where reference is made in any one or more of the accompanying drawings to steps and/or features, which have the same reference numerals, those steps and/or features have for the purposes of this description the same function(s) or operation(s), unless the contrary intention appears.
(28) It is to be noted that the discussions contained in the “Background” section and that above relating to prior art arrangements relate to discussions of documents or devices. Such discussions should not be interpreted as a representation by the inventors or the patent applicant that such documents or devices in any way form part of the common general knowledge in the art.
Overview of the CILP Arrangement
General Overview
(29) The disclosed CILP arrangements are based on a multi-layer compositing method derived from standard Porter and Duff two-layer compositing operations. The Porter and Duff two-layer operations are well understood by anyone skilled in the art and an understanding of Porter and Duff compositing is expected in order to understand the disclosed CILP arrangements.
(30) A brief description of traditional Porter and Duff compositing is now provided with reference to
(31)
(32) The process of compositing the layers shown in
(33) This bottom-up approach of Porter and Duff compositing illustrates the inherently serial nature of this compositing method and as such shows that traditional Porter and Duff compositing is not well suited for parallel processing in a multi-core CPU system. Although there are prior-art methods adapting the traditional Porter and Duff compositing method for parallel processing of associative compositing operations, those methods are not suitable for general, including non-associative, Porter and Duff compositing operations.
(34) To address the shortcomings of traditional Porter and Duff compositing in the CILP arrangements, the two-layer compositing operations are expanded to N-layer compositing operations where N depends on the number of layers in the PDL and the desired level of parallelism. The number of layers M in a page is then divided into groups of at most N layers such that a page of M-layers, where M is greater than N, will have a number (being
(35)
for M, N greater than 2) of layer groups.
(36) For example:
(37) The grouping of layers and the multi-layer compositing operations allow for identifying “independent instructions” within the multi-layer compositing operation which can be computed independently of each other prior to final computation of the remaining “dependent instructions” to produce a result. Examples below are provided for 3-layer and 4-layer expansions of the compositing operations.
(38) The ability to compute terms independently means that these independent terms for each of the groups can be computed concurrently with computing independent terms corresponding to at least one other group on a multi-processor system. Alternatively or additionally, independent terms within each group can be processed concurrently by distributing the calculations to different threads or processes of the multi-processor system. This will typically increase the speed at which multiple layers can be composited.
(39) Further, a method of determining whether or not a layer group is substantially opaque, or otherwise does not contribute to the final result, can be used to determine the lowest layer at which to begin compositing, thereby avoiding processing layers which do not contribute to the results. This is done, for example, by evaluating independent terms in the compositing equations starting from the top most group (in a “top down” manner) prior to determining a final compositing output for at least some of the groups.
(40) The following descriptions of compositing operations derived from traditional Porter and Duff compositing use a number of common, basic terms which will now be described to avoid unnecessary repetition.
(41) The colour and opacity of each layer, for example 901-904, are represented in descriptions below by the following terms:
(42) TABLE-US-00001 TABLE 1 Description of terms used for colour and opacity values. Term Description S.sub.0 The colour value of the background layer 0 (904). S.sub.1 The colour value of layer 1 (903). S.sub.2 The colour value of layer 2 (902). S.sub.3 The colour value of layer 3 (901). S.sub.01 The colour value result of compositing layer 0 and layer 1 (904, 903). S.sub.012 The colour value result of compositing layer 0, layer 1 and layer 2 (904, 903, 902). S.sub.0123 The colour value result of compositing layers 0, 1, 2 and 3 (904, 903, 902, 901). α.sub.0 The opacity of the background layer 0 (904). α.sub.1 The opacity of layer 1 (903). α.sub.2 The opacity of layer 2 (902). α.sub.3 The opacity of layer 3 (901). α.sub.01 The opacity value result of compositing layer 0 and layer 1 (904, 903). α.sub.012 The opacity value result of compositing layer 0, layer 1 and layer 2 (904, 903, 902). α.sub.0123 The opacity value result of compositing layers 0, 1, 2 and 3 (904, 903, 902, 901).
(43) The control coefficients X, Y and Z and the blend mode control coefficients β.sub.ƒ0 and β.sub.ƒ1 affect how the terms of the compositing equation combine to produce a result. Each coefficient value can be either 0 (zero), meaning the term it applies to does not contribute to the result or 1, meaning the term it applies to does contribute to the result.
(44) TABLE-US-00002 TABLE 2 Description of control coefficients. Coefficient Meaning X Controls whether the source pixel opacity and destination pixel opacity contribute to the result opacity. (X only affects the contribution to opacity.) Applies to standard two-layer Porter and Duff equations (1, 2) X.sub.f Controls whether the source 904 (layer 0) pixel opacity and destination 903 (layer 1) pixel opacity contribute to the result opacity. (X.sub.f only affects the contribution to opacity.) Applies to three and four layer Porter and Duff equations (3, 4, 12A, 13). X.sub.g Controls whether the source 904, 903 (layer 0/1 result) pixel opacity and destination 902 (layer 2) pixel opacity contribute to the result opacity. (X.sub.g only affects the contribution to opacity.) Applies to three and four layer Porter and Duff equations (3, 4, 12A, 13). X.sub.h Controls whether the source 904, 903, 902 (layer 0/1/2 result) pixel opacity and destination 901 (layer 3) pixel opacity contribute to the result opacity. (X.sub.h only affects the contribution to opacity.) Applies to four layer Porter and Duff equations (12A, 13). Y Controls whether the source pixel value and the inverse destination pixel value contribute to the result. Y.sub.f Controls whether the source 904 (layer 0) pixel value and the inverse destination 903 (layer 1) pixel value contribute to the result. Applies to three and four layer Porter and Duff equations (3, 4, 12A, 13). Y.sub.g Controls whether the source 904, 903 (layer 0/1) pixel value and the inverse destination 902 (layer 2) pixel value contribute to the result. Applies to three and four layer Porter and Duff equations (3, 4, 12A, 13). Y.sub.h Controls whether the source 904, 903, 902 (layer 0/1/2) pixel value and the inverse destination 901 (layer 3) pixel value contribute to the result. Applies to four layer Porter and Duff equations (12A, 13). Z Control whether the destination pixel value and the inverse source pixel value contribute to the result. Z.sub.f Control whether the destination 903 (layer 1) pixel value and the inverse source 904 (layer 0) pixel value contribute to the result. Applies to three and four layer Porter and Duff equations (3, 4, 12A, 13). Z.sub.g Control whether the destination 902 (layer 2) pixel value and the inverse source 904, 903 (layer 0/1) pixel value contribute to the result. Applies to three and four layer Porter and Duff equations (3, 4, 12A, 13). Z.sub.h Control whether the destination 901 (layer 3) pixel value and the inverse source 904, 903, 902 (layer 0/1/2) pixel value contribute to the result. Applies to four layer Porter and Duff equations (3, 4, 12A, 13). β.sub.f0 and β.sub.f1 Control how the source pixel colour value and destination pixel colour value contribute to the result as defined by the compositing operator. β.sub.f0 is the control coefficient (either 0 or 1) which controls whether or not S.sub.0 (source/background colour) contributes to the result; and β.sub.f1 is the control coefficient (either 0 or 1) which controls whether or not S.sub.1 (destination colour) contributes to the result. β.sub.g0 and β.sub.g1 β.sub.g0 and β.sub.g1 are the Porter and Duff blend mode function control coefficients which describe how the result from the bottom two layers (layer 0/1) and the top layer (layer 2), of the group, contribute to the group colour result. β.sub.g0 is the control coefficient (either 0 or 1) which controls whether or not the layer 0/1 result contributes to the layer group result; and β.sub.g1 is the control coefficient (either 0 or 1) which controls whether or not S.sub.2 (layer 2/destination colour) contributes to the result. β.sub.h0 and β.sub.h1 β.sub.h0 and β.sub.h1 are the Porter and Duff blend mode function control coefficients which describe how the result from the bottom two layers (layer 0/1) and the top layer (layer 2), of the group, contribute to the group colour result. β.sub.h0 is the control coefficient (either 0 or 1) which controls whether or not the layer 0/1 result contributes to the layer group result; and β.sub.h1 is the control coefficient (either 0 or 1) which controls whether or not S.sub.2 (layer 2/destination colour) contributes to the result.
(45) The values that the control coefficients take are determined by the compositing operation and are shown in Table 3. The X column also applies to the multi-layer coefficients X.sub.ƒ, X.sub.g, X.sub.h. The Y column also applies to the multi-layer coefficients Y.sub.ƒ, Y.sub.g, Y.sub.h. The Z column also applies to the corresponding multi-layer coefficients Z.sub.ƒ, Z.sub.g, Z.sub.h. The β.sub.ƒ0 column also applies to the multi-layer compositing coefficient β.sub.g0 and β.sub.h0. The β.sub.ƒ1 column also applies to the multi-layer compositing coefficient β.sub.g1 and β.sub.h1.
(46) TABLE-US-00003 TABLE 3 Examples of Porter and Duff Compositing Operations P&D Operator X Y Z B.sub.f0 β.sub.f1 Description clear 0 0 0 0 0 None of the terms are used. src 1 0 1 0 1 Only the terms that contribute layer 1 colour are used. dst 1 1 0 1 0 Only the terms that contribute layer 0 colour are used. src-over 1 1 1 0 1 The layer 1 colour is placed over the layer 0 colour. dst-over 1 1 1 1 0 The layer 0 colour is placed over the layer 1 colour. src-in 1 0 0 0 1 The layer 1 that overlaps layer 0, replaces layer 0. dst-in 1 0 0 1 0 The layer 0 that overlaps layer 1, replaces layer 1. src-out 0 0 1 0 1 The layer 1 that does not overlap layer 0 replaces layer 0. dst-out 0 1 0 1 0 The layer 0 that does not overlap layer 1 replaces layer 1. src-atop 1 1 0 0 1 The layer 1 that overlaps layer 0 is composited with layer 0. dst-atop 1 0 1 1 0 The layer 0 that overlaps layer 1 is composited with layer 1 and replaces layer 0. xor 0 1 1 0 0 The non-overlapping regions of layer 1 and layer 0 are combined.
Traditional Two-Layer Porter and Duff Compositing
(47) The two-layer Porter and Duff compositing operations consist of two operations, namely an operation which calculates the accumulated opacity of the two layers (1), and an operation which calculates the pre-multiplied result colour value of the two layers (2). Equations (1) and (2) are the two basic Porter and Duff compositing operations which are well understood by those skilled in the art. The descriptions that follow will build upon these two basic equations to demonstrate how these equations can be expanded to work on more than two layers.
(48) These operations take the following forms:
α.sub.01=Xα.sub.0α.sub.1+Yα.sub.0(1−α.sub.1)+Zα.sub.1(1−α.sub.0) (1)
α.sub.01S.sub.01=Yα.sub.0S.sub.0(1−α.sub.1)+Zα.sub.1S.sub.1(1−α.sub.0)+α.sub.0α.sub.1ƒ(S.sub.0,S.sub.1) (2)
Where: α.sub.01S.sub.01 is the pre-multiplied colour result for the two layer Porter and Duff compositing equation (2). ƒ(S.sub.0, S.sub.1) is the blend mode function which defines how the layer 0 (source) and layer 1 (destination) colours are combined. The blend mode function ƒ(S.sub.0, S.sub.1) is controlled by parameters β.sub.ƒ0 and β.sub.ƒ1, which can be either 0 or 1, as defined below:
ƒ(S.sub.0,S.sub.1)=β.sub.ƒ0S.sub.0+β.sub.ƒ1S.sub.1 (2A)
(49) The remaining parameters are described in Table 1 and Table 2.
(50) The blend mode function ƒ(S.sub.0, S.sub.1) is shown in equation (2A) as a linear function for simplicity. The blend mode function defines how two colours are to be combined. The blend mode function ƒ(S.sub.0, S.sub.1) may be linear or non-linear depending upon the compositing operation being applied. The following explanations will use this linear blend mode function (2A) for simplicity, however, anyone skilled in the art can follow this procedure and substitute a non-linear blend mode function, for example corresponding to “multiply” blend mode.
(51) The values of the blend mode function control coefficients X, Y, Z, β.sub.ƒ0 and β.sub.ƒ1 are determined by the Porter and Duff Porter and Duff operator specified in the PDL script. Table 3 above lists the values of the control coefficients for each compositing operator supported by traditional two-layer Porter and Duff compositing.
Three Layer Porter and Duff Compositing
(52) The traditional two-layer Porter and Duff compositing formula can be generalised to multi-level compositing.
(53) For example, the case of three level compositing is shown in
(54) Expanding equations (1) and (2) to three levels gives: equation (3) which calculates the accumulated opacity of the three levels; and equation (4) which calculates the resultant pre-multiplied colour value for three levels. Equation (4) is in terms of the results from equations (1) and (2). Equations (3) and (4) are:
α.sub.012=α.sub.0(Y.sub.f(X.sub.fY.sub.fZ.sub.f)α.sub.1)(Y.sub.g+(X.sub.g−Y.sub.g−Z.sub.g)α.sub.2)α.sub.2Z.sub.f(Y.sub.g+(X.sub.g−Y.sub.g−Z.sub.g)α.sub.2)+α.sub.2Z.sub.g (3)
α.sub.012S.sub.012=Y.sub.g(1−α.sub.2)α.sub.01S.sub.01+Z.sub.ƒ(1−α.sub.01)α.sub.2S.sub.2+α.sub.01α.sub.2g(S.sub.01,S.sub.2) (4)
Where: g (S.sub.0, S.sub.1) is the blend mode function which defines how the layer 0, 1 (source) and layer 2 (destination) colours are combined.
(55) Substituting equation (2) for α.sub.01S.sub.01 in equation (4) expands the equation to give the three level pre-multiplied result in terms of the three source level colour and opacity values giving equation (5):
α.sub.012S.sub.012=Y.sub.ƒY.sub.g(1−α.sub.1)(1−α.sub.2)α.sub.0S.sub.0+Z.sub.ƒY.sub.g(1−α.sub.0)(1−α.sub.2)α.sub.1S.sub.1+Z.sub.g(1−α.sub.01)α.sub.2S.sub.2+Y.sub.g(1−α.sub.2)α.sub.0α.sub.1ƒ(S.sub.0,S.sub.1)+α.sub.01α.sub.2g(S.sub.01,S.sub.2) (5)
Where: ƒ(S.sub.0, S.sub.1) a blend mode function defined in PDL for compositing the layer 0 colour value S.sub.0 and layer 1 colour value S.sub.1. g(S.sub.01, S.sub.2) is a blend mode function defined in PDL the layers 0/1 and layer 2 applied to a colour value S.sub.2 of layer 2 and a colour value S.sub.01 resulting from compositing layers 0 and 1.
(56) The remaining terms of equation 5 are explained in Table 1, Table 2 and Table 3.
(57) For Porter and Duff compositing, the blend mode functions ƒ( ) and g( ) can be replaced by the following substitutions:
ƒ(S.sub.0,S.sub.1)=β.sub.ƒ0S.sub.0+β.sub.ƒ1S.sub.1 (6)
g(S.sub.01,S.sub.2)=β.sub.g0S.sub.01+β.sub.g1S.sub.2 (7)
(58) Then equation 5 can be expanded by substituting (7) for g (S.sub.01, S.sub.2) and then substituting (2) for α.sub.01S.sub.01, (1) for α.sub.01 and (6) for ƒ(S.sub.0, S.sub.1) in the resulting equation, giving a foreground compositing model for 3 layer compositing as follows:
(59)
Where: α.sub.0S.sub.0, α.sub.1S.sub.1 and α.sub.2S.sub.2 are the pre-multiplied colour values for the layer 0, layer 1 and layer 2 image pixels respectively. α.sub.01S.sub.01 is the pre-multiplied colour result for the two layer Porter and Duff compositing equation. α.sub.012S.sub.012 is the pre-multiplied colour result for the three layer Porter and Duff compositing equation.
(60) In a similar way as was done for the two level compositing equations, the Porter and Duff compositing operators specified in the input PDL script are used to set the values for the coefficients defined in Table 2.
(61) Compared to the traditional 2-level Porter and Duff equation, the 3-level compositing equation has many sub-expressions, which are independent of the bottom layer's colour value and transparency.
(62) To identify the independent sub-expressions of equation (8), first identify sub-expressions of equation (8) based on parenthesis, i.e. in accordance with rules for performing arithmetic operations “+” and “*”. Then the sub-expressions are classified as either containing only terms independent of the background or comprising at least one term dependent on the background.
(63) Given that “+” and “*” are both associative and commutative operations for all integer numbers, the order in which the sub-expressions within each group are evaluated does not affect the result. In the follow description of mathematics, a “term” is defined as either a single number or variable, or the product of several numbers or variables as normally defined in mathematics. “Operators” are mathematical operators such as “+”, “−”, etc. An “expression” is a combination of multiple terms and operators. A “sub-expression” is an expression which is a sub-part of a previously defined expression. Next, identify the terms in each sub-expression which are dependent on the background colour and opacity. These dependent background terms are represented by α.sub.0 and .sub.0S.sub.0 in equation (8). Sub-expressions which have no background terms can be immediately declared to be independent sub-expressions. For example, equation (8aa) is a sub-expression of (8) which is dependent on the background due to the term α.sub.0S.sub.0.
α.sub.0S.sub.0(Y.sub.ƒ−Y.sub.ƒ+α.sub.1β.sub.ƒ0)(Y.sub.g−Y.sub.gα.sub.2+α.sub.2β.sub.g0) (8aa)
(64) The sub-expression (8aa) is composed of both a dependent term (8ab) and an independent sub-expression (8a):
α.sub.0S.sub.0 (8ab)
(Y.sub.ƒ−Y.sub.ƒα.sub.1+α.sub.1β.sub.ƒ0)(Y.sub.g−Y.sub.gα.sub.2+α.sub.2β.sub.g0) (8a)
(65) The following shows the independent sub-expressions that can be derived from (8) as a result of applying the method described. The derived independent sub-expressions are indep1 (8a), indep2 (8b), indep3 (8c), indep4 (8d) and indep5 (8e). These sub-expressions are independent of the background and independent of each other sub-expression result. The independent sub-expressions can be computed while waiting for the background result and, therefore, sub-expression (8a) can be scheduled ahead of (8ab):
indep1=(Y.sub.ƒ−Y.sub.ƒα.sub.1+α.sub.1β.sub.ƒ0)(Y.sub.g−Y.sub.gα.sub.2+α.sub.2β.sub.g0) (8a)
indep2=α.sub.1S.sub.1(β.sub.ƒ1+Z.sub.ƒ)(Y.sub.g−α.sub.2Y.sub.g+α.sub.2β.sub.g0 (8b)
indep3=α.sub.2S.sub.2(β.sub.g1+Z.sub.g)(Y.sub.ƒ+(X.sub.ƒ−Y.sub.ƒ−Z.sub.ƒ)α.sub.1) (8c)
indep4=α.sub.1S.sub.1Z.sub.ƒ(Y.sub.g−α.sub.2Y.sub.g+α.sub.2β.sub.g0) (8d)
indep5=α.sub.2S.sub.2(α.sub.1Z.sub.ƒZ.sub.g+α.sub.1Z.sub.ƒβ.sub.g1−Z.sub.g) (8e)
Where: indep1-indep5 are the results of the independent sub-expressions.
(66) All other terms are explained in Table 1, Table 2 and Table 3.
(67) The result of the three-layer compositing equation (8) can now be expressed as an equation (8f) in terms of the results from the independent sub-expressions (8a), (8b), (8c), (8d), (8e), the background pre-multiplied colour value (8ab) and background opacity value (8ac):
α.sub.0 (8ac)
α.sub.012S.sub.012=α.sub.0S.sub.0(indep1)+α.sub.0(indep2+indep3)−indep4+indep5 (8f)
Where: indep1-indep5 are the results of the independent sub-expressions.
(68) All other terms are explained in Table 1, Table 2 and Table 3.
(69) The independent sub-expressions (8a), (8b), (8c), (8d) and (8e) are examples of independent instructions (1441, 1442, 1443, 1444, 1445) and the dependent expression (8f) is an example of the dependent instruction (1412) which combines the results from the independent instructions (1441, 1442, 1443, 1444, 1445) of the current layer group (1433) and the result from the dependent instruction (1402) from the previous layer group (1432) to calculate the result for the dependent instruction (1412) from the current layer group (1433). This process can be repeated up through the layer groups of the z-stack, using the result from the lower layer group to finalise computation of the dependent instruction for the upper layer group.
(70) Given that all identified sub-expressions (8a), (8b), (8c), (8d) and (8e) are very simple and are easy to implement on a GPU or any other graphics circuit, i.e. multiple cores can be adapted (programmed) to perform particular calculations. For example, one GPU kernel can be adapted for executing sub-expression (8a) for a plurality of pixels in the image and a particular group of layers, while another GPU kernel is configured for sub-expressions (8b). Thus, sub-expressions (8a), (8b), (8c), (8d) and (8e) within each layer group can be evaluated in parallel with each other.
(71) It should be noted, that different level of granularity can be chosen for identifying sub-expressions. If finer granularity is chosen, then sub-expressions can be prioritised depending on a number of times it is used in the compositing model (8). For example, given that sub-expressions, eg. (Y.sub.g−Y.sub.gα.sub.2+α.sub.2β.sub.g0), α.sub.2S.sub.2 and α.sub.1S.sub.1, are used multiple times, they can be prioritised over other sub-expressions, such as (Y.sub.ƒ−Y.sub.ƒα.sub.1+α.sub.1β.sub.ƒ0), (β.sub.ƒ1+Z.sub.ƒ), (β.sub.g1+Z.sub.g).
(72) Where there is no lower layer group, suitable values are supplied for the background pre-multiple colour (8ab) and opacity (8ac) by the page description language. These values are used directly in the dependent instruction (1402) of the bottom layer group (1432). The set of equations (8a)-(8f) can be regarded as one example of a “foreground compositing model” (also referred to merely as a “compositing model”).
(73) Sub-expressions may be adapted for specific processors, for example, scheduling specific instructions to operate on a particular core of a CPU in order to take advantage of cached locality of data in memory. Another method is to adapt the sub-expressions for execution on a GPGPU where a GPGPU compute kernel operates on a large amount of similar terms.
(74)
(75) As seen in
(76) The computer module 1501 typically includes a plurality processors 1505.sup.(1), 1505.sup.(2), . . . , 1505.sup.(n) (ie cores) for executing an application software program 1533 (such as the CILP compositing arrangement of
(77) The I/O interfaces 1508 and 1513 may afford either or both of serial and parallel connectivity, the former typically being implemented according to the Universal Serial Bus (USB) standards and having corresponding USB connectors (not illustrated). Storage devices 1509 are provided and typically include a hard disk drive (HDD) 1510. Other storage devices such as a floppy disk drive and a magnetic tape drive (not illustrated) may also be used. An optical disk drive 1512 is typically provided to act as a non-volatile source of data. Portable memory devices, such optical disks (e.g., CD-ROM, DVD, Blu ray Disc™), USB-RAM, portable, external hard drives, and floppy disks, for example, may be used as appropriate sources of data to the system 1500.
(78) The components 1505 to 1513 of the computer module 1501 typically communicate via an interconnected bus 1504 and in a manner that results in a conventional mode of operation of the computer system 1500 known to those in the relevant art. For example, the processors 1505.sup.(1), . . . , 1505.sup.(n) are coupled to the system bus 1504 using respective connections 1518(1) . . . 1518(n). Likewise, the memory 1506 and optical disk drive 1512 are coupled to the system bus 1504 by connections 1519. Examples of computers on which the described arrangements can be practised include IBM-PC's and compatibles, Sun Sparcstations, Apple Mac™ or a like computer systems.
(79) The CILP method may be implemented using the computer system 1500 wherein the processes of
(80) In particular, in one CILP arrangement referred to as an “embedded CILP arrangement”, the steps of the CILP method are effected by instructions 1531 in the code 1570 that are incorporated (embedded) into the software 1533. These steps are carried out within the computer system 1500. In this implementation, the CILP arrangement software code is incorporated directly into the application software program 1533 and is present in the software 1533 when it is acquired by a user. Thus for example according to this arrangement, a user can either purchase a “standard” version of a compositing program 1533 or a CILP version of the compositing program 1533, only the latter being configured to perform the CILP methods. In another CILP arrangement, the code 1571 can reside in a library 1533″, and an application software program, written specifically to support a CILP library, makes use of a library API to perform the CILP methods. Upon installing or running the multi-threaded compositing program that is written specifically to support the CILP library, the CILP arrangement software code in the library can be linked to the multi-threaded print compositing program, thus converting the multi-threaded compositing program that is written specifically to support the CILP library, which cannot without the library perform the CILP methods, to a “library based” CILP version of the print compositing program, which is configured to perform the CILP methods.
(81) The software instructions 1531 may be formed as one or more code modules, each for performing one or more particular tasks. The software may also be divided into two separate parts, in which a first part and the corresponding code modules performs the CILP methods and a second part and the corresponding code modules manage a user interface between the first part and the user.
(82) The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer system 1500 from the computer readable medium, and then executed by the computer system 1500. A computer readable medium having such software or computer program recorded on the computer readable medium is a computer program product. The use of the computer program product in the computer system 1500 preferably effects an advantageous CILP apparatus.
(83) The software 1533, 1533′ and 1533″ is typically stored in the HDD 1510 or the memory 1506. The software is loaded into the computer system 1500 from a computer readable medium, and executed by the computer system 1500. Thus, for example, the software 1533, 1533′ and 1533″ may be stored on an optically readable disk storage medium (e.g., CD-ROM) 1525 that is read by the optical disk drive 1512. A computer readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the computer system 1500 preferably effects a CILP apparatus.
(84) In some instances, the application programs 1533, 1533′ and 1533″ may be supplied to the user encoded on one or more CD-ROMs 1525 and read via the corresponding drive 1512, or alternatively may be read by the user from the networks 1520 or 1522. Still further, the software can also be loaded into the computer system 1500 from other computer readable media. Computer readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to the computer system 1500 for execution and/or processing. Examples of such storage media include floppy disks, magnetic tape, CD-ROM, DVD, Blu-Ray™ Disc, a hard disk drive, a ROM or integrated circuit, USB memory, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computer module 1501. Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of software, application programs, instructions and/or data to the computer module 1501 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.
(85) The second part of the application programs 1533, 1533′ and 1533″ and the corresponding code modules mentioned above may be executed to implement one or more graphical user interfaces (GUIs) to be rendered or otherwise represented upon the display 1514. Through manipulation of typically the keyboard 1502 and the mouse 1503, a user of the computer system 1500 and the application may manipulate the interface in a functionally adaptable manner to provide controlling commands and/or input to the applications associated with the GUI(s). Other forms of functionally adaptable user interfaces may also be implemented, such as an audio interface utilizing speech prompts output via the loudspeakers 1517 and user voice commands input via the microphone 1580.
(86)
(87) When the computer module 1501 is initially powered up, a power-on self-test (POST) program 1550 executes. The POST program 1550 is typically stored in a ROM 1549 of the semiconductor memory 1506 of
(88) The operating system 1553 manages the memory 1534 (1509, 1506) to ensure that each process or application running on the computer module 1501 has sufficient memory in which to execute without colliding with memory allocated to another process. Furthermore, the different types of memory available in the system 1500 of
(89) As shown in
(90) The application program 1533 includes a sequence of instructions 1531 that may include conditional branch and loop instructions. The program 1533 may also include data 1532 which is used in execution of the program 1533. The instructions 1531 and the data 1532 are stored in memory locations 1528, 1529, 1530 and 1535, 1536, 1537, respectively. Depending upon the relative size of the instructions 1531 and the memory locations 1528-1530, a particular instruction may be stored in a single memory location as depicted by the instruction shown in the memory location 1530. Alternately, an instruction may be segmented into a number of parts each of which is stored in a separate memory location, as depicted by the instruction segments shown in the memory locations 1528 and 1529.
(91) In general, the processor 1505 is given a set of instructions which are executed therein. The processor 1105 waits for a subsequent input, to which the processor 1505 reacts to by executing another set of instructions. Each input may be provided from one or more of a number of sources, including data generated by one or more of the input devices 1502, 1503, data received from an external source across one of the networks 1520, 1502, data retrieved from one of the storage devices 1506, 1509 or data retrieved from a storage medium 1525 inserted into the corresponding reader 1512, all depicted in
(92) The disclosed CILP arrangements use input variables 1554, which are stored in the memory 1534 in corresponding memory locations 1555, 1556, 1557. The CILP arrangements produce output variables 1561, which are stored in the memory 1534 in corresponding memory locations 1562, 1563, 1564. Intermediate variables 1558 may be stored in memory locations 1559, 1560, 1566 and 1567.
(93) Referring to the processor 1505 of
(94) Thereafter, a further fetch, decode, and execute cycle for the next instruction may be executed. Similarly, a store cycle may be performed by which the control unit 1539 stores or writes a value to a memory location 1532.
(95) Each step or sub-process in the processes of
(96) The CILP method may alternatively be implemented in dedicated hardware such as one or more integrated circuits performing the CILP functions or sub functions. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories.
(97) As previously noted, once the result for the independent instructions (8a), (8b), (8c), (8d) and (8e) have been computed, the dependent instruction result can be calculated using (8f). This process can be repeated up through the layer groups of the z-stack, using the result from the lower layer group to finalise computation of the dependent instruction for the upper layer group.
(98) As an example, consider a z-stack of 7 layers each comprising associated image data as shown in
(99)
(100) A set 1432 of five independent instructions (1441-1445) for the layer group 1313 are depicted in a dashed rectangle 1401. Since there are only 4 CPUs available, and five independent instructions, four of the instructions are scheduled by the scheduler 1533′ in
(101) Once the set 1432 of independent instructions 1441-1445 for the layer group 1313 have completed execution, a dependent instruction 1402 uses results such as 1447 from the independent instructions 1441-1445 which produces a result 1448 can be executed. Once the set 1433 of independent instructions 1441-1445 for the layer group 1312 complete execution and the dependent instruction 1402 completes execution, a dependent instruction 1412 for the layer group 1312 can be executed dependent upon results from execution of the independent instructions 1441-1445 and the result 1448 from execution of the dependent instruction 1402.
(102) Similarly a final set 1434 of independent instructions 1441-1445 for the layer group 1311 has to wait for the independent instructions 1441-1445 to complete execution and for a result 1449 to be available from the dependent instruction 1412 before a dependent instruction 1422 can execute and produce the final result.
(103) The order in which a set of independent instructions 1441-1445 within a layer group 1432 are scheduled does not matter, that is instructions 1445, 1444, 1443, 1442 can be schedule in the first time slot 1431, and then instruction 1441 can be scheduled in the next time slot 1446. It is important however, that the layer group instructions are scheduled such that instructions in lower or background layer groups are given priority over instructions in foreground layer groups.
(104) Independent instructions from foreground layer groups such as 1433 should only be scheduled in a timeslot such as 1446, if there are available processing resources after the final instructions 1445 from the background layer group 1432 have been scheduled.
Procedure for using the Three Layer Porter and Duff Compositing Equations
(105) The three level Porter and Duff equations presented above are used to composite a z-stack of layers. This procedure will be described with reference to
(106) The z-stack of layers 1300 illustrated in
(107) The z-stack is then divided into multiple groups of three layers. The bottom group 1313 is composed of the three bottom layers (1305, 1306 and 1307). For every group above that, 1311 and 1312, the top two layers of the group come from the z-stack 1300 and the bottom layer of the group is a virtual layer, a place holder, which will contain the computed result of the group immediately below the current group.
(108) For example, in
(109)
(110) The second group 1312 is then created. The lowest layer of this new group is the virtual layer 1322 which is a place-holder and can act as a frame buffer. Since the CLIP arrangements involve a pixel-wise compositing routine, a frame buffer is typically not used. Instead, since the instructions are pixel-wise, the intermediate results are typically be stored in registers or cache memory as compositing proceeds up the z-stack. In order to contain, as depicted by a dashed arrow 1352, a computed result 1332 of the lower group 1313, once the lower group 1313 has finished compositing. The second layer of the group 1312 is the layer 1304, and the top layer of the group is the layer 1303. Since the compositing is performed in parallel, the elements of the compositing equation which are independent of the virtual layer 1322 can be computed before the result 1332 of the lower group 1313 is ready.
(111) Creation of groups continues in a similar manner to that of the group 1312 until all layers which contribute to the result have been grouped. Note that the top-most group 1311 may only have two layers. In the example in
(112) Processing of the groups is done in two passes. The two-pass approach is essential in determining the upper-most opaque layer group, that is the layer group that will be the background layer group for the purpose of compositing. In situations where the background layer group is somehow indicated in the PDL, then the first pass could be skipped and processing starting at the indicated background layer group. Another option is to not determine an opaque layer and just composite everything, this will be less efficient however. In a first pass (see a dashed arrow 1353 in
(113) For each group, the following operations are performed: 1. Based on the composition operator (see column 1 from the left of Table 1) specified by the PDL interpreter for the composition of the group layers 0/1 and layers 1/2, determine the correct control coefficients (see columns 2-4 from the left of Table 1) to use for the 3-layer Porter and Duff compositing operation (recall that for the example of three level compositing shown in
(114) In the second pass, the bottom group/opaque group has been finalised (the compositing for the group is complete) and has the single opaque layer result 1332. The groups are now processed starting at the bottom-most group with a virtual group layer (the group above the previously finalised group). In this bottom-up composition pass through the z-stack, the independent terms for each group would have been calculated concurrently, however, the order executing dependent instructions is sequential. Given that some of independent terms for the layer groups would have been pre-calculated, the pre-calculated independent terms can be reused to finalise compositing calculations as described below.
(115) For each of the remaining groups, the following operations are performed with reference to
(116) When the second pass completes, the final computed result 1351 of the top-most group will be the computed pixel value.
Detailed Description of Scheduling Instructions
(117)
(118) The dependent instructions (8f) are shown as cross-hatched boxes 1402, 1412 and 1422. The independent instructions (8a), (8b), (8c), (8d) and (8e) are shown as the boxes numbered 1441, 1442, 1443, 1444 and 1445 respectively. These instructions are repeated as depicted by the dashed boxes 1401, 1411 and 1421 for each layer group.
(119) The small box 1447 depicts the register, cache or memory buffer which holds a result 1452 of the independent instructions. The small boxes 1448, 1449, 1450 depict the register, cache or memory buffer which hold respective results 1453, 1454 and 1458 of the dependent instructions 1402, 1412, 1422.
(120) The vertical regions 1431 and 1446 are indicative of some unit of time (also referred to as “time slots” or “slots”) in which instructions are scheduled by the scheduler 1533′ in
(121) The dependency lines such as 1451 show the dependency between instructions, e.g. the dependent instruction 1412 depends on outputs from the independent instructions 1411 for the layer group 1312 associated with the set 1433 of instructions, as well as the output (or result) 1448 of the dependent instruction 1402 which in turn depends upon outputs from the set 1432 of independent instructions associated with the layer group 1313.
(122) In the first time slot 1431, the set 1401 of independent instructions (8a) (ie 1441), (8b) (ie 1442), (8c) (ie 1443) and (8d) (ie 1444) for the first layer group 1313 are scheduled. Once one of those instructions completes, (8e) (ie 1445) can be scheduled. Once other instructions complete, independent instructions (i.e. 1411) from the set 1433 for the next layer group can begin being scheduled. The instructions 1441, 1442, 1443, 1444 and 1445 of the first layer group 1432 (ie 1313 in this example) can be scheduled in any order. The only constraint is that all the independent instructions for the first layer group are scheduled before the dependent instruction 1402 or any instruction for any subsequent layer group.
(123) Once (8e) (ie 1445) from the first layer group 1313 (ie the background group in this example) completes and all other instructions (8a), (8b), (8c) and (8d) from the first layer group 1313 have completed, the dependent instruction (8f) (ie 1402) that produces the result 1448 for the layer group 1313 associated with the set 1432 of independent instructions 1401 can be scheduled.
(124) As the dependent instruction (8f) (ie 1402) is processing and the set 1411 of independent instructions for the layer group 1312 complete, the remainder (ie 1444-1445) of the independent instructions in the set 1433 for the layer group 1312 are scheduled followed by the scheduling of instructions for the set 1434 of independent instructions for the layer group 1311.
(125) Once the set 1401 of independent instructions for the layer group 1313 has finished, the dependent instruction 1402 and the set 1411 of independent instructions for the layer group 1312 have completed, the dependent instruction 1412 for the layer group 1312, which depends on the independent instructions 1411 for the layer group 1312 as well as the output 1448 of the dependent instruction 1402 for the layer group 1313 can be scheduled.
(126) Similarly, once the independent instructions 1421 in the set 1434 of instructions for the layer group 1311 have completed and the dependent instruction 1412 for the layer group 1312 has completed, then the dependent instruction 1422 for the layer group 1311 can be scheduled.
(127)
(128) While it is shown in the example in
Determination of Number of Layers in a Group
(129) Descriptions of the division of layers in the z-stack into layer groups in this specification so far has focused on using 3-layers. Multi-processor examples have used 4-processors to illustrate scheduling. The choice of 3-layers and 4-processors is illustrative only, and has been selected for description for simplicity due to the complex mathematics involved in illustrating the validity of 3-layer compositing according to the CILP arrangements, and the complexity of illustrating multi-processor scheduling.
(130) In practice there is no need to limit the layer groups to 3-layers or the multi-processor scheduling to 4-processors. Currently general purposes CPUs have up to 16 processor cores and multiprocessor systems with multiple CPUs are common. This CILP arrangement describes a per-pixel processing process and PDL pages consist of many millions of pixels, as such the CILP methods described here are ideally suited to GPU implementations where GPU have hundreds of cores for massive data-parallel processing. For example, GPU kernels may be adapted for executing dependent and independent instructions on a pixel basis under control of the scheduler 1533′.
(131) Just as the traditional two-layer Porter and Duff compositing operations have been expanded to 3-layer compositing operations in the described CILP arrangements, so too can the equations be expanded to 4-layer (shown below) or any arbitrary number of layers.
(132) For multi-processor systems the number of layers chosen for each layer group may depend on the number of processors available or some ratio of the number of processors available. For example, for a two CPU system with 32 core a 32 layer expansion of the compositing equation will not be practical, however, 8 groups of 4 layers each being processed concurrently is practical.
(133) Alternatively, the number of layers in a layer group may be chosen based on the number of independent terms that the N-layer equation produces. For example, three layers compositing can be decomposed into at least 5 independent instructions (8a), (8b), (8c), (8d) and (8e) and at least one dependent instruction (8f).
(134) To achieve good performance and minimise data starvation of CPU cores, the number of independent instructions for a layer group should be greater than the number of available threads by 10-20% to compensate for the different execution time of the scheduled instructions. That is having slightly more independent instructions than the number of threads makes better use of the multiprocessor system by ensuring greater CPU utilisation. Having a precisely 1:1 relationship between the number of independent instructions and the number of threads is a bare minimum and will result is some slight starvation of the CPU as the dependent instruction waits for all independent instructions to complete.
(135) The number of independent instructions available for scheduling is also dependent on the complexity of the compositing operation or, in this case, the number of layers of the compositing operation. Should the number of threads available be large, such a 16 or 32 core system, then the number of layers in the compositing operation needs to increase from 3 to 4 or 5 layers to supply enough independent instructions. This is done under the assumption that more layers in a group generally provide more independent instructions to be processed in parallel.
(136) Similar analysis is also applicable to GPUs, in consideration of GPU kernels, which are able to work in parallel, instead of parallel threads.
Detection of Opaque Group Result
(137) While it is possible, using the disclosed CILP arrangements, to process every layer group to produce a final result, it is not always necessary. There are situations where the layers below a certain layer do not contribute to the result due to the effects of opacity, compositing operator or blend mode. Processing time can be saved if layers, and therefore layer groups, can be determined to be opaque and layer groups below an opaque layer group are not processed.
(138) The result of a layer group is determined to be fully opaque if the background layer makes no contribution to the result of the group. This is significant since this means that a layer group can have its opaqueness determined BEFORE the background layer for that group has been composited and made available, thereby removing the need to process the layer groups below.
(139) By examination of equation (8), it can be seen that the background layer makes no contribution to the result if the following background layer contribution determination equations 9 & 10, derived from sub-equations (8a), (8d) and (8e), are true:
(Y.sub.ƒ−Y.sub.ƒα.sub.1+α.sub.1β.sub.ƒ0)(Y.sub.g−Y.sub.gα.sub.2+α.sub.2β.sub.g0)=0 (9)
α.sub.1S.sub.1(β.sub.ƒ1+Z.sub.ƒ)(Y.sub.g−α.sub.2Y.sub.g+α.sub.2β.sub.g0)+α.sub.2S.sub.2(β.sub.g1+Z.sub.g)(Y.sub.ƒ+(X.sub.ƒ−Y.sub.ƒ−Z.sub.ƒ)α.sub.1)=0 (10)
Where:
(140) The terms of equations 9 & 10 have been described in Table 1, Table 2 and Table 3.
(141) Therefore, an opaque group result can be detected by testing the validity of equations (9), (indep1 (8a) equal to 0) and (10), (indep2 (8b) plus indep3 (8c) equal to 0).
(142) Note that these two equations do not use background opacity (α.sub.0) or background colour value (S.sub.0), so can be calculated before the results of the group below have been determined. If the independent instructions (8a), (8d) and (8e), which comprise (9) and (10), are scheduled to execute first when processing a layer group, then once the opaqueness is evaluated, the results from the independent instructions can be used again as input into the dependent instruction (8f) and the remaining independent instructions (8b) and (8c) to produce the final result for the layer group without re-evaluating all the independent instructions.
(143) The opaque group determination instructions are also performed per-pixel and as such well suited to implementation on a GPU.
(144) A further advantage of this method is that if equations (9) and (10) are valid, and therefore the background layer does not contribute to the result, then the effect of the background layer is removed from equation (8) and the layer group result calculation can be simplified to the three-layer compositing equation (12) for fully opaque backgrounds:
α.sub.012S.sub.012=−α.sub.1S.sub.1Z.sub.ƒ(Y.sub.g−α.sub.2Y.sub.gα.sub.2β.sub.g0)+α.sub.2S.sub.2(α.sub.1Z.sub.ƒZ.sub.g+α.sub.1Z.sub.ƒβ.sub.g1−Z.sub.g) (12)
(145) Where:
(146) The terms of equation 12 have been described in Table 1, Table 2 and Table 3.
(147) Equation (12) has only two sub-terms namely indep4 (see 8d) and indep5 (see 8e) which have already been calculated in determining if the layer group is opaque or not, so the evaluation of the layer group result in equation (12) is trivial.
(148) As previously noted, when the equations (9) and (10) are equal to zero this means that the group layer opacity is 100% and the background layer of the group has no contribution at all to the result. An alternate evaluation of group layer opacity is that instead of equations (9) and (10) being equivalent to 0, equations (9) and (10) need only be less than some pre-determined value such that the group layer opacity is some percentage substantially equivalent to being opaque, ie being greater than a pre-determined threshold, such that the background layer of the group has no visual contribution to the result.
Four Layer Porter and Duff Compositing
(149) The method used to generalise the Porter and Duff compositing equations to three levels can also be used to generalise the equations to four levels.
(150) The case of four level compositing is shown in
(151) The four level Porter and Duff compositing equations consist of equation (12A) which calculates the alpha result for the four layers and equation (13) which calculates the pre-multiplied colour value for the four layers:
(152)
Where:
(153) The majority of variables have already been described in previous equations and in Table 1, Table 2 and Table 3. The remaining variables are defined below: α.sub.3S.sub.3 is the pre-multiplied colour of the layer 3 image pixel. α.sub.0123S.sub.0123 is the pre-multiplied colour result for the 4-layer Porter and Duff compositing equation. β.sub.h0 and β.sub.h1 are the Porter and Duff blend mode function control coefficients which describe how the result from the bottom three layers (layer 0/1/2) and the top layer (layer 3), of the group, contribute to the group colour result. β.sub.g0 is the control coefficient (either 0 or 1) which controls whether or not the layer 0/1/2 result contributes to the layer group result; and β.sub.g1 is the control coefficient (either 0 or 1) which controls whether or not S.sub.2 (layer 3/destination colour) contributes to the result.
(154) In a similar way as was done for the two level compositing equations, the Porter and Duff compositing operators specified in the input PDL script are used to set the values for the coefficients defined in Table 2.
(155) In a similar manner to the three-layer Porter and Duff compositing equation, the four-layer equation (13) above can be deconstructed into instructions independent of the layer group background layer and instructions dependent of the layer group background layer. Further there are a number of sub-expressions in the equation (13) which are common to more than one sub-expression and need only be computed once.
(156) Scheduling of the instructions for the four-layer case is similar to that for the three-layer case as illustrated
Blend Equations
(157) The three-layer Porter and Duff compositing equations (3, 4) do not support the application of the multiply and screen blend modes of Porter and Duff. The three-layer equations (3, 4) can be extended to use the multiply and screen blend modes of Porter and Duff by adding a blend term γ to the compositing functions (6, 7) to give the extended blend mode functions (14, 15) as follows:
ƒ(S.sub.0,S.sub.1)=β.sub.ƒ0S.sub.0+β.sub.ƒ1S.sub.1+γ.sub.ƒS.sub.0S.sub.1 (14)
g(S.sub.1,S.sub.2)=β.sub.g0S.sub.1+β.sub.g1S.sub.2+γ.sub.gS.sub.2S.sub.1 (15)
Where: γ.sub.ƒ is the blend mode coefficient used for layer 0 and layer 1; γ.sub.g is the blend mode coefficient used for the layer 0/1 result and layer 2; and
(158) The remaining terms are described in Table 1, Table 2 and Table 3.
(159) By substitution of equations (14) and (15) into equation (8), the three-layer extended blend mode compositing equation (16) is:
(160)
Where: γ.sub.ƒ is the blend mode coefficient used for layer 0 and layer 1; γ.sub.g is the blend mode coefficient used for the layer 0/1 result and layer 2; and
(161) The remaining terms are described in Table 1, Table 2 and Table 3.
(162) In a similar way to the compositing formulae, the values of the X, Y, Z, β.sub.f0, β.sub.f1 and γ.sub.f1 terms are all determined by the blend operator set in the input PDL script. Table 2 describes the relationship between the blend operator and the values of these terms for the case of Porter and Duff blending.
(163) The values for the blend mode control coefficients for the extended blend modes multiply and screen are shown in Table 4. The X column also applies to the multi-layer coefficients X.sub.ƒ, X.sub.g, X.sub.h. The Y column also applies to the multi-layer coefficients Y.sub.ƒ, Y.sub.g, Y.sub.h. The Z column also applies to the corresponding multi-layer coefficients Z.sub.ƒ, Z.sub.g, Z.sub.h. The β.sub.ƒ0 column also applies to the multi-layer compositing coefficient β.sub.g0 and β.sub.h0. The β.sub.ƒ1 column also applies to the multi-layer compositing coefficient β.sub.g1 and β.sub.h1. The γ.sub.ƒ column also applies to the multi-layer compositing coefficient γ.sub.g.
(164) TABLE-US-00004 TABLE 4 Porter and Duff compositing operators for blend modes. P&D Operator X Y Z B.sub.f0 β.sub.f1 γ.sub.f Description multiply 1 0 1 0 0 1 Multiply blend mode. screen 1 0 1 1 1 0 Screen blend mode.
First Arrangement
(165) This first arrangement assumes that the compositing algorithm is provided data in a format that describes the objects that contribute to every point, such as a fillmap.
(166) Print rendering systems are normally provided with a source document in a page description language, such as Portable Document Format (PDF) developed by Adobe Systems Inc. Such print rendering systems generate pixel data output that is suitable for sending the pixel data to printer hardware for drawing on an output medium.
(167) A print rendering system may for example take drawing requests or instructions from the PDL and render the drawing instructions directly into a full-page frame-buffer. The drawing instructions can be rendered by converting the drawing instructions to one or more intermediate formats. The present arrangement relates to printing systems that use an intermediate format which provides a z-stack of objects contributing to each pixel, for example a fillmap data structure, or simply a fillmap.
(168) A fillmap describes the content of a single page in a print document. The data structure of a fillmap 1699 is shown in
(169) In the example of
(170) The first CILP arrangement that determines the colour values of a composited pixel is now described. The input fillmap 1699 comprises a plurality of compositing stacks such as 1605, in which all objects contributing to a particular pixel are z-ordered. Each entry in the compositing stack 1605 references object information, such as fills 1609 stored in the fill store 1608.
(171)
(172) The process then moves to a step 302 where the number of layers, m, in the list of layers obtained in step 303 is determined.
(173) The process then moves to a step 303 where the list of layers obtained in the step 301 is divided into groups of at most three layers. The first group is composed of the lowest layers in the layer list (the page background, and the first two objects in the z-ordered object list). In all other groups, the first layer (ie the lowest layer) in the group is the computed result of the group below the group in question, and the second and third layers are the next two layers from the layer list. Note that the top-most group may have only two layers.
(174) The process then moves to a step 304, described hereinafter in more detail with reference to
(175) The process then moves to step a 305, described hereinafter in more detail with reference to
(176) Once the final (top-most) group is processed, then the pixel value has been determined, and the process is complete.
(177)
(178) The process starts at a step 401 (all method steps in this process are performed by one or more of the processors 1505.sup.(1)-1505.sup.(n) as directed by the software 1533 and/or the scheduler 1533′). In this step, all groups are put in a waiting list in reverse z-order, where the lowest group (which contains the page background) at the tail of the list, and the highest group at the head of the list. All groups are marked as relevant.
(179) The process then moves to a step 402. In this step a number of group compositing threads are started, using one or more of the processors 1505.sup.(1)-1505.sup.(n) as directed by the software 1533 and the scheduler 1533′. Typically, the number of group compositing threads started is equal to the number of processor cores available. See the description of step 403 below for a description of one of the group compositing threads.
(180) The process then moves to a step 404 where the process waits for all of the group compositing threads started in step 402 to finish. The process then completes.
(181) The steps 403, described hereinafter in more detail with reference to
(182)
(183) The process starts at a decision point 501 (all method steps in this process are performed by one or more of the processors 1505.sup.(1)-1505.sup.(n) as directed by the software 1533 and/or the scheduler 1533′). In this decision point, the process checks to see if there is at least one group in the waiting list. If the waiting list is empty, then the process moves to a step 508 which terminates the thread. If there is at least one group in the waiting list, then the process moves to a step 502.
(184) In the step 502, the current thread removes the group at the head of the waiting list. This becomes the current group that this thread will work on. Note that since there are multiple threads, concurrent programming techniques must be used to ensure that the waiting list is accessed correctly in both the decision point 501 and the step 502.
(185) In the decision point 503, the current group is examined to see if it is marked as relevant. If the group is marked as relevant, then the process moves to a step 504. If the group is not marked as relevant, then the process moves to the step 508.
(186) In the step 504, described hereinafter in more detail with reference to
(187) The process then moves to a step 505, described hereinafter in more detail with reference to
(188) The process then moves to a decision point 506. In this decision point, the calculated results of the step 505 are examined to see if the result of the layer is opaque, using the equations (9) and (10). If the result of the layer is opaque, then the process moves to a step 507. If the layer is not opaque, then the processing loops back to the decision point 501 to get the next relevant group.
(189) In the step 507, all layers below the current layer are marked as not relevant.
(190) The process then moves to the step 508. In this step, the group compositing thread is terminated. This completes the processing of the thread.
(191)
(192) The compositing formula parameters are obtained for a group, using the Porter and Duff operator for each pair of layers as specified by the original PDL input.
(193) The process starts at a step 601. In this step, the Porter and Duff operator between layer 1 (ie 202 in
(194) The process then moves to a step 602. In this step, the parameters X.sub.f, Y.sub.f, Z.sub.f, β.sub.f0 and β.sub.f1 for the compositing between layers 1 and 0 are obtained. These parameters are obtained by using the Porter and Duff operator obtained in step the 601 to find the matching row in Table 3 and reading out the parameters from that row.
(195) The process then moves to a step 603. In this step, the Porter and Duff operator between layer 2 (ie 201) and layer 1 (ie 202) is identified.
(196) The process then moves to step a 604. In this step, the parameters X.sub.g, Y.sub.g, Z.sub.g, β.sub.g0 and β.sub.g1 for the compositing between layers 2 and 1 are obtained. Again, these parameters are obtained by using the Porter and Duff operator obtained in the step 603 to find the matching row in table 1 and reading out the parameters from that row.
(197) This completes the processing of
(198)
(199) The process starts at a step 701. In this step, the waiting list is cleared. Then all groups that are currently marked as relevant are put into the waiting list. The waiting list is ordered so that the bottom-most (in z-order) group (which contains the page background) is at the head of the waiting list, while the top most-group is at the tail of the list.
(200) The process then moves to a step 702. In this step, the group at the head of the waiting list is selected and removed from the list.
(201) The process then moves to a step 703, described hereinafter in more detail with reference to
(202) The process then moves to a decision point 704. If there are groups remaining in the waiting list, then the process loops back to step 702. If there are no groups left in the waiting list, then the process described by
(203)
(204) The process starts at a step 801. In this step, the currently available input parameters are obtained. These input parameters are: α.sub.1S.sub.1, α.sub.1, α.sub.2S.sub.2, α.sub.2, X.sub.f, X.sub.g, Y.sub.f, Y.sub.g, Z.sub.f, Z.sub.g, β.sub.f0, β.sub.f1, β.sub.g0 and β.sub.g1. Note that α.sub.0S.sub.0, α.sub.0 are not available at this point.
(205) The input parameters are described in Table 1 and Table 2.
(206) The process then moves to a step 802. In this step, the stage one calculations (equations 9 & 10) are performed for the group. The calculations are defined by the following:
I20=Y.sub.ƒ−Y.sub.ƒα.sub.1+α.sub.1β.sub.ƒ0 (20)
I21=Y.sub.g−Y.sub.gα.sub.2α.sub.2β.sub.g0 (21)
I22=α.sub.1Z.sub.ƒZ.sub.g+α.sub.1Z.sub.ƒβ.sub.g1−Z.sub.g (22)
I23=(β.sub.g1+Z.sub.g)(Y.sub.ƒ+(X.sub.ƒ−Y.sub.f−Z.sub.ƒ)α.sub.1) (23)
(207) Equations 20-23 are partial sub-expressions of equation 8 which can be calculated independently of each other. The results of the sub-expressions 20-23 are represented by the terms I20-I23 respectively. The terms in the sub-expressions 20-23 have been described previously in Table 1, Table 2 and Table 3.
(208) The process then moves to a step 803. In this step, the stage two calculations are performed for the group. The calculations use the results of the numbered calculations from the step 802. The calculations are defined by the following:
I25=I20×I21 (25)
I26=α.sub.2S.sub.2×I23 (26)
I27=α.sub.2S.sub.2×I22 (27)
I28=α.sub.1S1×I21 (28)
(209) Equations 25-28 are partial sub-expressions of equation 8 which have further been simplified by representing the sub-expressions of equation 8 in terms of the results I20-I23 of the partial sub-expressions 20-23. The partial sub-expressions 25-28 can be calculated independently of each other. The results of the sub-expressions 25-28 are represented by the terms I25-I28 respectively. The terms in the sub-expressions 25-28 have been described previously in Table 1, Table 2 and Table 3.
(210) Once the step 803 is completed, the process for
(211)
(212) When the process 703 in
(213) The process starts at a step 811. In this step, the stage three calculations are performed for the group. The calculations are defined as follows:
α.sub.0S.sub.0×I25+α.sub.0(I28(β.sub.ƒ1+Z.sub.ƒ)+I26)+I27−Z.sub.ƒ×I28 (29)
(214) Equation 29 is an alternate form of the compositing equation 8 illustrating how the compositing equation 8 can be resolved from the results of the independent sub-expressions 20-23 and 25-28. The terms in equation 29 have been described previously in Table 1, Table 2 and Table 3.
(215) The process then moves to a step 812. In this step, the computed group compositing result is outputted. The process described by
Second Arrangement
(216) This second CILP arrangement is a variation of the first arrangement in which the 4-layer Porter and Duff equations are used instead of the 3-layer equations. This process is very similar to the process of arrangement 1. Only the differences will be described.
(217) The second arrangement of the process that determines the colour values of a composited pixel using the four layer Porter and Duff compositing equations is now described.
(218)
(219) The process in
(220) Steps 1001 and 1002 are identical to the equivalent steps 301 and 302 in
(221) In a subsequent step 1003, the list of layers obtained in the step 1001 is divided into groups of at most four layers. Note that the top-most group may have fewer than four layers.
(222) Steps 1004 (described in more detail with reference to
(223) The process described in
(224) The process described in
(225)
(226) The compositing formula parameters are obtained for a group, using the Porter and Duff operator for each layer as specified by the original PDL input.
(227) The process starts at a step 1101. In this step, the Porter and Duff operator between layer 1 (ie 903 in
(228) The process then moves to a step 1102. In this step, the parameters X.sub.f, Y.sub.f, Z.sub.f, β.sub.f0 and β.sub.f1 for the compositing between layers 1 and 0 are obtained. These parameters are obtained by using the Porter and Duff operator obtained in the step 1101 to find the matching row in table 1 and reading out the parameters from that row.
(229) The process then moves to a step 1103. In this step, the Porter and Duff operator between layer 2 (ie 902) and layer 1 (ie 903) is identified.
(230) The process then moves to a step 1104. In this step, the parameters X.sub.g, Y.sub.g, Z.sub.g, β.sub.g0 and β.sub.g1 for the compositing between layers 2 and 1 are obtained. Again, these parameters are obtained by using the Porter and Duff operator obtained in the step 1103 to find the matching row in table 1 and reading out the parameters from that row.
(231) The process then moves to a step 1105. In this step, the Porter and Duff operator between layer 3 (ie 901) and layer 2 (ie 902) is identified.
(232) The process then moves to a step 1106. In this step, the parameters X.sub.g, Y.sub.g, Z.sub.g, β.sub.g0 and β.sub.g1 for the compositing between layers 3 and 2 are obtained. Again, these parameters are obtained by using the Porter and Duff operator obtained in the step 1103 to find the matching row in table 1 and reading out the parameters from that row.
(233) This completes the process depicted by
(234)
(235) The process starts at a step 1201. In this step, the currently available input parameters are obtained. These input parameters are: α.sub.1S.sub.1, α.sub.1, α.sub.2S.sub.2, α.sub.2, α.sub.3S.sub.3, α.sub.3, X.sub.f, X.sub.g, X.sub.h, Y.sub.f, Y.sub.g, Y.sub.h, Z.sub.f, Z.sub.g, Z.sub.h, β.sub.f0, β.sub.f1, β.sub.g0, β.sub.g1, β.sub.h0 and β.sub.h1. Note that α.sub.0S.sub.0, α.sub.0 are not available at this point.
(236) The process then moves to a step 1202. In this step, the stage one calculations are performed for the group. The calculations are defined as follows:
I30=Y.sub.ƒ−α.sub.1Y.sub.ƒ+α.sub.1β.sub.ƒ0 (30)
I31=Y.sub.g−α2Y.sub.g+α.sub.2β.sub.g0 (31)
I32=Y.sub.h−α.sub.3Y.sub.h+α.sub.3β.sub.h0 (32)
I33=α.sub.1S.sub.1(β.sub.ƒ1−Z.sub.ƒ) (33)
I34=α.sub.2S.sub.2(β.sub.g1−Z.sub.g) (34)
I35=α.sub.3S.sub.3(β.sub.h1−Z.sub.h) (35)
I36=(Y.sub.ƒ−α.sub.1(X.sub.ƒ−Y.sub.ƒ−Z.sub.ƒ)) (36)
I37=(Y.sub.g−α.sub.2(X.sub.g−Y.sub.g−Z.sub.g)) (37)
I38=α.sub.1Z.sub.ƒZ.sub.g+α.sub.1Z.sub.ƒβ.sub.g1−Z.sub.g (38)
I39=α.sub.2Z.sub.hZ.sub.g+α.sub.1Z.sub.gβ.sub.h1−Z.sub.h (39)
I40=−α.sub.1S.sub.1Z.sub.ƒ (40)
(237) Equations 30-40 are partial sub-expressions of equation 13 which can be calculated independently of each other. The results of the sub-expressions 30-40 are represented by the terms 130-140 respectively. The terms in the sub-expressions 30-40 have been described previously in Table 1, Table 2 and Table 3.
(238) The process then moves to a step 1203. In this step, the stage two calculations are performed for the group. The calculations use the results of the numbered calculations from the step 1202. The calculations are as follows:
I41=I30×I31×I32 (41)
I42=I31×I32×I33 (42)
I43=I32×I34×I36 (43)
I44=I35×I36×I37 (44)
145=I31×I32×I40 (45)
I46=α.sub.2S.sub.2×132×138 (46)
I47=α.sub.1Z.sub.ƒ×I35×I37 (47)
(239) Equations 41-47 are partial sub-expressions of equation 13 which have further been simplified by representing the sub-expressions of equation 13 in terms of the results 130-140 of the partial sub-expressions 30-40. The partial sub-expressions 41-47 can be calculated independently of each other. The results of the sub-expressions 41-47 are represented by the terms 141-147 respectively. The terms in the sub-expressions 41-47 have been described previously in Table 1, Table 2 and Table 3.
(240) Note that there are some common terms in the above equations. Intermediate calculations can be used to optimise these calculations.
(241) Once the step 1203 is done, the process described by
(242)
(243) The process starts at a step 1211. In this step, the stage three calculations are performed for the group. The calculations are as follows:
α.sub.0S.sub.0×I41+α.sub.0(I42+I43+I44)+(I45+I46+I47+I39) (50)
(244) Equation 50 is an alternate form of the compositing equation 13 illustrating how the compositing equation 13 can be resolved from the results of the independent sub-expressions 30-47. The terms in equation 50 have been described previously in Table 1, Table 2 and Table 3.
(245) The process then moves to a step 1212. In this step, the computed group compositing result is outputted. The processing of
Third Arrangement
(246) This 3.sup.rd CILP arrangement is a variation of the first arrangement where the 3-layer Porter and Duff blend extension equations are used instead of the 3-layer equations. This process is very similar to the process of the first CILP arrangement. Only the differences will be described.
(247) The third CILP arrangement of the process that determines the colour values of a composited pixel using the three layer Porter and Duff blend equations is now described.
(248) The process described in
(249) The process described in
(250) The process described in
(251) The process described in
(252) In the 3.sup.rd CILP arrangement, the step 602 is performed as follows: In this step, the parameters X.sub.f, Y.sub.f, Z.sub.f, β.sub.f0, β.sub.f1 and γ.sub.f for the compositing between layers 1 and 0 are obtained. These parameters are obtained by using the Porter and Duff operator obtained in the step 601 to find the matching row in Table 4 and reading out the parameters from that row.
(253) In this CILP arrangement, the step 604 is performed as follows: In this step, the parameters X.sub.g, Y.sub.g, Z.sub.g, β.sub.g0, β.sub.g1 and γ.sub.g for the compositing between layers 2 and 1 are obtained. Again, these parameters are obtained by using the Porter and Duff operator obtained in the step 603 to find the matching row in table 2 and reading out the parameters from that row.
(254) The process described in
(255) The process described in
(256) In the step 801, the currently available input parameters are obtained. These input parameters are: α.sub.1S.sub.1, α.sub.1, α.sub.2S.sub.2, α.sub.2, X.sub.f, X.sub.g, Y.sub.fƒ, Y.sub.g, Z.sub.f, Z.sub.g, β.sub.f0, β.sub.f1, β.sub.g0, β.sub.g1, γ.sub.f and γ.sub.f. Note that α.sub.0S.sub.0, α.sub.0 are not available at this point.
(257) In the 802 for the 3.sup.rd CILP arrangement, the stage one calculations are performed for the group. The calculations are as follows:
I50=Y.sub.ƒ−α.sub.1Y.sub.ƒ+α.sub.1β.sub.ƒ0+γ.sub.ƒα.sub.1S.sub.1 (51)
I51=Y.sub.g−α.sub.2Y.sub.gα.sub.2β.sub.g0+γ.sub.ƒgα.sub.2S.sub.2 (52)
I52=α.sub.1Z.sub.ƒZ.sub.g+α.sub.1Z.sub.ƒβ.sub.ƒ1−Z.sub.g (53)
I53=(β.sub.g1+Z.sub.g)(Y.sub.ƒ−α.sub.1(X.sub.ƒ−Y.sub.ƒZ.sub.ƒ)) (54)
I54=α.sub.1S.sub.1(β.sub.ƒ1+Z.sub.ƒ) (55)
(258) Equations 51-55 are partial sub-expressions of equation 16 which can be calculated independently of each other. The results of the sub-expressions 51-55 are represented by the terms 150-154 respectively. The terms in the sub-expressions 51-55 have been described previously in Table 1, Table 2 and Table 3 and Table 4.
(259) The process then moves to the step 803. In this step, the stage two calculations are performed for the group. The calculations use the results of the numbered calculations from the step 802. The calculations are as follows:
I55=I50×I51 (56)
I56=α.sub.2S.sub.2×I53 (57)
I57=α.sub.2S.sub.2×I52 (58)
I58=I51×I54 (59)
I59=Z.sub.ƒα.sub.1S.sub.1×I51 (60)
(260) Equations 56-60 are partial sub-expressions of equation 16 which have further been simplified by representing the sub-expressions of equation 16 in terms of the results 150-154 of the partial sub-expressions 51-55. The partial sub-expressions 56-60 can be calculated independently of each other. The results of the sub-expressions 56-60 are represented by the terms 155-159 respectively. The terms in the sub-expressions 56-60 have been described previously in Table 1, Table 2 and Table 3 and Table 4.
(261) Once the step 803 is completed, the process for
(262) The process depicted in
(263) In the step 811 for the 3.sup.rd CILP arrangement, the stage three calculations are performed for the group. The calculations are as follows:
α.sub.0S.sub.0×I55+α.sub.0×(I56+I58)+I57+I59 (61)
(264) Equation 61 is an alternate form of the compositing equation 16 illustrating how the compositing equation 16 can be resolved from the results of the independent sub-expressions 51-60. The terms in equation 61 have been described previously in Table 1, Table 2 and Table 3 and Table 4.
(265) The process then moves to the step 812. In this step, the computed group compositing result is outputted. The process described by
INDUSTRIAL APPLICABILITY
(266) The arrangements described are applicable to the computer and data processing industries and particularly for the image rendering industry.
(267) The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive.