Computer-implemented method and system for image correction for a biomarker test
20220414848 · 2022-12-29
Inventors
Cpc classification
International classification
Abstract
Computer-implemented image correction for a biomarker test includes a biomarker test which has a calibration array and a biomarker site, wherein the calibration array comprises plural colored patches and the biomarker site is color-responsive to indicate a measurement of biomarkers present. The method comprises: storing a reference color value for each of the plural colored patches; receiving an image of the biomarker test; defining shading of pixels of the image as D, a combination of a plurality of basis functions; defining a color correction matrix, M having parameters that when solved correct color of the calibration array in the image to the corresponding stored values; solving D and M for pixels of the image excluding pixels of the biomarker site; using D and M to interpolate values for pixels of the biomarker site of the image to generate a color and shading corrected image of the biomarker site.
Claims
1. A computer-implemented method of image correction for a biomarker test, the biomarker test having a calibration array and a biomarker site, wherein the calibration array comprises a plurality of colored patches and the biomarker site is color-responsive to indicate a measurement of biomarkers present, the method comprising: storing, in a data repository, a reference color value for each of the plurality of colored patches; receiving an image of the biomarker test; defining shading of pixels of the image as D, a combination of a plurality of basis functions; defining a color correction matrix, M having parameters that when solved correct color of the calibration array in the image to the corresponding values stored in the data repository; solving D and M for pixels of the image excluding pixels of the biomarker site; and using D and M to interpolate values for pixels of the biomarker site of the image to generate a color and shading corrected image of the biomarker site.
2. The method of claim 1, wherein the plurality of basis functions each comprise a slowly varying function.
3. The method of claim 1 wherein the plurality of basis functions collectively comprise a slowly varying function.
4. The method of claim 1, further comprising extracting pixels of at least a subset of the calibration array from the image, the step of solving D and M being performed on the extracted pixels.
5. The method of claim 1, wherein the plurality of basis shading functions are 2D Legendre polynomialsor 2D discrete cosine transform (DCT) functions.
6. The method of claim 1, wherein the plurality of basis shading functions are chosen to model real world shading conditions.
7. The method of claim 6 wherein the basis functions are found using principal component analysis of the real world shading found in images of the biomarker test.
8. The method of claim 6, wherein the basis functions are found using characteristic vector analysis of the real world shading found in images of the biomarker test.
9. The method of claim 1, wherein D and M are iteratively solved using an alternating least-squares, ALS, procedure.
10. The method of claim 1, wherein D and M are iteratively solved using a random sample consensus, RANSAC, procedure.
11. The method of claim 1, wherein D and M are iteratively solved using a robust solving method.
12. The method of claim 1, wherein D and M are solved using a simple search procedure.
13. The method of claim 12, wherein the simple search procedure comprises the Nelder-Mead Simplex Method.
14. A biomarker test system comprising: a biomarker test having a calibration array and a biomarker site, wherein the calibration array comprises a plurality of colored patches and the biomarker site is color-responsive to indicate a measurement of biomarkers present; a data repository storing a reference color value for each of the plurality of colored patches; a computer system configured to execute computer program code to: receive an image of the biomarker test; define shading of pixels of the image as D, a combination of a plurality of basis functions; define a color correction matrix, M having parameters that when solved correct color of the calibration array in the image to the corresponding values stored in the data repository; solve D and M for pixels of the image excluding pixels of the biomarker site; and use D and M to interpolate values for pixels of the biomarker site of the image to generate a color and shading corrected image of the biomarker site.
15. The biomarker test system of claim 14, wherein the computer system is configured to identify the calibration array of the biomarker test in the received image and solve D and M for pixels of the calibration array.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0041] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
[0042] Embodiments of the present invention will now be described by way of example only with reference to the accompanying drawings in which:
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
DETAILED DESCRIPTION
[0054]
[0055] In one embodiment, a computer-implemented method of image correction for a biomarker test is executed.
[0056] The biomarker test has a calibration array and a biomarker site (test site), the calibration array comprising a plurality of colored patches. The biomarker site is color-responsive to indicate a measurement of biomarkers present. A per-image transform is produced, mapping biomarker site colors to a reference color space such that any dependence on the device and illumination is minimized.
[0057] The method includes:
[0058] Storing in step 10, in a data repository, a reference color value for each of the plurality of colored patches;
[0059] receiving in step 20 an image of the biomarker test;
[0060] defining in step 30 shading of pixels of the image as D, a combination of a plurality of basis functions;
[0061] defining in step 40 a color correction matrix, M having parameters that when solved correct color of the calibration array in the image to the corresponding values stored in the data repository;
[0062] solving in step 50 D and M for pixels of the image excluding pixels of the biomarker site; and,
[0063] using in step 60 D and M to interpolate values for biomarker site pixels of the image to generate a color and shading corrected image of the biomarker site.
[0064] It will be appreciated that there are different ways of excluding pixels of the biomarker site. In a preferred embodiment, pixels of the calibration array are detected or identified from the image (for example the test may have a known geometry that can be determined from the image or one that is identified from the data repository or other source). Where necessary, the image is rotated, scaled, de-skewed etc. D and M can then be solved for pixels of the calibration array, thereby avoiding the biomarker site. Similarly, where other sites are used, these can be likewise identified from the image itself or by reference to a known layout.
[0065] In a preferred embodiment, the test may have two calibration arrays that are positioned in a known orientation to each other on the test (in parallel, perpendicularly, etc.). Use of multiple arrays with known orientation advantageously aids in identifying the arrays and any image manipulation needed.
[0066] In preferred embodiments, a biomarker test design is utilized in which there is a significant gap between a pair of reference calibration arrays (barcodes). This gap is preferably chosen to enable a shading correction to be determined of a completely unknown region, using only the reference patches.
[0067] As an alternative to a pair of arrays, a test could, for example have test sites enclosed on all 4 sides or on two non-parallel sides by calibration arrays.
[0068] Preferably, each calibration array has at least 6 color patches of colors that preferably extend uniformly across the visible color spectrum. The surface of the test not occupied by calibration arrays could be as much as 90% of the area of the test.
[0069] Solving D and M may be on the basis of optimization/convergence criterion. For example, when alternating least squares, ALS, is used to solve for D and M the procedure converges and there is no criterion needed to decide when to stop. Practically, the method may be implemented so as to stop the computation of D and M when the change to fit of the data falls below a criterion amount. Other solving methods would have different convergence/best estimate criteria
[0070] It will be appreciated that the storing in step 10 need only be done once and could be done ahead of time. The data repository may be local to or remote.
[0071] It will also be appreciated that the basis functions could also be defined ahead of time and encoded in the computer implemented method—if this approach is taken, the solution to D may vary from image to image but the basis functions themselves are the same for all images.
[0072] In the biomarker test of
[0073] It will also be appreciated that other layouts are also possible and that a single barcode, multiple barcodes or one or more other representations of the color reference could be used. When estimating shading using only a calibration array, there would preferably be a calibration array of some kind on at least two sides of the test sites (to enable shading to be inferred between them). However, it will be appreciated that there are embodiments described that can use additional information such as a white background—in such situations, a single calibration array could be used.
[0074] Consider the situation where we have an image of a biomarker test i(x,y), with dimensions m×n×3, i.e. i(x,y) returns the RGB triplet at pixel location (x,y). We can also represent this image as an N×3 matrix, I, where N is the total number of pixels (i.e. N=m×n). We represent the desired color correction matrix M as a 3×3 matrix, determined using knowledge of the expected colors of the barcode patches in a reference color space. Unless stated otherwise in the following discussion we will model an image as an N×3 matrix where the reader understands it is a simple matter to reinterpret this data as an m×n×3 image. Given a real image I—which is confounded by the unknown capture conditions—we would like to transform it to an output H for well controlled reference viewing conditions (e.g. a known colour of light and where the intensity of light is uniform across the target). Embodiments seek to correct the image I so that it is similar to the putative H.
[0075] Often, the color corrected image is calculated as H=IM, however in real life it is important to consider the impact of shading across the image. If the shading is ignored, both the mapping itself will be less accurate (since the relative intensities of the patches will be incorrect in the image) and the final color values for the sites will be inaccurate (since their intensities will also be modified, compounded by the affected mapping). Analogously to the image itself, the required 2D shading correction can be represented as a diagonal N×N matrix D (since each element in the diagonal matrix corresponds to a per-pixel multiplicative factor that modulates the magnitude of an RGB vector i.e. it models the effect of shading). In this case, H is related to I as:
H=DIM (1)
[0076] For the reasons noted above, it is important to consider the shading for both the barcodes and the biomarker sites. This means that the above equation is even more challenging to solve, since although we wish to correct the colour of the biomarker sites (which account for × pixels in total), we cannot use any measurements therein to either calculate D or M. That is, we wish to estimate D and M using only the pixels in I.sub.ω where w denotes a subset of the rows of I not in the biomarker sites. In the discussion that follows the number of pixels in w is equal to N′=N−x.
[0077] The existence of missing data means that we cannot estimate the shading (or color) directly, and so must make assumptions about the structure of D. In particular, we cannot directly solve for the shading for pixels outside of w. Thus, in embodiments, D is represented as the sum of k basis functions. Because we are representing shading as the sum of a small number of basis shading fields, we are able to estimate the shading of the biomarker site pixels by fitting the basis to the non-biomarker site regions.
[0078] The following considers in more detail one way of how we might represent the shading basis diagonal matrices. There are a number of possible options for these basis functions, for example 2D Legendre polynomials or 2D discrete cosine transform functions. A third option is to directly model the shading fields across reference white biomarker tests (i.e. the biomarker tests without the barcodes or the biomarker sites). To do this we simply take lots of pictures of the white biomarker tests under lots of typical viewing conditions. Then, principal component analysis can then be used to select a set of basis functions based on the shading fields found in the real world. Henceforth, we represent shading as a linear combination of a basis of shading adjustments as:
D=Σ.sub.i=1.sup.kc.sub.iB.sub.i (2)
[0079] where B.sub.i are the N×N basis functions (represented as diagonal matrices), and c.sub.i are weighting coefficients. As a point of practical detail, we note that, while D and B.sub.i are diagonal matrices, we can ignore the off-diagonal terms when fitting the data. By using this formulation, the number of parameters to determine for the shading is hugely reduced from N (including × missing pixels) to k (the number of shading basis functions). Typically k is a very small fraction of the number of pixels and may in fact be fixed to 3, 4 or 5.
[0080] Having found a representation for D, the problem still remains of how to solve Equation 1. Let us start by assuming that we already know M. In this case, we can apply it to obtain a color corrected image, IM, and use a least squares approach to find D. That is, we want to find the shading that minimizes |DI.sub.ωM−H.sub.ω||.sup.2 where D has the form described in Equation 2 and we solve for D using only a subset of the available pixels. Once we have estimated D we can hold it fixed and solving for M is a simple least-squares regression. We then hold M fixed to solve for the shading again and iterate in this way until the process converges.
[0081] In detail, in one embodiment we obtain D given M: [0082] 1. Prepare a 3p-row and k-column matrix, RP, based on the basis functions multiplied by the color corrected image, where p is the number of colored barcode patches [0083] Find the product of the color corrected image with each of the basis functions: B.sub.1,ωI.sub.ωM, B.sub.2,ωI.sub.ωM, B.sub.k,ωI.sub.ωM, and then find the average over each known colored patch to move from an N′×3 matrix to a p×3 matrix [0084] Reshape each of these p×3 matrices into a 3p×1 column by stacking the RGB channels: [[B.sub.i(IM).sub.red].sup.p . . . [B.sub.i(IM).sub.green].sup.p . . . [B.sub.i(IM).sub.blue].sup.p]′. Place the k columns next to each other to form the 3p×k matrix, R.sup.p. [0085] 2. Prepare the known XYZ values [1-2] for the barcode patches into a p×3 matrix H.sup.p with a single XYZ value for each barcode patch, then stack the XYZ values as above to form a 3p×1 column vector h.sup.p. [0086] 3. Find the weighting coefficients by carrying out a standard least squares regression:
c=[R.sup.p].sup.+h.sup.p
[0087] where the+represents the Moore-Penrose pseudo-inverse, [RP].sup.+=[[R.sup.p].sup.T[R.sup.p]].sup.−1[R.sup.p].sup.T [0088] 4. Finally, construct D using the coefficients and basis functions according to Equation 2.
[0089] If we instead assume that we already know the shading field D, we can find the shading corrected image DI. Again, taking an average over the known barcode patches, a p×3 matrix of RGB values, Q.sup.p is formed. Using a least squares regression, M is determined:
M=(Q.sup.p)+H.sup.p
[0090] In reality, neither D nor M is known independently of the other. So we have to iterate—solve for D then M then D again and so on—until convergence. If we do this then an estimation procedure can be used called ‘Alternating Least-squares (ALS)’ (see, for example, G. D. Finlayson, M. Mohammadzadeh Darrodi and M. Mackiewicz “The alternating least squares technique for nonuniform intensity color correction” Color Research and Application 40 3 (2015) which is incorporated herein by reference). There are various other bilinear optimisations (of which the ALS formulation is one) including ransac (see G. D. Finlayson, H. Gong and R. B. Fisher, “Color Homography: Theory and Applications”IEEE Transactions on Pattern Analysis and Machine Intelligence 41 1 (2019) which is incorporated herein by reference). Importantly, even although D and M are iteratively computed only for the p patch colours these matrices are then applied across the entire image. That is, the vector c calculated in Step 3 above is used to compute the per pixel shading correction (substituted into Equation 2 above).
[0091] A key advantage of the approach set forth above is that it is entirely agnostic about the size of the test samples. One might envisage a manufacturer changing the size of the biomarker test patches but this change would not change the colour calibration already deployed. This said, for a fixed test patch design the above method can also treat the pixels in the background (not in the biomarker sites and not the colour patches) as additional information useful in determining the colour calibration.
[0092] Worked example 1
[0093] 1. Start with an input image (
[0094] 2. As an example, use the first three discrete cosine transformation (DCT) basis functions to model the shading (
[0095] 3. Select only the barcode reference patches—those parts of the image for which there are known XYZ values available, the ω subset above—and utilize, for example, an Alternating Least Squares (ALS) approach, to determine both a mapping and shading correction based on these regions alone (
[0096] 4. Obtain a final mapping M, and a shading correction field D (result pictured here for the example input image shown in step 1). Note that the shading correction field, despite being determined from the isolated barcode patches shown in step 3, covers the whole image including the biomarker sites (
[0097] 5. Apply mapping and shading correction to the sites, obtaining results which are more shading and color corrected and thus more accurate. As a final step the corrected site values are converted to biomarker concentration levels and presented to the user (
[0098] Worked example 2:
[0099] In the above example, biomarker sites take up a relatively small proportion of the space between barcode pairs: barcodes take up ˜40% of the image and the biomarker sites just ˜7%. One could consider using the blank space between sites to aid shading correction. In this example the biomarker sites are significantly enlarged to take up ˜27% of the image—as above the barcodes alone are used to determine the mapping and shading correction and so the method still works despite the fourfold increase in biomarker site size. This highlights the flexibility of the approach to work with different test designs and using very little real estate for calibration.
[0100] 1. A modified version of the biomarker test design is used, with enlarged biomarker sites. Again, there is shading present across the image meaning that biomarker sites which should in this example have the same color do not (
[0101] 2. For consistency with the previous example, the first 3 DCT basis functions are used to model the shading.
[0102] 3. Barcodes are extracted from the image, affected by the shading (
[0103] 4. The mapping and shading correction, shown in
[0104] 5. After shading correction, the sites have more similar colors resulting in more accurate biomarker predictions (