METHOD FOR CONFIDENT REGISTRATION-BASED NON-UNIFORMITY CORRECTION USING SPATIO-TEMPORAL UPDATE MASK

Abstract

A scene-based non-uniformity correction method to achieve a fixed-pattern noise reduction and eliminate ghosting artifacts based on robust parameter updates via a confident inter-frame registration and spatio-temporally consistent correction coefficients. The method includes the steps of: Assessing an input image frame whether the input image frame has a sufficient scene detail for registrations to prevent false registrations originated from low-detail scenes, calculating horizontal and vertical translations between frames to find a shift, introducing a scene-adaptive registration quality metric to eliminate erroneous parameter updates resulting from unreliable registrations and applying a Gaussian mixture model (GMM)-based temporal consistency restriction to mask out unstable updates in non-uniformity correction parameters.

Claims

1. A method for a scene-based non-uniformity correction to achieve a fixed-pattern noise reduction and eliminate ghosting artifacts in an infrared imagery, comprising the steps of: assessing an input image frame whether the input image frame has a sufficient scene detail for registrations to prevent false registrations originated from low-detail scenes, calculating horizontal and vertical translations between frames to find a shift, introducing a scene-adaptive registration quality metric to eliminate erroneous parameter updates resulting from unreliable registrations, applying a Gaussian mixture model (GMM)-based temporal consistency restriction to mask out unstable updates in non-uniformity correction parameters.

2. The method according to claim 1, wherein an observation model for a fixed pattern noise on an infrared image is:
y.sub.n(i,j)=x.sub.n(i,j).Math.a.sub.n(i,j)+b.sub.n(i,j) wherein the fixed pattern noise on the infrared image consists of multiplicative gain terms and additive offset terms to be found for estimating and correcting the fixed pattern noise to obtain an uncorrupted clean image, where x.sub.n is a true response for an n.sup.th frame, y.sub.n is an associated detector output signal, (i, j) is pixel coordinates and a and b are gain and offset coefficients.

3. The method according to claim 2, comprising the steps of performing an inter-frame registration with an adaptive frame delay and using a least mean squares (LMS) minimization for estimating the gain and offset coefficients.

4. The method according to claim 3, comprising the step of updating the observation model as a minimization of the LMS problem:
a.sub.n+1(p,q)=a.sub.n(p,q).sub.1.Math.e.sub.n(p,q).Math.y.sub.n(p,q).Math..sub.st.sub.n(p,q)
b.sub.n+1(p,q)=b.sub.n(p,q).sub.2.Math.e.sub.n(p,q).Math..sub.st.sub.n(p,q) where, subscripts n and n+1 represents current and next frame indices, (p, q) represents an index range within a registered overlap region between a delayed image and a current image, .sub.1 and .sub.2 are learn rates for gain and offset updates, e.sub.n is an error image between a corrected current frame and a delayed frame, y.sub.n is a current input frame and .sub.st.sub.n is a spatio-temporal update mask, wherein the spatio-temporal update mask determines pixel locations, wherein the observation model is configured to be updated at the pixel locations.

5. The method according to claim 1, wherein a sufficiency of scene detail is checked by applying a Sobel kernel as an edge filter to extract an average magnitude of edge information in images and comparing the edge information with a threshold.

6. The method according claim 1, wherein a scene detail magnitude, .sub.n, is calculated: $_{n} = \frac{1}{WH} .Math. (|| x_{n} * h .Math. {||}_{2}^{2} .Math. + || x_{n} * h^{T} .Math. {||}_{2}^{2})$ where W is a frame width, H is a frame height, h is a horizontal edge filter kernel and * is a discrete convolution operation.

7. The method according to claim 1, wherein the horizontal and vertical translations between the frames are calculated by using 1-D horizontal and vertical projections of edge maps generated from original frames using an edge extraction filter and matching 1-D projection vectors using a cross-correlation.

8. The method according to claim 7, wherein the 1-D horizontal and vertical projections are calculated with the equations: $P_{n}^{x} (j) = {.Math.}_{i = 1}^{W} .Math. .Math. E_{n} (i, j), .Math. P_{n}^{y} (j) = {.Math.}_{j = 1}^{H} .Math. .Math. E_{n} (i, j)$ where E.sub.n represents an edge image calculated as: $E_{n} (i, j) = {({.Math.}_{k = - r}^{r} .Math. .Math. {.Math.}_{l = - r}^{r} .Math. x_{n} (i - k, j - l) .Math. h (k, l))}^{2} + .Math. {({.Math.}_{k = - r}^{r} .Math. .Math. {.Math.}_{l = - r}^{r} .Math. x_{n} (i - k, j - l) .Math. h^{T} (k, l))}^{2}$ where h is an edge filter of size rr used in a scene detail calculation.

9. The method according to claim 8, wherein the shift between current and delayed images in x and y directions are calculated with equations: $_{x} = \underset{i}{{argmax .Math.}_{x}} (i)$ $_{y} = \underset{j}{{argmax .Math.}_{y}} (j)$ where a normalized cross correlation of the 1-D projection vectors are calculated as: $_{x} (i) .Math. \overset{}{=} .Math. \underset{x}{.Math.} .Math. .Math. \frac{(P_{n - m}^{y} (i + x) - \overline{P_{n - m}^{y}}) .Math. (P_{n}^{y} (x) - \overline{P_{n}^{y}})}{{.Math. p}_{n}^{y} .Math. {.Math. p}_{n - m}^{y}}$ $_{j} (j) .Math. \overset{}{=} .Math. \underset{y}{.Math.} .Math. .Math. \frac{(P_{n - m}^{x} (i + x) - \overline{P_{n - m}^{y}}) .Math. (P_{n}^{y} (x) - \overline{P_{n}^{y}})}{{.Math. p}_{n}^{x} .Math. {.Math. p}_{n - m}^{x}}$ where P.sub.n.sup.x and P.sub.n.sup.y are the means and P.sub.n.sup.x and P.sub.n.sup.y are standard deviations of the 1-D projection vectors P.sub.n.sup.x(j) and P.sub.n.sup.y(i), respectively.

10. The method according to claim 6, wherein an adaptive metric showing a relative registration quality with respect to scene detail magnitudes is defined as: $_{n} = \frac{_{n}}{\frac{1}{WH} .Math. {.Math.}_{i, j} | E_{n} (i, j) - E_{n - 1} (i +_{x}, j +_{y}) |}$ and the registrations with a lower quality measure than a certain threshold are discarded at a parameter update step.

11. The method according to claim 1, wherein an error calculation is performed on corrected signals as:
e.sub.n(i,j)=x.sub.nm(i+.sub.x,j+.sub.y)x.sub.n(i,j)

12. The method according to claim 11, comprising the step of masking out outlier error values on a calculated error map to avoid erroneous updates in addition to an elimination of global registration errors by using the equation: $_{s_{n}} (p, q) = {\begin{matrix} 1 & i .Math. f .Math. .Math. (e_{n} (p, q) -_{e_{n}})) < (c_{e_{n}})) \\ 0 & otherwise \end{matrix}$ where, c is a constant for distance, wherein the constant for the distance determines an outlier threshold, .sub.e.sub.n and .sub.e.sub.n are a mean and a standard deviation of an error map, respectively.

13. The method according to claim 4, wherein a GMM-based update is applied by using equations: $\begin{matrix} (i, j) = e_{n} (i, j) -_{{GMM}_{n}} (i, j) \\ _{{st}_{n}} (i, j) = {\begin{matrix} _{s_{n}} (i, j) & if .Math. .Math. {(i, j)}^{2} <_{{GMM}_{n}}^{2} (i, j) \\ 0 & otherwise \end{matrix} \\ _{{GMM}_{n + 1}} (i, j) =_{{GMM}_{n}} (i, j) +_{1} .Math. (i, j) +_{{st}_{n}} (i, j) \\ \begin{matrix} {var}_{{GMM}_{n + 1}} = .Math._{{GMM}_{n + 1}}^{2} (i, j) \\ = .Math._{{GMM}_{n}}^{2} (i, j) \\ + .Math._{2} [{(i, j)}^{2} -_{{GMM}_{n}}^{2} (i, j)] .Math._{{st}_{n}} (i, j) \end{matrix} \end{matrix}$ where .sub.st.sub.n represents the spatio-temporal update mask computed using a GMM model, .sub.GMM and .sup.2.sub.GMM are a mean image and a variance image kept by the GMM, .sub.s.sub.n states spatially masked regions of a first input image frame, is a temporal variance threshold coefficient, .sub.1 is an update learn rate for the mean image kept by the GMM and .sub.2 is the update learn rate for the variance image kept by the GMM.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0009] FIG. 1 shows a block diagram of the present invention.

[0010] FIG. 2 shows PNSR performance of the evaluated methods. The titles of the plots represent the image sequence names as given in the original ETH-ASL Thermal Image Dataset [17].

[0011] FIG. 3 shows ghosting artifacts comparison among the evaluated methods executed on a dataset.

[0012] FIG. 4 shows ghosting artifacts visible in the error images that belongs to the frames shown in FIG. 3.

[0013] FIG. 5 shows comparison of ghosting artifacts among the evaluated methods executed on the ETH-ASL dataset [17].

DETAILED DESCRIPTION OF THE EMBODIMENTS

[0014] The present invention relates to a confident image registration based non-uniformity correction algorithm that utilizes a spatio-temporal update mask. A block diagram demonstrating the flow of the algorithm is given in FIG. 1.

1. Observation Model

[0015] In order to model the response of infrared detector pixels, the common linear model approach is utilized [1]. According to this model, the observed output signal is represented by a linear function of true response as given in (1):

y.sub.n(i,j)=x.sub.n(i,j).Math.a.sub.n(i,j)+b.sub.n(i,j)(1)

where x.sub.n is the true response for the n.sup.th frame, y.sub.n is the associated detector output signal, (i, j) is the pixel coordinates and a and b are the gain and offset coefficients.

[0016] Based on this model, true response can also be written as a linear function of the detector output as shown in (2):

x.sub.n(i,j)=y.sub.n(i,j).Math.g.sub.n(i,j)+o.sub.n(i,j)(2)

where backward gain g.sub.n(i,j) is equal to I/a.sub.n(i,j) and backward offset o.sub.n(i,j) is equal to b.sub.n(i,j)/a.sub.n(i,j).

[0017] According to this model, the observed signal is corrected by using the backward gain and offset coefficients. These coefficients can be estimated using different techniques. In the proposed method, inter-frame registration is performed with adaptive frame delay and least mean squares (LMS) minimization is used for estimating the gain and offset coefficients.

2. Registration

[0018] In frame registration based non-uniformity correction approaches, the main assumption is that the different pixels must produce the same response to the same scene data. Since the fixed pattern noise is a slowly varying signal, this assumption can be used for estimating the correction coefficients iteratively using LMS minimization [8]. For this purpose, the first step is to find the correct registration between the frames.

[0019] Linear model parameters are initialized by setting gain values to 1 and offset values to 0. After obtaining the corrected signals x.sub.n and x.sub.nm, the horizontal shift .sub.xn and the vertical shift .sub.yn between these signals are calculated. In the registration block, the image detail magnitude .sub.n and the registration quality value .sub.n are calculated for ignoring the problematic registrations caused by various reasons such as lack of scene details, motion blur and camera rotation. In other words, the correction parameters are updated only when a confident registration is achieved. This selective update approach is an important process for eliminating ghosting artifacts. Details of each sub-block are given in the following subsections.

2.1. Scene Detail Calculation

[0020] In order to achieve a reliable registration, sufficient scene detail must be present. Otherwise, matching can be erroneous due to low signal-to-noise ratio. Therefore, the registration and the other parameter update steps are skipped to ensure a robust registration quality, if the newly arrived frame does not contain sufficient scene detail (.sub.n<T.sub.).

[0021] For a given frame, scene detail magnitude .sub.n is calculated as given in (3):

[00001] $\begin{matrix} _{n} = \frac{1}{WH} .Math. ({.Math. x_{n} * h .Math.}_{2}^{2} + {.Math. x_{n} * h^{T} .Math.}_{2}^{2}) & (3) \end{matrix}$

where W is frame width, H is frame height, h is horizontal edge filter kernel and * is discrete convolution operation. In our implementation, 99 extension of horizontal Sobel kernel is used as the edge filter.

2.2. Inter-Frame Shift Calculation

[0022] Instead of using 1-D projections of intensity values for shift calculation as in [13], we employ 1-D projections of edge image which provides more robust registration performance in scenes with rich edge content. Since the edge image is already calculated in (3) for detail calculation, we efficiently obtain the 1-D projections in vertical and horizontal directions as shown in (4):

[00002] $\begin{matrix} P_{n}^{x} (j) = {.Math.}_{i = 1}^{W} .Math. E_{n} (i, j), .Math. P_{n}^{y} (i) = {.Math.}_{j = 1}^{H} .Math. E_{n} (i, j) & (4) \end{matrix}$

where E.sub.n represents the edge image calculated as shown in (5):

[00003] $\begin{matrix} E_{n} (i, j) = {({.Math.}_{k = - r}^{r} .Math. .Math. {.Math.}_{l = - r}^{r} .Math. x_{n} (i - k, j - l) .Math. h (k, l))}^{2} + .Math. {({.Math.}_{k = - r}^{r} .Math. .Math. {.Math.}_{l = - r}^{r} .Math. x_{n} (i - k, j - l) .Math. h^{T} (k, l))}^{2} & (5) \end{matrix}$

where h is the edge filter of size rr used in Scene Detail Calculation.

[0023] In order to calculate the shift between the current and delayed images, normalized cross correlation of the projection vectors are calculated as shown in (6):

[00004] $\begin{matrix} _{x} (i) .Math. \overset{}{=} .Math. \underset{x}{.Math.} .Math. .Math. \frac{(P_{n - m}^{y} (i + x) - \overline{P_{n - m}^{y}}) .Math. (P_{n}^{y} (x) - \overline{P_{n}^{y}})}{{.Math. p}_{n}^{y} .Math. {.Math. p}_{n - m}^{y}} .Math. .Math._{y} (j) .Math. \overset{}{=} .Math. \underset{y}{.Math.} .Math. .Math. \frac{(P_{n - m}^{x} (j + y) - \overline{P_{n - m}^{x}}) .Math. (P_{n}^{x} (y) - \overline{P_{n}^{x}})}{{.Math. p}_{n}^{x} .Math. {.Math. p}_{n - m}^{x}} & (6) \end{matrix}$

where P.sub.n.sup.x and P.sub.n.sup.y are the means and P.sub.n.sup.x and P.sub.n.sup.y are the standard deviations of the projection vectors P.sub.n.sup.x(j) and P.sub.n.sup.y(i), respectively. Similar definitions apply for the projection vectors for the delayed frame as well.

[0024] Shift in x and y directions are than retrieved as shown in (7):

[00005] $\begin{matrix} _{x} = \underset{i}{{argmax .Math.}_{x}} (i) .Math. .Math._{y} = \underset{j}{{argmax .Math.}_{y}} (j) & (7) \end{matrix}$

2.3. Registration Quality Assessment

[0025] One of the major sources of the erroneous model updates is the registration errors that stem from the violation of pure translation assumption. To overcome such problems, we utilize a registration quality measure that discards the frame pairs which have relatively large global registration error compared to scene detail magnitude. In other words, we introduce an adaptive metric that shows relative registration quality with respect to scene detail magnitude. The registration quality measure is defined as shown in (8):

[00006] $\begin{matrix} _{n} = \frac{_{n}}{\frac{1}{WH} .Math. {.Math.}_{i, j} | E_{n} (i, j) - E_{n - 1} (i +_{x}, j +_{y}) |} & (8) \end{matrix}$

[0026] This metric favors the registrations where the error magnitude is a small fraction of scene detail magnitude. Registrations with lower quality measure than a certain threshold T.sub.v are discarded at the parameter update step.

3. Non-Uniformity Correction Model Update

[0027] In general, the observation model discussed above is updated as the minimization of the LMS problem stated in [5] as given in (9):

a.sub.n+1(p,q)=a.sub.n(p,q).sub.1.Math.e.sub.n(p,q).Math.y.sub.n(p,q).Math..sub.st.sub.n(p,q)

b.sub.n+1(p,q)=b.sub.n(p,q).sub.2.Math.e.sub.n(p,q).Math..sub.st.sub.n(p,q)(9)

where, subscripts n and n+1 represents the current and next frame indices, (p, q) represents the index range within the registered overlap region between the delayed and current image, .sub.1 and .sub.2 are the learn rates for the gain and offset updates, e.sub.n is the error image between the corrected current frame and the delayed frame, y.sub.n is the current input frame and .sub.st.sub.n is the spatio-temporal update mask which determines pixel locations at which the observation model will be updated.

[0028] It is noted that there is a 2-pass update for the model which exploits the fact that the registered overlapping regions of the delayed and current frames coincide with different pixel locations of the FPA. Thus, it is possible to update the gain and offset coefficients for two pixel locations instead of one by repeating the update step in both directions.

[0029] The calculation of the error map e.sub.n and spatio-temporal update mask .sub.st.sub.n will be detailed in further sections.

3.1. Error Map Calculation

[0030] The error map yields the error between the intensities read from different FPA pixel locations for the same scene content assuming the registration error is sufficiently small. This information provides the difference between the FPA element responses at different locations. The error calculation is performed on the corrected signals as given in (10):

e.sub.n(i,j)=x.sub.nm(i+.sub.x,j+.sub.y)x.sub.n(i,j)(10)

3.2. Spatial Mask for Model Update

[0031] Although the generated error maps in 3.1 to find non-uniform responses, unfortunately, not all error values are valid due to residual registration errors, moving objects in the scene and so forth. As an a priori information, non-uniformity characteristics for each pixel in an FPA does not vary radically; unless they are defective [16]. Based on this, erroneous updates can be further avoided in addition to the elimination of global registration errors explained in Section 2.3. This is achieved by masking out the outlier error values on the previously calculated error map as given in (11):

[00007] $\begin{matrix} _{s_{n}} (p, q) = {\begin{matrix} 1 & i .Math. f .Math. .Math. (e_{n} (p, q) -_{e_{n}})) < (c_{e_{n}})) \\ 0 & otherwise \end{matrix} & (11) \end{matrix}$

where, c is the constant for distance which determines the outlier threshold. .sub.e.sub.n and .sub.e.sub.n are the mean and standard deviation of the error map, respectively.

3.3. GMM-Based Temporal Mask for Model Update

[0032] Up to this point, we have generated an update mask that avoids global registration errors and spatially inconsistent non-uniformity errors. As a final precaution for erroneous model parameter updates, we exploit the fact that FPN has an approximately constant behavior over time [7]. Thus, any update that would yield radically different model parameters is considered invalid and masked out from calculations. Analogous to the background subtraction domain, we could model residual FPN in the registration error map as fixed background and model other temporally inconsistent errors as the foreground. As a commonly applied solution to background subtraction problem, GMM could be successfully used to detect and eliminate the error components that do not belong to FPN. The GMM update steps are given in the following equations:

[00008] $\begin{matrix} (i, j) = e_{n} (i, j) -_{{GMM}_{n}} (I < j) & (12) \\ _{{st}_{n}} (i, j) = {\begin{matrix} _{s_{n}} (i, j) & if .Math. .Math. {(i, j)}^{2} <_{{GMM}_{n}}^{2} (i, j) \\ 0 & otherwise \end{matrix} & (13) \\ _{{GMM}_{n + 1}} (i, j) =_{{GMM}_{n}} (i, j) +_{1} .Math. (i, j) +_{{st}_{n}} (i, j) & (14) \\ \begin{matrix} {var}_{{GMM}_{n + 1}} = .Math._{{GMM}_{n + 1}}^{2} (i, j) \\ = .Math._{{GMM}_{n}}^{2} (i, j) \\ + .Math._{2} [{(i, j)}^{2} -_{{GMM}_{n}}^{2} (i, j)] .Math._{{st}_{n}} (i, j) \end{matrix} & (15) \end{matrix}$

where .sub.st.sub.n represents the spatio-temporal update mask computed using GMM model. .sub.GMM and .sup.2.sub.GMM are the mean image and variance image kept by the GMM. They are initialized by using the spatially masked regions (.sub.s.sub.n) of the first input image frame. Also, is the temporal variance threshold coefficient, .sub.1 is the update learn rate for the GMM mean image and .sub.2 is the learn rate for the GMM variance image.

4. Adaptive Frame Delay

[0033] In order to obtain a robust registration between the frames, a minimum magnitude of inter-frame shift and scene detail is required. If these conditions do not hold for the incoming frames, we drop them and wait for a better candidate for the parameter update. In other words, delay between two registration frames is not necessarily one, but can take values between one and m.sub.max. When the maximum frame delay m.sub.max is reached before finding a suitable image frame, the process restarts with new initial frame.

5. Process Steps of the Invention

[0034] The observation model for the fixed pattern noise on infrared (IR) image is considered to be consisting of a multiplicative gain terms and additive offset terms. If these terms are found then the fixed pattern noise is estimated and can be corrected from the noisy image to obtain an uncorrupted clean image. In the following steps, our aim is to find these correction terms.

[0035] After the IR image is captured by a sensing device, the algorithms first starts a registration process between image frames. The first step of this process is to check whether the frames to be registered contain sufficient detail information. This step is necessary to avoid erroneous registrations caused by insufficient detail in the images. It is checked by applying a Sobel filter to extract the average magnitude of edge information in the images. Then this information is compared with a suitable threshold. If the scene detail is lower than the threshold, the algorithms do not proceed to registration steps. This way, registration performance is improved by avoiding the potentially erroneous registrations.

[0036] The images with sufficient scene detail are then used to find a shift between the frames. For the registration, the 1-D horizontal was used and vertical projections of the edge maps generated from the original frames using edge extraction filter. Then the 1-D projection vectors are matched using cross-correlation. This way, horizontal and vertical translations between the frames are calculated.

[0037] Even after the scene detail assessment, the registration performance may still not be as good as desired. Hence, a registration quality metric is defined to quantify the registration performance. This metric is found by calculating the ratio of average registration error and the average scene detail magnitude. Average registration error is simply the frame difference between the edge maps of the frames and scene detail magnitude is readily calculated. This ratio is expected to be small for good registration, hence, a threshold is used for this metric and discontinue registration process if this metric is smaller than the designated threshold. This step helps to reduce errors caused by poor registration performance.

[0038] After these steps, two sufficiently well registered image frames are obtained. These frames provide same image region measured from different sensor pixels within the registered overlap region. Using the assumption that same regions should produce same responses in the imaging pixels, non-uniformity errors can be found. Using these errors, gain and offset correction terms can be updated. Due to the fact that fixed pattern noise does not change at a fast rate, the updates with a slow learning rate is made to avoid premature convergence. Also, not all the pixels are updated at once. There are spatial and temporal consistency masks to determine which pixel locations are to be updated. The use of both the spatial and temporal constraints provides robustness to the non-uniformity correction terms updates. This is a novel approach to reduce the ghosting artifacts produced by erroneous estimations of the correction terms.

[0039] In order to update non-uniformity correction terms, the error map between sensor element responses in the overlap regions of two shifted frames is found. The error map is simply the difference of the overlap regions of the two frames.

[0040] In the fact that typical behavior of fixed pattern noise does not change radically between pixels in an IR image sensor. Hence, it can be deduced that error values that are too different from its neighboring pixels are probably caused by erroneous calculations. These errors are mostly caused by scene changes that violate the translation assumption such as motion blur, non-rigid scene motion and so forth. Therefore, the pixels that have error map values deviating from its local neighborhood with a certain amount is marked. The marked pixels are considered spatially inconsistent and hence the correction terms for the corresponding pixels are not updated.

[0041] Another characteristic of the fixed pattern noise is that it has an approximately constant behavior over limited time durations. Using this information, the temporal variation of the error map values is constrained. If temporal variation is higher that a factor of variance threshold, it can be considered that error map values are not valid and correction terms for such pixel locations are not updated. This is achieved by the use of Gaussian Mixture Models. Analogous to background subtraction case, consistent error map values are considered as background and temporally changing error map values are considered as moving foreground. Then, the deviations of error map values from the temporal mean value estimations are found. These deviations are compared against the thresholds which are a factor of temporal variance estimations. The pixels with higher temporal error map deviations are masked out and the correction terms at these locations are not updated. The temporal mean and variance estimations are also updated at the non-masked pixel locations for iteratively improving the GMM mean and variance estimations similar to the update mechanism of non-uniformity correction terms.

[0042] When any of the conditions detailed in abovementioned steps, then the current couple of frames are not valid to find correction term updates. In such cases, one could drop both frames and wait for another couple of consecutive frames which would satisfy the constraints. However, this may not always be necessary. If the first frame has sufficient scene detail but registration performance is not satisfactory, then we can keep the first frame and wait only for another second frame until a maximum number of trials is reached. This way, one not only try to register the consecutive frames, but also the frames with more than one frame delay. This adaptive frame delay strategy for registration ensures not to unnecessarily drop frames and provides faster convergence for the estimation of non-uniformity correction terms.

6. Experiments

[0043] The proposed method is compared against the state-of-the-art scene-based NUC methods: GE-NUC [13], IRLMS [15] and GLMS [8]. The evaluation is carried out with an extensive dataset containing synthetically corrupted clean IR images from a public dataset (ETH-ASL Thermal Image Dataset [17]) and uncorrected IR images with FPN captured with various IR detectors. The performance comparison is conducted using the objective metrics described in the following section.

6.1. Performance Metrics

[0044] The experiments are performed on both images with synthetically added noise and real noisy thermal images. For synthetically corrupted images, Peak Signal-to-Noise Ratio (PSNR) metric is used which measures the error between the expected (clean) reference image and the image corrected by the scene based NUC methods. PSNR calculation is given in (16):

PSNR=10 log.sub.10(peakval.sup.2/MSE)(16)

where, MSE is the mean-squared-error between the output and target images and peakval is the maximum value that the input signal can take.

[0045] For the real fixed pattern noise case, however, PSNR metric cannot be calculated, since the clean reference images are not available. In the literature, roughness metric which measures the high frequency component of the image is commonly used [13]. Roughness metric is calculated as given in (17):

[00009] $\begin{matrix} = \frac{|| \hat{X} * h .Math. {||}_{2} .Math. + || \hat{X} * h^{} .Math. {||}_{2}}{|| \hat{X} .Math. {||}_{2}} & (17) \end{matrix}$

[0046] where, {circumflex over (X)} is the corrected image frame, h=[0.5, 0.5] is the horizontal high-pass filter kernel and h.sup.l is the vertical version of the kernel.

6.2. Dataset

[0047] In a scene-based NUC study, the dataset variety is vital in terms of both the scene content and different detector responses. In the experiments, our objective is to prevent over-fitting to some limited data and observe the behavior of the methods in possible failure scenarios. Thus, we would like to observe the effects of scene contents, camera motion, detector characteristics and noise characteristics.

[0048] We used two different datasets in our experiments. The first dataset is ETH-ASL Thermal Image Dataset [17] which is publicly available [18]. This dataset has a rich scene content involving people, animals, buildings, cars, terrain, sky, etc. The images are recorded using a FLIR Tau 320 camera with a resolution of 324256 and bit-depth of 16-bits. The clean images in this dataset is corrupted by a synthetic Gaussian noise of =0 and =10. In order for the synthetic FPN to be more realistic, the amount of noise is gradually increased from 60% to 100% of the full scale. There are 7 sequences having a total of 3409 frames in the selected dataset.

[0049] The second dataset is a collection of images recorded using ASELSAN thermal imagers with a resolution of 640512 and bit-depth of 16-bits. There are both hand-held and pan-tilt camera images in the dataset containing a variety of scenes such as terrain, sky, buildings, trees, people and so forth. In total, there are 5 sequences having 3785 frames. The image sequences have real uncorrected FPN so that the methods are evaluated in physical world conditions.

6.3. Results

[0050] The experimentation results are assessed both objectively by comparing the PSNR and roughness measurements and subjectively by observing the correction performances and ghosting artifacts.

[0051] In FIG. 2, PSNR plots of each method are given for all frames of each data sequence in ETH-ASL [17] dataset. The main reason for a detailed plot is that the performance loss caused by the ghosting artifacts or false FPN estimations in the middle sections of the image sequences would not be observable by inspecting only the average PSNR values for the whole sequence. Another benefit of these plots is that the convergence of the methods are easily visible. It is noted that the proposed method produces the best PSNR results in all sequences for all frames after its late convergence. The strategy of the proposed method is clearly imprinted on the PSNR characteristics. Our method assesses all the conditions such as registration quality, frame shift amount, scene detail magnitude and temporal consistency. If all the requirements are matched, only then a noise estimation update is performed.

[0052] This strategy provides more accurate estimations at the expense of fewer estimation updates. The stability of the estimations of our algorithm is also reflected on the PSNR curve showing no oscillation unlike the compared methods. For a brief summary, average PSNR values for each sequence and overall average for the whole dataset are given in Table 1.

TABLE-US-00001 TABLE 1 PSNR Performance Comparison (dB) GLMS IRLMS GE-NUC CORESUM Sempach6 16.86 13.32 13.14 12.33 Sempach7 16.93 12.82 12.91 11.17 Sempach8 16.87 14.48 11.81 10.51 Sempach9 16.84 12.32 12.22 9.72 Sempach10 16.64 12.47 12.41 10.38 Sempach11 17.02 16.59 12.87 10.67 Selnpach12 18.41 15.77 11.96 11.84 Average 17.22 13.99 12.37 11.08

[0053] Due to the fact that the roughness metric is highly scene-dependent, the values are usually close and characteristics of the methods are somewhat similar. The proposed method again produces the best (lowest) roughness values in overall performance for each data sequence. This result is given in the average roughness values given in Table 2.

TABLE-US-00002 TABLE 2 Roughness Performance Comparison () GLMS IRLMS GE-NUC CORESUM (10.sup.3) (10.sup.3) (10.sup.3) (10.sup.3) Seq01 4.809 4.801 4.798 4.651 Seq02 4.292 4.115 3.984 3.883 Seq03 3.171 2.887 2.927 2.883 Seq04 14.318 13.159 12.977 11.994 Seq05 1.453 1.276 1.205 1.130

[0054] In the scene-based NUC domain, ghosting artifacts are the arch enemies of the algorithms. One of the main contributions of our method is the elimination of such artifacts, thus, it is useful to view some exemplary cases for the evaluated methods. FIG. 3 depicts the ghosting artifacts produced by the compared methods. Although quite visible in the output images of the algorithms, it is easier to note the ghosting artifacts in the error correction images (difference of output image and original image) in FIG. 4. For other images, the artifacts are burnt into the corrections while no such artifacts are observed for our method.

[0055] Similar ghosting artifacts could be seen in the ETH-ASL public dataset [17] results given in FIG. 5. Again, it is noted that our method does not yield any noticeable ghosting artifact whereas others do at different scales.

REFERENCES

[0056] [1] David L Perry and Eustace L Dereniak, Linear theory of nonuniformity correction in infrared staring sensors, Optical Engineering, vol. 32, no. 8, pp. 1854-1860, 1993. [0057] [2] Esteban Vera, Pablo Meza, and Sergio Torres, Total variation approach for adaptive nonuniformity correction in focal-plane arrays, Optics letters, vol. 36, no. 2, pp. 172-174, 2011. [0058] [3] Abraham Friedenberg and Isaac Goldblatt, Nonuniformity two-point linear correction errors in infrared focal plane arrays, Optical Engineering, vol. 37, no. 4, pp. 1251-1254, 1998. [0059] [4] Sungho Kim, Two-point correction and minimum filter-based nonuniformity correction for scan-based aerial infrared cameras, Optical Engineering, vol. 51, no. 10, pp. 106401-106401, 2012. [0060] [5] Dean A Scribner, Kenneth A Sarkady, Melvin R Kruer, John T Caulfield, J D Hunt, and Charles Herman, Adap-tive nonuniformity correction for it focal plane arrays using neural networks, in Proc. SPIE, 1991, vol. 1541, pp. 100-109. [0061] [6] Lai Rui, Yang Yin-Tang, Zhou Duan, and Li Yue-Jin, Improved neural network based scene-adaptive nonuniformity correction method for infrared focal plane arrays, Applied optics, vol. 47, no. 24, pp. 4331-4335, 2008. [0062] [7] John G Harris and Yu-Ming Chiang, Minimizing the ghosting artifact in scene-based nonuniformity correction, in Proc. SPIE, 1998, vol. 3377, pp. 106-113. [0063] [8] Russell C Hardie, Frank Baxley, Brandon Brys, and Patrick Hytla, Scene-based nonuniformity correction with reduced ghosting using a gated lms algorithm, Optics express, vol. 17, no. 17, pp. 14918-14933, 2009. [0064] [9] Bradley M Ratliff, Majeed M Hayat, and J Scott Tyo, Generalized algebraic scene-based nonuniformity correction algorithm, JOSA A, vol. 22, no. 2, pp. 239-249, 2005. [0065] [10] Sergio N Torres and Majeed M Hayat, Kalman filtering for adaptive nonuniformity correction in infrared focal plane arrays, JOSA A, vol. 20, no. 3, pp. 470-480, 2003. [0066] [11] Chao Zuo, Qian Chen, Guohua Gu, and Weixian Qian, New temporal high-pass filter nonuniformity correction based on bilateral filter, Optical Review, vol. 18, no. 2, pp. 197-202, 2011. [0067] [12] Alessandro Rossi, Marco Diani, and Giovanni Corsini, Bilateral filter-based adaptive nonuniformity correction for infrared focal-plane array systems, Optical Engineering, vol. 49, no. 5, pp. 057003-057003, 2010. [0068] [13] Junjie Zeng, Xiubao Sui, and Hang Gao, Adaptive image-registration-based nonuniformity correction algorithm with ghost artifacts eliminating for infrared focal plane arrays, IEEE Photonics Journal, vol. 7, no. 5, pp. 1-16, 2015. [0069] [14] Russell C Hardie, Majeed M Hayat, Earnest Armstrong, and Brian Yasuda, Scene-based nonuniformity correction with video sequences and registration, Applied Optics, vol. 39, no. 8, pp. 1241-1250, 2000. [0070] [15] Chao Zuo, Qian Chen, Guohua Gu, and Xiubao Sui, Scene-based nonuniformity correction algorithm based on interframe registration, JOSA A, vol. 28, no. 6, pp. 1164-1176, 2011. [0071] [16] A. E. Mudau, C. J. Willers, D. Griffith, and F. P. J. le Roux, Non-uniformity correction and bad pixel replacement on lwir and mwir images, in 2011 Saudi International Electronics, Communications and Photonics Conference (SIECPC), April 2011, pp. 1-5. [0072] [17] J. Portmann, S. Lynen, M. Chli, and R. Siegwart, People detection and tracking from aerial thermal views, in 2014 IEEE International Conference on Robotics and Automation (ICRA), May 2014, pp. 1794-1800. [0073] [18] http://projects.asl.ethz.ch/datasets/doku.php?d=ir:iricra2014.

METHOD FOR CONFIDENT REGISTRATION-BASED NON-UNIFORMITY CORRECTION USING SPATIO-TEMPORAL UPDATE MASK

Assignee

Inventors

Cpc classification

Classification Explorer

G06T2207/30212

PHYSICS

Classification Explorer

G06T5/50

PHYSICS

Classification Explorer

G06T2207/10016

PHYSICS

Classification Explorer

G06T2207/20076

PHYSICS

Classification Explorer

G06T5/20

PHYSICS

Classification Explorer

G06T7/0002

PHYSICS

Classification Explorer

G06T7/38

PHYSICS

Classification Explorer

G06T7/32

PHYSICS

Classification Explorer

G06T2207/20068

PHYSICS

Classification Explorer

G06T2207/10048

PHYSICS

Classification Explorer

G06T2207/30232

PHYSICS

Classification Explorer

G06T5/002

PHYSICS

Classification Explorer

G06T2207/20182

PHYSICS

Classification Explorer

G06T2207/30168

PHYSICS

International classification

Classification Explorer

G06T7/38

PHYSICS

Classification Explorer

G06T5/00

PHYSICS

Classification Explorer

G06T5/20

PHYSICS

Classification Explorer

G06T5/50

PHYSICS

Classification Explorer

G06T7/00

PHYSICS

Classification Explorer

G06T7/32

PHYSICS

Abstract

Claims

Description