SYSTEMS AND METHODS TO GENERATE HIGH RESOLUTION FLOOD MAPS IN NEAR REAL TIME
20210149929 · 2021-05-20
Inventors
- Xinyi Shen (Mansfield, CT, US)
- Emmanouil Anagnostou (Mansfield Center, CT, US)
- Qing Yang (Storrs, CT, US)
Cpc classification
G06F16/535
PHYSICS
International classification
G06F16/535
PHYSICS
Abstract
A system and method to generate flood inundation maps in near real time. The system includes a plurality of computer processing modules: a flood trigger system, a SAR data query system, and a RAPID kernel algorithm system, running in real time, to identify the potential flood zones, query SAR data, and finally compute the inundation maps, respectively. As disclosed herein, the RAPID kernel algorithm is extended to a fully automated flood mapping system that requires no human interference from the initial flood events discovery to the final flood map production.
Claims
1. A system to generate a flood inundation map, the system comprising: a flood trigger system configured to identify a flood occurring zone having one or more bodies of water; a SAR data query system to identify relevant satellite images for the flood occurring zones; and a kernel algorithm system including an electronic processor configured to receive the data from the flood trigger system, receive the satellite images from the SAR data query system, generate a binary classification of water and non-water at pixel level of the satellite images, morphologically process the satellite images to reduce over-detection of the bodies of water and to reduce under-detection of the bodies of water, apply a multi-threshold compensation to reduce speckle noise in the bodies of water, apply machine learning-based correction for speckle, and generate a flood inundation map.
2. The system of claim 1 wherein the flood inundation map is generated in near real time.
3. The system of claim 1 wherein the electronic processor is further configured to apply a probability density threshold to identify the pixels in the satellite images correspond to water or non-water.
4. The system of claim 1 wherein the electronic processor is further configured to generate a plurality of water masks from a single satellite image, and wherein each mask uses a different probability density threshold to reduce the over-detection and the under-detection of the bodies of water.
5. The system of claim 1 wherein morphologically processing the satellite images includes water source tracing and improved change detection.
6. The system of claim 5 wherein water source training includes applying a region-growing algorithm to identify water bodies from known water sources.
7. The system of claim 5 wherein improved change detection includes applying a region-growing algorithm over the non-water pixels to identify water bodies.
8. The system of claim 1 wherein the electronic processor is further configured to apply a correction algorithm to the satellite images to identify whether a pixel is within a standing water body or a water body in movement.
9. The system of claim 1 wherein the flood trigger system is configured to detect fluvial flooding and pluvial flooding.
10. The system of claim 1 wherein when the flood trigger system identifies a flood occurring zone, the SAR data query system retrieves satellite images of the flood zone on the day of flooding and a plurality of satellite images prior to the flooding.
11. A method of generating a flood inundation map in near real time, the method comprising: identifying, with an electronic processor, a flood event; retrieving, with an electronic processor, a plurality of satellite images of an area defined by the flood event; and receiving, by a kernel algorithm system, the satellite images, the kernel algorithm system configured to apply a water identifier or a non-water identifier for each of the pixels in the satellite images, morphologically process the satellite images to reduce over-detection of the bodies of water and to reduce under-detection of the bodies of water, apply a multi-threshold compensation to reduce speckle noise in the bodies of water, apply machine learning-based correction for speckle, and generate a flood inundation map.
12. The system of claim 11 further comprising applying a probability density threshold to identify the pixels in the satellite images that correspond to water or non-water.
13. The system of claim 11 further comprising generating a plurality of water masks from a single satellite image, and wherein each mask uses a different probability density threshold to reduce the over-detection and the under-detection of the bodies of water.
14. The system of claim 11 wherein morphologically processing the satellite images includes water source tracing and improved change detection.
15. The system of claim 14 wherein water source training includes applying a region-growing algorithm to identify water bodies from known water sources.
16. The system of claim 14 wherein improved change detection includes applying a region-growing algorithm over the non-water pixels to identify water bodies.
17. The system of claim 11 further comprising applying a correction algorithm to the satellite images to identify whether a pixel is within a standing water body or a water body in movement.
18. The system of claim 11 further comprising detecting fluvial flooding and pluvial flooding.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
DETAILED DESCRIPTION
[0038] Before any embodiments of the invention are explained in detail, it is to be understood that the invention is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the following drawings. The invention is capable of other embodiments and of being practiced or of being carried out in various ways.
[0039] Also, it is to be understood that the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of “including,” “comprising” or “having” and variations thereof herein is meant to encompass the items listed thereafter and equivalents thereof as well as additional items. The terms “mounted,” “connected” and “coupled” are used broadly and encompass both direct and indirect mounting, connecting and coupling. Further, “connected” and “coupled” are not restricted to physical or mechanical connections or couplings, and may include electrical connections or couplings, whether direct or indirect. Also, electronic communications and notifications may be performed using any known means including direct connections, wireless connections, etc.
[0040] A plurality of hardware- and software-based devices, as well as a plurality of different structural components may be utilized to implement the invention. In addition, embodiments of the invention may include hardware, software, and electronic components or modules that, for purposes of discussion, may be illustrated and described as if the majority of the components were implemented solely in hardware. However, one of ordinary skill in the art, and based on a reading of this detailed description, would recognize that, in at least one embodiment, the electronic-based aspects of the invention may be implemented in software (for example, stored on non-transitory computer-readable medium) executable by one or more processors. As such, it should be noted that a plurality of hardware- and software-based devices, as well as a plurality of different structural components, may be utilized to implement the invention. For example, “mobile device,” “computing device,” and “server” as described in the specification may include one or more electronic processors, one or more memory modules including non-transitory computer-readable medium, one or more input/output interfaces, and various connections (for example, a system bus) connecting the components.
[0041] Disclosed herein is a fully automated information processing chain to delineate flood maps at high resolution (˜10 m) without requiring any human interference. The flood maps are produced in NRT.
[0042] SAR data is considered most suitable for flood inundation mapping, yet there is no automated processing chain currently available because the data processing is complicated and post-human interference has been the current practice to ensure the product's quality.
[0043] The disclosure is a standalone SAR data processing framework (tool) to generate flood inundation maps. The output from this SAR data processing framework can be provided to client/customer entities as a service operation through which flood information is provided in near-real-time.
[0044]
[0045] The SAR data query system 104 provides access to high resolution images of the Earth. Mapping techniques were developed that rely on synthetic aperture radar (SAR) on-board earth-orbiting platforms. SAR provides valid ground surface measurements through cloud cover with high resolution and sampling frequency that has recently increased through multiple missions. Despite numerous efforts, automatic processing of SAR data to derive accurate inundation maps still poses challenges.
[0046] SAR Imagery Classification to Water and Land
[0047] Different from the estimation of surface parameters (such as soil moisture), inundation mapping is simply the identification of a highly accurate binary mask of water and non-water. A review of previous work on inundation mapping, most of which has involved methods for which an automated approach was difficult or impossible, is discussed below.
[0048] The specular reflective properties of open still water in SAR sensing motivated several efforts (Giustarini et al. 2013; Hirose et al. 2001; Matgen et al. 2011; Yamada 2001) to determine a threshold below which pixels are identified as water. It was understood that a single threshold might not hold well with large-area water bodies (Tan et al. 2004) or the entire swath of SAR images due to the variability of the environment with regard to, for example, wind roughening and satellite system parameters (Martinis et al. 2009). Martinis and Rieke (2015) produced spatial and temporal backscattering heterogeneity, even for permanent water bodies.
[0049] To address spatial variability, Martinis et al. (2009) applied a split-based approach (SBA), together with object-oriented (00) segmentation (Baatz 1999). Martinis et al. (2015) further combined SBA with fuzzy logic-based refinement to construct an automated processing chain (Twele et al. 2016). Matgen et al. (2011) developed a histogram segmentation method, and Giustarini et al. (2013) automated the calibration process for segmentation and region-growing thresholds. Essentially, threshold-based approaches need either a bimodal histogram of the pixels or some sample data to initialize the water distribution. For more general situations, when the histogram of the pixels is not bimodal, a straightforward option is to draw training regions of interest (ROIs) manually; but, again, this impedes automation. The SBA method, on the other hand, ensures that only the splits that show a bimodal histogram (water versus non-water pixels) are used to derive the global threshold; and Lu et al. (2014) loosened the restriction to bimodal histograms by initializing the water distribution using a “core flooding area,” automatically derived from change detection using multi-temporal SAR images. Change detection (Bazi et al. 2005; Giustarini et al. 2013; Hirose et al. 2001; Lu et al. 2014; Matgen et al. 2011; Santoro and Wegmüller 2012; Yamada 2001) is also used to select only significantly changed pixels as inundation candidates to reduce false classification of water (hereafter referred to as “false positives”).
[0050] In contrast to pixel-based threshold determination, image segmentation-based techniques identify water bodies on continuous and non-overlapping objects. The active contour method (ACM) (Horritt 1999; Horritt et al. 2001) allows a certain amount of backscattering heterogeneity within a water body and incorporates morphological metrics, such as curvature and tension. Martinis et al. (2009) applied OO with SBA to reduce false positives and speckle. In a comparison of the ACM and OO, Heremans et al. (2003) concluded that the latter delineated more accurately while the former tended to identify large water areas better. Pulvirenti et al. (2011a) provided an image segmentation method that consisted of dilation and erosion operators and employed a microwave scattering model (Bracaglia et al. 1995), which coupled matrix doubling (Fung 1994; Ulaby et al. 1986) and the integral equation model (IEM) (Fung 1994; Fung et al. 1994) to interpret the backscattering signature at object level. (Giustarini et al. 2013; Lu et al. 2014; Matgen et al. (2011)) employed a region-growing algorithm to extend the inundation area from detected water pixels.
[0051] Inundation detection also encounters vegetated areas, partially submerged wetlands, and urban areas. Theoretically, dihedral scattering is enhanced during a flood if a vegetal stalk structure exists. Ormsby et al. (1985) evaluated the backscattering difference caused by flooding under vegetation. Martinis and Rieke (2015) analyzed the sensitivity of multi-temporal/frequency SAR data to flooding conditions under different land cover conditions and concluded that the X-band radar is only suitable to detect inundation beneath sparse vegetation or forest during leaf-off period, whereas L-band, though with better penetration, has a wider range of backscattering enhancement, which reduces the reliability of the classification. Kasischke et al. (2003) analyzed the backscattering change of ERS-2 SAR from a dry to an inundated situation by comparing with a scattering model and concluded the decrease was not as great as predicted. Townsend (2001) utilized ground truth to train a decision tree to identify flooding beneath forest using Radarsat-1 SAR. Horritt et al. (2003) used two radar signatures as input for the ACM, the enhanced backscattering at C-band and the HH-VV phase difference, to generate two water contours from selected known open-water (ocean) and dry-land (coastal) pixels. Then the area between the two contours was labeled “flooded vegetation.” Pulvirenti et al. (2010) trained a set of rules using visually interpreted regions of interest (ROIs) to extract flooded forest and urban areas from COSMO-SkyMed SAR data. Also using COSMO-SkyMed, Pulvirenti et al. (2013) combined their fuzzy logic classifier (Pulvirenti et al. 2011b) and segmentation method (Pulvirenti et al. 2011a) to monitor flood evolution in vegetated areas.
[0052] Given the potential for flood detection under vegetation using SAR data, most of these efforts were based on supervised classification, which is almost impossible to automate. One explanation for the preference for supervised classification over an automated threshold determination method is the vegetation heterogeneity: the enhanced dihedral scattering of vegetation cannot be considered as a single class because of the presence of different vegetation species and structure and leaf-off and leaf-on conditions. Such heterogeneity makes it difficult to find a threshold of backscattering enhancement automatically. In other words, detecting flooding beneath vegetation requires identification of multiple classes from an image, but current automatic methods based on threshold determination are only able to discern one.
[0053] Segmentation methods present other difficulties. The initial seeds (water lines) needed by the ACM may not be identified for inundated areas that are not connected to a known water source; image dilation and erosion-based methods can smooth out details while reducing speckle; and the OO algorithm, besides the subjective process of determining the scaling factor, was not designed for SAR and is therefore not resistant to speckle. Comparison to microwave scattering models can be affected by the models' poor accuracy, caused by the lack of ground truth (soil and vegetation parameters) (Pulvirenti et al. 2013).
[0054] Only a few studies are available on flood mapping in urban areas (Giustarini et al. 2013; Martinis et al. 2009; Mason et al. 2012; Mason et al. 2010), and only one (Mason et al. 2014) investigated the use of dihedral scattering to extract flooding in areas enhanced by buildings. The vertical structure of buildings can resemble vegetation in SAR images, but it is rotationally asymmetric in comparison with a canopy trunk, which prohibits enhancement from occurring from all directions of sight. As a result, scattering enhancement only occurs at some orientations. In addition, smooth impervious surfaces and shadow areas in cities may cause over-detection. More accurate detection of water, therefore, requires knowledge of geometry, the orientation and materials of buildings, and the direction of radar illumination (Ferro et al. 2011)—information that is challenging to acquire for many cities. Another consideration is expense. Ultra-high-resolution SAR data (˜1 m) such as TerraSAR-X and COSMO-SkyMed, which are suitable for inundation mapping in urban areas, are commercial and, therefore, costly.
[0055] Issues for an Automated Flood Mapping System
[0056] Existing algorithms to detect flooding unobstructed by structures or vegetation have, as yet, only partially addressed the operational demands of NRT inundation mapping in terms of automation and accuracy. The issues are summarized as follows:
[0057] Manual labor is needed to reduce over-detection caused by smooth surface and shadow areas (referred to hereafter as water-like surfaces) and under-detection resulting from strong scatter disturbances and speckle-caused noise. Skilled operators are needed to accomplish such manual editing.
[0058] Assembled segmentation using a region-growing algorithm (RGA) cannot capture the large isolated and scattered flooded areas that may, at times, become disconnected from the pre-flooded water bodies due to variability in surface elevation and barriers after the flood peak. Water paths too narrow to detect or covered by vegetation may appear isolated from known water sources—a limit of sensor spatial resolution. Bottom to top segmentation is affected by speckle. Neither method works where actual water areas are connected to water-like areas.
[0059] Change detection, designed to eliminate over-detection, may contain significant errors caused by noise-like speckle, geometric dislocation, or shadow areas that change with the direction of radar sight. Expected location error can be a few (1-3) pixels after geo-referencing of SAR data. Exact-repeat images (from the same orbits and, thus, radar sight direction) reduce these errors, however.
[0060] Comparison with a scattering model might be inaccurate because ground parameters of vegetation are required by these models but are not available. Scattering models are also complicated to use by those with less applicable technical training.
[0061] Spatial filtering, which was used in most of the aforementioned studies, will coarsen the resolution of the result and reduce valuable details along water boundaries.
[0062] To address these issues, a fully automated, radar-produced inundation diary (RAPID) system to detect open flood extent was developed. Operating in NRT, RAPID fully integrates radar polarimetry, SAR statistics, morphology, and machine-learning methods to address the identified issues in detecting open flood water. No individual operator attention is needed, although RAPID does not detect flooding under vegetation due to difficulties outlined above. As discussed below, the four automated processing steps are described and show the advantage of synergies of multisource ancillary data, including high-resolution topography, high-resolution water occurrence, land cover classification (LCC), and river width, hydrography, and water type databases.
[0063] As noted above, the RAPID kernel algorithm system 106 provides a system to generate flood inundation maps in NRT. For example,
[0064] The memory 16 may include read-only memory (ROM), random access memory (RAM) (for example, dynamic RAM (DRAM), synchronous DRAM (SDRAM), and the like), electrically erasable programmable read-only memory (EEPROM), flash memory, a hard disk, a secure digital (SD) card, other suitable memory devices, or a combination thereof. The electronic processor 14 executes computer-readable instructions (“software”) stored in the memory 16. The software may include firmware, one or more applications, program data, filters, rules, one or more program modules, and other executable instructions. For example, the software may include instructions and associated data for performing the methods described herein. For example, as illustrated in
[0065] The communications interface 18 allows the server 12 to communicate with devices external to the server 12. For example, as illustrated in
[0066] The SAR database 24 stores georeferenced satellite images of the polarized radar backscattering. The geographical and hydrography database 26 stores land use, water occurrence, river width, flow direction and topography data, and the like.
[0067] With reference to
[0068] Rather than simply replicating and speeding up existing human processes, computers may simultaneously process multiple tasks and draw upon multiple simultaneous information sources based on interactive rules. Therefore, unlike the human brain, which is largely a serial processor, multi-tasking computer system may simultaneously weigh many factors, and therefore complement or exceed human performance with regard to generating flood inundation maps in NRT.
[0069]
[0070] Based on a fundamental understanding of developed SAR speckle, it is known that noise-like speckle is not real noise (Lee and Pottier 2009; Ulaby et al. 1982); it is, rather, a strong overlap between water and non-water classes. Consequently, conventional single-threshold methods inevitably cause noisy classification results (Matgen et al. 2006; Matgen et al. 2004), and the common-practice strategy to filter speckle as noise at the price of reducing effective resolution is not recommended as a solution. Therefore, a multi-threshold scheme to reduce the speckle effects is implemented.
[0071] Since water-like surfaces share identical scattering properties with water bodies, they cannot be eliminated by only using radar statistics. Using the water masks generated by the three automatically optimized thresholds, the morphological and compensation procedures that significantly suppressed over- and under-detection were discovered. In principle, water sources for large water bodies can be found on a high-resolution LCC map, but they may not be found for small water bodies. To prevent over-detection, morphological processing was applied to trace floodplain inundation from known water sources. To prevent the under-detection of isolated water bodies, improved change detection (ICD) was applied.
[0072] Finally, a machine learning-based approach was used to refine the detected water areas. Strong scatterers within water bodies, such as infrastructure and vehicles, can cause significant identification errors in surrounding areas due to long synthetic aperture and wide-band range compression. These errors cannot be addressed by the previous processing steps. To reduce the error caused by strong scatterers and remaining speckle, the machine-learning step integrates information on topography, river network, and water probability and type.
[0073] Although a machine-learning procedure usually requires manual collection of training samples, this is not the case in the RAPID system. Since correctly identified pixels dominate the water mask generated by previous steps, the pixels collected for training within a buffered area of water bodies (to include both water and non-water pixels) can be used directly as the training set.
[0074] Step A: Binary Classification of Water and Non-Water at Pixel Level
[0075] The first step in binary classification is to cluster water pixels from the whole swath of polarimetric SAR images. All water bodies in one swath are hypothesized as homogeneous areas with fully developed speckle. Assuming the measuring surface is reciprocal, the PDF of multi-look backscattering amplitude matrix, A, for a given category can be characterized by the Wishart distribution (Lee and Pottier 2009), reciprocal, the PDF of multi-look backscattering amplitude matrix, A, for a given category can be characterized by the Wishart distribution (Lee and Pottier 2009)
where n is the equivalent number of looks (ENL), Tr.Math.
and |.Math.| are the matrix trace and determinant, and the multi-look covariance matrix, Z, is computed by averaging multiple 1-look covariance matrices,
where H represents the conjugate transpose operator, and ū is the 1-look complex scattering vector (Kostinski and Boerner 1986),
where q is the dimension of vector ū and takes the value of 3 in reciprocal condition. C is the expectation of the covariance matrix,
C=Eūū.sup.H
, (5)
where E4.Math.
stands for the expectation of a stochastic variable.
[0076] In practice, the most available data formats are dual-polarized intensities:
[0077] Carrying out the integral with respect to other variables in (1), we have the PDF of dual-polarized intensities (Hagedorn et al. 2006; Lee and Pottier 2009),
where Γ(.Math.) and I.sub.n(.Math.) stand for the Gamma function and modified Bessel function, respectively. Although (7) is used as the starting point of water extraction in this study, the RAPID framework is not restricted to the dual-polarization case, as one can simply replace (7) with other PDFs according to the polarization availability. The distribution parameters of (7) are
[0078] In this step, we try to find the optimal value of (C.sub.11, C.sub.22, |ρ.sub.c|) for a given SAR image and then find the best probability density threshold. Unfortunately, due to the nature of SAR calibration, we cannot assume (C.sub.11, C.sub.22, |ρ.sub.c|) to be constant over either time or space across different scenes of imagery. We developed an iterative optimization procedure for dual-polarized SAR data that shares similar principles with the single-band optimization method proposed by Giustarini et al. (2013).
[0079] A single threshold is not applicable in a dual-polarized intensity space. Instead, we use a probability density threshold, th.sub.PD. A pixel is classified as water if
p(I.sub.1,I.sub.2)>th.sub.PD. (10)
[0080] In this way, the intensity domain is segmented into two regions: the central part and the marginal part, which correspond to water and non-water, respectively. If accumulative probability, th.sub.P, of water pixels needs to be retained, then
th.sub.P=∫∫.sub.p(I.sub.
[0081] Using (11), th.sub.PD can be uniquely determined by th.sub.P and PDF. We can then derive our iterative optimization procedure:
[0082] 1. Compute the initial value of distribution parameters, (C.sub.11, C.sub.22, |ρ.sub.c|), from sampled water pixels.
[0083] 2. Set th.sub.P=0.82 as the minimum retaining probability.
[0084] 3. Solve for th.sub.PD using (11).
[0085] 4. Classify the entire image by substituting th.sub.PD into (10). Note that the seeding pixels are unconditionally classified as water to prevent the parameters from deviating due to trimming of the tailing region of the probability domain.
[0086] 5. Update (C.sub.11, C.sub.22, |ρ.sub.c|) using all pixels classified as water.
[0087] 6. If the change of (C.sub.11, C.sub.22, |ρ.sub.c|) is within 0.1%, the iteration under the current th.sub.P converges; if the change is too large—say, twice the original value—the iteration under the current th.sub.P fails; go to step 9. Otherwise, go to step 3.
[0088] 7. Save the current (C.sub.11, C.sub.22, |ρ.sub.c|) and th.sub.PD as the converged parameter set and classification threshold for the current th.sub.P. Compute the Nash-Sutcliffe efficiency coefficient,
[0089] where p and p.sub.obs stand for the probability density computed by (7) and the probability density aggregated from all water pixels in the image, respectively. p.sub.obs is derived by a grid area-normalized 2D histogram.
[0090] 8. Increment th.sub.P by 0.01. If th.sub.P is smaller than the upper limit, 0.99, set (C.sub.11, C.sub.22, |ρ.sub.c|) to the original value. Then go to step 3.
[0091] 9. The th.sub.P and (C.sub.11, C.sub.22, |ρ.sub.c|) corresponding to the maximal NSE are selected as the optimal values.
[0092] Automated Sampling and the Determination of ENL
[0093] Similar to Lu et al. (2014)'s strategy of detecting “core” flooded areas, we removed the requirement of bimodal histogram by initializing the PDF of water class from seeds needed in step 1. But, as discussed in the introduction, change detection is sensitive to speckle and geolocation error, and the threshold is difficult to globalize, so we chose to use a different approach. For sampling to be completely automated, the generation of seeds for step 1 needs to be automated. We proposed to obtain seeds automatically by collecting pixels of high water probability value (>95%) from the TM-derived global water probability map (Pekel et al. 2016). In this step, the high probability requirement ensures that most sampled pixels are water in SAR images. One potential complication is that we may still sample a very small portion of non-water pixels with strong backscattering. The magnitude of a single pixel of a strong scatterer can be many orders greater than water pixels in a radar image and is thus able to deviate the PDF significantly. Non-water pixels other than strong scatterers can broaden the scattering range, preventing us from deriving reasonable intervals to compute the histogram of the water class. To remove these non-water samples before initializing the PDF, we need to determine a pair of upper and lower thresholds, I.sub.u and I.sub.d, for each polarization. The PDF of water pixels of a single polarization follows the χ.sup.2 distribution (Lee and Pottier 2009),
[0094] We require (I.sub.u, I.sub.d) to represent a confidence interval of no less than 99% and let I.sub.p stand for the peak density. Then
can be estimated from (13). As I.sub.p, can be estimated from the histogram of sampling pixels and n is provided by the user guide the SAR data, even in the presence of strong scatterers and other non-water pixels, C.sub.ii, I.sub.d and I.sub.u can finally be derived. The following steps outline the method to estimate I.sub.u and I.sub.d and to refine n
[0095] 1. Find the intensity of the peak density, I.sub.p, from the initial samples.
[0096] 2. Find the x.sub.u, x.sub.d whose the cumulative probability of value χ.sub.2n.sup.2, are 0.5% and 99.5%, and x.sub.p which yields the peak χ.sub.2n.sup.2, value
[0097] where n is initialized using values from the Sentinel-1 user guide (https://sentinel.esa.int/web/sentinel/user-guides/sentinel-1-sar/resolutions/level-1-ground-range-detected)—i.e., 4.4 and 29.7 for the interferometric wide swath (IW) and strip map (SM) modes, respectively.
[0098] 3. Initialize I.sub.u and I.sub.d using (14),
[0099] 4. Iteratively refine I.sub.u by increasing I.sub.u by half a time until the sample number of the excluded tailing region, [I.sub.u, 5I.sub.u], is smaller than 0.5% of the included region, [I.sub.d, I.sub.u]. Refine I.sub.d similarly.
[0100] 5. Using remaining samples, refine ENL by (15):
[0101] Water Mask Generation by Multiple Thresholds
[0102] We generated three water masks, WM.sub.h, WM.sub.m, and WM.sub.l, from a single SAR image using multi-level probability density thresholds (high, moderate, and low) and later combined them through morphological and compensation procedures to suppress the severe over- and under-detection of current automated algorithms. The idea was to let WM.sub.h have the optimal PDF, WM.sub.m have a balanced over- and under-detection, and WM.sub.l have a low level of under-detection but a high level of over-detection. The high threshold was the optimal th.sub.PD. Then we divided th.sub.PD by 30 and 300 to get the moderate and low thresholds, respectively.
[0103] Step B: Morphological Processing
[0104] The objective of the morphological processing is to use body-level rather than pixel-level features to reduce over-detection and prepare for the next compensation step to reduce under-detection. We begin by acknowledging the following facts: [0105] 1. Disconnected inundation areas may exist. Therefore, not all water sources are identifiable from “dry date” SAR images and the LCC map. [0106] 2. Water-like radar responses from non-water surfaces can exist in any SAR image (pre-flood or in-flood). [0107] 3. Geometric error and noise-like speckle may “confuse” a change detector over targets with thin shapes, such as streets and small creeks.
[0108] We then design for the RAPID system a robust morphological module consisting of two steps: water source tracing (WST) and improved change detection (ICD).
[0109] WST utilizes the RGA to form water bodies from known water sources—that is, pixels that are classified as water on both the LCC map and the radar-derived water mask WM.sub.h (under processing). We then impose a size limit (th.sub.size>50 pixels) on all water body pixels, and a fraction limit of highly developed classes, developed ratio (r.sub.dev<30%) on water body pixels without the permanent water pixels overlapping with the LCC data. The argument is that false detected water bodies consist of speckle and unchanged non-water smooth surfaces.
[0110] WST has little chance of introducing over-detection caused by non-water smooth surfaces and blocked areas, but it has a high chance of neglecting water areas charged by narrow water paths invisible to the images' resolution. To identify these overlooked water areas further, we use ICD, but only for in-flood water masks.
[0111] We implemented ICD by running RGA again over the remaining water pixels (after muting all water pixels identified by the WST) in WM.sub.m(in), using the positive pixels in the difference water mask, ΔWM=WM.sub.h(in)−WM.sub.m(pre), as seeds. For derived water bodies, we loosened the developed ratio to r.sub.dev<80% and added two over-detection criteria to th.sub.size and r.sub.dev used in WST: the inundation ratio (r.sub.inund>30%) and high probability ratio (r.sub.p>50%). For each water body, we defined the inundation ratio as the difference area—the number of pixels that are classified as water in WM.sub.m(in) but as non-water in WM.sub.m(pre)—over the total area and the high probability ratio as the number of water pixels in WM.sub.h(in) over that in WM.sub.m(in). The reason for running the morphological processing over WM.sub.m rather than WM.sub.h for in-flood images is to reduce under-detection caused by speckle and to facilitate accurate estimation of r.sub.inund and r.sub.p. Note that speckle and changing shadow areas may severely affect the accuracy of change detection. To overcome them in ICD, we are forced to use, respectively, at least four dry references and satellite data of the same track (mode and orbit number). With identified water pixels (actual water or water-like) on multiple pre-flood dates forming the maximal pre-flood water mask, the probability of misidentifying seeding pixels and overestimating inundation ratio is reduced significantly. Since pre- and in-flood SAR data obtained in the same track share the same illumination geometry at any given pixel, they share similar water-like surfaces as well.
[0112] The ICD is different from traditional change detection (Giustarini et al. 2013; Lu et al. 2014; Matgen et al. 2011) in three ways: (1) ICD runs over all remaining non-water pixels after WST. It does not require inundated pixels to be connected to a known water source and, therefore, is capable of detecting inundation of disconnected lowland. (2) Complete water bodies rather than just changed pixels are formed by running RGA in ICD, while changed pixels, ΔWM, only serve as seeding pixels. Therefore, r.sub.inund and r.sub.p can be calculated at object level. Consequently, whereas traditional change detection algorithms measure whether the backscattering of a pixel is significantly changed, ICD measures whether a water body's area is changed significantly to evaluate its inundation severity. And (3) ICD detects changed pixels from a binary water mask instead of from an image of SAR backscattering. In practice, r.sub.inund and r.sub.p were effective to avoid introducing blocked (shadow) areas. The joint use of all four criteria at object level—that is, a water body must satisfy all the criteria to be accepted—made ICD resistant to classification error, noise-like speckle, and geometric error of SAR data. Although the threshold values of the four criteria are empirical, they all have clear physical interpretations, and users do not need to adjust them to different events and regions.
[0113] WST and ICD each overcomes the drawbacks of the other: the under-detection of inundation areas with unidentifiable water sources by WST and the exclusion of river-extended flood plains (usually of low r.sub.inund values) by ICD. Overall, the sophisticated morphological processing makes RAPID robust to common errors of ancillary and SAR data.
[0114] Step C: Compensation
[0115] Through morphological processing, most over-detection is removed and the location of all water bodies is determined. The under-detection within and surrounding water bodies is dealt with through compensation, as detailed in the following: [0116] 1. Generate a buffer region (extending 15 pixels) by swelling from the morphologically processed water mask. [0117] 2. Label a buffered pixel as water if it is identified as water in the WM.sub.l to generate WM.sub.comp. [0118] 3. Using all water pixels identified before step 2 as seeds, apply the RGA to WM.sub.comp. The grown water pixels form the final water mask.
[0119] The buffered area contains outside pixels to a certain distance and most inside pixels. Misclassified non-water pixels inside of each water body are a result of speckle; equivalently, pixels distributed in the marginal area of the water PDF, lower down the threshold of probability density, will reduce the error inside of each water body while not significantly altering the true boundary, as shown in
[0120] Step D: Machine Learning-Based Correction
[0121] Errors caused by noise-like speckle and strong scatters can occur inside of water bodies in WM.sub.comp. Although filtering approaches dominate SAR processing, they sacrifice the effective resolution and change the statistics of the signal without completely eliminating the error. For this reason, we did not employ a local filter in RAPID. Instead, we constructed an automated correction step based on machine learning. This step assumes that (1) given the noise, the majority of pixels are correctly classified; and (2) high-resolution terrain, river bathymetric, and network data also can provide the possibility ranks of water pixels.
[0122] In this correction step, a logistic binary classifier (LBC) is trained to predict the water probability of pixels in all water bodies and their buffered areas. Water coverage-related features are extracted as input variables, and the water result from the compensation step (described above) is used as a “prediction result” to train the LBC. Finally, user-defined thresholds are applied to the predicted water probability to correct the water mask. Unlike in usual machine-learning procedures, the pixels for training and correction in RAPID are in the same set, and neither cross-validation nor optimization is needed in the training.
[0123] The correction algorithm is depicted in
[0124] Consequently, factors that contribute to the expansion of river bodies are more complex than those affecting SWs, so in constructing the correction step we needed to separate the two categories and construct different feature spaces for them.
[0125] Unfortunately, no existing algorithm can accurately separate standing and flowing water bodies because manmade standing water bodies, such as reservoirs and canals, can be made in a wide variety of shapes or within any part of the fluvial system. Instead of developing an automatic algorithm, we relied mostly on existing datasets based on survey or visual interpretation. For the identification of SWs, we jointly used the HydroLakes dataset (Messager et al. 2016), US detailed water bodies (USDWB, optional, provided by ESRI), and water probability. HydroLakes is mostly accurate for SWs larger than 10 ha, USDWB labels a lake/pond or stream to each segment of the water central line. The two datasets may help identify more than 90% of water bodies, with those remaining unidentifiedsmall SWs. We used a simple rule to identify th remaining SWs—that is, a small water body is an SW if its P.sub.50 (water probability that ranks at 50 percentile) and compactness (the square root of area over the perimeter) are greater than 45% and 25, respectively. Improving the classification of SW and MW is beyond the scope of this study, but deep learning methods may be applied for this purpose in the future.
[0126] Training Samples
[0127] To train the classifier, buffering regions were generated from existing water bodies so both true (water) and false (non-water) pixels would be included. For an LB, we simply swelled the water area by 15 pixels. For river cross sections, we connected a given number (3 to 5) of adjacent central channel pixels, then generated a buffered polygon using twice their maximal river width. Therefore, a river width dataset was needed. Since we lacked this information, water pixels not contained in the training set of SWs or river bodies would not be processed (trained, predicted, or corrected).
[0128] Feature Selection
[0129] A water unit is a group of water pixels that theoretically share the same limit of a given feature. For an SW, the entire water body is a water unit, whereas for an MW, each cross section is an individual water unit. Table 1 provides the feature spaces of SWs and river cross sections (RCs). Each pixel has two types of features: uniform, which are constant for all pixels belonging to a water unit, and distributed, which are different for each pixel. Within a water unit, for example, the elevation of all pixels (a distributed feature) should be smaller than the maximal elevation (a uniform feature) of the water unit.
TABLE-US-00001 TABLE 1 Feature space of water bodies Water Feature Feature Description Reason to Select Type Type Central channel pixel (CCP) River width is related RC Uniform FAC to drainage area. Maximum distances from Greater distance RC Uniform both sides to CCP indicates smaller Distance from CCP chance of being RC Distributed inundated. Maximal elevation difference Elevation difference Both Uniform to the lowest pixel should be below the Elevation difference ranked upper limit. SW Uniform at 99%, 97%, 95%, and 90% Elevation difference to the Both Distributed lowest pixel Elevation ratio to the highest RC Distributed pixel Minimal probability Probability is higher Both Uniform Probability ranking at 1%, for river and SW SW Uniform 2%, 5%, 10%, and 20% centers than for edges Probability (works better for Both Distributed drier situations).
[0130] Ideally, the minimal probability and maximal elevation of an SW set the limits for all pixels within the SW. Due to the relatively coarse resolution and low frequency (15 days) of Landsat images, however, using minimal (0% rank) probability as the lower limit may result in a non-informative zero value of this feature for many SWs. We added, therefore, 1-20% rank probability values. Similarly, since the elevation of an SW can be controlled by a gate, the maximal elevation difference may not be representative of the floodplain boundary. We included, therefore, 90-99% elevation differences, as well. For a river water unit, we simply used the minimal probability or maximal elevation difference, due to the limited number of pixels within each cross section. As coastal areas have less pronounced topography per stream cross section than most inland areas, we included an elevation ratio as a supplement to elevation difference.
[0131] Correction Thresholds
[0132] Typically, a single threshold is applied to the probability result to generate the binary classes. To prevent over-correction and not rely purely on the trained results, we used double thresholds, 0.1 and 0.8. A water probability lower than 0.1 or higher than 0.8 indicated that a given pixel should be labeled as non-water or water class, respectively. Otherwise, if the probability falls in between, the class of the pixel will not change.
EXAMPLES
[0133] Two flood events were selected to test the efficiency and robustness of RAPID. Typhoon Nepartak caused flooding of the Yangtze River in 2016, and Hurricane Harvey caused flooding in Texas in 2017. The two events were large enough to be observed by satellite multiple times. Moreover, they occurred outside of and within the United States, respectively, thus allowing us to test the robustness of RAPID using different input data, in different locations and climatic conditions. Table 2 describes the events and data.
TABLE-US-00002 TABLE 2 Data and test events Data availability of Data availability of Event Location pre-flood dates in-flood dates Nepartak Hubei, May 6, 18, 23, and 30, Jul. 5, 17, and 22, China 2016 2016 Harvey Texas, USA Jun. 25, Jul. 18, 24, 30, Aug. 29 and 30, and and 31, and Aug. 05, 11, Sep. 4, 5, and 10, 12, 18, and 23, 2017 2017
[0134] We acquired Sentinel-1 level-1 dual-polarized (VH+VV or HV+HH) SAR data for Nepartak in IW mode and for Harvey in IW and SM modes in Ground Range Detected (GRD) format. After pre-processing using Sentinel Application Platform (SNAP), the ESA-released toolbox, the resulting pixel spacing was 10×10 m. The pre-processing included four steps: [0135] 1) Orbit correction [0136] 2) Radiometric calibration [0137] 3) Range-Doppler geometric terrain correction [0138] 4) Incidence angle normalization
Steps 2 and 3 are sometimes referred to as radiometric terrain correction (RTC). For simplicity, we used the algorithm provided by Mladenova et al. (2013) to run step 4. The total processing times for IW mode (˜33,000×21,000 pixels) images are around 6 h and 1 h for in- and pre-flood images, respectively. For the SM mode (˜12,000×17,000 pixels), processing takes around 2 h and 30 min. We processed images of the events on the University of Connecticut's high-performance computer (HPC) in parallel, making the total processing time about 6 hours. We used Matlab and Microsoft R Enterprise (RRE) to implement, respectively, the Steps A-C and the machine learning step.
[0139] Table 3 provides ancillary data options. Categorized by type, they comprise LCC, water occurrence, hydrographic, water type, and river width products. Of the Landsat-based LCC products, the National Land Cover Database (NLCD) (Fry et al. 2011; Homer et al. 2007; Homer et al. 2015; Vogelmann et al. 2001) is available in the United States at five-year intervals, and the Finer Resolution Observation and Monitoring of Global Land Cover database (FROM-GLC) (Gong et al. 2013) (http://data.ess.tsinghua.edu.cn/) has been available all over the globe since 2010. In NLCD taxonomy, water types are coded 90, “water bodies,” and 95, “wetland,” while highly developed types are coded 23, “built-up area with medium density,” and 24, “built-up area with high density.” In FROM-GLC taxonomy, water types are 50, “wetland,” and 60, “water bodies,” while the highly developed type is 80, “artificial surfaces.” For water occurrence, the only available dataset is produced by Pekel et al. (2016). For river width, two products, the Global River Width (GRWidth) (Allen and Pavelsky 2018) and the Global Width Database for Large Rivers (GRD-LR) (Yamazaki et al. 2014) are available. The latter will be available in the future for global applications. For hydrography, the National Hydrograph Dataset (NHD) plus v2 (Simley and Carswell Jr 2009) (www.horizon-systems.com/NHDPlus/NHDP1usV2_home.php) is available at 30 m resolution in the United States and the global GRD-LR at 90 m resolution globally. We used GRWidth as river width for both the Nepartak and Harvey events; as LCC we used FROM-GLC for Nepartak and NLCD for Harvey; and as hydrography we used GRD-LR for Nepartak and NHD for Harvey.
TABLE-US-00003 TABLE 3 Input data to the RAPID kernel algorithm Time Spatial Revisiting Needed Name Source/Type Producer Span Coverage Res. Intervals by Step Sentinel-1 SAR ESA Since Global 3.5/10 m ~2 days A 2014 NLCD TM/LCC USGS 1992- US 30 m 5 years B 2011 FROM-GLC TM/LCC Tsinghua 2010 Global 30 m One time B Univ. only Water TM/water ESA 1984- Global 30 m Static A & D Occurrence probability 2015 Hydrography NHD Horizon N/A US 30 m Static D Systems Co. DEM STRM USGS N/A Global 30 m Static D GRWidth TM/River George N/A Global 30 m Static D Width Allen GWD-LR STRM/River Dai N/A Global 90 m Static D Width and Yamazaki Hydrography HydroLakes STRM WWF N/A Global 90 m Static D USDWB multiple Esri, 2018 US 4 m Static D USGS, and USEPA
[0140]
[0141] The chance of having synchronized optical and SAR data of comparable resolution for the same area is rare, especially during a given flood period. To carry out quantitative validation, we compared the RAPID-generated inundation result with an expert, hand-derived inundation delineation (referred to as EE hereafter), intuitively using Sentinel-1 and World-View data.
TABLE-US-00004 TABLE 4 Confusion matrix of inundation mapping EE Confusion matrix Wet Dry Retrieval Wet 12,992,348 (11.09%) 3,853,426 (3.29%) Dry 4,367,647 (3.73%) 95,979,615 (81.90%)
[0142] It shows overall agreement, with M.sub.11+M.sub.22 being 93% pixels, with producer accuracy, M.sub.11/(M.sub.11+M.sub.21), being 77%, and user accuracy, M.sub.11/(M.sub.11+M.sub.12), being 75%, respectively. Although the EE map is the best obtainable reference, RAPID did not necessarily produce false positives or negatives among pixels that disagreed. The major portion of “under-detection” by RAPID, for example, is given by
[0143]
[0144] For Typhoon Nepartak, RAPID results generated from SAR data on Jul. 17, 2016 are given by
[0145] We have developed an NRT inundation mapping system, named RAPID, driven by SAR data of dual polarization, which requires no human interference. By combining statistical classification, morphological processing, multi-threshold compensation, and machine learning-based correction, RAPID extracts at high spatial resolution HO m) inundated areas that have been flooded from existing water bodies and isolated lowlands and reduces over- and under-detection and speckle noise without applying any filtering techniques, which cause severe problems using existing algorithms. By combining the strength of state-of-art technologies, such as radar polarimetry and machine learning, with information from multi-source remote-sensing datasets and products at high resolution (>30 m), including LCC, water probability, terrain data, and river bathymetry, RAPID achieved full automation and accuracy, as validated by selected flood events in Hubei, China, and Texas, United States, caused by Typhoon Nepartak (2016) and Hurricane Harvey (2017), respectively.
[0146] The datasets we used are all freely available globally. RAPID is configured to be resistant to low-level source data error, such as misclassification and low updating frequency of LCC data, less-representative water probability (of flood extremity), and limited resolution of terrain data. In addition, RAPID is open to integrating newly emerging datasets and products to produce more accurate inundation results. Overall, the RAPID system processing time is similar to that of regular SAR processing techniques to detect water and is of low cost and high quality in both effective resolution and accuracy.
[0147] Recently, the abundance of free available SAR data has boosted the ability of the flood-monitoring community to detect inundated areas accurately, often during events. High-resolution inundation maps can be produced without any budgetary concerns regarding, for example, airborne photography missions, as data are freely available from satellites. The RAPID system liberates flood observers from tedious processing work requiring expertise that might not be available during an event. The system can be operationally applied to derive global inundation mapping at intervals of two days (in midlatitude regions) to four days (near the equator) using satellites—both existing and to be launched—equipped with high-resolution SAR sensors, such as Sentinel, the Advanced Land Observation Satellite (ALOS), the Surface Water and Ocean Topography (SWOT) satellite, and the NASA-ISRO SAR Satellite Mission (NISAR).
[0148] Besides the advantages of NRT monitoring, the low cost of manpower associated with RAPID facilitates the use of miscellaneous applications, including retrospective investing historical flood events stored in inventory like Shen et al. (2017b) and the DFO using archived SAR data, and the evaluating accuracy of Federal Emergency Management Agency (FEMA) flood-zone maps. With global or regional flood inundation databases populated in the future, the use of RAPID will also benefit the calibration and validation of hydrological, hydrodynamic modeling (Bates et al. 1997; Havnø et al. 1995; Schumann et al. 2005; Shen and Anagnostou 2017; Yamazaki et al. 2011) and studies of inundation risk caused by geomorphological factors (Shen et al. 2017a; Shen et al. 2016). Besides inundation extent, floodwater depth can be inferred with available high-resolution DEM (Cohen et al. 2017).
[0149] Synthetic Aperture Radar (SAR) Imagery
[0150] The recently emergent freely available satellite-based SAR imagery providing a reasonable spatiotemporal resolution (10 m, ˜6 days) and is not disturbed by cloud cover (Prigent et al. 2016; Aires et al. 2017). Consequently, SAR imagery has gained popularity in delineating flood events. However, due to the algorithm complexity and the requirement of expert manual editing, existing flood archives only respond to emergencies (EC JRC 2015; JPL 2017) or a few major cases (Zeng et al. 2019; Diego et al. 2020). No method has yet facilitated a national-scale inundation extent dataset. This is primarily because fully automated retrieval algorithms with acceptable accuracy have only been recently developed (Shen et al. 2019a), which has limited the use of these data in flood events.
[0151] An unprecedented 10 m resolution flood inundation archive over the contiguous United States (CONUS) was generated from the entire Sentinel-1 SAR archive for the period from January 2016 to the present, based on the Radar Produced Inundation Diary (RAPID) algorithm (Shen et al. 2019b). By combining radar statistics and machine-learning methods, with the integration of multisource remote sensing data and product, RAPID achieves full automation and high-level accuracy with zero manual post-processing or expert knowledge. The RAPID system is driven by Sentinel-1 SAR imagery provided by the European Space Agency (ESA), which are the only freely available satellite SAR data with global coverage. By applying an automatic processing chain, the method could be further applied to more sources of SAR data, such as the soon to be launched Surface Water and Ocean Topography (SWOT) and NASA-ISRO SAR (NISAR), which is expected to deliver the next generation of global high quality surface water data (Frasson et al. 2019a; NASA 2019). Ancillary data include water surface occurrence, land cover classification, hydrography, and river width, as detailed in the RAPID kernel algorithm (Shen et al. 2019b). The accuracy of the dataset is assessed by visual and quantitative comparison with National Oceanic and Atmospheric Administration (NOAA) event reports, the Federal Emergency Management Agency (FEMA) derived floodplain maps, and the water extent from the USGS Dynamic Surface Water Extent (DSWE) product. The final product includes flood extent in raster format and the associated event table. The proposed dataset can, therefore, facilitate various applications, including flood monitoring, inundation models calibration and verification (Afshari et al. 2018; Zeng et al. 2020), flood damage and risk assessment (Wing et al. 2017), and mitigation management (Wing et al. 2020).
[0152] To enable the big data processing at the national scale, the flood trigger system 102 relies on both in-situ stream stage observations and satellite precipitation estimation to initially identify potential flooded zone (PFZ) (the maximal extent that may contain flood inundation) within which we acquire and process overpassing SAR images. The flood trigger system 102 detects two types of flooding, fluvial and pluvial, as depicted by
[0153] Based on the spatial proximity and temporal continuity of the daily PFZ, a flood event is defined as follows:
[0154] 1) Merge two spatially disconnected PFZs into one if a pair of points exist in the two PFZs that their distance is equal to or less than 50 km.
[0155] 2) For two PFZs on a day and the next, we associate them to the same event if the fraction of the intersected area is no less than 70% of the PFZ on either day.
[0156] 3) Update the maximal flood extent by the uniting of all PFZs within the latest five-days.
[0157] 4) Terminate the event if the flood zone is less than 10% of the previous five-day maximal flood extent.
[0158] Within a given flood zone, we acquire for retrieval processing the SAR images sensed on the day of flooding and, as dry references, multiple images obtained from the same Sentinel-1 ground track with a certain overlapping, sensed on previous dry days. Approximately five dry references are required by the RAPID kernel algorithm for each SAR image acquired on the flood day to reduce the error caused by noise-like speckle. Level-1 dual-polarized (VH+VV or HV+HH) Sentinel-1 SAR images in IW and SM modes and Ground Range Detected (GRD) format are pre-processed via orbit correction, radiometric correction, and terrain correction using the Sentinel Application Platform (SNAP) and then normalized by the incidence angle using the cosine-law (Mladenova et al. 2013).
[0159] The pre-processed grid resolution is regularized to 10 m×10 m when inputting to the RAPID kernel algorithm for flood map delineation. The resulting inundation extent raster images are binary water masks, with pixels labeled as water or non-water. Persistent water bodies are delineated as the maximal water extent of the water masks on dry days. A user can, therefore, choose either to highlight only the inundating area or use the total obtained water area.
[0160] The final product contains two sub-datasets. The first sub-dataset is a flood event collection stored as multiple time series in an ESRI shapefile. Each series represents one event containing several days of multi-polygon features with each representing the PFZ of a day. Each multi-polygon feature contains a unique event ID, and the date as fields. The second sub-dataset contains binary flood extent raster files with each pixel labeled as 1 (flooded) or 0 (non-flooded). A separate list is generated to associate the raster file name of each flood extent to the event ID to facilitate event-wise queries. The archive is linked to the Global Active Archive of Large Flood Events database produced by the Dartmouth Flood Observatory (DFO) (Brakenridge et al. 2010; Adhikari et al. 2010) to extend the flood death and displaced estimates caused by related events.
[0161] By way of example, the RAPID system has detected 21,589 flood events from January 2016 to June 2019, with
[0162] Four well-known and representative flood events—the 2019 Midwestern flood, Hurricane Florence (2018), Hurricane Harvey (2017), and Hurricane Matthew (2016)—were selected as examples to validate event formation and detection of inundation extent (
[0163] The visual comparison of the RAPID open water extent with the DSWE product (water with high and moderate confidence) shows strong overall agreement, with some differences in the regions where vegetation is concentrated (
[0164] As well as DSWE, the 100-yr floodplain delineated by FEMA using high-quality local hydraulic/hydrodynamic models (FEMA, 2016), is selected to verify the proposed dataset. As shown in
[0165] To quantitatively evaluate the overall accuracy of the inundation archive, we compare the overlapping areas pixel by pixel using DSWE as reference. Here, the “overlapping area” refers to the common pixels covered by both DSWE and the proposed dataset on the same day. We exclude any pixels identified as cloud, cloud shadow, shaded relief, missing pixels by the scanline corrector, and other types of error recorded by the DSWE mask band in the “overlapping area”. We resample the DSWE pixel to the resolution of Sentinel-1, 10 m×10 m. Additionally, we also exclude pixels labeled in DSWE as potential wetland or water (wetland) with low confidence (Zanter 2019) for the comparison. We use five error metrics in the assessment: overall agreement (OA), user agreement (UA), producer agreement (PA), critical success index (CSI), and detection bias (DB):
[0166] Where TP, TN, FP and FN stand for the true-positive, true-negative, false-positive, and false-negative, respectively, and positive (negative) represent the wet (dry) pixels. Analyzing over 73 billion pixels, the two datasets agree well across all 559 overlapping images, with the OA, UA, PA, CSI, and DB at 99.06%, 87.63%, 91.76%, 81.23%, and 1.27, respectively (
[0167]