How-to-?

Mapping Floods: Exploring Thresholding Methods

As part of our collaboration with Mongabay-India, we have utilised spatial analysis and visualisation to accompany their reporting. In June 2023, they published an explainer by Vinaya Kurtkoti on floodplains and their management in India. Their article discusses the ongoing process of concretisation and development in floodplains, which reduces the carrying capacity of rivers, leading to urban flooding. 

Presence of water bodies in Mahad pre and post-flood event.

We received information from the Mongabay-India team on urban floods in different parts of India. The data spanned periods exceeding ten years as well as recent occurrences. The task was to create maps of the areas before, during and after each flood event. Availability of suitable satellite imagery was key for creating these maps. This was a challenge as cloud cover during monsoon season - when the floods occurred was often 90% or more.  Thus the initial, critical step towards creating these maps was to check if clear imagery existed for the required flood dates. Additionally, for events older than a decade, the issue of low resolution imagery arose. Initially we planned on showing the flood visually using the raw satellite images. Since we found no clear imagery for the flood dates, we had to look for other options that could depict the flood-prone areas. 

Given the lack of clear imagery for the flood dates, I explored alternative approaches to represent flood-prone zones. Three distinct thresholding methods were experimented with, using three different platforms. 

The first method involved utilising Digital Elevation Model (DEM) data in QGIS, an approach that came into play due to QGIS’s simple interface. By loading the area of interest through quick map services and employing the SRTM-downloader plugin, DEM 30m data was directly sourced from NASA Earthdata. I used the DEM data to establish a threshold. This method is a prediction of flood prone areas, given the level of water level rise. I looked for sources like news articles that reported the water level when the areas were flooded. I set that water level as the threshold using the raster calculator. By setting that threshold the algorithm gave the areas based on the elevation that would be inundated, if water level rises to a certain level.

Thresholding flood level from SRTM DEM data.

The second method I tried was using Sentinel-1 Synthetic Aperture Radar (SAR) data, which was available for the exact date when the flooding occurred in this area, using Google Earth Engine (GEE). 

I then analysed pixel values of water by comparing images before and during flooding. Applying a pixel value threshold allowed for identification of sudden changes indicative of flooding. I began by filtering the pre-flood and post-flood dates for the images for Mahad city. So, I had two SAR images: one before the flood and one when it was flooded. I checked the pixel values of the water bodies before the flood from various spots. This pixel value was then set as the threshold. Once I input the threshold and ran the code, GEE highlighted areas with sudden pixel value changes of water bodies in the after image, indicating flood, and those with no change were the existing water bodies. 

Pixel value change of the area marked in red from pre-flood to flood date in the Inspector display box.

The third threshold method I employed was for the Commonwealth Game Village area of Delhi. Initially, we hoped to depict actual visuals of the flooding in one of the areas using satellite imagery. However, demarcating the flood manually for the viewers to clearly differentiate between the pre and post-flood imagery was not possible because clear imagery was not available for the flood dates. When working with older satellite images dating back to 2010, we faced issues stemming from their low spatial resolution. This limitation arose because satellites with enhanced spatial resolutions were launched only after that time, in 2013. In order to show a similar situation, we searched for similar events in recent years and found images from 2022’s flood in Delhi. However, satellite images during the flood were still not useful because of the high cloud cover in them. So I had to look for images just after the flood event when the cloud cover was low but was still indicative of flood, as it takes time for the water to drain away. 

Initially I tried the same method as before, by using SAR data. However, it seemed to detect built up areas like roads instead of water. Therefore I switched to Sentinel-2 L2A data for this region. According to Bhangale et al., 2020 [1] and Lekhak et al., 2023 [2] band 8 (NIR) with band 3 (Green) of Sentinel-2 could be used to identify water bodies. I therefore used the band 8 from both the pre and post-flood images to detect inundation. I checked the pixel values from various spots and noted down an approximate minimum and maximum pixel value of the water body in the image before the flood. This range was then used to differentiate water from non-water areas in post-flood images. After noting the values, I classified this range of values into one category as water and rest as not-water. I similarly applied this step in the post flood image which gave me the change in the water bodies which are the areas that were inundated. 

Checking pixel values of water bodies from pre-flood images.

Setting threshold values to classify water and not-water.

Results after classification.

After applying all the three thresholding methods the question that arises is of their accuracy. While the first method that I applied was a prediction based on elevation and level of water, the other two methods were entirely based on satellite data. 

In the case of Mahad, the first method based on elevation seemed to match the level of inundation to some extent with the SAR output, as it predicted the major areas that were inundated. SAR data, as per existing studies, is widely used and considered appropriate, for detecting floods as it is unaffected by cloud cover. This is because unlike other optical satellite imagery it is able to differentiate land and water contrast easily. However, SAR data [3] can sometimes misclassify shadows of tarmac areas with buildings and roads as water. This issue became evident when I experimented with SAR data in the Delhi case. 

Presence of water bodies in Yamuna floodplain, pre- and post-flood.

On the other hand, Sentinel-2 data gave results similar to the SAR output where built-up areas were misclassified as water. Sentinel-2 data is affected by atmospheric conditions unlike SAR. The process of setting pixel values is more manual, which can be affected by individual judgement, potentially leading to underestimation or overestimation[2].

Sentinel-1 SAR data has been found to have more accuracy in detecting floods than Sentinel 2.  A study by Nhangumbe et al., 2023 [4] suggests combining both the data for attaining higher overall accuracy. 

Overall all three methods provided estimations of the major areas that were inundated or likely to be inundated, fulfilling the purpose of the issue that Mongabay-India wished to convey. Meanwhile, the scope for exploration and improvement remains open!

References

  1. Bhangale, U., More, S., Shaikh, T., Patil, S., & More, N. (2020). Analysis of Surface Water Resources Using Sentinel-2 Imagery. Procedia Computer Science, 171, 2645–2654. (https://doi.org/10.1016/j.procs.2020.04.287)

  2. Lekhak, K., Rai̇, P., & Budha, P. B. (2023). Extraction of Water Bodies from Sentinel-2 Images in the Foothills of Nepal Himalaya. International Journal of Environment and Geoinformatics, 10(2), 70–81.(https://doi.org/10.30897/ijegeo.1240074)

  3. Rahman, Md. R., & Thakur, P. K. (2018). Detecting, mapping and analysing of flood water propagation using synthetic aperture radar (SAR) satellite data and GIS: A case study from the Kendrapara District of Orissa State of India. The Egyptian Journal of Remote Sensing and Space Science, 21, S37–S41. (https://doi.org/10.1016/j.ejrs.2017.10.002)

  4. Nhangumbe, M., Nascetti, A., & Ban, Y. (2023). Multi-Temporal Sentinel-1 SAR and Sentinel-2 MSI Data for Flood Mapping and Damage Assessment in Mozambique. ISPRS International Journal of Geo-Information, 12(2), 53.(https://doi.org/10.3390/ijgi12020053)

Identifying Forest Fire Scars with GEE

In March 2023, the serene forests of Goa experienced forest fires at an alarming scale. While there are many theories regarding the cause of the fires, mapping is helpful to know the scale of the fires and understand the extent of the damage. The sites identified at this stage are also helpful to monitor recovery and possible spread of invasive species in the aftermath of fires. This can aid in efforts to restore and maintain forest health.

In this blogpost, we look at the use of Google Earth Engine (GEE) to locate burn scars and hence identify areas affected by fires. I’ll be outlining the steps taken and sharing the relevant code for each stage of the analysis.

Importing Satellite Imagery

function maskS2clouds(image) {
  var qa = image.select('QA60');

  // Bits 10 and 11 are clouds and cirrus, respectively.
  var cloudBitMask = 1 << 10;
  var cirrusBitMask = 1 << 11;

  // Both flags should be set to zero, indicating clear conditions.
  var mask = qa.bitwiseAnd(cloudBitMask).eq(0)
      .and(qa.bitwiseAnd(cirrusBitMask).eq(0));

  return image.updateMask(mask).divide(10000);
}

var dataset = ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')
                  .filterDate('2023-02-01', '2023-03-31')
                  // Pre-filter to get less cloudy granules.
                  .filterBounds(goa)
                  .filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE',1))
                  .map(maskS2clouds);
print(dataset) 

To begin, I implemented the ‘maskS2clouds’ function to filter out cirrus and cloudy pixels. This function uses the QA60 bitmask band of Sentinel imagery to create a mask that identifies cloudless pixels suitable for further processing. This process is applied across the entire image collection dataset.

Creating a burn scar map, necessitates pre-fire and post-fire images, for an estimation of the scarred area. For this purpose, I selected appropriate images of post and pre-fire scenarios from the image collection. I wrote the below code to be able to display all the images date wise in the image collection to the map view.

Display All Images

//Add date attribute to all images to display the images and select the required //
dataset = dataset.map(function(image){
  var str = ee.String(image.get('system:index'))
  var sdate = str.slice(0,8)
  //var sdate = ee.Date(image.get('system:time_start'));
  image = ee.Image(image).set('date',sdate)
  return image;
})

print(dataset, 'with date')
var listOfImages = dataset.toList(dataset.size())
print(listOfImages)

//Add all the images in the dataset to the Map view //////
for(var i = 0; i < 28 ; i++){
  var image = ee.Image(listOfImages.get(i));
  var date = image.get('date')
  Map.addLayer(image, band_viz, ''+date.getInfo())
}

By attributing a date to each image and displaying them by date, I could visually assess the quality of the images in the collection. The first part of the code adds date as an attribute to all the images in the image collections. The second part displays the images using their date attribute. The output of the code is displayed below. This format of display helped me to gauge the quality of all the images in the collection. This further guided my selection of the suitable images for the pre and post-fires scenarios.

Example of all images displayed in the map view.

With all the images on display, I selected the 13th of February 2023 as the pre-fire image and the 10th of March 2023 as the post-fire image. This approach as in the above display also highlighted if the selected images need to be mosaicked. Mosaicking is joining two or more contiguous satellite images for analysis in order to cover the entire area of interest. Accordingly, I proceeded to convert the images to a list, then selected the required images and mosaicked them.

Selecting Pre-fire image

dataset = dataset.toList(dataset.size())
print(dataset, 'imageCollection')

//mosaic of 13/02/2023 - pre fire image.
var preFire = ee.ImageCollection([ee.Image(dataset.get(3)),ee.Image(dataset.get(6)),ee.Image(dataset.get(8)),ee.Image(dataset.get(11))])
preFire = (preFire.mosaic()).clip(goa)
print(preFire,'preFire')
Map.addLayer(preFire,band_viz,'preFire')

Selecting Post-fire image

//mosaic of 10/03/2023 - post fire image.
var postFire = ee.ImageCollection([ee.Image(dataset.get(26)),ee.Image(dataset.get(25)),ee.Image(dataset.get(24)),ee.Image(dataset.get(23))])
postFire = (postFire.mosaic()).clip(goa)
print(postFire,'postFire')
Map.addLayer(postFire,band_viz,'postFire')

The next step involved calculating Normalized Burn Ratio (NBR) for both pre and post-fire images. NBR relies on Near Infrared (NIR) and Short Wave Infrared (SWIR2) reflectance data. Burnt areas exhibit high reflectance in the SWIR2 region and very low reflectance in the NIR region of the electromagnetic spectrum. The NBR is designed to exploit this difference in reflectance along the spectrum. NBR is formulated as follows:

NBR = (NIR - SWIR2) / (NIR + SWIR2)

The NBR was calculated for both pre and post-fire images, using the below code.

Normalized Burn Ratio (NBR) Calculation

//Finding Pre and post fire NBR = normalized burn index ratio = NIR-SWR2/NIR+SWIR2
var PreNumer = preFire.select('B8').subtract(preFire.select('B12'))
var PreDenomin = preFire.select('B8').add(preFire.select('B12'))
var PreNBR = PreNumer.divide(PreDenomin)
//Map.addLayer(PreNBR,{},'Pre-NBR')

var PostNumer = postFire.select('B8').subtract(postFire.select('B12'))
var PostDenomin = postFire.select('B8').add(postFire.select('B12'))
var PostNBR = PostNumer.divide(PostDenomin)

The ensuing step involved determining the difference between pre and post-fire NBR images. The resultant output shows us the extent of burnt area in the region.

Difference between Pre and Post Fire NBR

//Subtracting postNBR fron PreNBR for scar map
var scar =PreNBR.subtract(PostNBR)
var palette = ['1a9850','91cf60','d9ef8b','ffffbf','fee08b','fc8d59', 'd73027']
Map.addLayer(scar, ,'ScarMap')

In subsequent stages, I incorporated the application of a threshold to isolate the scarred regions, which were then vectorized using a mask.

Define thresholds

// Define thresholds
var zones = scar.lt(0.09).add(scar.gt(0.09)).add(scar.gte(0.2))
zones = zones.updateMask(zones.neq(0));

I chose the thresholds by closely examining pixel values in the scar map using the Inspector Tool in GEE Code Editor interface. The ‘Inspector’ displays the pixel value at the point chosen on the map, for every displayed layer.

Inspector tool highlighted in the screengrab.

The output of the ‘define threshold’ step here was a raster with two values, 1 and 2. By this means, all values less than 0.2 were classified as 1 and while those greater than 0.2 were assigned the value 2. Using the code below, I converted the classified raster to a vector, yielding a downloadable KML file to create a comprehensive map of the affected area.

Reduce to Vector

// Convert the zones of the thresholded scars to vectors.
var vectors = zones.addBands(scar).reduceToVectors({
  geometry: goa,
  //crs: nl2012.projection(),
  scale: 10,
  geometryType: 'polygon',
  eightConnected: false,
  labelProperty: 'zone',
  reducer: ee.Reducer.mean(),
  bestEffort: true,
  maxPixels: 3784216672400
});


// Display the thresholds.
Map.setCenter(74.13215, 15.46816, 10);
Map.addLayer(zones, {min: 1, max: 3, palette: ['0000FF', '00FF00', 'FF0000']}, 'raster');

// Make a display image for the vectors, add it to the map.
var display = ee.Image(0).updateMask(0).paint(vectors, '000000', 3);
Map.addLayer(display, {palette: '000000'}, 'vectors');

In summary, employing this method was effective to identify forest fire scars in Goa. The inclusion of  ground points of scarred areas were valuable for ground truthing. Thresholding plays a very important role in delineating scarred areas, and hence ground truth points are helpful. If you’re involved in fire scar mapping, this demonstration could be a beneficial reference. Kindly get in touch via our contact form, if you have any further questions.

Link to the code: https://code.earthengine.google.com/c1010c05dd2762d18626b76a90112223

(Note: This code does not include the required shapefiles for the code to work, this is for reference only.) 

Google Earth Engine vs. the Microsoft Planetary Computer

I have been using Google Earth Engine (GEE) for two years now. It was initially difficult to understand the interface, language, analysis and visualisation, but this became simpler over time. Time-series analyses, plots and animated gifs/videos are easy to implement in GEE. Recently, I also began exploring Microsoft Planetary Computer (MPC) for satellite imagery processing. During this process, I made some initial comparisons between GEE and MPC, which I’ve detailed below.

  1. I viewed a spatially mosaicked dataset in MPC. However, one can download only individual images and not the mosaic of the area of interest. The ready-to-use codes for individual images are generated at MPC Explorer which can be used to view the image in MPC Hub. I also tried mosaicking the Sentinel dataset manually in MPC hub using this tutorial which uses GDAL’s ‘build VRT’ function. The tutorial used the National Agriculture Imagery Program (NAIP) dataset. However, the GDAL ‘build vrt’ code failed to work on the Sentinel dataset, as Sentinel and NAIP datasets are stored in different hierarchical formats.

  2. Customised functions make basic raster and vector processing easy, thereby enhancing user experience.

  3. The MPC datasets can be visualised and analysed using QGIS tools within MPC without the need to download them.

  4. Users can create Dask Gateways or switch to different computing environments based on their task to scale the computation and fasten the process. This needs coding experience, and it is not automated. Users decide which is the best environment for their task.

  5. STAC - Spatio-temporal Asset catalogue - a commonplace for users to look for spatial data. This was created to have a common code for accessing all spatial datasets. 

  6. “gdalcubes” is a library to represent collections of Earth Observation (EO) data. Data cubes may be simply exported as NetCDF files or directly streamed into external software such as R or Python. gdalcubes is not a database instead simply links to existing files / GDAL datasets - Source

I believe that when trying to compute large datasets, MPC has an advantage over GEE, as it provides access to different environments with a range of available computational power. The familiarity of using known Python and R libraries for analyses can be comforting for experienced users. On the other hand, I realised that all datasets in MPC are not homogeneously stored, and thus the code to extract & manipulate data may not be applicable to all datasets uniformly. While MPC is very useful for specific use cases, it will become even more useful when the planned introduction  of customised raster and vector data manipulation functions is completed. 

With regards to GEE, existing tutorials, extensive documentation and more customizability make tasks such as visualisation, storage, conversion between spatial data formats, raster and vector functions (such as mosaic and clip), easier for me than in the newer MPC. The most important utility is the flexibility to download analyses in GEE as this makes the data publishable and replicable. Bulky tasks like supervised classification are also less time consuming in GEE than in desktop GIS because of customised functions. While GEE requires minimal coding experience, most MPC tasks seem to be available only to skilled professionals at this point.

In conclusion, I’d like to caveat this article with the fact that at the time of writing, I have more experience using GEE than MPC, and I’m sure I haven’t explored the full capabilities of both platforms. Suggestions or comments are welcome; please get in touch via our contact form.