Coral Sea Sentinel 2 Marine Satellite Composite Draft Imagery version 0 (AIMS)

This dataset contains composite satellite images for the Coral Sea region based on 10 m resolution Sentinel 2 imagery from 2015 – 2021. This image collection is intended to allow mapping of the reef and island features of the Coral Sea. This is a draft version of the dataset prepared from approximately 60% of the available Sentinel 2 image. A future version of this dataset will be released based on review of 100% of the available imagery. This collection contains composite imagery for 31 Sentinel 2 tiles in the Coral Sea. For each tile there are 5 different colour and contrast enhancement styles intended to highlight different features. These include: - `DeepFalse` - Bands: B1 (ultraviolet), B2 (blue), B3 (green): False colour image that shows deep marine features to 50 - 60 m depth. This imagery exploits the clear waters of the Coral Sea to allow the ultraviolet band to provide a much deeper view of coral reefs than is typically achievable with true colour imagery. This technique doesn't work where the water is not as clear as the ultraviolet get scattered easily. - `DeepMarine` - Bands: B2 (blue), B3 (green), B4 (red): This is a contrast enhanced version of the true colour imagery, focusing on being able to better see the deeper features. Shallow features are over exposed due to the increased contrast. - `ReefTop` - Bands: B3 (red): This imagery is contrast enhanced to create an mask (black and white) of reef tops, delineating areas that are shallower or deeper than approximately 4 - 5 m. This mask is intended to assist in the creating of a GIS layer equivalent to the 'GBR Dry Reefs' dataset. The depth mapping exploits the limited water penetration of the red channel. In clear water the red channel can only see features to approximately 6 m regardless of the substrate type. - `Shallow` - Bands: B5 (red edge), B8 (Near Infrared) , B11 (Short Wave infrared): This false colour imagery focuses on identifying very shallow and dry regions in the imagery. It exploits the property that the longer wavelength bands progressively penetrate the water less. B5 penetrates the water approximately 3 - 5 m, B8 approximately 0.5 m and B11 < 0.1 m. Feature less than a couple of metres appear dark blue, dry areas are white. - `TrueColour` - Bands: B2 (blue), B3 (green), B4 (red): True colour imagery. This is useful to interpreting what shallow features are and in mapping the vegetation on cays and identifying beach rock. For most Sentinel tiles there are two versions of the DeepFalse and DeepMarine imagery based on different collections (dates). The R1 imagery are composites made up from the best available imagery while the R2 imagery uses the next best set of imagery. This splitting of the imagery is to allow two composites to be created from the pool of available imagery so that mapped features could be checked against two images. Typically the R2 imagery will have more artefacts from clouds. The satellite imagery was processed in tiles (approximately 100 x 100 km) to keep each final image small enough to manage. The dataset only covers the portion of the Coral Sea where there are shallow coral reefs. # Methods: The satellite image composites were created by combining multiple Sentinel 2 images using the Google Earth Engine. The core algorithm was: 1. For each Sentinel 2 tile, the set of Sentinel images from 2015 – 2021 were reviewed manually. In some tiles the cloud cover threshold was raised to gather more images, particularly if there were less than 20 images available. The Google Earth Engine image IDs of the best images were recorded. These were the images with the clearest water, lowest waves, lowest cloud, and lowest sun glint. 2. A composite image was created from the best images by taking the statistical median of the stack of images selected in the previous stage, after masking out clouds and their shadows (described in detail later). 3. The contrast of the images was enhanced to create a series of products for different uses. The true colour image retained the full range of tones visible, so that bright sand cays still retained some detail. The marine enhanced version stretched the blue, green and red channels so that they focused on the deeper, darker marine features. This stretching was done to ensure that when converted to 8-bit colour imagery that all the dark detail in the deeper areas were visible. This contrast enhancement resulted in bright areas of the imagery clipping, leading to loss of detail in shallow reef areas and colours of land areas looking off. A reef top estimate was produced from the red channel (B4) where the contrast was stretched so that the imagery contains almost a binary mask. The threshold was chosen to approximate the 5 m depth contour for the clear waters of the Coral Sea. Lastly a false colour image was produced to allow mapping of shallow water features such as cays and islands. This image was produced from B5 (far red), B8 (nir), B11 (nir), where blue represents depths from approximately 0.5 – 5 m, green areas with 0 – 0.5 m depth, and brown and white corresponding to dry land. 4. The various contrast enhanced composite images were exported from Google Earth Engine (default of 32 bit GeoTiff) and reprocessed to smaller LZW compresed 8 bit GeoTiff images GDAL. ## Cloud Masking Prior to combining the best images each image was processed to mask out clouds and their shadows. The cloud masking uses the COPERNICUS/S2_CLOUD_PROBABILITY dataset developed by SentinelHub (Google, n.d.; Zupanc, 2017). The mask includes the cloud areas, plus a mask to remove cloud shadows. The cloud shadows were estimated by projecting the cloud mask in the direction opposite the angle to the sun. The shadow distance was estimated in two parts. A low cloud mask was created based on the assumption that small clouds have a small shadow distance. These were detected using a 40% cloud probability threshold. These were projected over 400 m, followed by a 150 m buffer to expand the final mask. A high cloud mask was created to cover longer shadows created by taller, larger clouds. These clouds were detected based on an 80% cloud probability threshold, followed by an erosion and dilation of 300 m to remove small clouds. These were then projected over a 1.5 km distance followed by a 300 m buffer. The parameters for the cloud masking (probability threshold, projection distance and buffer radius) were determined through trial and error on a small number of scenes. As such there are probably significant potential improvements that could be made to this algorithm. Erosion, dilation and buffer operations were performed at a lower image resolution than the native satellite image resolution to improve the computational speed. The resolution of these operations were adjusted so that they were performed with approximately a 4 pixel resolution during these operations. This made the cloud mask significantly more spatially coarse than the 10 m Sentinel imagery. This resolution was chosen as a trade-off between the coarseness of the mask verse the processing time for these operations. With 4-pixel filter resolutions these operations were still using over 90% of the total processing resulting in each image taking approximately 10 min to compute on the Google Earth Engine. ## Sun glint removal and atmospheric correction. Sun glint was removed from the images using the infrared B8 band to estimate the reflection off the water from the sun glint. B8 penetrates water less than 0.5 m and so in water areas it only detects reflections off the surface of the water. The sun glint detected by B8 correlates very highly with the sun glint experienced by the ultra violet and visible channels (B1, B2, B3 and B4) and so the sun glint in these channels can be removed by subtracting B8 from these channels. This simple sun glint correction fails in very shallow and land areas. On land areas B8 is very bright and thus subtracting it from the other channels results in black land. In shallow areas (< 0.5 m) the B8 channel detects the substrate, resulting in too much sun glint correction. To resolve these issues the sun glint correction was adjusted by transitioning to B11 for shallow areas as it penetrates the water even less than B8. We don't use B11 everywhere because it is half the resolution of B8. Land areas need their tonal levels to be adjusted to match the water areas after sun glint correction. Ideally this would be achieved using an atmospheric correction that compensates for the contrast loss due to haze in the atmosphere. Complex models for atmospheric correction involve considering the elevation of the surface (higher areas have less atmosphere to pass through) and the weather conditions. Since this dataset is focused on coral reef areas, elevation compensation is unnecessary due to the very low and flat land features being imaged. Additionally the focus of the dataset it on marine features and so only a basic atmospheric correction is needed. Land areas (as determined by very bright B8 areas) where assigned a fixed smaller correction factor to approximate atmospheric correction. This fixed atmospheric correction was determined iteratively so that land areas matched the tonal value of shallow and water areas. ## Image selection Available Sentinel 2 images with a cloud cover of less than 0.5% were manually reviewed using an Google Earth Engine App [01-select-sentinel2-images.js]( Where there were few images available (less than 30 images) the cloud cover threshold was raised to increase the set of images that were raised. Images were excluded from the composites primarily due to two main factors: sun glint and fine scattered clouds. The images were excluded if there was any significant uncorrected sun glint in the image, i.e. the brightness of the sun glint exceeded the sun glint correction. Fine scattered clouds over reef areas were also a strong factor in down grading the quality rating of the image. As each satellite images were reviewed they were characterised into four classes: - `Excellent` – Almost perfectly cloud free. - `Good` – Large sections of the imagery are cloud free, particularly areas of reefs, and there is no remaining sun glint. Clouds in the image are low and not very small. - `OK` – Moderate areas of the image are cloud free (>30 %), particularly where there are reefs. No remaining sun glint (after correction). - `Maybe` – Some useful areas of imagery are visible and with enough images the clouds in the image might be able to be removed. Images that have lots of very small clouds are still generally excluded. No significant sun glint (<5%) after correction in the image. The images were then grouped to create two reference composite images. The first reference composite image (`R1`) was based on the best set of images available (typically the `Excellent`, `Good` and sometimes `OK` images) and the second reference image (`R2`) made up of the remaining images (typically from the `OK` and `Maybe` category). If there were enough `Excellent`, `Good` or `OK` images then the `Maybe` category images were unused. The categories and the final images used to create each of the composite images is recorded in the Google Earth Engine script [03-create-composite-Coral-Sea.js]( To speed up the implementation speed of this draft version of the datasets only approximately 50 – 70 % of the available imagery was reviewed, typically stopping once 30 – 40 images were reviewed, or sufficient good images were collected to create reasonable composite images. Since the images were reviewed from oldest to newest imagery this resulted in a bias towards the composite images containing older imagery. Where a tile scene was split over two satellite passes more images were previewed and collated to ensure that there were enough images in both the left and right sections of the image tile. A minimum of 4 images were combined for OK and Good classification and typically 6 – 8 used for images of the Maybe category. # Format: GeoTiff - LZW compressed, 8 bit channels, 0 as NoData, Imagery as values 1 - 255. Internal tiling and overviews. Average size: 11500 x 11500 pixels and 300 MB per image. The images in this dataset are all named using a naming convention. An example file name is `CS_AIMS_Sentinel2-marine_V0_R1_DeepFalse_54LZP_201808-202106-n3.tif`. The name is made up from: - Dataset name (`CS_AIMS_Sentinel2-marine`) - A dataset version number (`V0`), - Best imagery (`R1`) or second reference imagery (`R2`), - Colour and contrast enhancement applied (`DeepFalse`, `DeepMarine`, `ReefTop`, `Shallow`, `TrueColour`), - Sentinel 2 tile (example: `54LZP`), - Start and end year and month of the dates of the images in the image composite (example: `201808-202106`) - Number of images that were combined to make the image (example: `n3`) # Limitations: To save development time only 50 - 70 % of all the Sentinel 2 imagery was reviewed to create the final imagery in this version of this dataset. Heavy contrast enhancements applied to the `DeepMarine` and `DeepFalse` composites result in some scenes being darker than ideal. The thresholds used in the contrast enhancement are fixed and very sensitive to very small variations in uncorrected brightness in each scene. The `DeepFalse` scenes near PNG (55LBK, 54LZP, 55LCJ) are too dark due to a decrease in the water clarity. Ideally the contrast thresholds should to be adjusted to make these images more usable. In some scenes (56KQB and 56KPC) very few Sentinel 2 images were available leading to poor image composites. Due to the high contrast enhancement applied in the `DeepFalse` and `DeepMarine` composites and the relatively few images used to create these composites, masked out clouds create significant visual artefacts that need to be considered when mapping from this imagery. Additionally slight differences in the sensitivity and angle of the Sentinel 2 MSI imager sensors results in uncorrected diagonal tonal bands in the imagery. Only simple atmospheric correction was applied to land areas. The sun glint correction algorithm transitions between different correction levels from deep water (B8) to shallow water (B11) and a fixed atmospheric correction for land (bright B8 areas). Slight errors in the tuning of these transitions can result in unnatural tonal steps in the transitions between these areas. # References: Google (n.d.) Sentinel-2: Cloud Probability. Earth Engine Data Catalog. Accessed 10 April 2021 from Zupanc, A., (2017) Improving Cloud Detection with Machine Learning. Medium. Accessed 10 April 2021 from # Data Location: This dataset is filed in the eAtlas enduring data repository at: data\Other\CS_AIMS_Sentinel-2-marine_V0 The source code is available on [GitHub](

This image collection is intended to allow mapping of the reef and island features of the Coral Sea. This version of the imagery was developed to facilitate exploration of what could be seen and mapped from the imagery.

Principal Investigator
Lawrey, Eric, Dr Australian Institute of Marine Science (AIMS)
Point Of Contact
eAtlas Data Manager Australian Institute of Marine Science (AIMS)

Data collected from 01 Oct 2016 until 20 Sep 2021

Data Usage Constraints
  • Attribution 3.0 Australia