This R Markdown
demonstrates how to work with satellite imagery collections using the
rstac and gdalcube libraries in R. The
workflow will cover how to access collections from Microsoft’s Planetary
Computer and how to filter relevant datasets (e.g., Sentinel, Landsat,
MODIS). We will also work with spatial data to extract bounding boxes
for our regions of interest.
library(sf) # For working with spatial data and shapefiles
library(gdalcubes) # For working with satellite data cubes
library(rstac) # For accessing STAC API from Microsoft Planetary Computer
library(dplyr) # For data manipulation
library(terra) # For raster data operations
library(leaflet) # For interactive maps
library(leafem) # For additional leaflet functionalities
library(htmlwidgets) # For additional leaflet functionalities
###Accessing STAC API Collections
Here, we’ll connect to the Microsoft Planetary Computer’s STAC API and retrieve available collections. Then, we filter out some collections and identify their IDs based on Sentinel, Landsat, or MODIS data.
# Set planetary computer STAC API
planetary_computer <- rstac::stac("https://planetarycomputer.microsoft.com/api/stac/v1")
collections_ <- rstac::stac("https://planetarycomputer.microsoft.com/api/stac/v1",force_version=T) %>%
collections() %>%
get_request()
# Extract and print only the IDs of the collections
collection_ids <- sapply(collections_$collections, function(x) x$id)
print(head(collection_ids))
## [1] "daymet-annual-pr" "daymet-daily-hi" "3dep-seamless" "3dep-lidar-dsm"
## [5] "fia" "sentinel-1-rtc"
The above code queries the STAC API and lists all the available collections, such as Sentinel, Landsat, and MODIS datasets. These datasets are valuable for monitoring various aspects of the Earth’s surface. Filtering Relevant Collections
# Filter collections for Sentinel, Landsat, or MODIS
filtered_collections <- collections_$collections[sapply(collections_$collections, function(x) {
grepl("sentinel|landsat|modis", x$id, ignore.case = TRUE)
})]
# Extract and print the IDs of the filtered collections
filtered_collection_ids <- sapply(filtered_collections, function(x) x$id)
print(head(filtered_collection_ids))
## [1] "sentinel-1-rtc" "modis-64A1-061" "landsat-c2-l2"
## [4] "modis-17A2H-061" "modis-11A2-061" "modis-17A2HGF-061"
# Define the selected collection IDs
coleccion_names <- c('sentinel-2-l2a','landsat-c2-l2','modis-17A2HGF-061',
'modis-09A1-061','modis-09Q1-061','modis-11A2-061',
'modis-15A2H-061','modis-15A3H-061')
For the selected collections, here’s a brief description of what each dataset contains:
Landsat Collection 2 Level-2: Atmospherically corrected surface reflectance data.
Sentinel-2 Level-2A: Global imagery in 13 spectral bands at 10m-60m resolution, with bottom-of-atmosphere correction using Sen2Cor.
MODIS Gross Primary Productivity 8-Day Gap-Filled (modis-17A2HGF-061): Cumulative 8-day composites of Gross Primary Productivity at 500-meter resolution.
MODIS Surface Reflectance 8-Day (500m) (modis-09A1-061): Surface spectral reflectance of MODIS bands 1-7, corrected for atmospheric conditions.
MODIS Surface Reflectance 8-Day (250m) (modis-09Q1-061): Surface spectral reflectance of MODIS bands 1 and 2 at 250-meter resolution, corrected for atmospheric conditions.
MODIS Land Surface Temperature/Emissivity 8-Day (modis-11A2-061): Average 8-day per-pixel land surface temperature and emissivity at 1km resolution.
MODIS Leaf Area Index/FPAR 8-Day (modis-15A2H-061): Combined Fraction of Photosynthetically Active Radiation (FPAR) and Leaf Area Index (LAI) data, at 500-meter resolution.
MODIS Leaf Area Index/FPAR 4-Day (modis-15A3H-061): Similar to modis-15A2H-061 but with a 4-day composite.
To interact with our selected datasets, we’ll first load a spatial scenario from a shapefile. This scenario defines the region of interest for which we will extract satellite data.
# Example scenario data (replace this with your real scenario data)
scenario <- sf::st_read("spatialdata/Area_SR2_PRISMA_v2.shp")
## Reading layer `Area_SR2_PRISMA_v2' from data source
## `/Users/carloscamino/Library/Mobile Documents/com~apple~CloudDocs/Workspace/0-toolsrtm-simulator/Vignettes/Stac_server/spatialdata/Area_SR2_PRISMA_v2.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 1 field
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: 602317.8 ymin: 4841295 xmax: 606062.1 ymax: 4843639
## Projected CRS: WGS 84 / UTM zone 32N
# Reproject the bounding box to EPSG:4326 (WGS 84) for STAC API compatibility
scenario_4326 <- sf::st_as_sfc(scenario) |>
sf::st_transform(4326) |>
sf::st_bbox()
# Set GDAL cubes options
gdalcubes::gdalcubes_options(parallel = 8)
# Reproject the scenario to EPSG:4326 (WGS 84) for STAC API compatibility
scenario_4326 <- sf::st_transform(scenario, 4326)
# Calculate the centroid of the scenario to use it for centering the map
centroid_coords <- sf::st_coordinates(sf::st_centroid(sf::st_geometry(scenario_4326)))
# Plot the scenario using leaflet
leaflet() %>%
addTiles() %>% # Add default OpenStreetMap tiles
addPolygons(data = scenario_4326, # Use the reprojected scenario for polygons
color = "blue",
weight = 2,
fillColor = "lightblue",
fillOpacity = 0.5) %>%
addScaleBar(position = "bottomleft") %>%
addMiniMap(tiles = providers$Esri.WorldStreetMap, toggleDisplay = TRUE) %>%
setView(lng = centroid_coords[1],
lat = centroid_coords[2],
zoom = 14)
In this section, we will define a function to create a bounding box around a given scenario’s centroid, applying a buffer to ensure we encompass a specified area around that centroid. This bounding box will help us narrow down our satellite data search.
We begin by defining a function called get_bounding_box that takes two parameters: scenario (our spatial feature) and buffer_size (the size of the buffer in meters).
# Function to define the bounding box around a scenario centroid with buffer
get_bounding_box <- function(scenario, buffer_size) {
# Calculate the centroid of the scenario geometry
centroid <- sf::st_centroid(scenario)
# Apply a buffer around the centroid based on the specified buffer size
buffer_area <- sf::st_buffer(centroid, buffer_size)
# Get the bounding box of the buffered area
bb <- sf::st_bbox(buffer_area)
# Reproject the bounding box to EPSG:4326 (WGS 84) for STAC API compatibility
bb_4326 <- sf::st_as_sfc(bb) |>
sf::st_transform(4326) |>
sf::st_bbox()
return(bb_4326) # Return the reprojected bounding box
}
Next, we apply a buffer around the centroid based on the provided buffer_size.
# Input parameters (replace with actual values as needed)
buffer_size <- 300 # Buffer size around the centroid in meters
# Calculate the bounding box using the buffer size around the scenario centroid
bb <- get_bounding_box(scenario, buffer_size)
# Print bounding box to verify the result
print(bb)
## xmin ymin xmax ymax
## 10.28986 43.72517 10.29743 43.73066
The get.satellite_collection function retrieves
satellite imagery from specified collections using the Planetary
Computer STAC API, based on a given scenario, date range, cloud cover
threshold, and an optional buffer size. Initially, it calculates a
bounding box around the scenario’s centroid, applying the specified
buffer size to ensure a broader area is included. The function then
searches for available satellite items within the defined bounding box
and date range, returning a maximum of 500 results. If no images are
found, it raises an error; otherwise, it displays the number of images
retrieved. The function extracts asset names from the first item in the
results and defines the relevant assets for the selected collection,
such as Landsat or MODIS datasets. It then checks if cloud filtering is
applicable for the selected collection, creating an image collection
either with or without cloud cover filtering based on the specified
threshold. Finally, the function returns the filtered image collection,
enabling further analysis or visualization of the satellite data.
# Function to retrieve satellite collections
get.satellite_collection <- function(scenario, collection, date_range, cloud_threshold, buffer_size = 300) {
# Get bounding box based on the scenario centroid with a buffer of buffer_size meters
bbox <- get_bounding_box(scenario, buffer_size)
# Search for items using the planetary computer STAC API
items <- planetary_computer |>
stac_search(collections = collection,
bbox = bbox[1:4], # Use the reduced bounding box
datetime = paste0(date_range[[1]], "/", date_range[[2]]),
limit = 500) |>
post_request()
# Sign the STAC items for authentication
items <- items_sign(
items,
sign_planetary_computer()
)
if (length(items$features) == 0) {
stop("No images found in the specified collection for the given bounding box and date range.")
} else {
cat("Images found:", length(items$features), "\n")
}
# Get the assets for the first item
first_item_assets <- items$features[[1]]$assets
asset_names <- names(first_item_assets)
print("Asset names:") # Debug: Check the available asset names
print(asset_names)
# Define asset names based on the collection
assets <- switch(collection,
"landsat-c2-l2" = asset_names[c(3:4,12:16)], # for landsat level-2
"sentinel-2-l2a" = asset_names[c(3:9,13,11:12,14)], # Example for Sentinel-2 Level-2A
"modis-17A2HGF-061" = c("Gpp_500m",'PsnNet_500m'), # Example for MODIS Gross Primary Productivity
"modis-11A2-061" = c("LST_Day_1km"), # or MODIS Land Surface Temperature
"modis-09A1-061" = asset_names[c(3:9)], #for MODIS Surface Reflectance 8-Day (500m)
"modis-09Q1-061" = c("sur_refl_b01","sur_refl_b02","sur_refl_qc_250m","sur_refl_state_250m"), # Example for MODIS Surface Reflectance 8-Day (250m)
"modis-15A2H-061" = c("Lai_500m","Fpar_500m","LaiStdDev_500m",'FparStdDev_500m','FparLai_QC'), # Example for MODIS Leaf Area Index/FPAR 8-Day
"modis-15A3H-061" = c("Lai_500m","Fpar_500m","LaiStdDev_500m",'FparStdDev_500m','FparLai_QC') # Example for MODIS Leaf Area Index/FPAR 4-Day
)
# Determine if cloud filtering is applicable
if (collection == "modis-09A1-061" || collection == "modis-17A2HGF-061" || collection == "modis-11A2-061"
|| collection == 'modis-09Q1-061'|| collection == 'modis-09A1-061'
|| collection == 'modis-15A2H-061'|| collection == 'modis-15A3H-061') {
# Create image collection without cloud cover filtering
clt <- items$features |>
stac_image_collection(asset_names = assets)
} else {
# Create image collection with cloud cover filtering for other collections
clt <- items$features |>
stac_image_collection(asset_names = assets,
property_filter = function(x) {
x[["eo:cloud_cover"]] < cloud_threshold
})
}
return(clt) # Return the filtered image collection
}
In this section, we define the parameters for retrieving satellite data. We specify a date range from May 1, 2022, to July 31, 2022, during which we want to collect satellite imagery. We also set a cloud cover threshold of 5%, ensuring that the retrieved images have minimal cloud interference.
Next, we create a vector, coleccion_names, which contains the names of the satellite collections we are interested in, including Sentinel-2 Level-2A, Landsat Collection 2 Level-2, and various MODIS datasets. This vector is then printed to the console to confirm the available collections.
Finally, we call the get.satellite_collection function,
passing in the scenario, the first collection from our coleccion_names
vector, the specified date range, the cloud threshold, and a buffer size
of 10 meters. The results, stored in the satellite_collection variable,
are displayed using head() to show the first few entries of the
retrieved satellite data.
## Example usage
date_range <- as.Date(c("2022-05-01", "2022-07-31"))
cloud_threshold <- 5
coleccion_names <- c('sentinel-2-l2a','landsat-c2-l2','modis-17A2HGF-061','modis-09A1-061','modis-09Q1-061','modis-11A2-061','modis-15A2H-061','modis-15A3H-061')
print(coleccion_names)
## [1] "sentinel-2-l2a" "landsat-c2-l2" "modis-17A2HGF-061"
## [4] "modis-09A1-061" "modis-09Q1-061" "modis-11A2-061"
## [7] "modis-15A2H-061" "modis-15A3H-061"
satellite_collection <- get.satellite_collection(scenario, coleccion_names[1],
date_range, cloud_threshold, buffer_size = 10)
## Images found: 76
## [1] "Asset names:"
## [1] "AOT" "B01" "B02"
## [4] "B03" "B04" "B05"
## [7] "B06" "B07" "B08"
## [10] "B09" "B11" "B12"
## [13] "B8A" "SCL" "WVP"
## [16] "visual" "preview" "safe-manifest"
## [19] "granule-metadata" "inspire-metadata" "product-metadata"
## [22] "datastrip-metadata" "tilejson" "rendered_preview"
The get.sentinel2_cube function is designed to create
and process a data cube specifically for Sentinel-2 satellite
collections. It takes a satellite collection object, a spatial shape
(bounding box), a date range for analysis, an aggregation method (either
“mean” or “median”), and an optional parameter to return a dataset along
with the raster cube. The function begins by checking the get.dataset
parameter to determine whether to process a raster or just reflectance
values. It then defines a projected coordinate reference system (CRS)
and transforms the provided shape into this CRS. A cube view is created,
defining the spatial extent and resolution of the raster cube, with
cloud masking applied using the Scene Classification Layer (SCL) to
focus on vegetation pixels. The function checks available bands and
selects the relevant bands (e.g., B02, B03, B04, B08) based on the
specified aggregation method, calculating either the mean or median
across the selected bands over the specified time range. Finally, it
converts the result into a raster format suitable for visualization and
optionally returns a dataset along with the raster cube.
# Function to create and process the data cube for Sentinel-2 collections
get.sentinel2_cube <- function(s_collection, shape, date_range, aggregation_method = "mean", get.dataset=T) {
if (missing(get.dataset) || is.null(get.dataset)) {
get.dataset = F # Set default to F if empty or missing
print('A raster is processing ---')
} else {
print('The reflectance are processing ---')
}
crs_cube <- "EPSG:3035" # Use a projected CRS for raster operations
# Convert bounding box to sf object and transform to the target CRS
shape_cube <- shape |>
sf::st_as_sfc() |>
sf::st_transform(crs_cube) |>
sf::st_bbox(crs = crs_cube)
# Define the cube view (spatial extent and resolution)
view <- cube_view(
srs = crs_cube,
extent = list(
t0 = as.character(date_range[[1]]),
t1 = as.character(date_range[[2]]),
left = shape_cube["xmin"],
right = shape_cube["xmax"],
top = shape_cube["ymax"],
bottom = shape_cube["ymin"]
),
dx = 10, dy = 10, dt = "P1D", # 10m resolution, time unit = 1 day
aggregation = aggregation_method,#"mean", # Set default aggregation to "mean"
resampling = "near" # Resampling method
)
# Mask to remove clouds and shadows using the SCL band
mask <- image_mask("SCL", values = c(0,1,2,3, 6,7,8, 9,10,11,12)) # Clouds,snow,water bad pixels and shadows
# Create a raster cube for the given collection and view
cube <- raster_cube(s_collection, view, mask = mask)
# Check available bands in the cube
available_bands <- names(cube) # Adjust based on how you access bands
print(available_bands)
# Select bands based on the aggregation method
#selected_bands <- available_bands # Example RGB bands
# Filter selected bands based on available bands
selected_bands <- intersect(c("B02", "B03", "B04", "B08"), available_bands)
print(selected_bands)
# Calculate average or median across time for selected bands
if (aggregation_method == "mean") {
result_cube <- cube |>
select_bands(selected_bands) |>
reduce_time(paste0("mean(", selected_bands, ")"))
} else if (aggregation_method == "median") {
result_cube <- cube |>
select_bands(selected_bands) |>
reduce_time(paste0("median(", selected_bands, ")"))
} else {
stop("Invalid aggregation method specified.")
}
# Convert to stars or terra format for visualization
result_raster <- st_as_stars.cube(result_cube) |> rast()
# Generate names based on bands and time range
band_names <- paste0(selected_bands, "_",
format(as.Date(date_range[[1]]), "%Y%m%d"), "_",
format(as.Date(date_range[[2]]), "%Y%m%d"))
# Assign names to the raster layers
names(result_raster) <- selected_bands
if (get.dataset == T){
data.collection <- cube |> as.data.frame(complete_only = TRUE)
data.list <-list(data.collection=data.collection, raster=result_raster)
return(data.list)
} else {
return(result_raster)
}
}
In this example, we first define a date range for analysis. We then
call the get.sentinel2_cube function twice: first to
retrieve the average raster cube and second to retrieve the median
raster cube for the specified date range and bounding box (bb). Each
raster cube is plotted for visual inspection, and summaries are printed
to provide insights into the contents of the cubes. Finally, the
resulting raster cubes are converted to brick format, which is useful
for further analysis or manipulation within R. If
theget.dataset parameter is set toTRUE, the
function returns both the raster result and the underlying dataset as a
dataframe. The function is structured to print relevant messages during
execution, indicating the processing status.
# Get average raster cube
avg_raster_cube <- get.sentinel2_cube(satellite_collection, shape=bb,
date_range, aggregation_method = "mean",get.dataset = F)
## [1] "The reflectance are processing ---"
## [1] "B02" "B03" "B04" "B05" "B06" "B07" "B08" "B11" "B12" "B8A" "SCL"
## [1] "B02" "B03" "B04" "B08"
plot(avg_raster_cube)
# Get median raster cube
median_raster_cube <- get.sentinel2_cube(satellite_collection, shape=bb,
date_range, aggregation_method = "median",get.dataset = F)
## [1] "The reflectance are processing ---"
## [1] "B02" "B03" "B04" "B05" "B06" "B07" "B08" "B11" "B12" "B8A" "SCL"
## [1] "B02" "B03" "B04" "B08"
# Plot median raster cube
plot(median_raster_cube)
#### Leaflet Map Creation
In this section, we compute the Normalized Difference Vegetation Index (NDVI) using the average and median raster cubes derived from Sentinel-2 satellite imagery. NDVI serves as a vital indicator of vegetation health and density, derived from the near-infrared (NIR) and red bands of the satellite data.
Below, we provide R code to generate an interactive Leaflet map that displays both the average and median RGB layers alongside the NDVI layers. Users can easily toggle between different layers to visualize variations in vegetation health and land cover. The map features controls for layer visibility, a scale bar for measuring distances, and dynamic coordinates for mouse positioning, enhancing the overall user experience and interactivity.
# Load the raster data
avg_raster <- raster::brick(avg_raster_cube)
median_raster <- raster::brick(median_raster_cube)
# Calculate NDVI for average and median raster cubes
ndvi_avg <- (avg_raster[[4]] - avg_raster[[3]]) / (avg_raster[[4]] + avg_raster[[3]])
ndvi_median <- (median_raster[[4]] - median_raster[[3]]) / (median_raster[[4]] + median_raster[[3]])
# Assign names to NDVI rasters
names(ndvi_avg) <- "NDVI_Average"
names(ndvi_median) <- "NDVI_Median"
# Create a Leaflet map
m <- leaflet() |>
addTiles(group = "OSM") |>
addTiles(urlTemplate = "https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}",
group = "Esri Tile Layer") |>
# Add average RGB raster layer
leafem::addRasterRGB(avg_raster, r = 1, g = 2, b = 3,
quantile = c(0.2, 0.98),
group = "RGB (Average)") |>
# Add median RGB raster layer
leafem::addRasterRGB(median_raster, r = 4, g = 3, b = 2,
quantile = c(0.2, 0.98),
group = "RGB (Median)") |>
# Add NDVI layers
leaflet::addRasterImage(ndvi_avg, group = "NDVI (Average)", colors = terrain.colors(100)) |>
leaflet::addRasterImage(ndvi_median, group = "NDVI (Median)", colors = terrain.colors(100)) |>
# Set view to the centroid of the bounding box
setView(lng = st_coordinates(st_centroid(st_as_sf(st_as_sfc(bb))))[1],
lat = st_coordinates(st_centroid(st_as_sf(st_as_sfc(bb))))[2],
zoom = 16) |>
# Add layer controls to switch between raster layers and base maps
addLayersControl(
baseGroups = c("OSM", "Esri Tile Layer"),
overlayGroups = c("RGB (Average)", "RGB (Median)", "NDVI (Average)", "NDVI (Median)"),
options = layersControlOptions(collapsed = FALSE)
) |>
addScaleBar(position = "bottomright") |>
addMeasure(primaryLengthUnit = "meters", primaryAreaUnit = "sqmeters") |>
# Add mouse coordinates
leafem::addMouseCoordinates()
# Add click event to extract pixel values
m <- m |>
htmlwidgets::onRender("
function(el, x) {
this.on('click', function(e) {
var lat = e.latlng.lat;
var lng = e.latlng.lng;
// Display coordinates in a popup
L.popup()
.setLatLng(e.latlng)
.setContent('Clicked coordinates: <br>Latitude: ' + lat + '<br>Longitude: ' + lng)
.openOn(this);
});
}"
)
# Print the map
m