Keywords data used; sentinel 2 data used; crop mask

Background

The Water Productivity through Open access of Remotely sensed derived data (WaPOR) monitors and reports on agricultural water productivity through biophysical measures with a focus on Africa and the Near East. This information assists partner countries improve land and water productivity in both rainfed and irrigated agriculture (Peiser et al. 2017).

Evapotranspiration is of particular interest in irrigated agricultural lands. It can be informative to visualise how plant growth (photosynthesis/biomass production) progresses through a season alongside evapotranspiration.

Description

This notebook provides and introduction to WaPOR data and nomenclature, and demonstrates loading and plotting.

First, we load the cropland mask to define the area of interest for the visualisation.
Then, the seasonal pattern of local cropping is inspected using an Enhanced Vegetation Index (EVI) time series to identify a period for the visualistion.
Once the season is defined, evapotranspiration and net primary productivity data are loaded.
Finally, EVI, net primary productivity, and evapotranspiration are plotted in a single visualisation. ***

Getting started

To run this analysis, run all the cells in the notebook, starting with the “Load packages” cell.

Load packages

Import Python packages that are used for the analysis.

Use standard import commands; some are shown below. Begin with any iPython magic commands, followed by standard Python packages, then any additional functionality you need from the Tools package.

[1]:

import datacube
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import rioxarray
import xarray as xr
import geopandas as gpd

from datacube.utils.masking import mask_invalid_data
from deafrica_tools.areaofinterest import define_area
from deafrica_tools.bandindices import calculate_indices
from deafrica_tools.datahandling import load_ard
from deafrica_tools.load_wapor import get_all_WaPORv3_mapsets, get_WaPORv3_info, load_wapor_ds
from deafrica_tools.plotting import display_map
from deafrica_tools.temporal import temporal_statistics, xr_phenology
from odc.geo.geom import Geometry
from odc.algo import xr_reproject
from wapordl import wapor_map

INFO: WaPORDL (`1.0.2`)
WARNING: Latest version is '1.0.3'.
WARNING: Please update pywapor.

Connect to the datacube

Connect to the datacube so we can access DE Africa data. The app parameter is a unique name for the analysis which is based on the notebook file name.

[2]:

dc = datacube.Datacube(app="WaPOR")

Analysis parameters

The cell below specifies the folder where the downloaded data will be stored. If you are using this script repeatedly, it is recommended you empty this folder from time to time to reduce storage on the Sandbox volume.

[3]:

folder = "../Supplementary_data/WaPOR" # folder that the data will be sent to

Next, the area of interest is defined. This can also be a .geojson file which the loading function accepts. Otherwise, there are two methods available:

By specifying the latitude, longitude, and buffer. This method requires you to input the central latitude, central longitude, and the buffer value in square degrees around the center point you want to analyze. For example, lat = 30.75, lon = 31.35, and buffer = 0.1 will select an area with a radius of 0.1 square degrees around the point with coordinates (30.75, 31.35).

Alternatively, you can provide separate buffer values for latitude and longitude for a rectangular area. For example, lat = 30.75, lon = 31.35, and lat_buffer = 0.1 andlon_buffer = 0.08 will select a rectangular area extending 0.1 degrees north and south, and 0.08 degrees east and west from the point (30.75, 31.35).

For reasonable loading times, set the buffer as 0.1 or lower.
By uploading a polygon as an Esri Shapefile. If you choose this option, you will need to upload the geojson or ESRI shapefile into the Sandbox using Upload Files button in the top left corner of the Jupyter Notebook interface. ESRI shapefiles must be uploaded with all the related files (.cpg, .dbf, .shp, .shx). Once uploaded, you can use the shapefile or geojson to define the area of interest. Remember to update the code to call the file you have uploaded.

To use one of these methods, you can uncomment the relevant line of code and comment out the other one. To comment out a line, add the "#" symbol before the code you want to comment out. By default, the first option which defines the location using latitude, longitude, and buffer is being used.

If running the notebook for the first time, keep the default settings below. This will demonstrate how the analysis works and provide meaningful results.

As for the loading WaPOR data notebook, this demonstration notebook loads an area of cropland in the Nile Delta, Egypt. The Nile Delta supports irrigated agriculture in a very arid climate. This means it has very low cloud cover and easily distinguishable cropping patterns from satellite imagery, making it a useful testing area for Earth Observation based analyses.

[4]:

# Method 1: Specify the latitude, longitude, and buffer
aoi = define_area(lat=30.75, lon=31.35, buffer=0.03)

# Method 2: Use a polygon as a GeoJSON or Esri Shapefile.
# aoi = define_area(vector_path='aoi.shp')

#Create a geopolygon and geodataframe of the area of interest
geopolygon = Geometry(aoi["features"][0]["geometry"], crs="epsg:4326")
geopolygon_gdf = gpd.GeoDataFrame(geometry=[geopolygon], crs=geopolygon.crs)

# Get the latitude and longitude range of the geopolygon
lat_range = (geopolygon_gdf.total_bounds[1], geopolygon_gdf.total_bounds[3])
lon_range = (geopolygon_gdf.total_bounds[0], geopolygon_gdf.total_bounds[2])

region = [geopolygon_gdf.total_bounds[0], geopolygon_gdf.total_bounds[1], geopolygon_gdf.total_bounds[2], geopolygon_gdf.total_bounds[3]]

display_map(x=lon_range, y=lat_range)

[4]:

Make this Notebook Trusted to load map: File -> Trust Notebook

Load the cropland mask

In this notebook, we are interested in the phenology of crops. Therefore, we limit our area to cropland only to eliminate the effect of other land cover classes. We use the Digital Earth Africa cropland mask to do this below.

Firstly, we load the cropland mask for the area of interest.

[6]:

cm = dc.load(
    product="crop_mask",
    time=("2019"),
    measurements="filtered",
    resampling="nearest",
    like=ds.geobox,
).filtered.squeeze()

cm.where(cm < 255).plot.imshow(
    add_colorbar=False, figsize=(6, 6)
)  # we filter to <255 to omit missing data
plt.title("Cropland Extent");

../../../_images/sandbox_notebooks_Real_world_examples_Inspecting_WaPOR_through_time_16_0.png

Next, we mask the Sentinel-2 data, ds, to cropland areas.

[7]:

ds = ds.where(cm == 1)

We will use the Enhanced Vegetation Index (EVI) to get an intial picture of phenology. This is a preferred index because it is more sensitive at high levels of greenness, which are observed in densely planted irrigated crops, which we are looking at here. We can therefore better identify progression of the crop season with EVI than with other indices.

[8]:

ds = calculate_indices(ds, index='EVI', satellite_mission="s2")

The cell below resamples the EVI timeseries to regular 10 day intervals. This means that the EVI time series will match the regularity of the WaPOR dekadal data.

[9]:

resample_period = "10D"
window = 4

veg_smooth = (
    ds['EVI']
    .resample(time=resample_period)
    .median()
    .rolling(time=window, min_periods=1)
    .mean()
)

Inspect phenology

Plotting the EVI timeseries over two years (2022 & and 2023) shows that the area we’ve selected follows a ‘double cropping’ pattern where two crops are grown within a 12-month period. Based on the EVI profile, the primary season appears to be from approximately November to June, with a secondary season from approximately June to November. This corresponds with characterisations of cropping systems in the Nile Delta.

[10]:

veg_smooth_1D = veg_smooth.mean(["x", "y"])
veg_smooth_1D.plot.line("b-^", figsize=(15, 5), color='green')
_max = veg_smooth_1D.max()
_min = veg_smooth_1D.min()

plt.title("EVI time-series")
plt.ylabel('EVI');

../../../_images/sandbox_notebooks_Real_world_examples_Inspecting_WaPOR_through_time_24_0.png

For the remaining visualisations, we will focus on the primary (winter) season between approximately October 2022 and June 2023. We will conduct a more definitive phenology analysis to identify the start and end of season, but first we need to limit our timeseries to a discrete season, as below.

[11]:

veg_smooth_1D = veg_smooth.mean(["x", "y"]).sel(time=slice('2022-10', '2023-06'))
veg_smooth_1D.plot.line("b-^", figsize=(15, 5))
_max = veg_smooth_1D.max()
_min = veg_smooth_1D.min()

plt.title("EVI time-series")
plt.ylabel('EVI');

../../../_images/sandbox_notebooks_Real_world_examples_Inspecting_WaPOR_through_time_26_0.png

Calculate phenology statistics

The cell below calculates phenological statistics for the selected timeseries, which will enable us to limit our analysis to a defined crop season.

The statistics are:

SOS - start of season date and its value (vSOS)
POS - peak of season date and its value (vPOS)
EOS - end of season date and its value (EOS)
Trough - minimum value
LOS - length of season (EOS - SOS)
AOS - amplitude (vPOS - Trough)
ROG - rate of greening = (vPOS - vSOS) / (POS - SOS)
ROS - rate of senescing = (vEOS - vPOS) / (EOS - POS)

We also define the methods for detection of SOS and EOS, as below.

method_sos : If ‘first’ then vSOS is estimated as the first positive slope on the greening side of the curve. If ‘median’, then vSOS is estimated as the median value of the postive slopes on the greening side of the curve.

method_eos : If ‘last’ then vEOS is estimated as the last negative slope on the senescing side of the curve. If ‘median’, then vEOS is estimated as the ‘median’ value of the negative slopes on the senescing side of the curve.

More information on phenological methods is available in the detailed documentation of temporal statistics.

[12]:

basic_pheno_stats = [
    "SOS",
    "vSOS",
    "POS",
    "vPOS",
    "EOS",
    "vEOS",
    "Trough",
    "LOS",
    "AOS",
    "ROG",
    "ROS",
]
method_sos = "first"
method_eos = "last"

years = veg_smooth_1D.groupby("time.year")

# store results in dict
pheno_results = {}

# loop through years and calculate phenology
for y, year in years:
    # calculate stats
    stats = xr_phenology(
        year,
        method_sos=method_sos,
        method_eos=method_eos,
        stats=basic_pheno_stats,
        verbose=False,
    )
    # add results to dict
    pheno_results[str(y)] = stats

df_dict = {}
for key, value in pheno_results.items():
    df_dict_1 = {}
    for b in value.data_vars:
        if value[b].dtype == np.dtype("<M8[ns]") or value[b].dtype == np.dtype("int16"):
            result = pd.to_datetime(value[b].values)
        else:
            result = round(float(value[b].values), 3)
        df_dict_1[b] = result
    df_dict[key] = df_dict_1

df = (pd.DataFrame(df_dict)).T
df

[12]:

	SOS	vSOS	POS	vPOS	EOS	vEOS	Trough	LOS	AOS	ROG	ROS
2022	2022-11-09 00:00:00	0.223	2022-12-29 00:00:00	0.53	NaT	NaN	0.214	0.0	0.316	0.006	NaN
2023	2023-01-08 00:00:00	0.59	2023-03-19 00:00:00	0.703	2023-06-17 00:00:00	0.225	0.225	160.0	0.478	0.002	-0.005

Season plot

Below, the EVI timeseries is plotted with start of season, peak of season, and end of season labelled.

[13]:

fig, ax = plt.subplots(figsize=(8,6))

y = veg_smooth_1D
x = veg_smooth_1D.time

line, = ax.plot(x,y, color='green')
ax.annotate('SOS', xy=(df.SOS[0], df.Trough[0]), xytext=(df.SOS[0], df.Trough[0]+0.09),
            arrowprops=dict(facecolor='black', shrink=0.05),)
ax.annotate('POS', xy=(df.POS[1], df.vPOS[1]), xytext=(df.POS[1], df.vPOS[1]-0.09),
            arrowprops=dict(facecolor='black', shrink=0.05),)
ax.annotate('EOS', xy=(df.EOS[1], df.vEOS[1]), xytext=(df.EOS[1], df.vEOS[1]+0.09),
            arrowprops=dict(facecolor='black', shrink=0.05),)
plt.show()

../../../_images/sandbox_notebooks_Real_world_examples_Inspecting_WaPOR_through_time_30_0.png

Load and plot dekadal Net Primary Productivity

Dekadal biomass production (net primary productivity) is loaded below, then reprojected and crpped to the cropland area as for evapotranspiration.

[18]:

variable = 'L3-NPP-D'

nppd = wapor_map(region, variable, period, folder, extension = '.nc')

nppd_xr = load_wapor_ds(filename=nppd, variable=variable)

# Reproject data
nppd_xr_reprojected = xr_reproject(src=nppd_xr,
                              geobox=ds.geobox,
                              resampling="average")

#Set nodata to `NaN`
nppd_xr_reprojected = mask_invalid_data(nppd_xr_reprojected)

nppd_xr_crop = nppd_xr_reprojected.where(cm == 1)

nppd_xr_crop['L3-NPP-D'].mean(["x", "y"]).plot.line("g-^", figsize=(11, 4))
plt.title("Zonal mean of NPP");

WARNING: `region` intersects with multiple L3 regions (['ENO', 'ZAN']), continuing with ENO only.
INFO: Found 23 files for L3-NPP-D.
INFO: Converting from `.tif` to `.nc`.

../../../_images/sandbox_notebooks_Real_world_examples_Inspecting_WaPOR_through_time_40_1.png

Combine variables into one plot

Finally, EVI, ET, and NPP are plotted together to inspect the progression of the crop season. This can be used to draw inferences about the crop season. For example, we can see a dip in NPP & ET following February 2023, associated with a slowing in the rate of EVI growth. This suggests crop growth activity slowed. It’s not clear what caused this. It could be a lack of irrigation water, a nutrient issue, or thermal stress. This analysis is therefore exploratory and informs further analyses that could be undertaken.

[19]:

fig, axs = plt.subplots(3, sharex=True, figsize=(14,8), gridspec_kw={'hspace': 0.5})

veg_smooth_1D_slice = veg_smooth_1D.sel(time=slice(*period))

axs[0].set_title('EVI')
axs[0].set_ylabel('EVI')
axs[0].plot(veg_smooth_1D_slice.time, veg_smooth_1D_slice, color='green')

axs[1].set_title('NPP')
axs[1].set_ylabel('gC/m²/day')
axs[1].plot(nppd_xr_crop['L3-NPP-D'].mean(["x", "y"]).time, nppd_xr_crop['L3-NPP-D'].mean(["x", "y"]), color='black')

axs[2].set_title('Actual Evapotranspiration and Interception')
axs[2].set_ylabel('mm')
axs[2].plot(aetid_xr_crop['L3-AETI-D'].mean(["x", "y"]).time, aetid_xr_crop['L3-AETI-D'].mean(["x", "y"]), color='blue')
axs[2].xaxis_date()
axs[2].xaxis.set_major_formatter(mdates.DateFormatter("%b-%Y"))

../../../_images/sandbox_notebooks_Real_world_examples_Inspecting_WaPOR_through_time_42_0.png

Conclusion

This notebook demonstrated using WaPOR through a crop season and integrating it with phenology tools to understand seasonal crop patterns. The seasonal analysis techniques and visualisations shown in this notebook inform further analyses that could be undertaken.

Additional information

License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Africa data is licensed under the Creative Commons by Attribution 4.0 license.

Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the open-data-cube tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on Github.

Compatible datacube version:

[20]:

print(datacube.__version__)

1.8.20

Last Tested:

[21]:

from datetime import datetime
datetime.today().strftime('%Y-%m-%d')

[21]:

'2025-02-17'