Keywords data used; sentinel 2 data used; crop mask
Background
The Water Productivity through Open access of Remotely sensed derived data (WaPOR) monitors and reports on agricultural water productivity through biophysical measures with a focus on Africa and the Near East. This information assists partner countries improve land and water productivity in both rainfed and irrigated agriculture (Peiser et al. 2017).
Evapotranspiration is of particular interest in irrigated agricultural lands. It can be informative to visualise how plant growth (photosynthesis/biomass production) progresses through a season alongside evapotranspiration.
Description
This notebook provides and introduction to WaPOR data and nomenclature, and demonstrates loading and plotting.
First, we load the cropland mask to define the area of interest for the visualisation.
Then, the seasonal pattern of local cropping is inspected using an Enhanced Vegetation Index (EVI) time series to identify a period for the visualistion.
Once the season is defined, evapotranspiration and net primary productivity data are loaded.
Finally, EVI, net primary productivity, and evapotranspiration are plotted in a single visualisation. ***
Getting started
To run this analysis, run all the cells in the notebook, starting with the “Load packages” cell.
Load packages
Import Python packages that are used for the analysis.
Use standard import commands; some are shown below. Begin with any iPython magic commands, followed by standard Python packages, then any additional functionality you need from the Tools package.
[1]:
import datacube
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import rioxarray
import xarray as xr
import geopandas as gpd
from datacube.utils.masking import mask_invalid_data
from deafrica_tools.areaofinterest import define_area
from deafrica_tools.bandindices import calculate_indices
from deafrica_tools.datahandling import load_ard
from deafrica_tools.load_wapor import get_all_WaPORv3_mapsets, get_WaPORv3_info, load_wapor_ds
from deafrica_tools.plotting import display_map
from deafrica_tools.temporal import temporal_statistics, xr_phenology
from odc.geo.geom import Geometry
from odc.algo import xr_reproject
from wapordl import wapor_map
INFO: WaPORDL (`1.0.2`)
WARNING: Latest version is '1.0.3'.
WARNING: Please update pywapor.
Connect to the datacube
Connect to the datacube so we can access DE Africa data. The app parameter is a unique name for the analysis which is based on the notebook file name.
[2]:
dc = datacube.Datacube(app="WaPOR")
Analysis parameters
The cell below specifies the folder where the downloaded data will be stored. If you are using this script repeatedly, it is recommended you empty this folder from time to time to reduce storage on the Sandbox volume.
[3]:
folder = "../Supplementary_data/WaPOR" # folder that the data will be sent to
Next, the area of interest is defined. This can also be a .geojson file which the loading function accepts. Otherwise, there are two methods available:
By specifying the latitude, longitude, and buffer. This method requires you to input the central latitude, central longitude, and the buffer value in square degrees around the center point you want to analyze. For example,
lat = 30.75,lon = 31.35, andbuffer = 0.1will select an area with a radius of 0.1 square degrees around the point with coordinates (30.75, 31.35).Alternatively, you can provide separate buffer values for latitude and longitude for a rectangular area. For example,
lat = 30.75,lon = 31.35, andlat_buffer = 0.1andlon_buffer = 0.08will select a rectangular area extending 0.1 degrees north and south, and 0.08 degrees east and west from the point(30.75, 31.35).For reasonable loading times, set the buffer as
0.1or lower.By uploading a polygon as an
Esri Shapefile. If you choose this option, you will need to upload the geojson or ESRI shapefile into the Sandbox using Upload Files button
in the top left corner of the Jupyter Notebook interface. ESRI shapefiles must be uploaded with all the related files (.cpg, .dbf, .shp, .shx). Once uploaded, you can use the shapefile or geojson to define the area of interest. Remember to update the code to call the file you have uploaded.
To use one of these methods, you can uncomment the relevant line of code and comment out the other one. To comment out a line, add the "#" symbol before the code you want to comment out. By default, the first option which defines the location using latitude, longitude, and buffer is being used.
If running the notebook for the first time, keep the default settings below. This will demonstrate how the analysis works and provide meaningful results.
As for the loading WaPOR data notebook, this demonstration notebook loads an area of cropland in the Nile Delta, Egypt. The Nile Delta supports irrigated agriculture in a very arid climate. This means it has very low cloud cover and easily distinguishable cropping patterns from satellite imagery, making it a useful testing area for Earth Observation based analyses.
[4]:
# Method 1: Specify the latitude, longitude, and buffer
aoi = define_area(lat=30.75, lon=31.35, buffer=0.03)
# Method 2: Use a polygon as a GeoJSON or Esri Shapefile.
# aoi = define_area(vector_path='aoi.shp')
#Create a geopolygon and geodataframe of the area of interest
geopolygon = Geometry(aoi["features"][0]["geometry"], crs="epsg:4326")
geopolygon_gdf = gpd.GeoDataFrame(geometry=[geopolygon], crs=geopolygon.crs)
# Get the latitude and longitude range of the geopolygon
lat_range = (geopolygon_gdf.total_bounds[1], geopolygon_gdf.total_bounds[3])
lon_range = (geopolygon_gdf.total_bounds[0], geopolygon_gdf.total_bounds[2])
region = [geopolygon_gdf.total_bounds[0], geopolygon_gdf.total_bounds[1], geopolygon_gdf.total_bounds[2], geopolygon_gdf.total_bounds[3]]
display_map(x=lon_range, y=lat_range)
[4]:
Load Sentinel-2 imagery
We will use surface reflectance data to calculate indices for an initial assessment of crop phenology, so we load Sentinel-2 data for this.
[5]:
period = ["2022-01-01", "2023-12-31"]
query = {
"y": (region[1],region[3]),
"x": (region[0],region[2]),
"time": period,
"measurements": ["red", "green", "blue", "nir"],
"resolution": (-20, 20),
"output_crs": "epsg:6933", #match wapor
"group_by": "solar_day",
}
# Load available data from Sentinel-2
ds = load_ard(
dc=dc,
products=["s2_l2a"],
**query,
)
ds
Using pixel quality parameters for Sentinel 2
Finding datasets
s2_l2a
Applying pixel quality/cloud mask
Loading 146 time steps
[5]:
<xarray.Dataset> Size: 224MB
Dimensions: (time: 146, y: 331, x: 290)
Coordinates:
* time (time) datetime64[ns] 1kB 2022-01-03T08:41:33 ... 2023-12-29...
* y (y) float64 3kB 3.745e+06 3.745e+06 ... 3.738e+06 3.738e+06
* x (x) float64 2kB 3.022e+06 3.022e+06 ... 3.028e+06 3.028e+06
spatial_ref int32 4B 6933
Data variables:
red (time, y, x) float32 56MB 494.0 513.0 406.0 ... 532.0 614.0
green (time, y, x) float32 56MB 991.0 968.0 835.0 ... 615.0 722.0
blue (time, y, x) float32 56MB 653.0 625.0 576.0 ... 358.0 409.0
nir (time, y, x) float32 56MB 4.67e+03 5.447e+03 ... 2.059e+03
Attributes:
crs: epsg:6933
grid_mapping: spatial_refLoad the cropland mask
In this notebook, we are interested in the phenology of crops. Therefore, we limit our area to cropland only to eliminate the effect of other land cover classes. We use the Digital Earth Africa cropland mask to do this below.
Firstly, we load the cropland mask for the area of interest.
[6]:
cm = dc.load(
product="crop_mask",
time=("2019"),
measurements="filtered",
resampling="nearest",
like=ds.geobox,
).filtered.squeeze()
cm.where(cm < 255).plot.imshow(
add_colorbar=False, figsize=(6, 6)
) # we filter to <255 to omit missing data
plt.title("Cropland Extent");
Next, we mask the Sentinel-2 data, ds, to cropland areas.
[7]:
ds = ds.where(cm == 1)
We will use the Enhanced Vegetation Index (EVI) to get an intial picture of phenology. This is a preferred index because it is more sensitive at high levels of greenness, which are observed in densely planted irrigated crops, which we are looking at here. We can therefore better identify progression of the crop season with EVI than with other indices.
[8]:
ds = calculate_indices(ds, index='EVI', satellite_mission="s2")
The cell below resamples the EVI timeseries to regular 10 day intervals. This means that the EVI time series will match the regularity of the WaPOR dekadal data.
[9]:
resample_period = "10D"
window = 4
veg_smooth = (
ds['EVI']
.resample(time=resample_period)
.median()
.rolling(time=window, min_periods=1)
.mean()
)
Inspect phenology
Plotting the EVI timeseries over two years (2022 & and 2023) shows that the area we’ve selected follows a ‘double cropping’ pattern where two crops are grown within a 12-month period. Based on the EVI profile, the primary season appears to be from approximately November to June, with a secondary season from approximately June to November. This corresponds with characterisations of cropping systems in the Nile Delta.
[10]:
veg_smooth_1D = veg_smooth.mean(["x", "y"])
veg_smooth_1D.plot.line("b-^", figsize=(15, 5), color='green')
_max = veg_smooth_1D.max()
_min = veg_smooth_1D.min()
plt.title("EVI time-series")
plt.ylabel('EVI');
For the remaining visualisations, we will focus on the primary (winter) season between approximately October 2022 and June 2023. We will conduct a more definitive phenology analysis to identify the start and end of season, but first we need to limit our timeseries to a discrete season, as below.
[11]:
veg_smooth_1D = veg_smooth.mean(["x", "y"]).sel(time=slice('2022-10', '2023-06'))
veg_smooth_1D.plot.line("b-^", figsize=(15, 5))
_max = veg_smooth_1D.max()
_min = veg_smooth_1D.min()
plt.title("EVI time-series")
plt.ylabel('EVI');
Calculate phenology statistics
The cell below calculates phenological statistics for the selected timeseries, which will enable us to limit our analysis to a defined crop season.
The statistics are:
SOS - start of season date and its value (vSOS)
POS - peak of season date and its value (vPOS)
EOS - end of season date and its value (EOS)
Trough - minimum value
LOS - length of season (EOS - SOS)
AOS - amplitude (vPOS - Trough)
ROG - rate of greening = (vPOS - vSOS) / (POS - SOS)
ROS - rate of senescing = (vEOS - vPOS) / (EOS - POS)
We also define the methods for detection of SOS and EOS, as below.
method_sos : If ‘first’ then vSOS is estimated as the first positive slope on the greening side of the curve. If ‘median’, then vSOS is estimated as the median value of the postive slopes on the greening side of the curve.
method_eos : If ‘last’ then vEOS is estimated as the last negative slope on the senescing side of the curve. If ‘median’, then vEOS is estimated as the ‘median’ value of the negative slopes on the senescing side of the curve.
More information on phenological methods is available in the detailed documentation of temporal statistics.
[12]:
basic_pheno_stats = [
"SOS",
"vSOS",
"POS",
"vPOS",
"EOS",
"vEOS",
"Trough",
"LOS",
"AOS",
"ROG",
"ROS",
]
method_sos = "first"
method_eos = "last"
years = veg_smooth_1D.groupby("time.year")
# store results in dict
pheno_results = {}
# loop through years and calculate phenology
for y, year in years:
# calculate stats
stats = xr_phenology(
year,
method_sos=method_sos,
method_eos=method_eos,
stats=basic_pheno_stats,
verbose=False,
)
# add results to dict
pheno_results[str(y)] = stats
df_dict = {}
for key, value in pheno_results.items():
df_dict_1 = {}
for b in value.data_vars:
if value[b].dtype == np.dtype("<M8[ns]") or value[b].dtype == np.dtype("int16"):
result = pd.to_datetime(value[b].values)
else:
result = round(float(value[b].values), 3)
df_dict_1[b] = result
df_dict[key] = df_dict_1
df = (pd.DataFrame(df_dict)).T
df
[12]:
| SOS | vSOS | POS | vPOS | EOS | vEOS | Trough | LOS | AOS | ROG | ROS | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 2022 | 2022-11-09 00:00:00 | 0.223 | 2022-12-29 00:00:00 | 0.53 | NaT | NaN | 0.214 | 0.0 | 0.316 | 0.006 | NaN |
| 2023 | 2023-01-08 00:00:00 | 0.59 | 2023-03-19 00:00:00 | 0.703 | 2023-06-17 00:00:00 | 0.225 | 0.225 | 160.0 | 0.478 | 0.002 | -0.005 |
Season plot
Below, the EVI timeseries is plotted with start of season, peak of season, and end of season labelled.
[13]:
fig, ax = plt.subplots(figsize=(8,6))
y = veg_smooth_1D
x = veg_smooth_1D.time
line, = ax.plot(x,y, color='green')
ax.annotate('SOS', xy=(df.SOS[0], df.Trough[0]), xytext=(df.SOS[0], df.Trough[0]+0.09),
arrowprops=dict(facecolor='black', shrink=0.05),)
ax.annotate('POS', xy=(df.POS[1], df.vPOS[1]), xytext=(df.POS[1], df.vPOS[1]-0.09),
arrowprops=dict(facecolor='black', shrink=0.05),)
ax.annotate('EOS', xy=(df.EOS[1], df.vEOS[1]), xytext=(df.EOS[1], df.vEOS[1]+0.09),
arrowprops=dict(facecolor='black', shrink=0.05),)
plt.show()
Load and plot dekadal evapotranspiration
Now that we have identified the start and end of season, we can load evapotranspiration data. We load this at 10 day (dekadal) intervals below.
Clicking on the attributes element when the dataset is shown reveals units, scale factors, and other useful information.
Note that scale and offset values have been incorporated into the load_wapor_ds() function so the values returned are in the units shown below, in this case mm/day.
[14]:
period = [pd.to_datetime(df.SOS)[0], pd.to_datetime(df.EOS)[1]]
variable = 'L3-AETI-D'
aetid = wapor_map(region, variable, period, folder, extension = '.nc')
aetid_xr = load_wapor_ds(filename=aetid, variable=variable)
aetid_xr
WARNING: `region` intersects with multiple L3 regions (['ENO', 'ZAN']), continuing with ENO only.
INFO: Found 23 files for L3-AETI-D.
INFO: Converting from `.tif` to `.nc`.
[14]:
<xarray.Dataset> Size: 18MB
Dimensions: (x: 294, y: 330, time: 23)
Coordinates:
* x (x) float64 2kB 3.391e+05 3.392e+05 ... 3.45e+05 3.45e+05
* y (y) float64 3kB 3.406e+06 3.406e+06 ... 3.4e+06 3.4e+06
spatial_ref int32 4B 32636
* time (time) datetime64[ns] 184B 2022-11-01 2022-11-11 ... 2023-06-11
Data variables:
L3-AETI-D (time, y, x) float64 18MB 1.7 1.5 1.8 2.0 ... 2.6 2.7 3.9 3.5
Attributes:
long_name: Actual EvapoTranspiration and Interception
overview: NONE
temporal_resolution: Dekad
units: mm/day
scale_factor: 0.1
_FillValue: -9999
add_offset: 0.0To limit the evapotranspiration data to cropland areas, we must reproject it.
[15]:
# Reproject data
aetid_xr_reprojected = aetid_xr.odc.reproject(how=ds.odc.geobox, resampling="average")
#Set nodata to `NaN`
aetid_xr_reprojected = mask_invalid_data(aetid_xr_reprojected)
aetid_xr_crop = aetid_xr_reprojected.where(cm == 1).where(aetid_xr_reprojected['L3-AETI-D'] != -9999)
The plot below shows evapotranspiration in mm/day for the first timestep in the dekadal timeseries.
[16]:
aetid_xr_crop['L3-AETI-D'].isel(time=0).plot()
[16]:
<matplotlib.collections.QuadMesh at 0x7f5fe9d8d550>
Finally, the timeseries of evapotranspiration for cropland areas is plotted.
[17]:
aetid_xr_crop['L3-AETI-D'].mean(["x", "y"]).plot.line("b-^", figsize=(11, 4))
plt.title("Zonal mean of ET");
Load and plot dekadal Net Primary Productivity
Dekadal biomass production (net primary productivity) is loaded below, then reprojected and crpped to the cropland area as for evapotranspiration.
[18]:
variable = 'L3-NPP-D'
nppd = wapor_map(region, variable, period, folder, extension = '.nc')
nppd_xr = load_wapor_ds(filename=nppd, variable=variable)
# Reproject data
nppd_xr_reprojected = xr_reproject(src=nppd_xr,
geobox=ds.geobox,
resampling="average")
#Set nodata to `NaN`
nppd_xr_reprojected = mask_invalid_data(nppd_xr_reprojected)
nppd_xr_crop = nppd_xr_reprojected.where(cm == 1)
nppd_xr_crop['L3-NPP-D'].mean(["x", "y"]).plot.line("g-^", figsize=(11, 4))
plt.title("Zonal mean of NPP");
WARNING: `region` intersects with multiple L3 regions (['ENO', 'ZAN']), continuing with ENO only.
INFO: Found 23 files for L3-NPP-D.
INFO: Converting from `.tif` to `.nc`.
Combine variables into one plot
Finally, EVI, ET, and NPP are plotted together to inspect the progression of the crop season. This can be used to draw inferences about the crop season. For example, we can see a dip in NPP & ET following February 2023, associated with a slowing in the rate of EVI growth. This suggests crop growth activity slowed. It’s not clear what caused this. It could be a lack of irrigation water, a nutrient issue, or thermal stress. This analysis is therefore exploratory and informs further analyses that could be undertaken.
[19]:
fig, axs = plt.subplots(3, sharex=True, figsize=(14,8), gridspec_kw={'hspace': 0.5})
veg_smooth_1D_slice = veg_smooth_1D.sel(time=slice(*period))
axs[0].set_title('EVI')
axs[0].set_ylabel('EVI')
axs[0].plot(veg_smooth_1D_slice.time, veg_smooth_1D_slice, color='green')
axs[1].set_title('NPP')
axs[1].set_ylabel('gC/m²/day')
axs[1].plot(nppd_xr_crop['L3-NPP-D'].mean(["x", "y"]).time, nppd_xr_crop['L3-NPP-D'].mean(["x", "y"]), color='black')
axs[2].set_title('Actual Evapotranspiration and Interception')
axs[2].set_ylabel('mm')
axs[2].plot(aetid_xr_crop['L3-AETI-D'].mean(["x", "y"]).time, aetid_xr_crop['L3-AETI-D'].mean(["x", "y"]), color='blue')
axs[2].xaxis_date()
axs[2].xaxis.set_major_formatter(mdates.DateFormatter("%b-%Y"))
Conclusion
This notebook demonstrated using WaPOR through a crop season and integrating it with phenology tools to understand seasonal crop patterns. The seasonal analysis techniques and visualisations shown in this notebook inform further analyses that could be undertaken.
Additional information
License: The code in this notebook is licensed under the Apache License, Version 2.0. Digital Earth Africa data is licensed under the Creative Commons by Attribution 4.0 license.
Contact: If you need assistance, please post a question on the Open Data Cube Slack channel or on the GIS Stack Exchange using the open-data-cube tag (you can view previously asked questions here). If you would like to report an issue with this notebook, you can file one on
Github.
Compatible datacube version:
[20]:
print(datacube.__version__)
1.8.20
Last Tested:
[21]:
from datetime import datetime
datetime.today().strftime('%Y-%m-%d')
[21]:
'2025-02-17'