View on GitHub

worldfoodecosystems2023

Welcome to the last practical.

Congratulations, you have come a very long way…

In the previous two sessions, we have focessed a lot on implementing command-line based data retrieval and analysis, and tried to move away from a point-and-click approach.

You will probably have noticed that the level has gone up, with more and more pieces of code left open for you to fill in.

Make sure you found your way trough the first practical as well as the fifth practical as this last practical builds upon this.

Step 1: The case-study

The case we are investigating today is the salinization of fresh water lakes. Recently, a global dataset of surface water salinity - with measurements between 1980 and 2019 - was published here. The paper reports on the dataset and how it was established. In this practical we will analyze the dataset to answer following questions:

has salinity - as measured by the electrical conductivity (EC) increased or descreased in global freshwater lakes?
Is salinity of the water linked to rainfall deficits?
Are increases/decreases in salinity different across different biomes?
Which local drivers can influence salinity trends?

Problem simplification

Much like all problems, we’ll need to simplify and define this one as well:

Building block	Decision
Geographic scale	Points: measurement points of salinity in the database. Each seperate point is considered to be a location (regardless if two points are taken in the same water body)
temporal scale	We will compare averages over 1980-1990 with averages over 2005-2015: only stations that have >5y of measurement in both epochs are considered
Assumption	We assume that water deficits (low precipitation with high evaporation) for example here is linked to higher salinity
Dimensions	we focus on (i) a quanitified rainfall deficit and (ii) the biome map
Dimension description	The Terraclimate dataset (Climate water deficit band) and the OpenLand Biome map

Data description

Now that we have described the problem, we can describe the data we’ll use

Dataset	Type	Source	Access point
EC sampling points	Vector:points	Thorslund et al. 2020	here
TerraClimate, water deficit	Raster	derived from the TerraClimate Collection	Google Earth Engine Catalogue
Biome map	Raster	OpenLand potential Biomes	Google Earth Engine Catalogue

The datasets by Thorslund et al. are very large and need to be pre-processed as we want to compare data from 1980-1990 to that from 2005-2015 for those stations for which an average can be reliably calculated.

To make this exercise feasible, this preprocessing has already been done: the aggregated csv file and its conversion into a shapefile as well as the original file can be found here.

Now we are ready for the next step: we’ll first explore the original and the pre-processed data in R