California has been actively monitoring groundwater levels for many years. In this example, we will use groundwater level measurements last fall (Fall 2021) in Central Valley, and do spatial interpolation using spatial statistics/geostatistics.
Data source: California natural resources agency
Let's load the point-wise groundwater elevation measurements:
# load the necessary module
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import pandas as pd
import warnings
import seaborn as sns
import plotly.express as px
from shapely.geometry import Point
from shapely.geometry.polygon import Polygon
warnings.filterwarnings("ignore") # Please uncomment this line and see your warnings if you use this code further for your research
# Clone data from our GitHub repository
! git clone https://github.com/lijingwang/IntroSpatialData_SDSI.git
# Load data
## California groundwater elevation during 2021 fall (Sep - Nov)
groundwater = pd.read_csv('/content/IntroSpatialData_SDSI/Data/california_groundwater_level_2021_fall.csv')
## Central Valley polygon
central_valley_polygon = pd.read_csv('/content/IntroSpatialData_SDSI/Data/CV_bounds.csv')
Groundwater level in California, Sep - Nov 2021
fig, ax = plt.subplots(figsize=[7,10])
matplotlib.rcParams.update({'font.size': 15})
plt.scatter(groundwater['LATITUDE'],groundwater['LONGITUDE'],c = groundwater['GWE'],s = 5,cmap = 'plasma',zorder = 1000)
plt.colorbar()
plt.xlabel('Latitude')
plt.ylabel('Longitude')
plt.title('Groundwater level (feet)')
Let's zoom into the groundwater level data within the central valley:
# Within central valley polygon
central_valley = Polygon(central_valley_polygon.values)
point_sets = [Point((groundwater['LONGITUDE'][i],groundwater['LATITUDE'][i])) for i in range(len(groundwater))]
in_valley_or_not = [central_valley.contains(point) for point in point_sets]
groundwater = groundwater[in_valley_or_not]
fig, ax = plt.subplots(figsize=[7,10])
matplotlib.rcParams.update({'font.size': 15})
plt.scatter(groundwater['LATITUDE'],groundwater['LONGITUDE'],c = groundwater['GWE'],s = 5,cmap = 'plasma',zorder = 1000)
plt.colorbar()
plt.xlabel('Latitude')
plt.ylabel('Longitude')
plt.title('Groundwater level (feet)')
We do see the spatial correlation of those point measurements, where the nearby points share similar values. In the next section, we will understand this spatial correlation quantitatively.
! pip install geostatspy
! pip install pyproj
import geostatspy.geostats as geostats
from pyproj import Proj
# Project data from lat, lon to utm coordinates
p = Proj(proj='utm',zone=10,ellps='WGS84', preserve_units=False)
# UTM polygon
cv_x,cv_y = p(central_valley_polygon.values[:,0],central_valley_polygon.values[:,1])
cv_x = cv_x/1000
cv_y = cv_y/1000
central_valley_xy = Polygon(np.array([cv_x,cv_y]).T)
# UTM groundwater elevation
## UTM (meters)
groundwater['x'],groundwater['y'] = p(groundwater['LONGITUDE'].values, groundwater['LATITUDE'].values)
## UTM (km)
groundwater['x'] = groundwater['x']/1000.
groundwater['y'] = groundwater['y']/1000.
Start from any point data and make it as a center of a circle, groundwater levels are quite different if we move outside circle with radius ~80km. We call this 80km as the correlation length, or in geostatistics, the range.
import matplotlib.patches as patches
fig, ax = plt.subplots(figsize=[10,10])
matplotlib.rcParams.update({'font.size': 15})
patch = patches.Polygon(np.array([cv_x,cv_y]).T,fc = 'gray')
ax.add_patch(patch)
plt.scatter(groundwater['x'],groundwater['y'],c = groundwater['GWE'],s = 5,cmap = 'plasma',zorder = 1000)
plt.colorbar()
plt.scatter(610, 4300,s = 100,zorder = 10000,label = 'center')
circle = plt.Circle((610, 4300), 80, color='b', fill=False,label = 'circle')
ax.add_patch(circle)
plt.gca().set(aspect='equal')
plt.xlabel('UTM x (km)')
plt.ylabel('UTM y (km)')
plt.legend()
plt.title('Groundwater level (feet)')