Visualizing spatial point data

California has been actively monitoring groundwater levels for many years. In this example, we will use groundwater level measurements last fall (Fall 2021) in Central Valley, and do spatial interpolation using spatial statistics/geostatistics.

Data source: California natural resources agency

Let's load the point-wise groundwater elevation measurements:

In [ ]:
# load the necessary module
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import pandas as pd
import warnings
import seaborn as sns
import plotly.express as px
from shapely.geometry import Point
from shapely.geometry.polygon import Polygon

warnings.filterwarnings("ignore") # Please uncomment this line and see your warnings if you use this code further for your research
In [ ]:
# Clone data from our GitHub repository
! git clone https://github.com/lijingwang/IntroSpatialData_SDSI.git
fatal: destination path 'IntroSpatialData_SDSI' already exists and is not an empty directory.
In [ ]:
# Load data
## California groundwater elevation during 2021 fall (Sep - Nov)
groundwater = pd.read_csv('/content/IntroSpatialData_SDSI/Data/california_groundwater_level_2021_fall.csv')
## Central Valley polygon
central_valley_polygon = pd.read_csv('/content/IntroSpatialData_SDSI/Data/CV_bounds.csv')

Groundwater level in California, Sep - Nov 2021

In [ ]:
fig, ax = plt.subplots(figsize=[7,10])
matplotlib.rcParams.update({'font.size': 15})

plt.scatter(groundwater['LATITUDE'],groundwater['LONGITUDE'],c = groundwater['GWE'],s = 5,cmap = 'plasma',zorder = 1000)
plt.colorbar()

plt.xlabel('Latitude')
plt.ylabel('Longitude')

plt.title('Groundwater level (feet)')
Out[ ]:
Text(0.5, 1.0, 'Groundwater level (feet)')

Let's zoom into the groundwater level data within the central valley:

In [ ]:
# Within central valley polygon
central_valley = Polygon(central_valley_polygon.values)
point_sets = [Point((groundwater['LONGITUDE'][i],groundwater['LATITUDE'][i])) for i in range(len(groundwater))]
in_valley_or_not = [central_valley.contains(point) for point in point_sets]
groundwater = groundwater[in_valley_or_not]
In [ ]:
fig, ax = plt.subplots(figsize=[7,10])
matplotlib.rcParams.update({'font.size': 15})
plt.scatter(groundwater['LATITUDE'],groundwater['LONGITUDE'],c = groundwater['GWE'],s = 5,cmap = 'plasma',zorder = 1000)
plt.colorbar()
plt.xlabel('Latitude')
plt.ylabel('Longitude')
plt.title('Groundwater level (feet)')
Out[ ]:
Text(0.5, 1.0, 'Groundwater level (feet)')

We do see the spatial correlation of those point measurements, where the nearby points share similar values. In the next section, we will understand this spatial correlation quantitatively.

Spatial correlation: variogram analysis

Empirical variogram

In [ ]:
! pip install geostatspy
! pip install pyproj
Collecting geostatspy
  Downloading geostatspy-0.0.22-py3-none-any.whl (57 kB)
     |████████████████████████████████| 57 kB 2.1 MB/s 
Installing collected packages: geostatspy
Successfully installed geostatspy-0.0.22
Requirement already satisfied: pyproj in /usr/local/lib/python3.7/dist-packages (3.2.1)
Requirement already satisfied: certifi in /usr/local/lib/python3.7/dist-packages (from pyproj) (2021.10.8)
In [ ]:
import geostatspy.geostats as geostats
from pyproj import Proj

# Project data from lat, lon to utm coordinates
p = Proj(proj='utm',zone=10,ellps='WGS84', preserve_units=False)

# UTM polygon
cv_x,cv_y = p(central_valley_polygon.values[:,0],central_valley_polygon.values[:,1])
cv_x = cv_x/1000
cv_y = cv_y/1000
central_valley_xy = Polygon(np.array([cv_x,cv_y]).T)

# UTM groundwater elevation
## UTM (meters)
groundwater['x'],groundwater['y'] = p(groundwater['LONGITUDE'].values, groundwater['LATITUDE'].values)

## UTM (km)
groundwater['x'] = groundwater['x']/1000.
groundwater['y'] = groundwater['y']/1000.

Start from any point data and make it as a center of a circle, groundwater levels are quite different if we move outside circle with radius ~80km. We call this 80km as the correlation length, or in geostatistics, the range.

In [ ]:
import matplotlib.patches as patches
fig, ax = plt.subplots(figsize=[10,10])
matplotlib.rcParams.update({'font.size': 15})

patch = patches.Polygon(np.array([cv_x,cv_y]).T,fc = 'gray')
ax.add_patch(patch)
plt.scatter(groundwater['x'],groundwater['y'],c = groundwater['GWE'],s = 5,cmap = 'plasma',zorder = 1000)
plt.colorbar()
plt.scatter(610, 4300,s = 100,zorder = 10000,label = 'center')
circle = plt.Circle((610, 4300), 80, color='b', fill=False,label = 'circle')
ax.add_patch(circle)
plt.gca().set(aspect='equal')
plt.xlabel('UTM x (km)')
plt.ylabel('UTM y (km)')
plt.legend()
plt.title('Groundwater level (feet)')
Out[ ]:
Text(0.5, 1.0, 'Groundwater level (feet)')