Extreme event detection in gridded data

Overview

Because of human-induced climate change, we anticipate that extreme events would occur more frequently and that they would become greater in intensity. Here I address this question by using some gridded SST data, which is the only way that we can assess if this trend is unfolding across large ocean regions. Using the gridded 0.25 degree Reynolds OISST, I will find extreme SST events around South Africa data by applying the detect() function pixel-by-pixel. After detecting the events, I will fit a generalised linear model (GLM) to calculate rates of change in some marine heat wave metrics, and then plot the estimated trend.

Reading in the netCDF

First, I read in the 0.25 degree OISST netCDF file. I save the only variable, sst, as a multidimensional array (longitude, latitude and date). To do this, I use the ncdf4 package. I then construct a list with names and values for each dimension of the SST array. The example data used here cover 93 longitude and 43 latitude steps (i.e. 3999 pixels), and cover 12797 days (1981 to 2016). The names for the three dimensions are simply a list of longitudes, latitudes and dates, which I get directly from the netCDF file. Note that not all netCDF files are created equally, and sometimes vectors of longitudes, latitudes and dates need to be created based on the start values for lat/lon/date in the netCDF, and the ‘steps’ between sequential lats/lons/dates. To determine out how to approach reading in the netCDF file, use the print() function to scrutinise the content of the netCDF.

nc <- nc_open("/Users/ajsmit/spatial/OISSTv2/daily/OISST-V2-AVHRR_agg.nc")
# print(nc)
sst <- ncvar_get(nc, varid = "sst") # reads in the only var in netCDF file
dimnames(sst) <- list(lon = nc$dim$lon$vals,
                      lat = nc$dim$lat$vals, 
                      t = nc$dim$time$vals)

Now I take this array of SST data and convert it into a a data frame using reshape2’s melt() function. This creates columns named lon, lat and t (the latter is the time vector, with the mandatory name t, which is required by RmarineHeatWave’s make_whole() and detect() functions). Now each SST value is indexed by a unique combination of lon, lat and t. Notice too how I use the tibble package’s as_tibble() function together with melt(), because I like the way that it neatly creates a very readbale data frame.

sst <- as_tibble(melt(sst, value.name = "temp"))
sst$t <- as.Date(sst$t, origin = '1978-01-01')

Since the data frame of SSTs contains some NAs where seawater temperatures are not available due to the presence of land, I remove all cases with NAs.

sst1 <- sst %>%
  na.omit()
rm(sst); rm(nc)

Event detection

Now I make a wrapper function to combine make_whole() and detect(). This increases the ease of use of this function with plyr’s ddply() function, but most importantly, it pulls out only the event portion of the list of results (i.e. ignoring clim).

OISST_detect <- function(dat) {
  dat <- dat[,3:4]
  start = "1983-01-01"
  end = "2012-12-31"
  whole <- make_whole(dat)
  mhw <- detect(whole, climatology_start = start, climatology_end = end,
                min_duration = 5, max_gap = 2, cold_spells = FALSE)
  events <- mhw$event
  return(events)
  }

And now I apply the OISST_detect() function to the SST data and save the results as an intermediate .Rdata file.

# it takes a long time...
system.time(OISST_events <- ddply(sst1, .(lon, lat), OISST_detect, .parallel = TRUE))
 #   user  system elapsed 
 # 1862.3   205.9   704.8 
save(OISST_events, file = "/Users/ajsmit/spatial/OISSTv2/MHWs/OISST_events.Rdata")

Because I saved the results to a file I can load these rather than rerunning the OISST_detect() function, which takes about 11 minutes to execute on my 4 GHz iMac.

# in case it was calculated and saved earlier, we can load it here now:
load("/Users/ajsmit/spatial/OISSTv2/MHWs/OISST_events.Rdata")

Trend detection

# summarise the number of unique longitude, latitude and year combination:
event_freq <- OISST_events %>% 
  mutate(year = year(date_start)) %>% 
  group_by(lon, lat, year) %>% 
  summarise(n = n())
head(event_freq)

# create complete grid for merging with:
sst.grid <- sst1 %>% 
  select(lon, lat, t) %>% 
  mutate(t = lubridate::year(t)) %>% 
  unique() # slow...
colnames(sst.grid)[colnames(sst.grid) == 't'] <- 'year'

# and merge:
OISST_n <- left_join(sst.grid, event_freq)
OISST_n[is.na(OISST_n)] <- 0
lin_fun <- function(ev) {
  mod1 <- glm(n ~ year, family = poisson(link = "log"), data = ev)
  # mod1 <- lm(n ~ year, data = ev)
  tr <- summary(mod1)$coefficients[2,c(1,4)] # extract slope coefficient and its p-value
  return(tr)
}
OISST_nTrend <- ddply(OISST_n, .(lon, lat), lin_fun)
OISST_nTrend$pval <- cut(OISST_nTrend[,4], breaks = c(0, 0.001, 0.01, 0.05, 1))
head(OISST_nTrend)
## Coastline of African continent and some borders:
load("/Users/ajsmit/Dropbox/repos/Trend_Analysis/graph/africa_coast.RData")
load("/Users/ajsmit/Dropbox/repos/Trend_Analysis/graph/africa_borders.Rdata")
load("/Users/ajsmit/Dropbox/repos/Trend_Analysis/graph/south_africa_coast.RData")
load("/Users/ajsmit/Dropbox/repos/Trend_Analysis/graph/sa_provinces_new.RData")
ggplot(OISST_nTrend, aes(x = lon, y = lat)) +
  geom_rect(size = 0.2, fill = NA,
       aes(xmin = lon - 0.1, xmax = lon + 0.1, ymin = lat - 0.1, ymax = lat + 0.1,
           colour = OISST_nTrend[,5])) +
  geom_raster(aes(fill = Estimate), interpolate = FALSE, alpha = 0.9) +
  scale_fill_gradient2(name = "Count\n per\n year", high = "red", mid = "white",
                       low = "darkblue", midpoint = 0) +
  scale_colour_manual(breaks = c("(0,0.001]", "(0.001,0.01]", "(0.01,0.05]", "(0.05,1]"),
                      values = c("firebrick1", "firebrick2", "firebrick3", "white"),
                      name = "p-value", guide = FALSE) +
  geom_polygon(data = south_africa_coast, aes(x = lon, y = lat, group = group), 
               colour = NA, fill = "grey80") +
  geom_path(data = africa_borders, aes(x = lon, y = lat, group = group), 
            size = 0.3, colour = "black") +
  geom_polygon(data = south_africa_coast, aes(x = lon, y = lat, group = group),
               size = 0.5, colour = "black", fill = NA) +
  coord_fixed(ratio = 1, xlim = c(12.12, 35.12), ylim = c(-26.88, -37.38), expand = TRUE) +
  theme_bw()
ggsave(file = "README-grid-example1.png", width = 6, height = 3.2)

ggplot(OISST_nTrend, aes(lon, lat)) +
  geom_raster(aes(fill = pval), interpolate = FALSE) +
  scale_fill_manual(breaks = c("(0,0.001]", "(0.001,0.01]", "(0.01,0.05]",
                               "(0.05,0.1]", "(0.1,0.5]", "(0.5,1]"),
                    values = c("black", "grey20", "grey40",
                               "grey80", "grey90", "white"),
                    name = "p-value") +
  geom_polygon(data = south_africa_coast, aes(x = lon, y = lat, group = group), 
               colour = NA, fill = "grey80") +
  geom_path(data = africa_borders, aes(x = lon, y = lat, group = group), 
            size = 0.3, colour = "black") +
  geom_polygon(data = south_africa_coast, aes(x = lon, y = lat, group = group), 
               size = 0.5, colour = "black", fill = NA) +
  coord_fixed(ratio = 1, xlim = c(12.12, 35.12), ylim = c(-26.88, -37.38), expand = TRUE) +
  theme_bw()
ggsave(file = "README-grid-example2.png", width = 6, height = 3.2)