library("INLA")
library("tidyverse")
library("spatstat")
library("raster")
library("sf")
library("rgeos")
library("MASS")
library("spdep")
library("patchwork")
INLA Tutorial
In this tutorial, we will use the following R
packages.
To install INLA
, you can follow the instruction here. Alternatively,
install.packages("INLA", repos = c(getOption("repos"), INLA = "https://inla.r-inla-download.org/R/stable"), dep = TRUE)
Along the tutorial, we will use the following common theme and colors for the plots with ggplot2
<- c("#00008FFF", "#0000F2FF", "#0063FFFF", "#00D4FFFF", "#46FFB8FF", "#B8FF46FF", "#FFD400FF", "#FF6300FF", "#F00000FF", "#800000FF")
pal
<- theme_bw() + theme(legend.position = "right",
custom_theme text = element_text(size = 14),
plot.title = element_text(size = 16),
legend.title = element_text(size = 12))
INLA
In a nutshell, the integrated nested Laplace approximation (INLA) (Rue et al., 2009) method is used to approximate Bayesian inference in latent Gaussian models. In particular, it can be used to fit models of the form \begin{align*} y_i|S(x_i), &\theta \sim \pi(y_i|S(x_i), \theta), \text{ for } i \in \{1, \cdots, n\} \\ S(x)|\theta &\sim \text{Normal}(\mu(\theta), Q(\theta)^{-1}) \\ \theta &\sim \pi(\theta), \end{align*} where y = (y_1, \ldots, y_n) is the vector or observed values, x = (x_1, \ldots, x_n) is a Gaussian random field, and \theta = (\theta_1, \ldots, \theta_k), for some k \in \mathbb{N}, is a vector of hyperparameters. \mu(\theta) and Q(\theta) represent the mean vector and the precision matrix, respectively.
The observed values y_i, \forall i, are assumed to have mean \mu_i=g^{-1}(\eta_i), such that the linear prediction \eta_i can be expressed as follows \begin{align*} \eta_i = \alpha + \sum^{n_{\beta}}_{k = 1} \beta_k z_{ki} + \sum_{j = 1}^{n_f}f^{(j)}(u_{ij}) + \varepsilon_i, \forall i, \end{align*} where \alpha is the intercept, \{\beta_k\}_k, are the coefficients of covariates \{z_{ki}\}_{ki}, and \{f^{(j)}\}_j define some random effects in terms of covariates \{u_{ij}\}_{ij}. Lastly, \varepsilon_i is an error term.
In this formulation, \{f^{(j)}\}_j will be commonly defined as spatio(-temporal) structures—at least for the purposes of this tutorial.
Multiple linear regression
Let us start with a multiple linear regression problem, so that we can get to know the INLA
notation and how to extract the desired quantities. This example was extracted from Bayesian Inference with INLA (Gómez-Rubio, 2021).
We will analyze the cement
data set from the MASS
package. x1
, x2
, x3
, and x4
represent the percentages of the four main chemical ingredients in the setting of 13 cements, and y
measures the heat evolved.
summary(cement)
x1 x2 x3 x4 y
Min. : 1.000 Min. :26.00 Min. : 4.00 Min. : 6 Min. : 72.50
1st Qu.: 2.000 1st Qu.:31.00 1st Qu.: 8.00 1st Qu.:20 1st Qu.: 83.80
Median : 7.000 Median :52.00 Median : 9.00 Median :26 Median : 95.90
Mean : 7.462 Mean :48.15 Mean :11.77 Mean :30 Mean : 95.42
3rd Qu.:11.000 3rd Qu.:56.00 3rd Qu.:17.00 3rd Qu.:44 3rd Qu.:109.20
Max. :21.000 Max. :71.00 Max. :23.00 Max. :60 Max. :115.90
Let us add another data point for prediction.
<- rbind(cement, data.frame(x1 = 24, x2 = 24, x3 = 24, x4 = 24, y = NA)) cement
And let us fit the following model \begin{align*} y_i = \alpha + \sum_{j = 1}^{4} \beta_j x_{ji} + \varepsilon_i, \forall i, \end{align*} where \varepsilon_i\overset{\text{i.i.d.}}{\sim}\text{Normal}(0, 1/\tau).
<- y ~ 0 + 1 + x1 + x2 + x3 + x4 formula_1
<- inla(formula = formula_1,
model_1_1 family = "gaussian",
data = cement,
control.predictor = list(compute = TRUE)) # Should the marginals for the linear predictor be computed?
summary(model_1_1)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 1.98, Running = 0.553, Post = 0.0266, Total = 2.56
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 62.503 68.255 -73.838 62.492 198.907 62.494 0
x1 1.550 0.725 0.100 1.550 2.999 1.550 0
x2 0.509 0.705 -0.900 0.509 1.918 0.509 0
x3 0.101 0.735 -1.368 0.101 1.569 0.101 0
x4 -0.145 0.691 -1.525 -0.145 1.235 -0.145 0
Model hyperparameters:
mean sd 0.025quant 0.5quant
Precision for the Gaussian observations 0.209 0.093 0.069 0.195
0.975quant mode
Precision for the Gaussian observations 0.428 0.168
Marginal log-Likelihood: -59.58
is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
# model_1_1$summary.fixed # Estimated fixed effects
# model_1_1$summary.hyperpar # Estimated hyperparameters
# model_1_1$summary.linear.predictor # Estimated values for the linear predictor
$summary.fitted.values[, 1:5] # Fitted values (+ prediction) model_1_1
mean sd 0.025quant 0.5quant 0.975quant
fitted.Predictor.01 78.49487 1.7599239 74.97938 78.49473 82.01162
fitted.Predictor.02 72.78996 1.3694565 70.05488 72.78971 75.52689
fitted.Predictor.03 105.97348 1.8005797 102.37782 105.97307 109.57220
fitted.Predictor.04 89.32802 1.2889907 86.75361 89.32781 91.90409
fitted.Predictor.05 95.64936 1.4188557 92.81527 95.64921 98.48472
fitted.Predictor.06 105.27493 0.8360121 103.60514 105.27482 106.94566
fitted.Predictor.07 104.14886 1.4375734 101.27740 104.14870 107.02165
fitted.Predictor.08 75.67461 1.5164313 72.64548 75.67450 78.70480
fitted.Predictor.09 91.72304 1.2866716 89.15342 91.72278 94.29458
fitted.Predictor.10 115.61786 1.9856102 111.65149 115.61772 119.58554
fitted.Predictor.11 81.80842 1.5474777 78.71723 81.80833 84.90056
fitted.Predictor.12 112.32632 1.2164048 109.89641 112.32627 114.75683
fitted.Predictor.13 111.69378 1.3073368 109.08228 111.69371 114.30606
fitted.Predictor.14 110.86481 5.9359151 99.00816 110.86418 122.72683
INLA
has numerous functions to process the posterior marginals. For instance, inla.smarginal()
is used to obtain a spline smoothing.
<- model_1_1$marginals.fixed[[1]]
alpha
ggplot(data.frame(inla.smarginal(alpha)), aes(x, y)) +
geom_line() +
labs(x = "", y = "", title = "Posterior of alpha") +
custom_theme
inla.qmarginal()
and inla.pmarginal()
compute the quantile and distribution function, respectively.
<- inla.qmarginal(0.05, alpha)
qq qq
[1] -48.42291
inla.pmarginal(qq, alpha)
[1] 0.05
inla.dmarginal()
computes the density at particular values.
inla.dmarginal(0, alpha)
[1] 0.003704535
inla.tmarginal()
transforms the marginal. This is useful, e.g., to obtain the posterior distribution of the 1 / \tau.
<- inla.tmarginal(fun = function(x) { 1 / x }, marginal = model_1_1$marginals.hyperpar$`Precision for the Gaussian observations`)
sigma_2
ggplot(data.frame(inla.smarginal(sigma_2))) +
geom_line(aes(x, y)) +
labs(x = "", y = "", title = "Posterior of the variance") +
custom_theme
Complementary, we can also plot the posterior distribution of the fitted values.
<- model_1_1$marginals.fitted.values
post_fitted
<- data.frame(do.call(rbind, post_fitted))
post_margin $cement <- rep(names(post_fitted), times = sapply(post_fitted, nrow))
post_margin
ggplot(post_margin) +
geom_line(aes(x, y)) +
facet_wrap(~ cement, ncol = 5) +
labs(x = "", y = "Density") +
custom_theme
Generalized linear models (for disease mapping)
Next, let us fit a generalized linear model (GLM) for a problem in disease mapping. Initially, we will ignore the underlying spatial structure, but we will consider it in the following part. This example was extracted from Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny (Moraga, 2019).
We will model the number of cases of lung cancer in Ohio, USA, from 1968 to 1988. The data can be downloaded from here.
After downloading the data, we can do the following
# Load data
<- read_csv(file = "data/example_2/data_ohio.csv", show_col_types = FALSE)
d_ohio <- read_sf(dsn = "data/example_2/ohio_shapefile/", layer = "fe_2007_39_county")
m_ohio
head(d_ohio)
# A tibble: 6 × 7
county gender race year y n NAME
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 1 1 1 1968 6 8912 Adams
2 1 1 1 1969 5 9139 Adams
3 1 1 1 1970 8 9455 Adams
4 1 1 1 1971 5 9876 Adams
5 1 1 1 1972 8 10281 Adams
6 1 1 1 1973 5 10876 Adams
plot(m_ohio$geometry)
For this problem, we would like to model the relative risk of cancer in a certain region. To do so, we can compute first the “standardized incidence ratio” (SIR).
Let Y_i denote the number of cases in the location i, then \begin{align*} \text{SIR}_i = \frac{Y_i}{E_i}, \forall i, \end{align*} where \begin{align*} E_i = \sum_{j = 1}^{m} r_j^{(s)} \cdot n_j^{(i)}, \end{align*} such that r_j^{(s)} represents the rate in stratum j in the standard population, and n_j^{(i)} is the population in stratum j of location i.
We can compute the SIR as follows (expected()
was extracted from the SpatialEpi
package)
# Extracted from `SpatialEpi`
<- function (population, cases, n.strata, ...) {
expected
<- length(population) / n.strata
n <- rep(0, n)
E <- rep(0, n.strata)
qNum <- rep(0, n.strata)
qDenom <- rep(0, n.strata)
q
# Compute strata-specific rates
for (i in 1:n.strata) {
<- rep(i, n) + seq(0, (n - 1)) * n.strata
indices <- qNum[i] + sum(cases[indices])
qNum[i] <- qDenom[i] + sum(population[indices])
qDenom[i]
}<- qNum / qDenom
q
# Compute expected counts
for (i in 1:n) {
<- (1:n.strata) + (i - 1) * n.strata
indices <- sum(population[indices] * q)
E[i]
}
E }
################
# Process data #
################
# Observed number of cases
<- d_ohio %>% group_by(NAME, year) %>% summarise(Y = sum(y)) %>% ungroup() %>% rename(county = NAME) %>% arrange(county, year)
d
# Expected cases
<- d_ohio %>% arrange(county, year, gender, race)
d_ohio <- 4 # 2 genders x 2 races
n.strata <- expected(population = d_ohio$n, cases = d_ohio$y, n.strata = n.strata)
E
# Compute SIR
<- length(unique(d_ohio$year))
n_years <- length(unique(d_ohio$NAME))
n_counties
<- rep(unique(d_ohio$NAME), each = n_years)
counties_E <- rep(unique(d_ohio$year), times = n_counties)
years_E
<- data.frame(county = counties_E, year = years_E, E = E)
d_E
<- d %>% left_join(y = d_E, by = c("county", "year"))
d <- d %>% mutate(SIR = Y / E)
d
# Link map information
<- d %>% pivot_wider(names_from = year, values_from = c(Y, E, SIR), names_glue = "{.value}.{year}") # `wide` format
d_wide <- m_ohio %>% rename(county = NAME) %>% left_join(y = d_wide, by = c("county")) %>% dplyr::select(c(colnames(d_wide), "geometry"))
m_ohio
<- setdiff(colnames(m_ohio), c("county", "geometry")) # to `long` format
s_cols <- m_ohio %>%
m_ohio pivot_longer(cols = all_of(s_cols), names_to = c(".value", "year"), names_sep = "\\.") %>%
::select(county, year, Y, E, SIR, geometry) %>%
dplyrmutate(year = as.numeric(year)) %>%
arrange(county, year)
head(m_ohio)
Simple feature collection with 6 features and 5 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -83.70532 ymin: 38.59659 xmax: -83.26755 ymax: 39.0552
Geodetic CRS: NAD83
# A tibble: 6 × 6
county year Y E SIR geometry
<chr> <dbl> <dbl> <dbl> <dbl> <POLYGON [°]>
1 Adams 1968 6 8.28 0.725 ((-83.67511 38.99929, -83.67486 38.99965, -83.…
2 Adams 1969 5 8.50 0.588 ((-83.67511 38.99929, -83.67486 38.99965, -83.…
3 Adams 1970 9 8.78 1.03 ((-83.67511 38.99929, -83.67486 38.99965, -83.…
4 Adams 1971 6 9.18 0.654 ((-83.67511 38.99929, -83.67486 38.99965, -83.…
5 Adams 1972 10 9.55 1.05 ((-83.67511 38.99929, -83.67486 38.99965, -83.…
6 Adams 1973 7 10.1 0.693 ((-83.67511 38.99929, -83.67486 38.99965, -83.…
Now, we can plot the computed SIR for all locations for all years.
ggplot(m_ohio) +
geom_sf(aes(fill = SIR)) +
facet_wrap(~ year, ncol = 7) +
scale_fill_gradient2(name = "SIR", midpoint = 1, low = "blue", mid = "white", high = "red") +
labs(x = "", y = "") +
+
custom_theme theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank())
However, if there exist locations with small population or expected counts, SIR may be unreliable. To overcome this issue, we will model the relative risk as follows
\begin{align*} Y_i &\sim \text{Poisson}(E_i \theta_i), \forall i,\\ \log(\theta_i) &= \alpha + \sum_{j = 1}^{n_f}f^{(j)}(u_{ij}) \end{align*} where the relative risk \theta_i quantifies whether the location i has higher (>1) or lower (<1) risk than the average risk in the standard population. We will experiment with different structures (if any) for the random effects.
Aiming to implement the random effects, let us include an index for area and time in our data set.
# Create indices for random effects
<- m_ohio %>% mutate(id_area = as.numeric(as.factor(county)),
m_ohio id_time = (1 + year - min(year)))
However, we will with data corresponding to 1988 only (we will consider a spatio-temporal structure later).
<- m_ohio %>% filter(year == 1988) data
No random effects
<- Y ~ 0 + 1 formula_1
<- inla(formula = formula_1,
model_2_1 family = "poisson",
data = data,
E = E, # Known component in the mean for the Poisson likelihoods
control.predictor = list(compute = TRUE),
control.compute = list(cpo = TRUE,
dic = TRUE,
waic = TRUE)) # For model comparison
Notice that we are computing cpo
, dic
, and waic
, so that we can compare models.
summary(model_2_1)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 2.09, Running = 0.524, Post = 0.0224, Total = 2.64
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 0.279 0.012 0.254 0.279 0.303 0.279 0
Deviance Information Criterion (DIC) ...............: 778.76
Deviance Information Criterion (DIC, saturated) ....: 303.57
Effective number of parameters .....................: 1.00
Watanabe-Akaike information criterion (WAIC) ...: 783.13
Effective number of parameters .................: 5.02
Marginal log-Likelihood: -391.85
CPO, PIT is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
For later use, let us create the function table_model_comparison
to print a table for model comparison.
<- function (models, ...) {
table_model_comparison
<- length(models)
n_models <- as.data.frame(matrix(data = 0, nrow = n_models, ncol = 3))
df rownames(df) <- names(models)
colnames(df) <- c("CPO", "DIC", "WAIC")
for (i in 1:n_models) {
<- sum(log(models[[i]]$cpo$cpo)) * -1
tmp_CPO <- models[[i]]$dic$dic
tmp_DIC <- models[[i]]$waic$waic
tmp_WAIC
<- c(tmp_CPO, tmp_DIC, tmp_WAIC)
df[i, ]
}
df }
IID model
~ 0 + 1 + f(id_area, model = "iid") Y
Y ~ 0 + 1 + f(id_area, model = "iid")
<- inla(formula = formula_2,
model_2_2 family = "poisson",
data = data,
E = E,
control.predictor = list(compute = TRUE),
control.compute = list(cpo = TRUE,
dic = TRUE,
waic = TRUE))
summary(model_2_2)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 1.8, Running = 0.521, Post = 0.0273, Total = 2.35
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 0.198 0.032 0.134 0.198 0.259 0.198 0
Random effects:
Name Model
id_area IID model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for id_area 19.94 5.20 11.86 19.23 32.05 17.94
Deviance Information Criterion (DIC) ...............: 626.58
Deviance Information Criterion (DIC, saturated) ....: 151.39
Effective number of parameters .....................: 55.70
Watanabe-Akaike information criterion (WAIC) ...: 624.54
Effective number of parameters .................: 40.09
Marginal log-Likelihood: -346.57
CPO, PIT is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
# Model comparison
table_model_comparison(models = list(model_2_1 = model_2_1, model_2_2 = model_2_2))
CPO DIC WAIC
model_2_1 391.6097 778.7631 783.1293
model_2_2 354.8703 626.5815 624.5403
Intrinsic Conditional Auto-Regressive (ICAR) model
Alternatively, we can also account for the spatial dependence that may exist underlying our observed counts.
To do so, the first thing we have to do is determining the neighboring structure for the analyzed region. This can be done by using the poly2nb()
function from the spdep
package.
<- poly2nb(data$geometry)
nb head(nb)
[[1]]
[1] 8 36 66 73
[[2]]
[1] 6 32 33 69 81
[[3]]
[1] 38 39 42 47 52 70 85
[[4]]
[1] 28 43 78
[[5]]
[1] 37 53 58 64 82 84
[[6]]
[1] 2 33 46 54 75 81
# Save matrix for later use within INLA
nb2INLA("data/example_2/map.adj", nb)
<- inla.read.graph(filename = "data/example_2/map.adj") g
Then we can fit a iid + besag
model.
Remark: in INLA
, we can always use inla.doc()
to view the documentation (e.g., parameterization) of models.
# Example: `besag`
inla.doc("besag")
In this case, we are fitting a model such that
\begin{align*} \log(\theta_i) = \alpha + u_i + v_i, \end{align*} such that u_i is a random effect specific to area i to model spatial dependence between the relative risks, and v_i is an unstructured exchangeable component that models uncorrelated noise.
<- Y ~ 0 + 1 + f(id_area, model = "iid") + f(id_area_cp, model = "besag", graph = g, scale.model = TRUE) formula_3
scale.model = TRUE
makes the model to be scaled to have an average variance of 1 (Sorbye and Rue, 2014).
<- inla(formula = formula_3,
model_2_3 family = "poisson",
data = data,
E = E,
control.predictor = list(compute = TRUE),
control.compute = list(cpo = TRUE,
dic = TRUE,
waic = TRUE))
summary(model_2_3)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 1.98, Running = 0.639, Post = 0.0361, Total = 2.66
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 0.199 0.031 0.136 0.199 0.258 0.199 0
Random effects:
Name Model
id_area IID model
id_area_cp Besags ICAR model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for id_area 22.96 8.33e+00 9.90 21.84 42.20 19.75
Precision for id_area_cp 11242.10 1.45e+05 20.52 501.77 64005.04 39.29
Deviance Information Criterion (DIC) ...............: 625.86
Deviance Information Criterion (DIC, saturated) ....: 150.66
Effective number of parameters .....................: 55.11
Watanabe-Akaike information criterion (WAIC) ...: 624.05
Effective number of parameters .................: 39.85
Marginal log-Likelihood: -370.33
CPO, PIT is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
# Model comparison
table_model_comparison(models = list(model_2_1 = model_2_1, model_2_2 = model_2_2, model_2_3 = model_2_3))
CPO DIC WAIC
model_2_1 391.6097 778.7631 783.1293
model_2_2 354.8703 626.5815 624.5403
model_2_3 352.9770 625.8572 624.0514
BYM2 model
As an alternative to iid + besag
, we can fit a “Besag, York, and Mollié” (BYM2) model. In this case, we reparameterize the structured u = (u_1, \cdots, u_n) and unstructured v = (v_1, \cdots, v_n) random effects as follows \begin{align*}
b = u + v = \frac{1}{\sqrt{\tau_b}}(\sqrt{1 - \phi}v^{\star} + \sqrt{\phi}u^{\star}),
\end{align*} where \tau_b > 0 is the precision, 0 \leq \phi \leq 1 is a mixing parameter, v^{\star} \sim \text{Normal}(0, I_n) and u^{\star} is a scaled ICAR model.
<- Y ~ 0 + 1 + f(id_area, model = "bym2", graph = g) formula_4
<- inla(formula = formula_4,
model_2_4 family = "poisson",
data = data,
E = E,
control.predictor = list(compute = TRUE),
control.compute = list(cpo = TRUE,
dic = TRUE,
waic = TRUE))
summary(model_2_4)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 18.1, Running = 0.608, Post = 0.0354, Total = 18.8
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 0.2 0.029 0.141 0.2 0.255 0.2 0
Random effects:
Name Model
id_area BYM2 model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for id_area 19.029 4.779 11.389 18.440 30.08 17.302
Phi for id_area 0.301 0.159 0.064 0.276 0.66 0.201
Deviance Information Criterion (DIC) ...............: 623.35
Deviance Information Criterion (DIC, saturated) ....: 148.16
Effective number of parameters .....................: 55.15
Watanabe-Akaike information criterion (WAIC) ...: 619.32
Effective number of parameters .................: 38.26
Marginal log-Likelihood: -282.48
CPO, PIT is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
# Model comparison
table_model_comparison(models = list(model_2_1 = model_2_1, model_2_2 = model_2_2, model_2_3 = model_2_3, model_2_4 = model_2_4))
CPO DIC WAIC
model_2_1 391.6097 778.7631 783.1293
model_2_2 354.8703 626.5815 624.5403
model_2_3 352.9770 625.8572 624.0514
model_2_4 348.5249 623.3496 619.3207
Penalized Complexity (PC) priors
As before, we can also set different priors before fitting our model. In particular, we can set Penalized Complexity (PC) priors for our bym2
model.
As in Gómez-Rubio (2021),
PC priors are designed following some principles for inference. First of all, the prior favors the base model unless evidence is provided against it, following the principle of parsimony. Distance from the base model is measured using the Kullback-Leibler distance, and penalization from the base model is done at a constant rate on the distance. Finally, the PC prior is defined using probability statements on the model parameters in the appropriate scale.
To define the prior for the marginal precision \tau_b we use the probability statement \mathbb{P}((1/\sqrt{\tau_b}) > U) = \alpha. A prior for \phi is defined using \mathbb{P}(\phi < U) = \alpha.
<- list(
pc_prior prec = list(
prior = "pc.prec",
param = c(1, 0.01)), # P(1 / sqrt(prec) > 1) = 0.01
phi = list(
prior = "pc",
param = c(0.5, 0.75)) # P(phi < 0.5) = 0.75
)
From above, notice that \mathbb{P}(\phi < 0.5) = 0.75 implies that we believe that the unstructured effect accounts for more of the variability than the spatially structured effect.
<- Y ~ 0 + 1 + f(id_area, model = "bym2", graph = g, hyper = pc_prior) formula_5
<- inla(formula = formula_5,
model_2_5 family = "poisson",
data = data,
E = E,
control.predictor = list(compute = TRUE),
control.compute = list(cpo = TRUE,
dic = TRUE,
waic = TRUE))
summary(model_2_5)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 18.8, Running = 0.582, Post = 0.0354, Total = 19.4
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) 0.199 0.029 0.14 0.2 0.256 0.2 0
Random effects:
Name Model
id_area BYM2 model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for id_area 19.129 4.784 11.489 18.537 30.202 17.39
Phi for id_area 0.264 0.149 0.052 0.238 0.614 0.16
Deviance Information Criterion (DIC) ...............: 623.44
Deviance Information Criterion (DIC, saturated) ....: 148.24
Effective number of parameters .....................: 55.45
Watanabe-Akaike information criterion (WAIC) ...: 619.13
Effective number of parameters .................: 38.26
Marginal log-Likelihood: -282.44
CPO, PIT is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
# Model comparison
table_model_comparison(models = list(model_2_1 = model_2_1, model_2_2 = model_2_2, model_2_3 = model_2_3, model_2_4 = model_2_4, model_2_5 = model_2_5))
CPO DIC WAIC
model_2_1 391.6097 778.7631 783.1293
model_2_2 354.8703 626.5815 624.5403
model_2_3 352.9770 625.8572 624.0514
model_2_4 348.5249 623.3496 619.3207
model_2_5 349.0522 623.4366 619.1260
Fitted values
Although model_2_4
and model_2_5
are slightly better, for the purpose of this tutorial, we will plot the fitted values based on model_2_3
(iid + besag
), so that we can see the estimated unstructured and structured effects separately.
We can retrieve the fitted values by analyzing model_2_3$summary.fitted.values
.
<- model_2_3$summary.fitted.values[, c("mean", "0.025quant", "0.975quant")] %>%
fitted_values as_tibble() %>%
mutate(id_area = 1:nrow(data)) %>%
rename(Mean = mean, `0.025` = `0.025quant`, `0.975` = `0.975quant`) %>%
left_join(y = data[, c("county", "geometry", "id_area")], by = "id_area") %>%
::select(county, Mean, `0.025`, `0.975`, geometry) %>%
dplyrpivot_longer(cols = c(Mean, `0.025`, `0.975`)) %>%
mutate(name = factor(name, levels = c("0.025", "Mean", "0.975"))) %>%
st_as_sf()
ggplot(fitted_values) +
geom_sf(aes(fill = value)) +
facet_wrap(~ name, dir = "h", ncol = 3) +
scale_fill_gradient2(name = "Relative risk", midpoint = 1, low = "blue", mid = "white", high = "red") +
labs(x = "", y = "", title = "Estimated \"Relative Risk\" in 1988") +
+
custom_theme theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank())
And we can retrieve the summary of the posterior distribution of the random effects by analyzing model_2_3$summary.random
.
<- data[, c("county", "geometry")] %>%
random_effects mutate("IID" = model_2_3$summary.random$id_area$mean,
"BESAG" = model_2_3$summary.random$id_area_cp$mean) %>%
pivot_longer(cols = c(IID, BESAG)) %>%
mutate(name = factor(name, levels = c("IID", "BESAG"))) %>%
st_as_sf()
ggplot(random_effects) +
geom_sf(aes(fill = value)) +
facet_wrap(~ name, dir = "h", ncol = 2) +
scale_fill_gradient2(name = "RE", midpoint = 0, low = "blue", mid = "white", high = "red") +
labs(x = "", y = "", title = "Posterior mean of the random effects") +
+
custom_theme theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank())
Spatio-temporal models
We will continue analyzing the data set we introduced in the previous section (i.e., number of cases of lung cancer in Ohio, USA, from 1968 to 1988), but now we will also account for the possible temporal dependence that may exist over the years.
<- m_ohio
data head(data)
Simple feature collection with 6 features and 7 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: -83.70532 ymin: 38.59659 xmax: -83.26755 ymax: 39.0552
Geodetic CRS: NAD83
# A tibble: 6 × 8
county year Y E SIR geometry id_area id_time
<chr> <dbl> <dbl> <dbl> <dbl> <POLYGON [°]> <dbl> <dbl>
1 Adams 1968 6 8.28 0.725 ((-83.67511 38.99929, -83.6748… 1 1
2 Adams 1969 5 8.50 0.588 ((-83.67511 38.99929, -83.6748… 1 2
3 Adams 1970 9 8.78 1.03 ((-83.67511 38.99929, -83.6748… 1 3
4 Adams 1971 6 9.18 0.654 ((-83.67511 38.99929, -83.6748… 1 4
5 Adams 1972 10 9.55 1.05 ((-83.67511 38.99929, -83.6748… 1 5
6 Adams 1973 7 10.1 0.693 ((-83.67511 38.99929, -83.6748… 1 6
Similar to before, we will model the relative risk as follows \begin{align*} Y_{it} &\sim \text{Poisson}(E_{it} \theta_{it}), \forall i, t, \\ \log(\theta_{it}) &= \alpha + \sum_{j = 1}^{n_f}f^{(j)}(u_{it,j}). \end{align*} However, notice that now all quantities depend on the year (or time, t). The random effects may depend on both space and time with possible interaction.
Replicates
An alternative to model space-time observations is treating them as replicates. In INLA
, using replicate
implies that replicated effects share the hyperparameters (only). This means that the values of the random effects in the different replicates can be different (Gómez-Rubio, 2021).
<- Y ~ 0 + 1 + f(id_area, model = "besag", graph = g, replicate = id_time) formula_1
<- inla(formula = formula_1,
model_3_1 family = "poisson",
data = data,
E = E,
control.predictor = list(compute = TRUE))
summary(model_3_1)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 2.26, Running = 3.81, Post = 0.2, Total = 6.27
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) -0.089 0.005 -0.099 -0.089 -0.079 -0.089 0
Random effects:
Name Model
id_area Besags ICAR model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for id_area 4.30 0.26 3.81 4.29 4.83 4.27
Marginal log-Likelihood: -8822.75
is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Aiming to plot the fitted values for the different spatio-temporal approaches, we will use the function plot_rr_years()
.
<- function (d, col_name = "fitted.values", temporal_name = "year", tt = "", ...) {
plot_rr_years ggplot(d) +
geom_sf(aes(fill = .data[[col_name]])) +
facet_wrap(~ .data[[temporal_name]], dir = "h", ncol = 7) +
scale_fill_gradient2(name = "Relative risk", midpoint = 1, low = "blue", mid = "white", high = "red") +
labs(x = "", y = "", title = tt) +
+
custom_theme theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank())
}
<- bind_cols(data[, c("county", "year", "geometry")], fitted.values = model_3_1$summary.fitted.values$mean)
data_3_1
plot_rr_years(d = data_3_1, tt = "Replicates")
Additive model
Alternatively, we can fit a model with an additive structure for random effects in space and time. For instance,
\begin{align*} \log(\theta_{it}) = \alpha + u_i + \varrho_t, \end{align*} where u_i \sim \text{ICAR} and, e.g., \varrho_t \sim \text{AR}(1).
<- Y ~ 0 + 1 + f(id_time, model = "rw1") + f(id_area, model = "besag", graph = g) formula_2
<- inla(formula = formula_2,
model_3_2 family = "poisson",
data = data,
E = E,
control.predictor = list(compute = TRUE))
summary(model_3_2)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 1.82, Running = 0.609, Post = 0.0369, Total = 2.46
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) -0.122 0.006 -0.132 -0.122 -0.111 -0.122 0
Random effects:
Name Model
id_time RW1 model
id_area Besags ICAR model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for id_time 534.89 181.071 262.98 507.25 967.07 456.29
Precision for id_area 5.85 0.964 4.16 5.78 7.95 5.66
Marginal log-Likelihood: -6008.24
is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
<- bind_cols(data[, c("county", "year", "geometry")], fitted.values = model_3_2$summary.fitted.values$mean)
data_3_2
plot_rr_years(d = data_3_2, tt = "Additive model")
Kronecker product model
A more elaborated solution is to consider a spatio-temporal random effect. In particular, we will fit a grouped (or Kronecker) separable model. A more rigorous motivation for this approach is available here.
In a nutshell, in the grouped random effected,
- There is a within group correlation structure, and
- There is a between group correlation structure.
In particular, if y_{g, i} is the i-the element in group g, then \text{Cov}(y_{g_1, i_1}, y_{g_2, i_2}) = (\text{cov. between groups } g_1 \text{ and } g_2) \times (\text{cov. between elements } i_1 \text{ and } i_2) (Simpson, 2016).
In INLA
, the within-group effect is the one defined in the main f()
function, while the between effect is the one defined using the control.group
argument.
<- Y ~ 0 + 1 + f(id_area, model = "besag", graph = g,
formula_3 group = id_time, control.group = list(model = "rw1"))
<- inla(formula = formula_3,
model_3_3 family = "poisson",
data = data,
E = E,
control.predictor = list(compute = TRUE))
summary(model_3_3)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 1.76, Running = 4.32, Post = 0.117, Total = 6.19
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) -0.106 0.006 -0.117 -0.106 -0.095 -0.106 0
Random effects:
Name Model
id_area Besags ICAR model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Precision for id_area 17.37 1.93 13.92 17.25 21.46 17.03
Marginal log-Likelihood: -9154.40
is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
<- bind_cols(data[, c("county", "year", "geometry")], fitted.values = model_3_3$summary.fitted.values$mean)
data_3_3
plot_rr_years(d = data_3_3, tt = "Kronecker product model (grouped by \"time\")")
Bernardinelli model
Lastly, we can also implement the Bernardinelli model (Bernardinelli et al., 1995) using INLA
. In this case, the linear predictor will be defined as follows
\begin{align*}
\log(\theta_{it}) = \alpha + u_i + v_i + (\beta + \delta_i) \times t_j,
\end{align*} where u_i + v_i is a iid + besag
model, \beta is the global linear trend, and \delta_i denotes a space-time interaction; in particular, this model allows each of the areas to have its own time trend with spatial intercept given by (\alpha + u_i + v_i) and slope given by (\beta + \delta_i) (Moraga, 2019).
<- Y ~ 0 + 1 + f(id_area, model = "bym", graph = g) + f(id_area_cp, id_time, model = "iid") + id_time formula_4
<- inla(formula = formula_4,
model_3_4 family = "poisson",
data = data,
E = E,
control.predictor = list(compute = TRUE))
summary(model_3_4)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 1.88, Running = 0.631, Post = 0.0372, Total = 2.55
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
(Intercept) -0.528 0.021 -0.570 -0.528 -0.487 -0.528 0
id_time 0.037 0.001 0.035 0.037 0.039 0.037 0
Random effects:
Name Model
id_area BYM model
id_area_cp IID model
Model hyperparameters:
mean sd 0.025quant 0.5quant
Precision for id_area (iid component) 34.24 8.45 20.12 33.40
Precision for id_area (spatial component) 91.53 102.38 13.76 61.33
Precision for id_area_cp 49251.00 17455.34 24120.92 46259.24
0.975quant mode
Precision for id_area (iid component) 53.16 31.96
Precision for id_area (spatial component) 358.12 31.78
Precision for id_area_cp 91906.40 40773.85
Marginal log-Likelihood: -5938.96
is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
<- bind_cols(data[, c("county", "year", "geometry")], fitted.values = model_3_4$summary.fitted.values$mean)
data_3_4
plot_rr_years(d = data_3_4, tt = "Bernardinelli model")
Geostatistical data
Throughout this section, we will assume that \begin{align*} y_i = \mu + \zeta(x_i) + \varepsilon_i, ~\forall i, \end{align*} where y = (y_1, \cdots, y_n) are the observations at x = (x_1, \cdots, x_n), \mu is the common mean, \zeta(x) is a zero-mean stationary and isotropic Gaussian random field, and \varepsilon \overset{\text{i.i.d.}}{\sim}\text{Normal}(0, \sigma^2_{\varepsilon}). In particular, \zeta(x) has a Matérn structure.
Other more sophisticated models are described in Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA (Krainski et al., 2019).
For fitting such a model, we can rely on a stochastic partial differential equation (SPDE) approach. As showed in Whittle (1963), a Gaussian random field with Matérn covariance structure can be expressed as a solution of the following SPDE \begin{align*} (\kappa^2 - \Delta)^{\alpha/2}(\tau S(x)) = \mathcal{W}(x), \end{align*} where \Delta is the Laplacian, \mathcal{W}(s) is a Gaussian white-noise random process, \alpha controls the smoothness of the random field, \tau controls the variance, and \kappa is a scale parameter.
Based on this result, Lindgren et al. (2011) proposed a new procedure for representing a Gaussian random field with Matérn covariance as a Gaussian Markov Random Field (GMRF). They did this by expressing a solution of the SPDE using the Finite Element Method (FEM).
As in Moraga (2019),
An approximate solution of the SPDE can be found by using the Finite Element method. This method divides the spatial domain into a set of non-intersecting triangles leading to a triangulated mesh with n nodes and n basis functions. Basis functions \{\phi_k(x)\}_{k = 1}^{n} are defined as piecewise linear functions on each triangle that is equal to 1 at vertex k, and equal to 0 at the other vertices. Then, the continuously indexed Gaussian field \zeta is represented as a discretely indexed Gaussian Markov random field (GMRF) by means of the finite basis functions defined on the triangulated mesh \zeta(x) = \sum_{k = 1}^{n} \beta_k \phi_k(x), where n is the number of vertices of the triangulation, \{\phi_k(x)\}_{k = 1}^{n} denotes the piecewise linear basis functions, and \{\beta_k\}_{k = 1}^{n} are zero-mean Gaussian distributed weights.
The smoothness parameter \nu of the Matérn covariance structure is related to \alpha in the SPDE formulation through \alpha = \nu + (d / 2), where d (typically d = 2) is dimension. In INLA
, 0 \leq \alpha < 2.
PM2.5
We will analyze the particulate matter 2.5 (PM2.5), measured in \mu \text{g} / \text{m}^3) levels in the United States (USA) in 2022. The data is the same as the one analyzed in Amaral et al. (2023). The data can be downloaded from here.
<- readRDS(file = "data/example_4/data_USA.rds")
data_USA <- readRDS(file = "data/example_4/USA_filtered.rds")
USA
head(data_USA)
Simple feature collection with 6 features and 2 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 415.5262 ymin: 3374.241 xmax: 611.4046 ymax: 3794.633
Projected CRS: +init=epsg:6345 +units=km +no_defs
# A tibble: 6 × 3
mean sd geometry
<dbl> <dbl> <POINT [km]>
1 7.95 4.52 (415.5262 3374.241)
2 7.3 3.96 (611.4046 3683.513)
3 7.24 3.70 (594.8104 3794.633)
4 8.66 4.20 (593.0379 3761.67)
5 9.75 5.28 (517.1733 3712.617)
6 8.44 3.39 (499.6639 3687.995)
ggplot() +
geom_sf(data = USA, fill = "white") +
geom_sf(data = data_USA, aes(fill = mean), color = "black", size = 3, shape = 21) +
scale_fill_gradientn(name = "PM2.5 level", colors = pal) +
labs(x = "", y = "", title = "") +
+
custom_theme theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank())
Now, let us pre-process the data.
# Boundary coordinates
<- sf::st_coordinates(USA)
USA_coor <- matrix(c(USA_coor[, 1], USA_coor[, 2]), ncol = 2)
USA_coor colnames(USA_coor) <- c("lon", "lat")
head(USA_coor)
lon lat
[1,] -91.01918 5501.724
[2,] -90.93677 5501.492
[3,] -90.68302 5501.466
[4,] -90.57107 5501.223
[5,] -90.35477 5501.060
[6,] -90.34765 5501.055
# Points coordinates
<- sf::st_coordinates(data_USA)
data_coor <- bind_cols(data_USA, as_tibble(data_coor))
data_USA <- data_USA %>% rename(lon = X, lat = Y) %>% dplyr::select(mean, sd, lon, lat, geometry)
data_USA head(data_USA)
Simple feature collection with 6 features and 4 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 415.5262 ymin: 3374.241 xmax: 611.4046 ymax: 3794.633
Projected CRS: +init=epsg:6345 +units=km +no_defs
# A tibble: 6 × 5
mean sd lon lat geometry
<dbl> <dbl> <dbl> <dbl> <POINT [km]>
1 7.95 4.52 416. 3374. (415.5262 3374.241)
2 7.3 3.96 611. 3684. (611.4046 3683.513)
3 7.24 3.70 595. 3795. (594.8104 3794.633)
4 8.66 4.20 593. 3762. (593.0379 3761.67)
5 9.75 5.28 517. 3713. (517.1733 3712.617)
6 8.44 3.39 500. 3688. (499.6639 3687.995)
And create the mesh with the inla.mesh.2d()
function. A tool for mesh assessment is described here.
# `max.edge`: the largest allowed triangle edge length. One or two values.
# `offset`: the automatic extension distance. One or two values, for an inner and an optional outer extension.
<- inla.mesh.2d(loc.domain = USA_coor, max.edge = c(300, 3000), offset = c(300, 1500))
mesh
$n # Number of nodes mesh
[1] 680
# Plot `mesh`
{plot(mesh)
plot(USA$geometry, lwd = 2, border = "red", add = TRUE)
points(data_USA$lon, data_USA$lat, pch = 1, col = "green")
}
We can build the SPDE model using one of the following functions: inla.spde2.matern()
or inla.spde2.pcmatern
. Notice that the parameterization of these two models is different, such that the latter accepts PC priors.
# Building the SPDE model
# alpha = nu + d / 2 = 1 + 1 = 2, for nu = 1
# Flexible parameterization (with a default one)
# spde <- inla.spde2.matern(mesh = mesh, alpha = 2)
# Parameterization for PC priors
<- inla.spde2.pcmatern(mesh = mesh,
spde alpha = 2,
prior.range = c(1e3, 0.90), # P(range < 1e3) = 0.90
prior.sigma = c(1.0, 0.01)) # P(sigma > 1.0) = 0.01
Now, we can create an indxs
object for the SPDE model, as well as the projection matrix. The projection matrix depends on the mesh
and data_coor
and is used to project the Gaussian random field (GRF) from the observations to the triangulation vertices.
# Indices
<- inla.spde.make.index("s", spde$n.spde)
indxs # Projection matrix
<- inla.spde.make.A(mesh = mesh, loc = data_coor)
A dim(A)
[1] 928 680
To make prediction, we have to define the set of coordinates where we want to estimate the process. Given location boundaries and the desired resolution, the function create_prediction_grid()
does the job.
<- function (loc, resolution = 25, ...) {
create_prediction_grid <- loc$geometry[[1]][[1]]
pts_bdy <- range(pts_bdy[, 1])
pts_bdy_x <- range(pts_bdy[, 2])
pts_bdy_y <- expand.grid(x = seq(pts_bdy_x[1], pts_bdy_x[2], by = resolution), y = seq(pts_bdy_y[1], pts_bdy_y[2], by = resolution))
coord_pred coordinates(coord_pred) <- ~ x + y
<- as(st_as_sf(coord_pred), "sf"); st_crs(xx) <- st_crs(loc$geometry)
xx <- as(st_as_sf(loc$geometry), "sf"); st_crs(yy) <- st_crs(loc$geometry)
yy
# Compute intersection between grid and `loc` borders
<- st_intersection(x = xx, y = yy)
pppts
<- matrix(data = NA, nrow = length(pppts$geometry), ncol = 2)
coord_pred colnames(coord_pred) <- c("x", "y")
for (p in 1:length(pppts$geometry)) { coord_pred[p, ] <- sf::st_coordinates(pppts$geometry[[p]]) }
<- data.frame(x = coord_pred[, 1], y = coord_pred[, 2])
coord_pred
as.matrix(coord_pred)
}
<- create_prediction_grid(USA) coord_pred
{<- as.data.frame(coord_pred)
coord_pred_cp coordinates(coord_pred_cp) <- ~ x + y
plot(coord_pred_cp)
}
Similar to before, we must create a projection matrix for the locations where we want to make prediction.
# Projection matrix for prediction
<- inla.spde.make.A(mesh = mesh, loc = coord_pred) Ap
Lastly, we can organize the data in stacks.
# Create stacks
# Stack for estimation
<- inla.stack(tag = "est",
stk_e data = list(y = data_USA$mean),
A = list(1, A),
effects = list(data.frame(b0 = rep(1, nrow(data_USA))), s = indxs))
# Stack for prediction
<- inla.stack(tag = "pred",
stk_p data = list(y = NA),
A = list(1, Ap),
effects = list(data.frame(b0 = rep(1, nrow(coord_pred))), s = indxs))
# Full stack
<- inla.stack(stk_e, stk_p) stk_full
After all these steps, we can finally fit the model.
<- y ~ 0 + b0 + f(s, model = spde) formula_1
<- inla(formula = formula_1,
model_4_1 family = "gaussian",
data = inla.stack.data(stk_full),
control.predictor = list(compute = TRUE,
A = inla.stack.A(stk_full))) # Matrix of predictors
summary(model_4_1)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 2.36, Running = 1.89, Post = 0.118, Total = 4.37
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
b0 6.818 0.304 6.208 6.821 7.414 6.821 0
Random effects:
Name Model
s SPDE2 model
Model hyperparameters:
mean sd 0.025quant 0.5quant
Precision for the Gaussian observations 0.592 0.032 0.531 0.591
Range for s 492.618 86.960 345.990 484.328
Stdev for s 1.680 0.136 1.428 1.675
0.975quant mode
Precision for the Gaussian observations 0.656 0.59
Range for s 687.299 466.93
Stdev for s 1.964 1.67
Marginal log-Likelihood: -1721.00
is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
As we did in the “Multiple linear regression” example, we can work with the posterior marginals.
<- inla.tmarginal(fun = function(x) { 1 / sqrt(x) }, marginal = model_4_1$marginals.hyperpar$`Precision for the Gaussian observations`)
ss
ggplot(data.frame(inla.smarginal(ss))) +
geom_line(aes(x, y)) +
labs(x = "", y = "", title = "Posterior of the standard deviation for the observations") +
custom_theme
ggplot(data.frame(inla.smarginal(model_4_1$marginals.hyperpar$`Range for s`))) +
geom_line(aes(x, y)) +
labs(x = "", y = "", title = "Posterior of the range in the Matérn model") +
custom_theme
ggplot(data.frame(inla.smarginal(model_4_1$marginals.hyperpar$`Stdev for s`))) +
geom_line(aes(x, y)) +
labs(x = "", y = "", title = "Posterior of the standard deviation in the Matérn model") +
custom_theme
Remark: According to this parameterization, the scale parameter \kappa > 0 can be expressed by \frac{\sqrt{8\nu}}{\text{range}}, where \nu is the smoothness parameter.
Prediction
Now, let us retrieve the predicted values.
# Fitted values and prediction
# Indices for the predicted observations
<- inla.stack.index(stk_full, tag = "pred")$data
idxs_pred
<- as.data.frame(cbind(coord_pred, model_4_1$summary.fitted.values[idxs_pred, "mean"]))
pred_mm <- as.data.frame(cbind(coord_pred, model_4_1$summary.fitted.values[idxs_pred, "0.025quant"]))
pred_ll <- as.data.frame(cbind(coord_pred, model_4_1$summary.fitted.values[idxs_pred, "0.975quant"]))
pred_uu
# E.g.,
%>% as_tibble() %>% rename(posterior_mean = V3) %>% head() pred_mm
# A tibble: 6 × 3
x y posterior_mean
<dbl> <dbl> <dbl>
1 1026. 2733. 7.02
2 1051. 2733. 7.00
3 1076. 2733. 6.96
4 1076. 2758. 7.01
5 1126. 2758. 6.97
6 1101. 2783. 7.00
To plot the posterior mean and percentiles, we will use the function plot_pred_USA()
.
<- function (fitted_values, USA, r, tt = "", should_round = TRUE, ...) {
plot_pred_USA
coordinates(fitted_values) <- ~ x + y
gridded(fitted_values) <- TRUE
<- raster(fitted_values)
fitted_values crs(fitted_values) <- "+init=epsg:6345 +units=km +no_defs"
<- as(fitted_values, "SpatialPixelsDataFrame")
fitted_values <- as.data.frame(fitted_values)
fitted_values_df colnames(fitted_values_df) <- c("pred", "x", "y")
if (should_round) {
<- seq(floor(r[1]), ceiling(r[2]), length.out = 5)
breaks else {
} <- seq(r[1], r[2], length.out = 5)
breaks
}
<- c("#00008FFF", "#0000F2FF", "#0063FFFF", "#00D4FFFF", "#46FFB8FF", "#B8FF46FF", "#FFD400FF", "#FF6300FF", "#F00000FF", "#800000FF")
pal
<- ggplot() +
pp geom_tile(data = fitted_values_df, mapping = aes(x = x, y = y, fill = pred)) +
geom_sf(data = USA, color = "black", fill = NA, lwd = 0.5) +
scale_fill_gradientn(name = "PM2.5", colors = pal, breaks = breaks, limits = c(breaks[1], tail(breaks, 1))) +
labs(x = "", y = "", title = tt) +
+
custom_theme theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank())
pp }
The patchwork
package allows us to combine plots.
<- pred_mm$V3 %>% range()
r_mm <- pred_ll$V3 %>% range()
r_ll <- pred_uu$V3 %>% range()
r_uu <- c(min(r_mm[1], r_ll[1], r_uu[1]), max(r_mm[2], r_ll[2], r_uu[2]))
r
<- plot_pred_USA(fitted_values = pred_mm, USA = USA, r = r, tt = "Mean")
pp_mm <- plot_pred_USA(fitted_values = pred_ll, USA = USA, r = r, tt = "2.5th")
pp_ll <- plot_pred_USA(fitted_values = pred_uu, USA = USA, r = r, tt = "97.5th")
pp_uu
+ pp_mm + pp_uu) + plot_layout(guides = "collect") & theme(legend.position = "right") (pp_ll
Point processes
In this section, we will see how to model a log-Gaussian Cox process (LGCP) using INLA
. However, one can refer to this tutorial for a more detailed explanation on how to make inference in point processes (also using INLA
).
Recall that
A Cox process can be seen as a doubly stochastic process. \xi is a Cox process driven by \Lambda(x) if
\{\Lambda(x); x \in \mathcal{D}\} is a non-negative valued stochastic process.
Conditional on \{\Lambda(x) = \lambda(x); \mathbf{x} \in \mathcal{D}\}, \xi is a Poisson process with intensity function \lambda(x).
A particular case of a Cox process, named log-Gaussian Cox process, can be constructed by setting \log(\Lambda(x)) = \mu^{\star}(x) + \zeta(x), such that \mu(x) = \exp(\mu^{\star}(x)) is possibly interpreted as the mean structure of \Lambda(x), and \zeta(x) is a stationary Gaussian process, such that \mathbb{E}(\zeta(x)) = -\sigma^2/2, \forall x, and \text{Cov}(\zeta(x_1), \zeta(x_2)) = \phi(h) = \sigma^2 \rho(h), where h is the distance between x_1 and x_2, and \sigma^2 is the variance of \zeta(x).
As we can see here, a way to fit a LGCP with INLA
is placing a regular grid, given by cells \{c_{ij}\}_{ij}, on top of the spatial domain, such that \int_{c_{i,j}}\Lambda(x)dx \approx |c_{i,j}|\Lambda(x), where |\cdot| denotes the area.
Thus, conditional on \zeta(x), the number of locations in the grid cell c_{ij}, \forall i, j, are independent and Poisson distributed, such that \begin{align*}
\mathcal{N}(c_{ij})|\zeta(x) &\sim \text{Poisson}(|c_{ij}| \Lambda(x_{ij})), \forall i, j, \\
\log(\Lambda(x_{ij})) &= \alpha + \cdots + \zeta(x_{ij})
\end{align*} where \zeta(x) is a Gaussian field (e.g., matern2d
).
However, Simpson et al.(2016) proposed a method to make inference on LGCP while still considering the exact observation locations—instead of aggregating them over a regular grid and fitting a counting model for each cell.
The main idea consists of an approximation of the likelihood by fitting a Poisson model on an augmented data set made of the observation locations and integration points from the corresponding SPDE mesh (1
for the observed points and 0
for the dummy variables, such that all have associated “expected values” e
). The expected number of events in the integration points is defined to be proportional to the area around the node (the areas of the polygons in the dual mesh)
Remark: The main reference for this section (and the next one) is Chapter 4: Point processes and preferential sampling.
PM2.5 stations
Although the problem of estimating PM2.5 depends, obviously, on the observed pollution levels, we will ignore it for now and focus on the sampling process, i.e., the locations where the monitoring stations were placed and treat them as a realization of a random process. In particular, a LGCP.
head(data_USA)
Simple feature collection with 6 features and 4 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 415.5262 ymin: 3374.241 xmax: 611.4046 ymax: 3794.633
Projected CRS: +init=epsg:6345 +units=km +no_defs
# A tibble: 6 × 5
mean sd lon lat geometry
<dbl> <dbl> <dbl> <dbl> <POINT [km]>
1 7.95 4.52 416. 3374. (415.5262 3374.241)
2 7.3 3.96 611. 3684. (611.4046 3683.513)
3 7.24 3.70 595. 3795. (594.8104 3794.633)
4 8.66 4.20 593. 3762. (593.0379 3761.67)
5 9.75 5.28 517. 3713. (517.1733 3712.617)
6 8.44 3.39 500. 3688. (499.6639 3687.995)
ggplot() +
geom_sf(data = USA, fill = "white") +
geom_sf(data = data_USA, color = "black", size = 3, shape = 3) +
labs(x = "", y = "", title = "") +
+
custom_theme theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank())
We will also use the data pre-processed in the previous section.
head(data_coor)
X Y
[1,] 415.5262 3374.241
[2,] 611.4046 3683.513
[3,] 594.8104 3794.633
[4,] 593.0379 3761.670
[5,] 517.1733 3712.617
[6,] 499.6639 3687.995
head(USA_coor)
lon lat
[1,] -91.01918 5501.724
[2,] -90.93677 5501.492
[3,] -90.68302 5501.466
[4,] -90.57107 5501.223
[5,] -90.35477 5501.060
[6,] -90.34765 5501.055
Let us start with the mesh and the SPDE model. For the former, we will use the same object (mesh
) we created for the previous section, for the latter, we may want to use different priors.
plot(mesh)
<- inla.spde2.pcmatern(mesh = mesh,
spde alpha = 2,
prior.range = c(1e3, 0.90), # P(range < 1e3) = 0.90
prior.sigma = c(1.0, 0.01)) # P(sigma > 1.0) = 0.01
We can then built the dual mesh using the function inla.dual.mesh()
(extracted from Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA).
<- function (mesh, ...) {
inla.dual.mesh if (mesh$manifold == "R2") {
<- t(sapply(1:nrow(mesh$graph$tv), function (i) { colMeans(mesh$loc[mesh$graph$tv[i, ], 1:2]) }))
ce library("parallel")
<- mclapply(1:mesh$n, function (i) {
pls <- unique(Reduce("rbind", lapply(1:3, function (k) {
p <- which(mesh$graph$tv[,k] == i)
j if (length(j) > 0) {
return(rbind(ce[j, , drop = FALSE],
cbind(mesh$loc[mesh$graph$tv[j, k], 1] + mesh$loc[mesh$graph$tv[j, c(2:3, 1)[k]], 1],
$loc[mesh$graph$tv[j, k], 2] + mesh$loc[mesh$graph$tv[j, c(2:3, 1)[k]], 2]) / 2))
meshelse {
} return(ce[j, , drop = FALSE])
}
})))<- which(mesh$segm$bnd$idx[, 1] == i)
j1 <- which(mesh$segm$bnd$idx[, 2] == i)
j2 if ((length(j1) > 0) | (length(j2) > 0)) {
<- unique(rbind(mesh$loc[i, 1:2], p,
p $loc[mesh$segm$bnd$idx[j1, 1], 1:2] / 2 + mesh$loc[mesh$segm$bnd$idx[j1, 2], 1:2] / 2,
mesh$loc[mesh$segm$bnd$idx[j2, 1], 1:2] / 2 + mesh$loc[mesh$segm$bnd$idx[j2, 2], 1:2] / 2))
mesh<- p[, 2] - mean(p[, 2]) / 2 - mesh$loc[i, 2] / 2
yy <- p[, 1] - mean(p[, 1]) / 2 - mesh$loc[i, 1] / 2
xx else {
} <- p[, 2] - mesh$loc[i, 2]
yy <- p[, 1] - mesh$loc[i, 1]
xx
}Polygon(p[order(atan2(yy, xx)), ])
})return(SpatialPolygons(lapply(1:mesh$n, function (i) { Polygons(list(pls[[i]]), i)} )))
}else { stop("It only works for R2.") }
}
<- inla.dual.mesh(mesh) dual_mesh
Now, we can compute the weights (wgt
) based on the dual mesh, such that the regions outside of the location boundaries have weight 0
.
<- SpatialPolygons(list(Polygons(list(Polygon(USA_coor)), ID = "1")))
USA_poly <- sapply(1:length(dual_mesh), function (i) {
wgt if (gIntersects(dual_mesh[i, ], USA_poly)) {
return(gArea(gIntersection(dual_mesh[i, ], USA_poly)))
else {
} return(0)
} })
hist(wgt, main = "")
# Plotting dual mesh color coded based on the `wgt`
{plot(mesh$loc, asp = 1, col = (wgt == 0) + 1, pch = 19, xlab = "", ylab = "", axes = F, cex = 0.6)
plot(mesh, add = TRUE)
plot(dual_mesh, add = TRUE)
plot(USA$geometry, add = TRUE, border = "green", lwd = 2)
}
As before, we must construct the projection matrices and format the data in the “right” way (i.e., in stacks). For prediction, we use assume the same coord_pred
as in the previous section.
head(coord_pred)
x y
[1,] 1025.571 2732.56
[2,] 1050.571 2732.56
[3,] 1075.571 2732.56
[4,] 1075.571 2757.56
[5,] 1125.571 2757.56
[6,] 1100.571 2782.56
<- mesh$n
n_vtx <- nrow(data_USA)
n_pts <- nrow(coord_pred)
n_pts_pred
# Indices
<- inla.spde.make.index("s", spde$n.spde)
indxs
# augmented data: `0` for the mesh nodes and `1` for the observations
<- rep(0:1, c(n_vtx, n_pts))
y_pp # Exposure vector
<- c(wgt, rep(0, n_pts))
e_pp # Projection matrix (in two parts)
# (1) For the integration points, this is just a diagonal matrix--as these locations are just the mesh vertices
<- Diagonal(n_vtx, rep(1, n_vtx))
imat # (2) For the observed points, the projection matrix is defined with `inla.spde.make.A`
<- inla.spde.make.A(mesh, data_coor)
ymat # (1) + (2)
<- rbind(imat, ymat)
A_pp
# Prediction
<- inla.spde.make.A(mesh = mesh, loc = coord_pred) # Projection matrix for the prediction points
A_pp_p
# Create stacks
# Stack for estimation
<- inla.stack(tag = "est_pp",
stk_pp_e data = list(y = y_pp, e = e_pp),
A = list(1, A_pp),
effects = list(alpha_pp = rep(1, n_vtx + n_pts), s = indxs))
<- inla.stack(tag = "pred_pp",
stk_pp_p data = list(y = rep(NA, n_pts_pred), e = rep(1, n_pts_pred)),
A = list(1, A_pp_p),
effects = list(alpha_pp = rep(1, n_pts_pred), s = indxs))
# Full stack
<- inla.stack(stk_pp_e, stk_pp_p) stk_full_pp
Let us fit the model. Note that we are using a Poisson likelihood.
<- y ~ 0 + alpha_pp + f(s, model = spde)
formula_1
<- inla(formula = formula_1,
model_5_1 family = "poisson",
E = inla.stack.data(stk_full_pp)$e,
data = inla.stack.data(stk_full_pp),
control.predictor = list(link = 1,
compute = TRUE,
A = inla.stack.A(stk_full_pp)))
summary(model_5_1)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 2.66, Running = 2.41, Post = 0.15, Total = 5.21
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
alpha_pp -9.187 0.334 -9.811 -9.202 -8.476 -9.251 0
Random effects:
Name Model
s SPDE2 model
Model hyperparameters:
mean sd 0.025quant 0.5quant 0.975quant mode
Range for s 1217.356 305.964 749.947 1173.118 1945.13 1079.317
Stdev for s 0.937 0.147 0.692 0.922 1.27 0.886
Marginal log-Likelihood: -9117.83
is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Prediction
Now, let us retrieve the predicted values for the intensity function and plot them.
<- inla.stack.index(stk_full_pp, tag = "pred_pp")$data
idx_pp
<- as.data.frame(cbind(coord_pred, model_5_1$summary.fitted.values[idx_pp, "mean"]))
pred_pp_mm <- as.data.frame(cbind(coord_pred, model_5_1$summary.fitted.values[idx_pp, "0.025quant"]))
pred_pp_ll <- as.data.frame(cbind(coord_pred, model_5_1$summary.fitted.values[idx_pp, "0.975quant"]))
pred_pp_uu
# Expected number of observations
sum(pred_pp_mm$V3 * (25 ** 2))
[1] 923.9056
<- pred_pp_mm$V3 %>% range()
r_mm <- pred_pp_ll$V3 %>% range()
r_ll <- pred_pp_uu$V3 %>% range()
r_uu <- c(min(r_mm[1], r_ll[1], r_uu[1]), max(r_mm[2], r_ll[2], r_uu[2]))
r
<- plot_pred_USA(fitted_values = pred_pp_mm, USA = USA, r = r, tt = "Mean", should_round = FALSE)
pp_pp_mm <- plot_pred_USA(fitted_values = pred_pp_ll, USA = USA, r = r, tt = "2.5th", should_round = FALSE)
pp_pp_ll <- plot_pred_USA(fitted_values = pred_pp_uu, USA = USA, r = r, tt = "97.5th", should_round = FALSE)
pp_pp_uu
+ pp_pp_mm + pp_pp_uu) + plot_layout(guides = "collect") & theme(legend.position = "right") (pp_pp_ll
Preferential sampling (PS)
At this point, we can implement a more sophisticated model for PM2.5. We will assume that the sampling process (modeled in the previous section) is not independent of the latent field. In our problem, this is equivalent to stating that the locations where the monitoring stations were placed are somehow influenced by the pollution levels at those sites. This is known as preferential sampling (Diggle et al., 2010).
Formally, \begin{align*} y_i &= \mu + \zeta(x_i) + \varepsilon_i, ~\forall i, \\ \xi|\zeta(x) &\sim \text{PPP}(\Lambda(x)) \\ \log(\Lambda(x)) &= \alpha + \cdots + \gamma \cdot \zeta(x) \end{align*} where \gamma \in \mathbb{R} is the “degree of preferentiality”, and all other quantities are defined as before.
In INLA
we can fit this model using multiple likelihoods and copy
of the random effects.
PM2.5
To fit the model, we will re-use many objects we already defined in previous sections.
# The following objects will be re-used
# `data_USA`
# `USA`
# `data_coor`
# `USA_coor`
# `mesh`
# `dual_mesh`
# `wgt`
# `coord_pred`
ggplot() +
geom_sf(data = USA, fill = "white") +
geom_sf(data = data_USA, aes(fill = mean), color = "black", size = 3, shape = 21) +
scale_fill_gradientn(name = "PM2.5 level", colors = pal) +
labs(x = "", y = "", title = "") +
+
custom_theme theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank())
As the SPDE model may have different priors, we will define it again.
<- inla.spde2.pcmatern(mesh = mesh,
spde alpha = 2,
prior.range = c(1e3, 0.90), # P(range < 1e3) = 0.90
prior.sigma = c(1.0, 0.01)) # P(sigma > 1.0) = 0.01
Similar to before, let us construct the projection matrices and format the data into stacks.
<- mesh$n
n_vtx <- nrow(data_USA)
n_pts <- nrow(coord_pred)
n_pts_pred
<- inla.spde.make.index("s", spde$n.spde)
indxs <- inla.spde.make.index("v", spde$n.spde)
indxv
<- rep(0:1, c(n_vtx, n_pts))
y_pp <- c(wgt, rep(0, n_pts))
e_pp <- Diagonal(n_vtx, rep(1, n_vtx))
imat <- inla.spde.make.A(mesh, data_coor)
ymat <- rbind(imat, ymat)
A_pp # Prediction
<- inla.spde.make.A(mesh = mesh, loc = coord_pred) A_pp_p
The stacks will have two columns for the data
component, one for each likelihood. Besides, notice that we have stacks for the gaussian
component (stk_y_X
) and for the poisson
component (stk_pp_X
)—both for estimation and prediction. At the end, all of them are combined into one object (stk_full_pp_y
).
# Create stacks
# `data` has two columns, one for each likelihood
<- inla.stack(tag = "est_y",
stk_y_e data = list(y = cbind(data_USA$mean, NA), e = rep(NA, n_pts)),
A = list(1, ymat),
effects = list(mu = rep(1, n_pts), s = indxs))
<- inla.stack(tag = "pred_y",
stk_y_p data = list(y = cbind(rep(NA, n_pts_pred), NA), e = rep(NA, n_pts_pred)),
A = list(1, A_pp_p),
effects = list(mu = rep(1, n_pts_pred), s = indxs))
<- inla.stack(tag = "est_pp",
stk_pp_e data = list(y = cbind(NA, y_pp), e = e_pp),
A = list(1, A_pp),
effects = list(alpha_pp = rep(1, n_vtx + n_pts), v = indxv))
<- inla.stack(tag = "pred_pp",
stk_pp_p data = list(y = cbind(NA, rep(NA, n_pts_pred)), e = rep(1, n_pts_pred)),
A = list(1, A_pp_p),
effects = list(alpha_pp = rep(1, n_pts_pred), v = indxv))
# Full stack
<- inla.stack(stk_y_e, stk_y_p, stk_pp_e, stk_pp_p) stk_full_pp_y
Now, let us fit the model.
To do so, notice that we are “copying” a random effect. The copy
feature allows for the inclusion of a shared term among several linear predictors. Besides, we are assigning a \text{Normal}(0, 1) prior for \gamma.
<- list(prior = "gaussian", param = c(0, 1))
re_prior
# `fixed = FALSE` allows the copy coefficient to be estimated
<- y ~ 0 + mu + alpha_pp + f(s, model = spde) + f(v, copy = "s", fixed = FALSE, hyper = list(beta = re_prior)) formula_1
<- inla(formula = formula_1,
model_6_1 family = c("gaussian", "poisson"),
E = inla.stack.data(stk_full_pp_y)$e,
data = inla.stack.data(stk_full_pp_y),
control.predictor = list(link = rep(c(1, 2), c((n_pts + n_pts_pred), (n_vtx + n_pts + n_pts_pred))),
compute = TRUE,
A = inla.stack.A(stk_full_pp_y)))
summary(model_6_1)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 2.63, Running = 5.76, Post = 0.508, Total = 8.9
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
mu 6.653 0.378 5.927 6.643 7.442 6.645 0
alpha_pp -9.278 0.238 -9.736 -9.284 -8.778 -9.283 0
Random effects:
Name Model
s SPDE2 model
v Copy
Model hyperparameters:
mean sd 0.025quant 0.5quant
Precision for the Gaussian observations 0.409 0.026 0.360 0.409
Range for s 971.868 293.364 548.388 922.273
Stdev for s 1.302 0.206 0.955 1.283
Beta for v 0.634 0.066 0.508 0.633
0.975quant mode
Precision for the Gaussian observations 0.462 0.408
Range for s 1689.346 822.169
Stdev for s 1.764 1.238
Beta for v 0.766 0.629
Marginal log-Likelihood: -10982.56
is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
ggplot(data.frame(inla.smarginal(model_6_1$marginals.hyperpar$`Beta for v`))) +
geom_line(aes(x, y)) +
labs(x = "", y = "", title = "Posterior of the degree of preferentiality (gamma)") +
custom_theme
Prediction
Similar to before, we can finally retrieve the fitted values—both for the response and intensity processes.
## Response
<- inla.stack.index(stk_full_pp_y, tag = "pred_y" )$data
idx_y
<- as.data.frame(cbind(coord_pred, model_6_1$summary.fitted.values[idx_y, "mean"]))
pred_y_mm <- as.data.frame(cbind(coord_pred, model_6_1$summary.fitted.values[idx_y, "0.025quant"]))
pred_y_ll <- as.data.frame(cbind(coord_pred, model_6_1$summary.fitted.values[idx_y, "0.975quant"]))
pred_y_uu
<- pred_y_mm$V3 %>% range()
r_mm <- pred_y_ll$V3 %>% range()
r_ll <- pred_y_uu$V3 %>% range()
r_uu <- c(min(r_mm[1], r_ll[1], r_uu[1]), max(r_mm[2], r_ll[2], r_uu[2]))
r
<- plot_pred_USA(fitted_values = pred_y_mm, USA = USA, r = r, tt = "Mean")
pp_y_mm <- plot_pred_USA(fitted_values = pred_y_ll, USA = USA, r = r, tt = "2.5th")
pp_y_ll <- plot_pred_USA(fitted_values = pred_y_uu, USA = USA, r = r, tt = "97.5th")
pp_y_uu
+ pp_y_mm + pp_y_uu) + plot_layout(guides = "collect") & theme(legend.position = "right") (pp_y_ll
## Intensity process
<- inla.stack.index(stk_full_pp_y, tag = "pred_pp")$data
idx_pp
<- as.data.frame(cbind(coord_pred, model_6_1$summary.fitted.values[idx_pp, "mean"]))
pred_pp_mm <- as.data.frame(cbind(coord_pred, model_6_1$summary.fitted.values[idx_pp, "0.025quant"]))
pred_pp_ll <- as.data.frame(cbind(coord_pred, model_6_1$summary.fitted.values[idx_pp, "0.975quant"]))
pred_pp_uu
# Expected number of observations
sum(pred_pp_mm$V3 * (25 ** 2))
[1] 903.7687
<- pred_pp_mm$V3 %>% range()
r_mm <- pred_pp_ll$V3 %>% range()
r_ll <- pred_pp_uu$V3 %>% range()
r_uu <- c(min(r_mm[1], r_ll[1], r_uu[1]), max(r_mm[2], r_ll[2], r_uu[2]))
r
<- plot_pred_USA(fitted_values = pred_pp_mm, USA = USA, r = r, tt = "Mean", should_round = FALSE)
pp_pp_mm <- plot_pred_USA(fitted_values = pred_pp_ll, USA = USA, r = r, tt = "2.5th", should_round = FALSE)
pp_pp_ll <- plot_pred_USA(fitted_values = pred_pp_uu, USA = USA, r = r, tt = "97.5th", should_round = FALSE)
pp_pp_uu
+ pp_pp_mm + pp_pp_uu) + plot_layout(guides = "collect") & theme(legend.position = "right") (pp_pp_ll
One more thing (Spat. varying PS)
Expanding upon the PS model presented in the previous section, it may be desirable to enable the degree of preferentiality to vary across space. This could be especially beneficial when analyzing extensive regions, as the strength of dependence between the latent field and the sampling process is unlikely to remain constant across the spatial domain. This approach was originally proposed by Amaral et al. (2023) with code available here.
In particular, \begin{align*} y_i &= \mu + \zeta(x_i) + \epsilon_i, \forall i \\ \xi|\zeta(x) &\sim \text{PPP}(\Lambda(x)) \\ \log(\lambda(x)) &= \alpha + \cdots + \gamma(x) \cdot \zeta(x) \\ \gamma(x) &= \sum_{k = 1}^{K}\beta_k \phi_k(x), \text{ s.t. } \beta_k \overset{\text{i.i.d.}}{\sim} \text{Normal}(0, \sigma^2_{\beta}), \forall k. \end{align*} From the above formulation, notice that \gamma(x) is approximated by a combination of unknown coefficients \{\beta_k\}_k and basis functions \{\phi_k(x)\}_k. All remaining quantities are defined as before.
The type, number, and location of such basis functions are problem-dependent and Amaral et al. (2023) provide guidelines on how to set these aspects of the model.
For this tutorial, we will use “radial basis” built using a compactly supported Wendland (Wendland, 2015) function defined in space.
Radial Basis (Wendland): let h_k = \left(||x - a_k|| / b_k\right), such that b_k \in \mathbb{R} and a_k \in \mathbb{R}^2, \forall k. Under this setting, a_k is also known as a “node point,” and, sometimes, b_k can also be referred to as “bandwidth.” Then, \phi_k(x) = \begin{cases} (1 - h_k)^6 (35h_k^2 + 18h_k + 3) / 3 & h_k \in [0, 1] \\ 0 & \text{otherwise.} \end{cases}
PM2.5
Again, to fit the model, we will re-use objects we already defined in previous sections.
# The following objects will be re-used
# `data_USA`
# `USA`
# `data_coor`
# `USA_coor`
# `mesh`
# `dual_mesh`
# `wgt`
# `coord_pred`
ggplot() +
geom_sf(data = USA, fill = "white") +
geom_sf(data = data_USA, aes(fill = mean), color = "black", size = 3, shape = 21) +
scale_fill_gradientn(name = "PM2.5 level", colors = pal) +
labs(x = "", y = "", title = "") +
+
custom_theme theme(axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank())
As the SPDE model may have different priors, we will define it again.
<- inla.spde2.pcmatern(mesh = mesh,
spde alpha = 2,
prior.range = c(1e3, 0.90), # P(range < 1e3) = 0.90
prior.sigma = c(1.0, 0.01)) # P(sigma > 1.0) = 0.01
The next step is to pre-compute the basis functions on the “mesh nodes + observed locations” for estimation and on the “prediction locations” for prediction. To do so, we will use the functions basis_functions()
and Wendland()
.
Remark: user can define other smoothing kernels (e.g., Wendland()
) and use them along with basis_functions()
.
# Generic function
<- function (center_pts, loct, mesh = NULL, bandwidth, smoothing_kernel = "Wendland", ...) {
basis_functions if (is.null(mesh)) {
<- loct
all_pts else {
} <- rbind(mesh$loc[, 1:2], loct)
all_pts
}
<- get(smoothing_kernel)
f <- list()
b <- nrow(center_pts)
n_basis_functions
for (i in 1:n_basis_functions) { b[[i]] <- f(x = all_pts[, 1], y = all_pts[, 2], center_x = center_pts[i, 1], center_y = center_pts[i, 2], bandwidth = bandwidth) }
b
}
# Basis functions implementations (e.g., `Wendland`)
<- function (x, y, center_x, center_y, bandwidth = 0.5, ...) {
Wendland <- cbind((x - center_x), (y - center_y))
pts <- apply(X = pts, MARGIN = 1, FUN = function (x) { sqrt(sum(x ** 2)) }) / bandwidth
d <- (1 - d) ^ 6 * (35 * d ^ 2 + 18 * d + 3) / 3 * as.numeric(d <= 1)
k <- k / max(k)
k
k }
Thus, for bandwidth = 2500
and center_pts
defined as below, we can pre-compute the basis functions for both estimation and prediction.
<- 2500
bandwidth <- "Wendland"
smoothing_kernel <- as.matrix(rbind(c(-2372, 4346), c(-1984, 5363), c( -918, 5143),
center_pts c( -518, 3817), c( 263, 4987), c( 677, 4096),
c( 1494, 4720), c( -198, 4291), c(-1710, 4513),
c( 875, 3602), c(-1202, 3742), c(-1310, 4946),
c( -190, 5189), c( 920, 4550), c( -233, 4800)))
<- basis_functions(center_pts = center_pts, loct = data_coor, mesh = mesh, smoothing_kernel = smoothing_kernel, bandwidth = bandwidth)
bfs <- basis_functions(center_pts = center_pts, loct = coord_pred, mesh = NULL, smoothing_kernel = smoothing_kernel, bandwidth = bandwidth)
bfs_pred
<- list()
pps for (i in 1:nrow(center_pts)) {
<- plot_pred_USA(fitted_values = as.data.frame(cbind(coord_pred, bfs_pred[[i]])), USA = USA, r = c(0, 1), tt = i)
pps[[i]]
}
1]] + pps[[ 2]] + pps[[ 3]]) /
((pps[[ 4]] + pps[[ 5]] + pps[[ 6]]) /
(pps[[ 7]] + pps[[ 8]] + pps[[ 9]]) /
(pps[[ 10]] + pps[[11]] + pps[[12]]) /
(pps[[13]] + pps[[14]] + pps[[15]])) + plot_layout(guides = "collect") & theme(legend.position = "none") (pps[[
One more time, let us construct the projection matrices and format the data into stacks.
Notice that projection matrices for all effects in the Poisson model are obtained by multiplying column-wise a base matrix by the vector of basis function evaluated at the corresponding locations.
<- mesh$n
n_vtx <- nrow(data_USA)
n_pts <- nrow(coord_pred)
n_pts_pred
<- 1:spde$n.spde
indx
<- rep(0:1, c(n_vtx, n_pts))
y_pp <- c(wgt, rep(0, n_pts))
e_pp <- Diagonal(n_vtx, rep(1, n_vtx))
imat <- inla.spde.make.A(mesh, data_coor)
ymat
# Estimation
<- rbind(imat, ymat)
A_pp_base <- nrow(center_pts)
n_basis_functions <- list()
A_pp for (i in 1:n_basis_functions) {
<- A_pp_base * bfs[[i]] # Multiply each column by the basis function evaluated at the corresponding locations
A_pp[[i]]
}
# Prediction
<- inla.spde.make.A(mesh = mesh, loc = as.matrix(coord_pred))
A_pp_p_base <- list()
A_pp_p for (i in 1:n_basis_functions) {
<- A_pp_p_base * bfs_pred[[i]]
A_pp_p[[i]] }
The stacks are defined similarly to before. However, for the Poisson model, it’s important to note that effects are built dynamically and reflect the number of basis functions.
# Create stacks
#########################
# Create a list of effects based on the number of basis function
#########################
<- list(rep(1, n_vtx + n_pts))
fx_effects <- list()
rd_effects for (i in 1:length(A_pp)) { rd_effects <- append(rd_effects, list(indx)) }
<- append(fx_effects, rd_effects)
effects names(effects) <- c("alpha_pp", paste("v_", 1:length(A_pp), sep = ""))
<- effects
effects_pred 1]] <- rep(1, times = n_pts_pred)
effects_pred[[
#########################
#########################
# Stacks for `y` are similar to before; however, stacks for `pp` have different structures for `A` and `effects`
<- inla.stack(tag = "est_y",
stk_y_e data = list(y = cbind(data_USA$mean, NA), e = rep(NA, n_pts)),
A = list(1, ymat),
effects = list(mu = rep(1, n_pts), s = indx))
<- inla.stack(tag = "pred_y",
stk_y_p data = list(y = cbind(rep(NA, n_pts_pred), NA), e = rep(NA, n_pts_pred)),
A = list(1, A_pp_p_base),
effects = list(mu = rep(1, n_pts_pred), s = indx))
<- inla.stack(tag = "est_pp",
stk_pp_e data = list(y = cbind(NA, y_pp), e = e_pp),
A = append(1, A_pp),
effects = effects)
<- inla.stack(tag = "pred_pp",
stk_pp_p data = list(y = cbind(NA, rep(NA, n_pts_pred)), e = rep(1, n_pts_pred)),
A = append(1, A_pp_p),
effects = effects_pred)
# Full stack
<- inla.stack(stk_y_e, stk_y_p, stk_pp_e, stk_pp_p) stk_full_pp_y
Now, let us fit the model. In this case, although the formula has a similar structure to what we used for the PS model, we also build it dynamically, as it depends on the number of basis functions.
<- paste(paste("f(v_", 1:length(A_pp), ", copy = \"s\", fixed = FALSE, hyper = list(beta = re_prior))", sep = ""), collapse = " + ")
random_effects <- paste("y ~ 0 + mu + alpha_pp + f(s, model = spde) + ", random_effects, sep = "")
formula_1 <- eval(parse(text = formula_1))
formula_1 formula_1
y ~ 0 + mu + alpha_pp + f(s, model = spde) + f(v_1, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_2, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_3, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_4, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_5, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_6, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_7, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_8, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_9, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_10, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_11, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_12, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_13, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_14, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior)) + f(v_15, copy = "s",
fixed = FALSE, hyper = list(beta = re_prior))
<- inla(formula = formula_1,
model_7_1 family = c("gaussian", "poisson"),
E = inla.stack.data(stk_full_pp_y)$e,
data = inla.stack.data(stk_full_pp_y),
control.predictor = list(link = rep(c(1, 2), c((n_pts + n_pts_pred), (n_vtx + n_pts + n_pts_pred))),
compute = TRUE,
A = inla.stack.A(stk_full_pp_y)))
<- readRDS(file = "models/model_7_1.RDS") model_7_1
summary(model_7_1)
Call:
c("inla.core(formula = formula, family = family, contrasts = contrasts,
", " data = data, quantiles = quantiles, E = E, offset = offset, ", "
scale = scale, weights = weights, Ntrials = Ntrials, strata = strata,
", " lp.scale = lp.scale, link.covariates = link.covariates, verbose =
verbose, ", " lincomb = lincomb, selection = selection, control.compute
= control.compute, ", " control.predictor = control.predictor,
control.family = control.family, ", " control.inla = control.inla,
control.fixed = control.fixed, ", " control.mode = control.mode,
control.expert = control.expert, ", " control.hazard = control.hazard,
control.lincomb = control.lincomb, ", " control.update =
control.update, control.lp.scale = control.lp.scale, ", "
control.pardiso = control.pardiso, only.hyperparam = only.hyperparam,
", " inla.call = inla.call, inla.arg = inla.arg, num.threads =
num.threads, ", " keep = keep, working.directory = working.directory,
silent = silent, ", " inla.mode = inla.mode, safe = FALSE, debug =
debug, .parent.frame = .parent.frame)" )
Time used:
Pre = 5.26, Running = 1269, Post = 3.26, Total = 1278
Fixed effects:
mean sd 0.025quant 0.5quant 0.975quant mode kld
mu 5.453 0.226 4.933 5.471 5.836 5.511 0
alpha_pp -9.820 0.115 -10.057 -9.818 -9.594 -9.818 0
Random effects:
Name Model
s SPDE2 model
v_1 Copy
v_2 Copy
v_3 Copy
v_4 Copy
v_5 Copy
v_6 Copy
v_7 Copy
v_8 Copy
v_9 Copy
v_10 Copy
v_11 Copy
v_12 Copy
v_13 Copy
v_14 Copy
v_15 Copy
Model hyperparameters:
mean sd 0.025quant 0.5quant
Precision for the Gaussian observations 0.482 0.029 0.427 0.481
Range for s 1522.722 430.223 858.456 1462.383
Stdev for s 1.812 0.331 1.256 1.779
Beta for v_1 0.278 0.187 -0.096 0.280
Beta for v_2 0.453 0.427 -0.381 0.450
Beta for v_3 -0.323 1.010 -2.340 -0.313
Beta for v_4 0.819 0.262 0.312 0.817
Beta for v_5 0.958 0.651 -0.322 0.957
Beta for v_6 0.213 0.255 -0.292 0.214
Beta for v_7 1.645 0.208 1.239 1.643
Beta for v_8 -0.839 0.446 -1.733 -0.834
Beta for v_9 0.737 0.445 -0.124 0.732
Beta for v_10 0.310 0.209 -0.096 0.309
Beta for v_11 -0.773 0.290 -1.349 -0.771
Beta for v_12 -0.162 0.643 -1.440 -0.158
Beta for v_13 0.483 1.218 -1.967 0.501
Beta for v_14 -0.777 0.374 -1.514 -0.777
Beta for v_15 0.155 1.079 -1.926 0.140
0.975quant mode
Precision for the Gaussian observations 0.541 0.479
Range for s 2538.049 1346.647
Stdev for s 2.557 1.711
Beta for v_1 0.642 0.288
Beta for v_2 1.302 0.440
Beta for v_3 1.636 -0.271
Beta for v_4 1.343 0.805
Beta for v_5 2.243 0.954
Beta for v_6 0.713 0.218
Beta for v_7 2.059 1.637
Beta for v_8 0.024 -0.812
Beta for v_9 1.627 0.711
Beta for v_10 0.726 0.302
Beta for v_11 -0.207 -0.764
Beta for v_12 1.093 -0.141
Beta for v_13 2.827 0.578
Beta for v_14 -0.042 -0.775
Beta for v_15 2.323 0.077
Marginal log-Likelihood: -10874.97
is computed
Posterior summaries for the linear predictor and the fitted values are computed
(Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
Prediction
Finally, we can retrieve the (1) fitted values, (2) estimated intensity process, and (3) estimated preferentiality surface \hat{\gamma}(x).
# Fitted values and prediction
## Response
<- inla.stack.index(stk_full_pp_y, tag = "pred_y" )$data
idx_y
<- as.data.frame(cbind(coord_pred, model_7_1$summary.fitted.values[idx_y, "mean"]))
pred_y_mm <- as.data.frame(cbind(coord_pred, model_7_1$summary.fitted.values[idx_y, "0.025quant"]))
pred_y_ll <- as.data.frame(cbind(coord_pred, model_7_1$summary.fitted.values[idx_y, "0.975quant"]))
pred_y_uu
<- pred_y_mm$V3 %>% range()
r_mm <- pred_y_ll$V3 %>% range()
r_ll <- pred_y_uu$V3 %>% range()
r_uu <- c(min(r_mm[1], r_ll[1], r_uu[1]), max(r_mm[2], r_ll[2], r_uu[2]))
r
<- plot_pred_USA(fitted_values = pred_y_mm, USA = USA, r = r, tt = "Mean")
pp_y_mm <- plot_pred_USA(fitted_values = pred_y_ll, USA = USA, r = r, tt = "2.5th")
pp_y_ll <- plot_pred_USA(fitted_values = pred_y_uu, USA = USA, r = r, tt = "97.5th")
pp_y_uu
+ pp_y_mm + pp_y_uu) + plot_layout(guides = "collect") & theme(legend.position = "right") (pp_y_ll
## Intensity process
<- inla.stack.index(stk_full_pp_y, tag = "pred_pp")$data
idx_pp
<- as.data.frame(cbind(coord_pred, model_7_1$summary.fitted.values[idx_pp, "mean"]))
pred_pp_mm <- as.data.frame(cbind(coord_pred, model_7_1$summary.fitted.values[idx_pp, "0.025quant"]))
pred_pp_ll <- as.data.frame(cbind(coord_pred, model_7_1$summary.fitted.values[idx_pp, "0.975quant"]))
pred_pp_uu
# Expected number of observations
sum(pred_pp_mm$V3 * (25 ** 2))
[1] 903.119
<- pred_pp_mm$V3 %>% range()
r_mm <- pred_pp_ll$V3 %>% range()
r_ll <- pred_pp_uu$V3 %>% range()
r_uu <- c(min(r_mm[1], r_ll[1], r_uu[1]), max(r_mm[2], r_ll[2], r_uu[2]))
r
<- plot_pred_USA(fitted_values = pred_pp_mm, USA = USA, r = r, tt = "Mean", should_round = FALSE)
pp_pp_mm <- plot_pred_USA(fitted_values = pred_pp_ll, USA = USA, r = r, tt = "2.5th", should_round = FALSE)
pp_pp_ll <- plot_pred_USA(fitted_values = pred_pp_uu, USA = USA, r = r, tt = "97.5th", should_round = FALSE)
pp_pp_uu
+ pp_pp_mm + pp_pp_uu) + plot_layout(guides = "collect") & theme(legend.position = "right") (pp_pp_ll
The estimated preferentiality surface \hat{\gamma}(x) is determined based on the pre-computed basis functions \{\phi_k(x)\}_k and the posterior mean of \{\beta_k\}_k.
n_basis_functions
[1] 15
<- nrow(model_7_1$summary.hyperpar)
n_hyperparameter
<- model_7_1$summary.hyperpar[((n_hyperparameter - n_basis_functions + 1):n_hyperparameter), "mean"]
coeff <- rep(0, n_pts_pred)
value for (i in 1:n_basis_functions) {
<- bfs_pred[[i]]
bfs_pred_tmp <- value + (bfs_pred_tmp * coeff[i])
value
}
<- as.data.frame(cbind(coord_pred, value))
pref_mm
plot_pred_USA(fitted_values = pref_mm, USA = USA, r = range(pref_mm$value), tt = "Preferentiality surface", should_round = FALSE)