Title: | Graceful 'ggplot'-Based Graphics and Other Functions for GAMs Fitted Using 'mgcv' |
---|---|
Description: | Graceful 'ggplot'-based graphics and utility functions for working with generalized additive models (GAMs) fitted using the 'mgcv' package. Provides a reimplementation of the plot() method for GAMs that 'mgcv' provides, as well as 'tidyverse' compatible representations of estimated smooths. |
Authors: | Gavin L. Simpson [aut, cre, cph] , Henrik Singmann [ctb] |
Maintainer: | Gavin L. Simpson <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.9.2.9011 |
Built: | 2024-11-21 22:21:59 UTC |
Source: | https://github.com/gavinsimpson/gratia |
Add a confidence interval to an existing object
add_confint(object, coverage = 0.95, ...) ## S3 method for class 'smooth_estimates' add_confint(object, coverage = 0.95, ...) ## S3 method for class 'parametric_effects' add_confint(object, coverage = 0.95, ...) ## Default S3 method: add_confint(object, coverage = 0.95, ...)
add_confint(object, coverage = 0.95, ...) ## S3 method for class 'smooth_estimates' add_confint(object, coverage = 0.95, ...) ## S3 method for class 'parametric_effects' add_confint(object, coverage = 0.95, ...) ## Default S3 method: add_confint(object, coverage = 0.95, ...)
object |
a R object. |
coverage |
numeric; the coverage for the interval. Must be in the range
0 < |
... |
arguments passed to other methods. |
Add a constant to estimated values
add_constant(object, constant = NULL, ...) ## S3 method for class 'smooth_estimates' add_constant(object, constant = NULL, ...) ## S3 method for class 'smooth_samples' add_constant(object, constant = NULL, ...) ## S3 method for class 'mgcv_smooth' add_constant(object, constant = NULL, ...) ## S3 method for class 'parametric_effects' add_constant(object, constant = NULL, ...) ## S3 method for class 'tbl_df' add_constant(object, constant = NULL, column = NULL, ...) ## S3 method for class 'evaluated_parametric_term' add_constant(object, constant = NULL, ...)
add_constant(object, constant = NULL, ...) ## S3 method for class 'smooth_estimates' add_constant(object, constant = NULL, ...) ## S3 method for class 'smooth_samples' add_constant(object, constant = NULL, ...) ## S3 method for class 'mgcv_smooth' add_constant(object, constant = NULL, ...) ## S3 method for class 'parametric_effects' add_constant(object, constant = NULL, ...) ## S3 method for class 'tbl_df' add_constant(object, constant = NULL, column = NULL, ...) ## S3 method for class 'evaluated_parametric_term' add_constant(object, constant = NULL, ...)
object |
a object to add a constant to. |
constant |
the constant to add. |
... |
additional arguments passed to methods. |
column |
character; for the |
Returns object
but with the estimate shifted by the addition of
the supplied constant.
Gavin L. Simpson
Add fitted values from a model to a data frame
add_fitted(data, model, value = ".value", ...)
add_fitted(data, model, value = ".value", ...)
data |
a data frame containing values for the variables used to fit the
model. Passed to |
model |
a fitted model for which a |
value |
character; the name of the variable in which model predictions will be stored. |
... |
additional arguments passed to methods. |
A data frame (tibble) formed from data
and fitted values from
model
.
Adds draws from the posterior distribution of model
to the data object
using one of fitted_samples()
, predicted_samples()
, or
posterior_samples()
.
add_fitted_samples(object, model, n = 1, seed = NULL, ...) add_predicted_samples(object, model, n = 1, seed = NULL, ...) add_posterior_samples(object, model, n = 1, seed = NULL, ...) add_smooth_samples(object, model, n = 1, seed = NULL, select = NULL, ...)
add_fitted_samples(object, model, n = 1, seed = NULL, ...) add_predicted_samples(object, model, n = 1, seed = NULL, ...) add_posterior_samples(object, model, n = 1, seed = NULL, ...) add_smooth_samples(object, model, n = 1, seed = NULL, select = NULL, ...)
object |
a data frame or tibble to which the posterior draws will be added. |
model |
a fitted GAM (or GAM-like) object for which a posterior draw method exists. |
n |
integer; the number of posterior draws to add. |
seed |
numeric; a value to seed the random number generator. |
... |
arguments are passed to the posterior draw function, currently one
of |
select |
character; select which smooth's posterior to draw from. The
default, |
load_mgcv() df <- data_sim("eg1", n = 400, seed = 42) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # add fitted samples (posterior draws of the expected value of the response) # note that there are 800 rows in the output: 400 data by `n = 2` samples. df |> add_fitted_samples(m, n = 2, seed = 84) # add posterior draws from smooth s(x2) df |> add_smooth_samples(m, n = 2, seed = 2, select= "s(x2)")
load_mgcv() df <- data_sim("eg1", n = 400, seed = 42) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # add fitted samples (posterior draws of the expected value of the response) # note that there are 800 rows in the output: 400 data by `n = 2` samples. df |> add_fitted_samples(m, n = 2, seed = 84) # add posterior draws from smooth s(x2) df |> add_smooth_samples(m, n = 2, seed = 2, select= "s(x2)")
Add fitted values from a GAM to a data frame
## S3 method for class 'gam' add_fitted(data, model, value = ".fitted", type = "response", ...)
## S3 method for class 'gam' add_fitted(data, model, value = ".fitted", type = "response", ...)
data |
a data frame containing values for the variables used to fit the
model. Passed to |
model |
a fitted model for which a |
value |
character; the name of the variable in which model predictions will be stored. |
type |
character; the type of predictions to return. See
|
... |
additional arguments passed to |
A data frame (tibble) formed from data
and predictions from
model
.
load_mgcv() df <- data_sim("eg1", seed = 1) df <- df[, c("y", "x0", "x1", "x2", "x3")] m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # add fitted values to our data add_fitted(df, m) # with type = "terms" or "iterms" add_fitted(df, m, type = "terms")
load_mgcv() df <- data_sim("eg1", seed = 1) df <- df[, c("y", "x0", "x1", "x2", "x3")] m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # add fitted values to our data add_fitted(df, m) # with type = "terms" or "iterms" add_fitted(df, m, type = "terms")
Add partial residuals
add_partial_residuals(data, model, ...) ## S3 method for class 'gam' add_partial_residuals(data, model, select = NULL, partial_match = FALSE, ...)
add_partial_residuals(data, model, ...) ## S3 method for class 'gam' add_partial_residuals(data, model, select = NULL, partial_match = FALSE, ...)
data |
a data frame containing values for the variables used to fit the
model. Passed to |
model |
a fitted model for which a |
... |
arguments passed to other methods. |
select |
character, logical, or numeric; which smooths to plot. If
|
partial_match |
logical; should smooths be selected by partial matches
with |
load_mgcv() df <- data_sim("eg1", seed = 1) df <- df[, c("y", "x0", "x1", "x2", "x3")] m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") ## add partial residuals add_partial_residuals(df, m) ## add partial residuals for selected smooths add_partial_residuals(df, m, select = "s(x0)")
load_mgcv() df <- data_sim("eg1", seed = 1) df <- df[, c("y", "x0", "x1", "x2", "x3")] m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") ## add partial residuals add_partial_residuals(df, m) ## add partial residuals for selected smooths add_partial_residuals(df, m, select = "s(x0)")
Add residuals from a model to a data frame
add_residuals(data, model, value = ".residual", ...)
add_residuals(data, model, value = ".residual", ...)
data |
a data frame containing values for the variables used to fit the
model. Passed to |
model |
a fitted model for which a |
value |
character; the name of the variable in which model residuals will be stored. |
... |
additional arguments passed to methods. |
A data frame (tibble) formed from data
and residuals from model
.
Add residuals from a GAM to a data frame
## S3 method for class 'gam' add_residuals(data, model, value = ".residual", type = "deviance", ...)
## S3 method for class 'gam' add_residuals(data, model, value = ".residual", type = "deviance", ...)
data |
a data frame containing values for the variables used to fit the
model. Passed to |
model |
a fitted model for which a |
value |
character; the name of the variable in which model predictions will be stored. |
type |
character; the type of residuals to return. See
|
... |
additional arguments passed to |
A data frame (tibble) formed from data
and residuals from model
.
load_mgcv() df <- data_sim("eg1", seed = 1) df <- df[, c("y", "x0", "x1", "x2", "x3")] m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") ## add_residuals(df, m)
load_mgcv() df <- data_sim("eg1", seed = 1) df <- df[, c("y", "x0", "x1", "x2", "x3")] m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") ## add_residuals(df, m)
Add indicators of significant change after SiZeR
add_sizer(object, type = c("change", "sizer"), ...) ## S3 method for class 'derivatives' add_sizer(object, type = c("change", "sizer"), ...) ## S3 method for class 'smooth_estimates' add_sizer(object, type = c("change", "sizer"), derivatives = NULL, ...)
add_sizer(object, type = c("change", "sizer"), ...) ## S3 method for class 'derivatives' add_sizer(object, type = c("change", "sizer"), ...) ## S3 method for class 'smooth_estimates' add_sizer(object, type = c("change", "sizer"), derivatives = NULL, ...)
object |
an R object. Currently supported methods are for classes
|
type |
character; |
... |
arguments passed to other methods |
derivatives |
an object of class |
load_mgcv() df <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 42) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") ## first derivatives of all smooths using central finite differences d <- derivatives(m, type = "central") |> add_sizer() # default adds a .change column names(d)
load_mgcv() df <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 42) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") ## first derivatives of all smooths using central finite differences d <- derivatives(m, type = "central") |> add_sizer() # default adds a .change column names(d)
Model diagnostic plots
appraise(model, ...) ## S3 method for class 'gam' appraise( model, method = c("uniform", "simulate", "normal", "direct"), use_worm = FALSE, n_uniform = 10, n_simulate = 50, seed = NULL, type = c("deviance", "pearson", "response"), n_bins = c("sturges", "scott", "fd"), ncol = NULL, nrow = NULL, guides = "keep", level = 0.9, ci_col = "black", ci_alpha = 0.2, point_col = "black", point_alpha = 1, line_col = "red", ... ) ## S3 method for class 'lm' appraise(model, ...)
appraise(model, ...) ## S3 method for class 'gam' appraise( model, method = c("uniform", "simulate", "normal", "direct"), use_worm = FALSE, n_uniform = 10, n_simulate = 50, seed = NULL, type = c("deviance", "pearson", "response"), n_bins = c("sturges", "scott", "fd"), ncol = NULL, nrow = NULL, guides = "keep", level = 0.9, ci_col = "black", ci_alpha = 0.2, point_col = "black", point_alpha = 1, line_col = "red", ... ) ## S3 method for class 'lm' appraise(model, ...)
model |
a fitted model. Currently models inheriting from class |
... |
arguments passed to |
method |
character; method used to generate theoretical quantiles.
The default is Note that |
use_worm |
logical; should a worm plot be drawn in place of the QQ plot? |
n_uniform |
numeric; number of times to randomize uniform quantiles
in the direct computation method ( |
n_simulate |
numeric; number of data sets to simulate from the estimated
model when using the simulation method ( |
seed |
numeric; the random number seed to use for |
type |
character; type of residuals to use. Only |
n_bins |
character or numeric; either the number of bins or a string indicating how to calculate the number of bins. |
ncol , nrow
|
numeric; the numbers of rows and columns over which to spread the plots. |
guides |
character; one of |
level |
numeric; the coverage level for QQ plot reference intervals.
Must be strictly |
ci_alpha , ci_col
|
colour and transparency used to draw the QQ plot
reference interval when |
point_col , point_alpha
|
colour and transparency used to draw points in
the plots. See |
line_col |
colour specification for the 1:1 line in the QQ plot and the reference line in the residuals vs linear predictor plot. |
The wording used in mgcv::qq.gam()
uses direct in reference to the
simulated residuals method (method = "simulated"
). To avoid confusion,
method = "direct"
is deprecated in favour of method = "uniform"
.
The plots are produced by functions qq_plot()
,
residuals_linpred_plot()
, residuals_hist_plot()
,
and observed_fitted_plot()
.
load_mgcv() ## simulate some data... dat <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2) mod <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat) ## run some basic model checks appraise(mod, point_col = "steelblue", point_alpha = 0.4) ## To change the theme for all panels use the & operator, for example to ## change the ggplot theme for all panels library("ggplot2") appraise(mod, seed = 42, point_col = "steelblue", point_alpha = 0.4, line_col = "black" ) & theme_minimal()
load_mgcv() ## simulate some data... dat <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2) mod <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat) ## run some basic model checks appraise(mod, point_col = "steelblue", point_alpha = 0.4) ## To change the theme for all panels use the & operator, for example to ## change the ggplot theme for all panels library("ggplot2") appraise(mod, seed = 42, point_col = "steelblue", point_alpha = 0.4, line_col = "black" ) & theme_minimal()
Creates a basis expansion from a definition of a smoother using the syntax
of mgcv's smooths via mgcv::s()
., mgcv::te()
, mgcv::ti()
, and
mgcv::t2()
, or from a fitted GAM(M).
basis(object, ...) ## S3 method for class 'gam' basis( object, select = NULL, term = deprecated(), data = NULL, n = 100, n_2d = 50, n_3d = 16, n_4d = 4, partial_match = FALSE, ... ) ## S3 method for class 'scam' basis( object, select = NULL, term = deprecated(), data = NULL, n = 100, n_2d = 50, n_3d = 16, n_4d = 4, partial_match = FALSE, ... ) ## S3 method for class 'gamm' basis( object, select = NULL, term = deprecated(), data = NULL, n = 100, n_2d = 50, n_3d = 16, n_4d = 4, partial_match = FALSE, ... ) ## S3 method for class 'list' basis( object, select = NULL, term = deprecated(), data = NULL, n = 100, n_2d = 50, n_3d = 16, n_4d = 4, partial_match = FALSE, ... ) ## Default S3 method: basis( object, data, knots = NULL, constraints = FALSE, at = NULL, diagonalize = FALSE, ... )
basis(object, ...) ## S3 method for class 'gam' basis( object, select = NULL, term = deprecated(), data = NULL, n = 100, n_2d = 50, n_3d = 16, n_4d = 4, partial_match = FALSE, ... ) ## S3 method for class 'scam' basis( object, select = NULL, term = deprecated(), data = NULL, n = 100, n_2d = 50, n_3d = 16, n_4d = 4, partial_match = FALSE, ... ) ## S3 method for class 'gamm' basis( object, select = NULL, term = deprecated(), data = NULL, n = 100, n_2d = 50, n_3d = 16, n_4d = 4, partial_match = FALSE, ... ) ## S3 method for class 'list' basis( object, select = NULL, term = deprecated(), data = NULL, n = 100, n_2d = 50, n_3d = 16, n_4d = 4, partial_match = FALSE, ... ) ## Default S3 method: basis( object, data, knots = NULL, constraints = FALSE, at = NULL, diagonalize = FALSE, ... )
object |
a smooth specification, the result of a call to one of
|
... |
other arguments passed to |
select |
character; select smooths in a fitted model |
term |
|
data |
a data frame containing the variables used in |
n |
numeric; the number of points over the range of the covariate at which to evaluate the smooth. |
n_2d |
numeric; the number of new observations for each dimension of a
bivariate smooth. Not currently used; |
n_3d |
numeric; the number of new observations to generate for the third dimension of a 3D smooth. |
n_4d |
numeric; the number of new observations to generate for the
dimensions higher than 2 (!) of a kD smooth (k >= 4). For example, if
the smooth is a 4D smooth, each of dimensions 3 and 4 will get |
partial_match |
logical; in the case of character |
knots |
a list or data frame with named components containing
knots locations. Names must match the covariates for which the basis
is required. See |
constraints |
logical; should identifiability constraints be applied to
the smooth basis. See argument |
at |
a data frame containing values of the smooth covariate(s) at which the basis should be evaluated. |
diagonalize |
logical; if |
A tibble.
Gavin L. Simpson
load_mgcv() df <- data_sim("eg4", n = 400, seed = 42) bf <- basis(s(x0), data = df) bf <- basis(s(x2, by = fac, bs = "bs"), data = df, constraints = TRUE)
load_mgcv() df <- data_sim("eg4", n = 400, seed = 42) bf <- basis(s(x0), data = df) bf <- basis(s(x2, by = fac, bs = "bs"), data = df, constraints = TRUE)
Extract basis dimension of a smooth
basis_size(object, ...) ## S3 method for class 'mgcv.smooth' basis_size(object, ...) ## S3 method for class 'gam' basis_size(object, ...) ## S3 method for class 'gamm' basis_size(object, ...)
basis_size(object, ...) ## S3 method for class 'mgcv.smooth' basis_size(object, ...) ## S3 method for class 'gam' basis_size(object, ...) ## S3 method for class 'gamm' basis_size(object, ...)
object |
A fitted GAM(M). Currently |
... |
Arguments passed to other methods. |
load_mgcv() df <- data_sim("eg1", n = 200, seed = 1) m <- bam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df) basis_size(m)
load_mgcv() df <- data_sim("eg1", n = 200, seed = 1) m <- bam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df) basis_size(m)
Data generated from a hypothetical study of bird movement along a migration
corridor, sampled throughout the year. This dataset consists of simulated
sample records of numbers of observed locations of 100 tagged individuals
each from six species of bird, at ten locations along a latitudinal gradient,
with one observation taken every four weeks. Counts were simulated randomly
for each species in each location and week by creating a species-specific
migration curve that gave the probability of finding an individual of a
given species in a given location, then simulated the distribution of
individuals across sites using a multinomial distribution, and subsampling
that using a binomial distribution to simulation observation error (i.e.
not every bird present at a location would be detected). The data set
(bird_move
) consists of the variables count
, latitude
, week
and
species
.
A data frame
Pedersen EJ, Miller DL, Simpson GL, Ross N. 2018. Hierarchical generalized additive models: an introduction with mgcv. PeerJ Preprints 6:e27320v1 doi:10.7287/peerj.preprints.27320v1.
boundary(x, ...) ## S3 method for class 'soap.film' boundary(x, ...) ## S3 method for class 'gam' boundary(x, select, ...)
boundary(x, ...) ## S3 method for class 'soap.film' boundary(x, ...) ## S3 method for class 'gam' boundary(x, select, ...)
x |
an R object. Currently only objects that inherit from classes
|
... |
arguments passed to other methods. |
select |
character; the label of the soap film smooth from which to extract the boundary. |
A list of lists or data frames specifying the loops that define the boundary of the soap film smooth.
Given a vector indexing the smooths of a GAM, returns a logical vector selecting the requested smooths.
check_user_select_smooths( smooths, select = NULL, partial_match = FALSE, model_name = NULL )
check_user_select_smooths( smooths, select = NULL, partial_match = FALSE, model_name = NULL )
smooths |
character; a vector of smooth labels. |
select |
numeric, logical, or character vector of selected smooths. |
partial_match |
logical; in the case of character |
model_name |
character; a model name that will be used in error messages. |
A logical vector the same length as length(smooths)
indicating
which smooths have been selected.
Gavin L. Simpson
scam
model.Extract coefficients from a fitted scam
model.
## S3 method for class 'scam' coef(object, parametrized = TRUE, ...)
## S3 method for class 'scam' coef(object, parametrized = TRUE, ...)
object |
a model object fitted by |
parametrized |
logical; extract parametrized coefficients, which respect the linear inequality constraints of the model. |
... |
other arguments. |
Compare smooths across models
compare_smooths( model, ..., select = NULL, smooths = deprecated(), n = 100, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, partial_match = FALSE )
compare_smooths( model, ..., select = NULL, smooths = deprecated(), n = 100, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, partial_match = FALSE )
model |
Primary model for comparison. |
... |
Additional models to compare smooths against those of |
select |
character; select which smooths to compare. The default
( |
smooths |
|
n |
numeric; the number of points over the range of the covariate at which to evaluate the smooth. |
data |
a data frame of covariate values at which to evaluate the smooth. |
unconditional |
logical; should confidence intervals include the
uncertainty due to smoothness selection? If |
overall_uncertainty |
logical; should the uncertainty in the model constant term be included in the standard error of the evaluate values of the smooth? |
partial_match |
logical; should smooths be selected by partial matches
with |
load_mgcv() dat <- data_sim("eg1", seed = 2) ## models to compare smooths across - artificially create differences m1 <- gam(y ~ s(x0, k = 5) + s(x1, k = 5) + s(x2, k = 5) + s(x3, k = 5), data = dat, method = "REML" ) m2 <- gam(y ~ s(x0, bs = "ts") + s(x1, bs = "ts") + s(x2, bs = "ts") + s(x3, bs = "ts"), data = dat, method = "REML") ## build comparisons comp <- compare_smooths(m1, m2) comp ## notice that the result is a nested tibble draw(comp)
load_mgcv() dat <- data_sim("eg1", seed = 2) ## models to compare smooths across - artificially create differences m1 <- gam(y ~ s(x0, k = 5) + s(x1, k = 5) + s(x2, k = 5) + s(x3, k = 5), data = dat, method = "REML" ) m2 <- gam(y ~ s(x0, bs = "ts") + s(x1, bs = "ts") + s(x2, bs = "ts") + s(x3, bs = "ts"), data = dat, method = "REML") ## build comparisons comp <- compare_smooths(m1, m2) comp ## notice that the result is a nested tibble draw(comp)
Calculates point-wise confidence or simultaneous intervals for the first derivatives of smooth terms in a fitted GAM.
## S3 method for class 'fderiv' confint( object, parm, level = 0.95, type = c("confidence", "simultaneous"), nsim = 10000, ncores = 1L, ... )
## S3 method for class 'fderiv' confint( object, parm, level = 0.95, type = c("confidence", "simultaneous"), nsim = 10000, ncores = 1L, ... )
object |
an object of class |
parm |
which parameters (smooth terms) are to be given intervals as a vector of terms. If missing, all parameters are considered. |
level |
numeric, |
type |
character; the type of interval to compute. One of |
nsim |
integer; the number of simulations used in computing the simultaneous intervals. |
ncores |
number of cores for generating random variables from a
multivariate normal distribution. Passed to |
... |
additional arguments for methods |
a data frame with components:
term
; factor indicating to which term each row relates,
lower
; lower limit of the confidence or simultaneous interval,
est
; estimated derivative
upper
; upper limit of the confidence or simultaneous interval.
Gavin L. Simpson
load_mgcv() dat <- data_sim("eg1", n = 1000, dist = "normal", scale = 2, seed = 2) mod <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") # new data to evaluate the derivatives at, say over the middle 50% of range # of each covariate middle <- function(x, n = 25, coverage = 0.5) { v <- (1 - coverage) / 2 q <- quantile(x, prob = c(0 + v, 1 - v), type = 8) seq(q[1], q[2], length = n) } new_data <- sapply(dat[c("x0", "x1", "x2", "x3")], middle) new_data <- data.frame(new_data) ## first derivatives of all smooths... fd <- fderiv(mod, newdata = new_data) ## point-wise interval ci <- confint(fd, type = "confidence") ci ## simultaneous interval for smooth term of x2 x2_sint <- confint(fd, parm = "x2", type = "simultaneous", nsim = 10000, ncores = 2 ) x2_sint
load_mgcv() dat <- data_sim("eg1", n = 1000, dist = "normal", scale = 2, seed = 2) mod <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") # new data to evaluate the derivatives at, say over the middle 50% of range # of each covariate middle <- function(x, n = 25, coverage = 0.5) { v <- (1 - coverage) / 2 q <- quantile(x, prob = c(0 + v, 1 - v), type = 8) seq(q[1], q[2], length = n) } new_data <- sapply(dat[c("x0", "x1", "x2", "x3")], middle) new_data <- data.frame(new_data) ## first derivatives of all smooths... fd <- fderiv(mod, newdata = new_data) ## point-wise interval ci <- confint(fd, type = "confidence") ci ## simultaneous interval for smooth term of x2 x2_sint <- confint(fd, parm = "x2", type = "simultaneous", nsim = 10000, ncores = 2 ) x2_sint
Calculates point-wise confidence or simultaneous intervals for the smooth terms of a fitted GAM.
## S3 method for class 'gam' confint( object, parm, level = 0.95, data = newdata, n = 100, type = c("confidence", "simultaneous"), nsim = 10000, shift = FALSE, transform = FALSE, unconditional = FALSE, ncores = 1, partial_match = FALSE, ..., newdata = NULL ) ## S3 method for class 'gamm' confint(object, ...) ## S3 method for class 'list' confint(object, ...)
## S3 method for class 'gam' confint( object, parm, level = 0.95, data = newdata, n = 100, type = c("confidence", "simultaneous"), nsim = 10000, shift = FALSE, transform = FALSE, unconditional = FALSE, ncores = 1, partial_match = FALSE, ..., newdata = NULL ) ## S3 method for class 'gamm' confint(object, ...) ## S3 method for class 'list' confint(object, ...)
object |
an object of class |
parm |
which parameters (smooth terms) are to be given intervals as a vector of terms. If missing, all parameters are considered, although this is not currently implemented. |
level |
numeric, |
data |
data frame; new values of the covariates used in the model fit. The selected smooth(s) wil be evaluated at the supplied values. |
n |
numeric; the number of points to evaluate smooths at. |
type |
character; the type of interval to compute. One of |
nsim |
integer; the number of simulations used in computing the simultaneous intervals. |
shift |
logical; should the constant term be add to the smooth? |
transform |
logical; should the smooth be evaluated on a transformed scale? For generalised models, this involves applying the inverse of the link function used to fit the model. Alternatively, the name of, or an actual, function can be supplied to transform the smooth and it's confidence interval. |
unconditional |
logical; if |
ncores |
number of cores for generating random variables from a
multivariate normal distribution. Passed to |
partial_match |
logical; should matching |
... |
additional arguments for methods |
newdata |
DEPRECATED! data frame; containing new values of the covariates used in the model fit. The selected smooth(s) wil be evaluated at the supplied values. |
a tibble with components:
.smooth
; character indicating to which term each row relates,
.type
; the type of smooth,
.by
the name of the by variable if a by smooth, NA
otherwise,
one or more vectors of values at which the smooth was evaluated, named as per the variables in the smooth,
zero or more variables containing values of the by variable,
.estimate
; estimated value of the smooth,
.se
; standard error of the estimated value of the smooth,
.crit
; critical value for the 100 * level
% confidence interval.
.lower_ci
; lower limit of the confidence or simultaneous interval,
.upper_ci
; upper limit of the confidence or simultaneous interval,
Gavin L. Simpson
load_mgcv() dat <- data_sim("eg1", n = 1000, dist = "normal", scale = 2, seed = 2) mod <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") # new data to evaluate the smooths at, say over the middle 50% of range # of each covariate middle <- function(x, n = 50, coverage = 0.5) { v <- (1 - coverage) / 2 q <- quantile(x, prob = c(0 + v, 1 - v), type = 8) seq(q[1], q[2], length = n) } new_data <- sapply(dat[c("x0", "x1", "x2", "x3")], middle) new_data <- data.frame(new_data) ## point-wise interval for smooth of x2 ci <- confint(mod, parm = "s(x2)", type = "confidence", data = new_data) ci
load_mgcv() dat <- data_sim("eg1", n = 1000, dist = "normal", scale = 2, seed = 2) mod <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") # new data to evaluate the smooths at, say over the middle 50% of range # of each covariate middle <- function(x, n = 50, coverage = 0.5) { v <- (1 - coverage) / 2 q <- quantile(x, prob = c(0 + v, 1 - v), type = 8) seq(q[1], q[2], length = n) } new_data <- sapply(dat[c("x0", "x1", "x2", "x3")], middle) new_data <- data.frame(new_data) ## point-wise interval for smooth of x2 ci <- confint(mod, parm = "s(x2)", type = "confidence", data = new_data) ci
All combinations of factor levels plus typical values of continuous variables
data_combos(object, ...) ## S3 method for class 'gam' data_combos( object, vars = everything(), complete = TRUE, envir = environment(formula(object)), data = NULL, ... )
data_combos(object, ...) ## S3 method for class 'gam' data_combos( object, vars = everything(), complete = TRUE, envir = environment(formula(object)), data = NULL, ... )
object |
a fitted model object. |
... |
arguments passed to methods. |
vars |
terms to include or exclude from the returned object. Uses tidyselect principles. |
complete |
logical; should all combinations of factor levels be
returned? If |
envir |
the environment within which to recreate the data used to fit
|
data |
an optional data frame of data used to fit the mdoel if reconstruction of the data from the model doesn't work. |
A tidy reimplementation of the functions implemented in mgcv::gamSim()
that can be used to fit GAMs. An new feature is that the sampling
distribution can be applied to all the example types.
data_sim( model = "eg1", n = 400, scale = NULL, theta = 3, power = 1.5, dist = c("normal", "poisson", "binary", "negbin", "tweedie", "gamma", "ocat", "ordered categorical"), n_cat = 4, cuts = c(-1, 0, 5), seed = NULL, gfam_families = c("binary", "tweedie", "normal") )
data_sim( model = "eg1", n = 400, scale = NULL, theta = 3, power = 1.5, dist = c("normal", "poisson", "binary", "negbin", "tweedie", "gamma", "ocat", "ordered categorical"), n_cat = 4, cuts = c(-1, 0, 5), seed = NULL, gfam_families = c("binary", "tweedie", "normal") )
model |
character; either |
n |
numeric; the number of observations to simulate. |
scale |
numeric; the level of noise to use. |
theta |
numeric; the dispersion parameter |
power |
numeric; the Tweedie power parameter. |
dist |
character; a sampling distribution for the response
variable. |
n_cat |
integer; the number of categories for categorical response.
Currently only used for |
cuts |
numeric; vector of cut points on the latent variable, excluding
the end points |
seed |
numeric; the seed for the random number generator. Passed to
|
gfam_families |
character; a vector of distributions to use in
generating data with grouped families for use with |
data_sim()
can simulate data from several underlying models of
known true functions. The available options currently are:
"eg1"
: a four term additive true model. This is the classic Gu & Wahba
four univariate term test model. See gw_functions
for more details of
the underlying four functions.
"eg2"
: a bivariate smooth true model.
"eg3"
: an example containing a continuous by smooth (varying
coefficient) true model. The model is where the function
is
.
"eg4"
: a factor by smooth true model. The true model contains a factor
with 3 levels, where the response for the nth level follows the nth
Gu & Wabha function (for ).
"eg5"
: an additive plus factor true model. The response is a linear
combination of the Gu & Wabha functions 2, 3, 4 (the latter is a null
function) plus a factor term with four levels.
"eg6"
: an additive plus random effect term true model.
´"eg7": a version of the model in
"eg1"', but where the covariates are
correlated.
"gwf2"
: a model where the response is Gu & Wabha's
plus noise.
"lwf6"
: a model where the response is Luo & Wabha's "example 6"
function plus noise.
"gfam"
: simulates data for use with GAMs with
family = gfam(families)
. See example in mgcv::gfam()
. If this model
is specified then dist
is ignored and gfam_families
is used to
specify which distributions are included in the simulated data. Can be a
vector of any of the families allowed by dist
. For
"ocat" %in% gfam_families
(or "ordered categorical"
), 4 classes are
assumed, which can't be changed. Link functions used are "identity"
for "normal"
, "logit"
for "binary"
, "ocat"
, and
"ordered categorical"
, and "exp"
elsewhere.
The random component providing noise or sampling variation can follow one
of the distributions, specified via argument dist
"normal"
: Gaussian,
"poisson"
: Poisson,
"binary"
: Bernoulli,
"negbin"
: Negative binomial,
"tweedie"
: Tweedie,
"gamma"
: gamma , and
"ordered categorical"
: ordered categorical
Other arguments provide the parameters for the distribution.
Gu, C., Wahba, G., (1993). Smoothing Spline ANOVA with Component-Wise Bayesian "Confidence Intervals." J. Comput. Graph. Stat. 2, 97–117.
Luo, Z., Wahba, G., (1997). Hybrid adaptive splines. J. Am. Stat. Assoc. 92, 107–116.
data_sim("eg1", n = 100, seed = 1) # an ordered categorical response data_sim("eg1", n = 100, dist = "ocat", n_cat = 4, cuts = c(-1, 0, 5))
data_sim("eg1", n = 100, seed = 1) # an ordered categorical response data_sim("eg1", n = 100, dist = "ocat", n_cat = 4, cuts = c(-1, 0, 5))
Prepare a data slice through model covariates
data_slice(object, ...) ## Default S3 method: data_slice(object, ...) ## S3 method for class 'data.frame' data_slice(object, ...) ## S3 method for class 'gam' data_slice(object, ..., data = NULL, envir = NULL) ## S3 method for class 'gamm' data_slice(object, ...) ## S3 method for class 'list' data_slice(object, ...) ## S3 method for class 'scam' data_slice(object, ...)
data_slice(object, ...) ## Default S3 method: data_slice(object, ...) ## S3 method for class 'data.frame' data_slice(object, ...) ## S3 method for class 'gam' data_slice(object, ..., data = NULL, envir = NULL) ## S3 method for class 'gamm' data_slice(object, ...) ## S3 method for class 'list' data_slice(object, ...) ## S3 method for class 'scam' data_slice(object, ...)
object |
an R model object. |
... |
< |
data |
an alternative data frame of values containing all the variables
needed to fit the model. If |
envir |
the environment within which to recreate the data used to fit
|
A data slice is the data set that results where one (or more covariates) is varied systematically over some or all of its (their) range or at a specified subset of values of interest, while any remaining covariates in the model are held at fixed, representative values. This is known as a reference grid in package emmeans and a data grid in the marginaleffects package.
For GAMs, any covariates not specified via ...
will take representative
values determined from the data used to fit the model as follows:
for numeric covariates, the value in the fitting data that is closest to the median value is used,
for factor covariates, the modal (most frequently observed) level is used,
or the first level (sorted as per the vector returned by base::levels()
if several levels are observed the same number of times.
These values are already computed when calling gam()
or bam()
for example
and can be found in the var.summary
component of the fitted model. Function
typical_values()
will extract these values for you if you are interested.
Convenience functions evenly()
, ref_level()
, and level()
are provided
to help users specify data slices. ref_level()
, and level()
also ensure
that factor covariates have the correct levels, as needed by
mgcv::predict.gam()
for example.
For an extended discussion of data_slice()
and further examples, see
vignette("data-slices", package = "gratia")
.
The convenience functions evenly()
, ref_level()
, and level()
.
typical_values()
for extracting the representative values used for
covariates in the model but not named in the slice.
load_mgcv() # simulate some Gaussian data df <- data_sim("eg1", n = 50, seed = 2) # fit a GAM with 1 smooth and 1 linear term m <- gam(y ~ s(x2, k = 7) + x1, data = df, method = "REML") # Want to predict over f(x2) while holding `x1` at some value. # Default will use the observation closest to the median for unspecified # variables. ds <- data_slice(m, x2 = evenly(x2, n = 50)) ds # for full control, specify the values you want ds <- data_slice(m, x2 = evenly(x2, n = 50), x1 = 0.3) # or provide an expression (function call) which will be evaluated in the # data frame passed to `data` or `model.frame(object)` ds <- data_slice(m, x2 = evenly(x2, n = 50), x1 = mean(x1))
load_mgcv() # simulate some Gaussian data df <- data_sim("eg1", n = 50, seed = 2) # fit a GAM with 1 smooth and 1 linear term m <- gam(y ~ s(x2, k = 7) + x1, data = df, method = "REML") # Want to predict over f(x2) while holding `x1` at some value. # Default will use the observation closest to the median for unspecified # variables. ds <- data_slice(m, x2 = evenly(x2, n = 50)) ds # for full control, specify the values you want ds <- data_slice(m, x2 = evenly(x2, n = 50), x1 = 0.3) # or provide an expression (function call) which will be evaluated in the # data frame passed to `data` or `model.frame(object)` ds <- data_slice(m, x2 = evenly(x2, n = 50), x1 = mean(x1))
Posterior expectations of derivatives from an estimated model
derivative_samples(object, ...) ## Default S3 method: derivative_samples(object, ...) ## S3 method for class 'gamm' derivative_samples(object, ...) ## S3 method for class 'gam' derivative_samples( object, focal = NULL, data = NULL, order = 1L, type = c("forward", "backward", "central"), scale = c("response", "linear_predictor"), method = c("gaussian", "mh", "inla", "user"), n = 100, eps = 1e-07, n_sim = 10000, level = lifecycle::deprecated(), seed = NULL, envir = environment(formula(object)), draws = NULL, mvn_method = c("mvnfast", "mgcv"), ... )
derivative_samples(object, ...) ## Default S3 method: derivative_samples(object, ...) ## S3 method for class 'gamm' derivative_samples(object, ...) ## S3 method for class 'gam' derivative_samples( object, focal = NULL, data = NULL, order = 1L, type = c("forward", "backward", "central"), scale = c("response", "linear_predictor"), method = c("gaussian", "mh", "inla", "user"), n = 100, eps = 1e-07, n_sim = 10000, level = lifecycle::deprecated(), seed = NULL, envir = environment(formula(object)), draws = NULL, mvn_method = c("mvnfast", "mgcv"), ... )
object |
an R object to compute derivatives for |
... |
arguments passed to other methods and on to |
focal |
character; name of the focal variable. The response derivative
of the response with respect to this variable will be returned.
All other variables involved in the model will be held at constant values.
This can be missing if supplying |
data |
a data frame containing the values of the model covariates at which to evaluate the first derivatives of the smooths. If supplied, all but one variable must be held at a constant value. |
order |
numeric; the order of derivative. |
type |
character; the type of finite difference used. One of
|
scale |
character; should the derivative be estimated on the response
or the linear predictor (link) scale? One of |
method |
character; which method should be used to draw samples from
the posterior distribution. |
n |
numeric; the number of points to evaluate the derivative at (if
|
eps |
numeric; the finite difference. |
n_sim |
integer; the number of simulations used in computing the simultaneous intervals. |
level |
|
seed |
numeric; a random seed for the simulations. |
envir |
the environment within which to recreate the data used to fit
|
draws |
matrix; user supplied posterior draws to be used when
|
mvn_method |
character; one of |
A tibble, currently with the following variables:
.derivative
: the estimated partial derivative,
additional columns containing the covariate values at which the derivative was evaluated.
Gavin L. Simpson
load_mgcv() df <- data_sim("eg1", dist = "negbin", scale = 0.25, seed = 42) # fit the GAM (note: for execution time reasons using bam()) m <- bam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, family = nb(), method = "fREML") # data slice through data along x2 - all other covariates will be set to # typical values (value closest to median) ds <- data_slice(m, x2 = evenly(x2, n = 200)) # samples from posterior of derivatives fd_samp <- derivative_samples(m, data = ds, type = "central", focal = "x2", eps = 0.01, seed = 21, n_sim = 100 ) # plot the first 20 posterior draws if (requireNamespace("ggplot2") && requireNamespace("dplyr")) { library("ggplot2") fd_samp |> dplyr::filter(.draw <= 20) |> ggplot(aes(x = x2, y = .derivative, group = .draw)) + geom_line(alpha = 0.5) }
load_mgcv() df <- data_sim("eg1", dist = "negbin", scale = 0.25, seed = 42) # fit the GAM (note: for execution time reasons using bam()) m <- bam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, family = nb(), method = "fREML") # data slice through data along x2 - all other covariates will be set to # typical values (value closest to median) ds <- data_slice(m, x2 = evenly(x2, n = 200)) # samples from posterior of derivatives fd_samp <- derivative_samples(m, data = ds, type = "central", focal = "x2", eps = 0.01, seed = 21, n_sim = 100 ) # plot the first 20 posterior draws if (requireNamespace("ggplot2") && requireNamespace("dplyr")) { library("ggplot2") fd_samp |> dplyr::filter(.draw <= 20) |> ggplot(aes(x = x2, y = .derivative, group = .draw)) + geom_line(alpha = 0.5) }
Derivatives of estimated smooths via finite differences
derivatives(object, ...) ## Default S3 method: derivatives(object, ...) ## S3 method for class 'gamm' derivatives(object, ...) ## S3 method for class 'gam' derivatives( object, select = NULL, term = deprecated(), data = newdata, order = 1L, type = c("forward", "backward", "central"), n = 100, eps = 1e-07, interval = c("confidence", "simultaneous"), n_sim = 10000, level = 0.95, unconditional = FALSE, frequentist = FALSE, offset = NULL, ncores = 1, partial_match = FALSE, ..., newdata = NULL )
derivatives(object, ...) ## Default S3 method: derivatives(object, ...) ## S3 method for class 'gamm' derivatives(object, ...) ## S3 method for class 'gam' derivatives( object, select = NULL, term = deprecated(), data = newdata, order = 1L, type = c("forward", "backward", "central"), n = 100, eps = 1e-07, interval = c("confidence", "simultaneous"), n_sim = 10000, level = 0.95, unconditional = FALSE, frequentist = FALSE, offset = NULL, ncores = 1, partial_match = FALSE, ..., newdata = NULL )
object |
an R object to compute derivatives for. |
... |
arguments passed to other methods. |
select |
character; select which smooth's posterior to draw from.
The default ( |
term |
|
data |
a data frame containing the values of the model covariates at which to evaluate the first derivatives of the smooths. |
order |
numeric; the order of derivative. |
type |
character; the type of finite difference used. One of
|
n |
numeric; the number of points to evaluate the derivative at. |
eps |
numeric; the finite difference. |
interval |
character; the type of interval to compute. One of
|
n_sim |
integer; the number of simulations used in computing the simultaneous intervals. |
level |
numeric; |
unconditional |
logical; use smoothness selection-corrected Bayesian covariance matrix? |
frequentist |
logical; use the frequentist covariance matrix? |
offset |
numeric; a value to use for any offset term |
ncores |
number of cores for generating random variables from a
multivariate normal distribution. Passed to |
partial_match |
logical; should smooths be selected by partial matches
with |
newdata |
Deprecated: use |
A tibble, currently with the following variables:
smooth
: the smooth each row refers to,
var
: the name of the variable involved in the smooth,
data
: values of var
at which the derivative was evaluated,
derivative
: the estimated derivative,
se
: the standard error of the estimated derivative,
crit
: the critical value such that derivative
± (crit * se)
gives
the upper and lower bounds of the requested confidence or simultaneous
interval (given level
),
lower
: the lower bound of the confidence or simultaneous interval,
upper
: the upper bound of the confidence or simultaneous interval.
derivatives()
will ignore any random effect smooths it encounters in
object
.
Gavin L. Simpson
load_mgcv() dat <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 42) mod <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") ## first derivatives of all smooths using central finite differences derivatives(mod, type = "central") ## derivatives for a selected smooth derivatives(mod, type = "central", select = "s(x1)") ## or via a partial match derivatives(mod, type = "central", select = "x1", partial_match = TRUE)
load_mgcv() dat <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 42) mod <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") ## first derivatives of all smooths using central finite differences derivatives(mod, type = "central") ## derivatives for a selected smooth derivatives(mod, type = "central", select = "s(x1)") ## or via a partial match derivatives(mod, type = "central", select = "x1", partial_match = TRUE)
Estimates pairwise differences (comparisons) between factor smooth
interactions (smooths with a factor by
argument) for pairs of groups
defined by the factor. The group means can be optionally included in the
difference.
difference_smooths(model, ...) ## S3 method for class 'gam' difference_smooths( model, select = NULL, smooth = deprecated(), n = 100, ci_level = 0.95, data = NULL, group_means = FALSE, partial_match = TRUE, unconditional = FALSE, frequentist = FALSE, ... )
difference_smooths(model, ...) ## S3 method for class 'gam' difference_smooths( model, select = NULL, smooth = deprecated(), n = 100, ci_level = 0.95, data = NULL, group_means = FALSE, partial_match = TRUE, unconditional = FALSE, frequentist = FALSE, ... )
model |
A fitted model. |
... |
arguments passed to other methods. Not currently used. |
select |
character, logical, or numeric; which smooths to plot. If
|
smooth |
|
n |
numeric; the number of points at which to evaluate the difference between pairs of smooths. |
ci_level |
numeric between 0 and 1; the coverage of credible interval. |
data |
data frame of locations at which to evaluate the difference between smooths. |
group_means |
logical; should the group means be included in the difference? |
partial_match |
logical; should |
unconditional |
logical; account for smoothness selection in the model? |
frequentist |
logical; use the frequentist covariance matrix? |
load_mgcv() df <- data_sim("eg4", seed = 42) m <- gam(y ~ fac + s(x2, by = fac) + s(x0), data = df, method = "REML") sm_dif <- difference_smooths(m, select = "s(x2)") sm_dif draw(sm_dif) # include the groups means for `fac` in the difference sm_dif2 <- difference_smooths(m, select = "s(x2)", group_means = TRUE) draw(sm_dif2)
load_mgcv() df <- data_sim("eg4", seed = 42) m <- gam(y ~ fac + s(x2, by = fac) + s(x0), data = df, method = "REML") sm_dif <- difference_smooths(m, select = "s(x2)") sm_dif draw(sm_dif) # include the groups means for `fac` in the difference sm_dif2 <- difference_smooths(m, select = "s(x2)", group_means = TRUE) draw(sm_dif2)
dispersion(model, ...) ## S3 method for class 'gam' dispersion(model, ...) ## S3 method for class 'glm' dispersion(model, ...)
dispersion(model, ...) ## S3 method for class 'gam' dispersion(model, ...) ## S3 method for class 'glm' dispersion(model, ...)
model |
a fitted model. |
... |
arguments passed to other methods. |
ggplot2
Generic plotting via ggplot2
draw(object, ...)
draw(object, ...)
object |
and R object to plot. |
... |
arguments passed to other methods. |
Generic function for plotting of R objects that uses the ggplot2
package.
A ggplot2::ggplot()
object.
Gavin L. Simpson
Plots basis functions using ggplot2
## S3 method for class 'basis' draw( object, legend = FALSE, labeller = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, ncol = NULL, nrow = NULL, angle = NULL, guides = "keep", contour = FALSE, n_contour = 10, contour_col = "black", ... )
## S3 method for class 'basis' draw( object, legend = FALSE, labeller = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, ncol = NULL, nrow = NULL, angle = NULL, guides = "keep", contour = FALSE, n_contour = 10, contour_col = "black", ... )
object |
an object, the result of a call to |
legend |
logical; should a legend by drawn to indicate basis functions? |
labeller |
a labeller function with which to label facets. The default
is to use |
ylab |
character or expression; the label for the y axis. If not
supplied, a suitable label will be generated from |
title |
character or expression; the title for the plot. See
|
subtitle |
character or expression; the subtitle for the plot. See
|
caption |
character or expression; the plot caption. See
|
ncol , nrow
|
numeric; the numbers of rows and columns over which to spread the plots |
angle |
numeric; the angle at which the x axis tick labels are to be
drawn passed to the |
guides |
character; one of |
contour |
logical; should contours be draw on the plot using
|
n_contour |
numeric; the number of contour bins. Will result in
|
contour_col |
colour specification for contour lines. |
... |
arguments passed to other methods. Not used by this method. |
A patchwork
object.
Gavin L. Simpson
load_mgcv() df <- data_sim("eg1", n = 400, seed = 42) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") bf <- basis(m) draw(bf) bf <- basis(m, "s(x2)") draw(bf)
load_mgcv() df <- data_sim("eg1", n = 400, seed = 42) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") bf <- basis(m) draw(bf) bf <- basis(m, "s(x2)") draw(bf)
Plot comparisons of smooths
## S3 method for class 'compare_smooths' draw(object, ncol = NULL, nrow = NULL, guides = "collect", ...)
## S3 method for class 'compare_smooths' draw(object, ncol = NULL, nrow = NULL, guides = "collect", ...)
object |
of class |
ncol , nrow
|
numeric; the numbers of rows and columns over which to spread the plots |
guides |
character; one of |
... |
additional arguments passed to |
Plot derivatives of smooths
## S3 method for class 'derivatives' draw( object, select = NULL, scales = c("free", "fixed"), add_change = FALSE, change_type = c("change", "sizer"), alpha = 0.2, change_col = "black", decrease_col = "#56B4E9", increase_col = "#E69F00", lwd_change = 1.5, ncol = NULL, nrow = NULL, guides = "keep", angle = NULL, ... ) ## S3 method for class 'partial_derivatives' draw( object, select = NULL, scales = c("free", "fixed"), alpha = 0.2, ncol = NULL, nrow = NULL, guides = "keep", angle = NULL, ... )
## S3 method for class 'derivatives' draw( object, select = NULL, scales = c("free", "fixed"), add_change = FALSE, change_type = c("change", "sizer"), alpha = 0.2, change_col = "black", decrease_col = "#56B4E9", increase_col = "#E69F00", lwd_change = 1.5, ncol = NULL, nrow = NULL, guides = "keep", angle = NULL, ... ) ## S3 method for class 'partial_derivatives' draw( object, select = NULL, scales = c("free", "fixed"), alpha = 0.2, ncol = NULL, nrow = NULL, guides = "keep", angle = NULL, ... )
object |
a fitted GAM, the result of a call to |
select |
character, logical, or numeric; which smooths to plot. If
|
scales |
character; should all univariate smooths be plotted with the
same y-axis scale? If Currently does not affect the y-axis scale of plots of the parametric terms. |
add_change |
logical; should the periods of significant change be highlighted on the plot? |
change_type |
character; the type of change to indicate. If |
alpha |
numeric; alpha transparency for confidence or simultaneous interval. |
change_col , decrease_col , increase_col
|
colour specifications to use for
indicating periods of change. |
lwd_change |
numeric; the |
ncol , nrow
|
numeric; the numbers of rows and columns over which to spread the plots |
guides |
character; one of |
angle |
numeric; the angle at which the x axis tick labels are to be
drawn passed to the |
... |
additional arguments passed to |
load_mgcv() dat <- data_sim("eg1", n = 800, dist = "normal", scale = 2, seed = 42) mod <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") ## first derivative of all smooths df <- derivatives(mod, type = "central") draw(df) ## fixed axis scales draw(df, scales = "fixed")
load_mgcv() dat <- data_sim("eg1", n = 800, dist = "normal", scale = 2, seed = 42) mod <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") ## first derivative of all smooths df <- derivatives(mod, type = "central") draw(df) ## fixed axis scales draw(df, scales = "fixed")
Plot differences of smooths
## S3 method for class 'difference_smooth' draw( object, select = NULL, rug = FALSE, ref_line = FALSE, contour = FALSE, contour_col = "black", n_contour = NULL, ci_alpha = 0.2, ci_col = "black", smooth_col = "black", line_col = "red", scales = c("free", "fixed"), ncol = NULL, nrow = NULL, guides = "keep", xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, angle = NULL, ... )
## S3 method for class 'difference_smooth' draw( object, select = NULL, rug = FALSE, ref_line = FALSE, contour = FALSE, contour_col = "black", n_contour = NULL, ci_alpha = 0.2, ci_col = "black", smooth_col = "black", line_col = "red", scales = c("free", "fixed"), ncol = NULL, nrow = NULL, guides = "keep", xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, angle = NULL, ... )
object |
a fitted GAM, the result of a call to |
select |
character, logical, or numeric; which smooths to plot. If
|
rug |
logical; |
ref_line |
logical; |
contour |
logical; should contour lines be added to smooth surfaces? |
contour_col |
colour specification for contour lines. |
n_contour |
numeric; the number of contour bins. Will result in
|
ci_alpha |
numeric; alpha transparency for confidence or simultaneous interval. |
ci_col |
colour specification for the confidence/credible intervals band. Affects the fill of the interval. |
smooth_col |
colour specification for the the smooth or difference line. |
line_col |
colour specification for drawing reference lines |
scales |
character; should all univariate smooths be plotted with the
same y-axis scale? If Currently does not affect the y-axis scale of plots of the parametric terms. |
ncol , nrow
|
numeric; the numbers of rows and columns over which to spread the plots |
guides |
character; one of |
xlab , ylab , title , subtitle , caption
|
character; labels with which to annotate plots |
angle |
numeric; the angle at which the x axis tick labels are to be
drawn passed to the |
... |
additional arguments passed to |
load_mgcv() # simulate some data; a factor smooth example df <- data_sim("eg4", seed = 42) # fit GAM m <- gam(y ~ fac + s(x2, by = fac) + s(x0), data = df, method = "REML") # calculate the differences between pairs of smooths the f_j(x2) term diffs <- difference_smooths(m, select = "s(x2)") draw(diffs)
load_mgcv() # simulate some data; a factor smooth example df <- data_sim("eg4", seed = 42) # fit GAM m <- gam(y ~ fac + s(x2, by = fac) + s(x0), data = df, method = "REML") # calculate the differences between pairs of smooths the f_j(x2) term diffs <- difference_smooths(m, select = "s(x2)") draw(diffs)
Plots estimated univariate and bivariate smooths using ggplot2.
## S3 method for class 'evaluated_parametric_term' draw( object, ci_level = 0.95, constant = NULL, fun = NULL, xlab, ylab, title = NULL, subtitle = NULL, caption = NULL, rug = TRUE, position = "identity", response_range = NULL, ... )
## S3 method for class 'evaluated_parametric_term' draw( object, ci_level = 0.95, constant = NULL, fun = NULL, xlab, ylab, title = NULL, subtitle = NULL, caption = NULL, rug = TRUE, position = "identity", response_range = NULL, ... )
object |
an object, the result of a call to
|
ci_level |
numeric between 0 and 1; the coverage of credible interval. |
constant |
numeric; a constant to add to the estimated values of the
smooth. |
fun |
function; a function that will be applied to the estimated values
and confidence interval before plotting. Can be a function or the name of a
function. Function |
xlab |
character or expression; the label for the x axis. If not
supplied, a suitable label will be generated from |
ylab |
character or expression; the label for the y axis. If not
supplied, a suitable label will be generated from |
title |
character or expression; the title for the plot. See
|
subtitle |
character or expression; the subtitle for the plot. See
|
caption |
character or expression; the plot caption. See
|
rug |
For |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function. |
response_range |
numeric; a vector of two values giving the range of
response data for the guide. Used to fix plots to a common scale/range.
Ignored if |
... |
arguments passed to other methods. |
A ggplot2::ggplot()
object.
Gavin L. Simpson
Plots estimated smooths from a fitted GAM model in a similar way to
mgcv::plot.gam()
but instead of using base graphics, ggplot2::ggplot()
is used instead.
## S3 method for class 'gam' draw( object, data = NULL, select = NULL, parametric = FALSE, terms = NULL, residuals = FALSE, scales = c("free", "fixed"), ci_level = 0.95, n = 100, n_3d = 16, n_4d = 4, unconditional = FALSE, overall_uncertainty = TRUE, constant = NULL, fun = NULL, dist = 0.1, rug = TRUE, contour = TRUE, grouped_by = FALSE, ci_alpha = 0.2, ci_col = "black", smooth_col = "black", resid_col = "steelblue3", contour_col = "black", n_contour = NULL, partial_match = FALSE, discrete_colour = NULL, discrete_fill = NULL, continuous_colour = NULL, continuous_fill = NULL, position = "identity", angle = NULL, ncol = NULL, nrow = NULL, guides = "keep", widths = NULL, heights = NULL, crs = NULL, default_crs = NULL, lims_method = "cross", wrap = TRUE, caption = TRUE, envir = environment(formula(object)), ... )
## S3 method for class 'gam' draw( object, data = NULL, select = NULL, parametric = FALSE, terms = NULL, residuals = FALSE, scales = c("free", "fixed"), ci_level = 0.95, n = 100, n_3d = 16, n_4d = 4, unconditional = FALSE, overall_uncertainty = TRUE, constant = NULL, fun = NULL, dist = 0.1, rug = TRUE, contour = TRUE, grouped_by = FALSE, ci_alpha = 0.2, ci_col = "black", smooth_col = "black", resid_col = "steelblue3", contour_col = "black", n_contour = NULL, partial_match = FALSE, discrete_colour = NULL, discrete_fill = NULL, continuous_colour = NULL, continuous_fill = NULL, position = "identity", angle = NULL, ncol = NULL, nrow = NULL, guides = "keep", widths = NULL, heights = NULL, crs = NULL, default_crs = NULL, lims_method = "cross", wrap = TRUE, caption = TRUE, envir = environment(formula(object)), ... )
object |
a fitted GAM, the result of a call to |
data |
an optional data frame that is used to supply the data at which the smooths will be evaluated and plotted. This is usually not needed, but is an option if you need fine control over exactly what data are used for plotting. |
select |
character, logical, or numeric; which smooths to plot. If
|
parametric |
logical; plot parametric terms also? Note that |
terms |
character; which model parametric terms should be drawn? The
Default of |
residuals |
logical; should partial residuals for a smooth be drawn? Ignored for anything but a simple univariate smooth. |
scales |
character; should all univariate smooths be plotted with the
same y-axis scale? If Currently does not affect the y-axis scale of plots of the parametric terms. |
ci_level |
numeric between 0 and 1; the coverage of credible interval. |
n |
numeric; the number of points over the range of the covariate at which to evaluate the smooth. |
n_3d |
numeric; the number of new observations to generate for the third dimension of a 3D smooth. |
n_4d |
numeric; the number of new observations to generate for the
dimensions higher than 2 (!) of a kD smooth (k >= 4). For example, if
the smooth is a 4D smooth, each of dimensions 3 and 4 will get |
unconditional |
logical; should confidence intervals include the
uncertainty due to smoothness selection? If |
overall_uncertainty |
logical; should the uncertainty in the model constant term be included in the standard error of the evaluate values of the smooth? |
constant |
numeric; a constant to add to the estimated values of the
smooth. |
fun |
function; a function that will be applied to the estimated values
and confidence interval before plotting. Can be a function or the name of a
function. Function |
dist |
numeric; if greater than 0, this is used to determine when
a location is too far from data to be plotted when plotting 2-D smooths.
The data are scaled into the unit square before deciding what to exclude,
and |
rug |
logical; draw a rug plot at the bottom of each plot for 1-D smooths or plot locations of data for higher dimensions. |
contour |
logical; should contours be draw on the plot using
|
grouped_by |
logical; should factor by smooths be drawn as one panel
per level of the factor ( |
ci_alpha |
numeric; alpha transparency for confidence or simultaneous interval. |
ci_col |
colour specification for the confidence/credible intervals band. Affects the fill of the interval. |
smooth_col |
colour specification for the smooth line. |
resid_col |
colour specification for the partial residuals. |
contour_col |
colour specification for contour lines. |
n_contour |
numeric; the number of contour bins. Will result in
|
partial_match |
logical; should smooths be selected by partial matches
with |
discrete_colour |
a suitable colour scale to be used when plotting discrete variables. |
discrete_fill |
a suitable fill scale to be used when plotting discrete variables. |
continuous_colour |
a suitable colour scale to be used when plotting continuous variables. |
continuous_fill |
a suitable fill scale to be used when plotting continuous variables. |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function. |
angle |
numeric; the angle at which the x axis tick labels are to be
drawn passed to the |
ncol , nrow
|
numeric; the numbers of rows and columns over which to spread the plots |
guides |
character; one of |
widths , heights
|
The relative widths and heights of each column and
row in the grid. Will get repeated to match the dimensions of the grid. If
there is more than 1 plot and |
crs |
the coordinate reference system (CRS) to use for the plot. All
data will be projected into this CRS. See |
default_crs |
the coordinate reference system (CRS) to use for the
non-sf layers in the plot. If left at the default |
lims_method |
character; affects how the axis limits are determined. See
|
wrap |
logical; wrap plots as a patchwork? If |
caption |
logical; show the smooth type in the caption of each plot? |
envir |
an environment to look up the data within. |
... |
additional arguments passed to |
The object returned is created by patchwork::wrap_plots()
.
Internally, plots of each smooth are created using ggplot2::ggplot()
and composed into a single plot using patchwork::wrap_plots()
. As a
result, it is not possible to use +
to add to the plots in the way one
might typically work with ggplot()
plots. Instead, use the &
operator;
see the examples.
Gavin L. Simpson
load_mgcv() # simulate some data df1 <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2) # fit GAM m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df1, method = "REML") # plot all smooths draw(m1) # can add partial residuals draw(m1, residuals = TRUE) df2 <- data_sim("eg2", n = 1000, dist = "normal", scale = 1, seed = 2) m2 <- gam(y ~ s(x, z, k = 40), data = df2, method = "REML") draw(m2, contour = FALSE, n = 50) # See https://gavinsimpson.github.io/gratia/articles/custom-plotting.html # for more examples and for details on how to modify the theme of all the # plots produced by draw(). To modify all panels, for example to change the # theme, use the & operator
load_mgcv() # simulate some data df1 <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2) # fit GAM m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df1, method = "REML") # plot all smooths draw(m1) # can add partial residuals draw(m1, residuals = TRUE) df2 <- data_sim("eg2", n = 1000, dist = "normal", scale = 1, seed = 2) m2 <- gam(y ~ s(x, z, k = 40), data = df2, method = "REML") draw(m2, contour = FALSE, n = 50) # See https://gavinsimpson.github.io/gratia/articles/custom-plotting.html # for more examples and for details on how to modify the theme of all the # plots produced by draw(). To modify all panels, for example to change the # theme, use the & operator
GJRM::gamlss
Provides a draw()
method for GAMLSS (distributional GAMs) fitted
by GJRM::gamlss()
.
## S3 method for class 'gamlss' draw( object, scales = c("free", "fixed"), ncol = NULL, nrow = NULL, guides = "keep", widths = NULL, heights = NULL, ... )
## S3 method for class 'gamlss' draw( object, scales = c("free", "fixed"), ncol = NULL, nrow = NULL, guides = "keep", widths = NULL, heights = NULL, ... )
object |
a model, fitted by |
scales |
character; should all univariate smooths be plotted with the
same y-axis scale? If Currently does not affect the y-axis scale of plots of the parametric terms. |
ncol , nrow
|
numeric; the numbers of rows and columns over which to spread the plots |
guides |
character; one of |
widths , heights
|
The relative widths and heights of each column and
row in the grid. Will get repeated to match the dimensions of the grid. If
there is more than 1 plot and |
... |
arguments passed to |
Plots of smooths are not labelled with the linear predictor to which they belong.
if (require("GJRM", quietly = TRUE)) { # follow example from ?GJRM::gamlss load_mgcv() suppressPackageStartupMessages(library("GJRM")) set.seed(0) n <- 100 x1 <- round(runif(n)) x2 <- runif(n) x3 <- runif(n) f1 <- function(x) cos(pi * 2 * x) + sin(pi * x) y1 <- -1.55 + 2 * x1 + f1(x2) + rnorm(n) dataSim <- data.frame(y1, x1, x2, x3) eq_mu <- y1 ~ x1 + s(x2) eq_s <- ~ s(x3, k = 6) fl <- list(eq_mu, eq_s) m <- gamlss(fl, data = dataSim) draw(m) }
if (require("GJRM", quietly = TRUE)) { # follow example from ?GJRM::gamlss load_mgcv() suppressPackageStartupMessages(library("GJRM")) set.seed(0) n <- 100 x1 <- round(runif(n)) x2 <- runif(n) x3 <- runif(n) f1 <- function(x) cos(pi * 2 * x) + sin(pi * x) y1 <- -1.55 + 2 * x1 + f1(x2) + rnorm(n) dataSim <- data.frame(y1, x1, x2, x3) eq_mu <- y1 ~ x1 + s(x2) eq_s <- ~ s(x3, k = 6) fl <- list(eq_mu, eq_s) m <- gamlss(fl, data = dataSim) draw(m) }
Plots basis functions using ggplot2
## S3 method for class 'mgcv_smooth' draw( object, legend = FALSE, use_facets = TRUE, labeller = NULL, xlab, ylab, title = NULL, subtitle = NULL, caption = NULL, angle = NULL, ... )
## S3 method for class 'mgcv_smooth' draw( object, legend = FALSE, use_facets = TRUE, labeller = NULL, xlab, ylab, title = NULL, subtitle = NULL, caption = NULL, angle = NULL, ... )
object |
an object, the result of a call to |
legend |
logical; should a legend by drawn to indicate basis functions? |
use_facets |
logical; for factor by smooths, use facets to show the
basis functions for each level of the factor? If |
labeller |
a labeller function with which to label facets. The default
is to use |
xlab |
character or expression; the label for the x axis. If not
supplied, a suitable label will be generated from |
ylab |
character or expression; the label for the y axis. If not
supplied, a suitable label will be generated from |
title |
character or expression; the title for the plot. See
|
subtitle |
character or expression; the subtitle for the plot. See
|
caption |
character or expression; the plot caption. See
|
angle |
numeric; the angle at which the x axis tick labels are to be
drawn passed to the |
... |
arguments passed to other methods. Not used by this method. |
A ggplot2::ggplot()
object.
Gavin L. Simpson
load_mgcv() df <- data_sim("eg4", n = 400, seed = 42) bf <- basis(s(x0), data = df) draw(bf) bf <- basis(s(x2, by = fac, bs = "bs"), data = df) draw(bf)
load_mgcv() df <- data_sim("eg4", n = 400, seed = 42) bf <- basis(s(x0), data = df) draw(bf) bf <- basis(s(x2, by = fac, bs = "bs"), data = df) draw(bf)
Plot concurvity measures
## S3 method for class 'pairwise_concurvity' draw( object, title = "Smooth-wise concurvity", subtitle = NULL, caption = NULL, x_lab = "Term", y_lab = "With", fill_lab = "Concurvity", continuous_colour = NULL, ... ) ## S3 method for class 'overall_concurvity' draw( object, title = "Overall concurvity", subtitle = NULL, caption = NULL, y_lab = "Concurvity", x_lab = NULL, bar_col = "steelblue", bar_fill = "steelblue", ... )
## S3 method for class 'pairwise_concurvity' draw( object, title = "Smooth-wise concurvity", subtitle = NULL, caption = NULL, x_lab = "Term", y_lab = "With", fill_lab = "Concurvity", continuous_colour = NULL, ... ) ## S3 method for class 'overall_concurvity' draw( object, title = "Overall concurvity", subtitle = NULL, caption = NULL, y_lab = "Concurvity", x_lab = NULL, bar_col = "steelblue", bar_fill = "steelblue", ... )
object |
An object inheriting from class |
title |
character; the plot title. |
subtitle |
character; the plot subtitle. |
caption |
character; the plot caption |
x_lab |
character; the label for the x axis. |
y_lab |
character; the label for the y axis. |
fill_lab |
character; the label to use for the fill guide. |
continuous_colour |
function; continuous colour (fill) scale to use. |
... |
arguments passed to other methods. |
bar_col |
colour specification for the bar colour. |
bar_fill |
colour specification for the bar fill |
Plot estimated effects for model parametric terms
## S3 method for class 'parametric_effects' draw( object, scales = c("free", "fixed"), ci_level = 0.95, ci_col = "black", ci_alpha = 0.2, line_col = "black", constant = NULL, fun = NULL, rug = TRUE, position = "identity", angle = NULL, ..., ncol = NULL, nrow = NULL, guides = "keep" )
## S3 method for class 'parametric_effects' draw( object, scales = c("free", "fixed"), ci_level = 0.95, ci_col = "black", ci_alpha = 0.2, line_col = "black", constant = NULL, fun = NULL, rug = TRUE, position = "identity", angle = NULL, ..., ncol = NULL, nrow = NULL, guides = "keep" )
object |
a fitted GAM, the result of a call to |
scales |
character; should all univariate smooths be plotted with the
same y-axis scale? If Currently does not affect the y-axis scale of plots of the parametric terms. |
ci_level |
numeric between 0 and 1; the coverage of credible interval. |
ci_col |
colour specification for the confidence/credible intervals band. Affects the fill of the interval. |
ci_alpha |
numeric; alpha transparency for confidence or simultaneous interval. |
line_col |
colour specification used for regression lines of linear continuous terms. |
constant |
numeric; a constant to add to the estimated values of the
smooth. |
fun |
function; a function that will be applied to the estimated values
and confidence interval before plotting. Can be a function or the name of a
function. Function |
rug |
logical; draw a rug plot at the bottom of each plot for 1-D smooths or plot locations of data for higher dimensions. |
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function. |
angle |
numeric; the angle at which the x axis tick labels are to be
drawn passed to the |
... |
additional arguments passed to |
ncol , nrow
|
numeric; the numbers of rows and columns over which to spread the plots |
guides |
character; one of |
ggplot
Displays the penalty matrices of smooths as a heatmap using ggplot
## S3 method for class 'penalty_df' draw( object, normalize = FALSE, as_matrix = TRUE, continuous_fill = NULL, xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, ncol = NULL, nrow = NULL, guides = "keep", ... )
## S3 method for class 'penalty_df' draw( object, normalize = FALSE, as_matrix = TRUE, continuous_fill = NULL, xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, ncol = NULL, nrow = NULL, guides = "keep", ... )
object |
a fitted GAM, the result of a call to |
normalize |
logical; normalize the penalty to the range -1, 1? |
as_matrix |
logical; how should the plotted penalty matrix be oriented?
If |
continuous_fill |
a suitable fill scale to be used when plotting continuous variables. |
xlab |
character or expression; the label for the x axis. If not supplied, no axis label will be drawn. May be a vector, one per penalty. |
ylab |
character or expression; the label for the y axis. If not supplied, no axis label will be drawn. May be a vector, one per penalty. |
title |
character or expression; the title for the plot. See
|
subtitle |
character or expression; the subtitle for the plot. See
|
caption |
character or expression; the plot caption. See
|
ncol , nrow
|
numeric; the numbers of rows and columns over which to spread the plots. |
guides |
character; one of |
... |
additional arguments passed to |
load_mgcv() dat <- data_sim("eg4", n = 400, seed = 42) m <- gam(y ~ s(x0) + s(x1, bs = "cr") + s(x2, bs = "bs", by = fac), data = dat, method = "REML" ) ## produce a multi-panel plot of all penalties draw(penalty(m)) # for a specific smooth draw(penalty(m, select = "s(x2):fac1"))
load_mgcv() dat <- data_sim("eg4", n = 400, seed = 42) m <- gam(y ~ s(x0) + s(x1, bs = "cr") + s(x2, bs = "bs", by = fac), data = dat, method = "REML" ) ## produce a multi-panel plot of all penalties draw(penalty(m)) # for a specific smooth draw(penalty(m, select = "s(x2):fac1"))
A rootogram is a model diagnostic tool that assesses the goodness of fit of
a statistical model. The observed values of the response are compared with
those expected from the fitted model. For discrete, count responses, the
frequency of each count (0, 1, 2, etc) in the observed data and expected
from the conditional distribution of the response implied by the model are
compared. For continuous variables, the observed and expected frequencies
are obtained by grouping the data into bins. The rootogram is drawn using
ggplot2::ggplot()
graphics. The design closely follows Kleiber & Zeileis
(2016).
## S3 method for class 'rootogram' draw( object, type = c("hanging", "standing", "suspended"), sqrt = TRUE, ref_line = TRUE, warn_limits = TRUE, fitted_colour = "steelblue", bar_colour = NA, bar_fill = "grey", ref_line_colour = "black", warn_line_colour = "black", ylab = NULL, xlab = NULL, ... )
## S3 method for class 'rootogram' draw( object, type = c("hanging", "standing", "suspended"), sqrt = TRUE, ref_line = TRUE, warn_limits = TRUE, fitted_colour = "steelblue", bar_colour = NA, bar_fill = "grey", ref_line_colour = "black", warn_line_colour = "black", ylab = NULL, xlab = NULL, ... )
object |
and R object to plot. |
type |
character; the type of rootogram to draw. |
sqrt |
logical; show the observed and fitted frequencies |
ref_line |
logical; draw a reference line at zero? |
warn_limits |
logical; draw Tukey's warning limit lines at +/- 1? |
fitted_colour , bar_colour , bar_fill , ref_line_colour , warn_line_colour
|
colours used to draw the respective element of the rootogram. |
xlab , ylab
|
character; labels for the x and y axis of the rootogram.
May be missing ( |
... |
arguments passed to other methods. |
A 'ggplot' object.
Kleiber, C., Zeileis, A., (2016) Visualizing Count Data Regressions Using Rootograms. Am. Stat. 70, 296–303. doi:10.1080/00031305.2016.1173590
rootogram()
to compute the data for the rootogram.
load_mgcv() df <- data_sim("eg1", n = 1000, dist = "poisson", scale = 0.1, seed = 6) # A poisson example m <- gam(y ~ s(x0, bs = "cr") + s(x1, bs = "cr") + s(x2, bs = "cr") + s(x3, bs = "cr"), family = poisson(), data = df, method = "REML") rg <- rootogram(m) # plot the rootogram draw(rg) # change the type of rootogram draw(rg, type = "suspended")
load_mgcv() df <- data_sim("eg1", n = 1000, dist = "poisson", scale = 0.1, seed = 6) # A poisson example m <- gam(y ~ s(x0, bs = "cr") + s(x1, bs = "cr") + s(x2, bs = "cr") + s(x3, bs = "cr"), family = poisson(), data = df, method = "REML") rg <- rootogram(m) # plot the rootogram draw(rg) # change the type of rootogram draw(rg, type = "suspended")
smooth_estimates()
Plot the result of a call to smooth_estimates()
## S3 method for class 'smooth_estimates' draw( object, constant = NULL, fun = NULL, contour = TRUE, grouped_by = FALSE, contour_col = "black", n_contour = NULL, ci_alpha = 0.2, ci_col = "black", smooth_col = "black", resid_col = "steelblue3", decrease_col = "#56B4E9", increase_col = "#E69F00", change_lwd = 1.75, partial_match = FALSE, discrete_colour = NULL, discrete_fill = NULL, continuous_colour = NULL, continuous_fill = NULL, angle = NULL, ylim = NULL, crs = NULL, default_crs = NULL, lims_method = "cross", caption = TRUE, ... )
## S3 method for class 'smooth_estimates' draw( object, constant = NULL, fun = NULL, contour = TRUE, grouped_by = FALSE, contour_col = "black", n_contour = NULL, ci_alpha = 0.2, ci_col = "black", smooth_col = "black", resid_col = "steelblue3", decrease_col = "#56B4E9", increase_col = "#E69F00", change_lwd = 1.75, partial_match = FALSE, discrete_colour = NULL, discrete_fill = NULL, continuous_colour = NULL, continuous_fill = NULL, angle = NULL, ylim = NULL, crs = NULL, default_crs = NULL, lims_method = "cross", caption = TRUE, ... )
object |
a fitted GAM, the result of a call to |
constant |
numeric; a constant to add to the estimated values of the
smooth. |
fun |
function; a function that will be applied to the estimated values
and confidence interval before plotting. Can be a function or the name of a
function. Function |
contour |
logical; should contours be draw on the plot using
|
grouped_by |
logical; should factor by smooths be drawn as one panel
per level of the factor ( |
contour_col |
colour specification for contour lines. |
n_contour |
numeric; the number of contour bins. Will result in
|
ci_alpha |
numeric; alpha transparency for confidence or simultaneous interval. |
ci_col |
colour specification for the confidence/credible intervals band. Affects the fill of the interval. |
smooth_col |
colour specification for the smooth line. |
resid_col |
colour specification for the partial residuals. |
decrease_col , increase_col
|
colour specifications to use for
indicating periods of change. |
change_lwd |
numeric; the value to set the |
partial_match |
logical; should smooths be selected by partial matches
with |
discrete_colour |
a suitable colour scale to be used when plotting discrete variables. |
discrete_fill |
a suitable fill scale to be used when plotting discrete variables. |
continuous_colour |
a suitable colour scale to be used when plotting continuous variables. |
continuous_fill |
a suitable fill scale to be used when plotting continuous variables. |
angle |
numeric; the angle at which the x axis tick labels are to be
drawn passed to the |
ylim |
numeric; vector of y axis limits to use all all panels drawn. |
crs |
the coordinate reference system (CRS) to use for the plot. All
data will be projected into this CRS. See |
default_crs |
the coordinate reference system (CRS) to use for the
non-sf layers in the plot. If left at the default |
lims_method |
character; affects how the axis limits are determined. See
|
caption |
logical; show the smooth type in the caption of each plot? |
... |
additional arguments passed to |
load_mgcv() # example data df <- data_sim("eg1", seed = 21) # fit GAM m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # plot all of the estimated smooths sm <- smooth_estimates(m) draw(sm) # evaluate smooth of `x2` sm <- smooth_estimates(m, select = "s(x2)") # plot it draw(sm) # customising some plot elements draw(sm, ci_col = "steelblue", smooth_col = "forestgreen", ci_alpha = 0.3) # Add a constant to the plotted smooth draw(sm, constant = coef(m)[1]) # Adding change indicators to smooths based on derivatives of the smooth d <- derivatives(m, n = 100) # n to match smooth_estimates() smooth_estimates(m) |> add_sizer(derivatives = d, type = "sizer") |> draw()
load_mgcv() # example data df <- data_sim("eg1", seed = 21) # fit GAM m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # plot all of the estimated smooths sm <- smooth_estimates(m) draw(sm) # evaluate smooth of `x2` sm <- smooth_estimates(m, select = "s(x2)") # plot it draw(sm) # customising some plot elements draw(sm, ci_col = "steelblue", smooth_col = "forestgreen", ci_alpha = 0.3) # Add a constant to the plotted smooth draw(sm, constant = coef(m)[1]) # Adding change indicators to smooths based on derivatives of the smooth d <- derivatives(m, n = 100) # n to match smooth_estimates() smooth_estimates(m) |> add_sizer(derivatives = d, type = "sizer") |> draw()
Plot posterior smooths
## S3 method for class 'smooth_samples' draw( object, select = NULL, n_samples = NULL, seed = NULL, xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, alpha = 1, colour = "black", contour = FALSE, contour_col = "black", n_contour = NULL, scales = c("free", "fixed"), rug = TRUE, partial_match = FALSE, angle = NULL, ncol = NULL, nrow = NULL, guides = "keep", ... )
## S3 method for class 'smooth_samples' draw( object, select = NULL, n_samples = NULL, seed = NULL, xlab = NULL, ylab = NULL, title = NULL, subtitle = NULL, caption = NULL, alpha = 1, colour = "black", contour = FALSE, contour_col = "black", n_contour = NULL, scales = c("free", "fixed"), rug = TRUE, partial_match = FALSE, angle = NULL, ncol = NULL, nrow = NULL, guides = "keep", ... )
object |
a fitted GAM, the result of a call to |
select |
character, logical, or numeric; which smooths to plot. If
|
n_samples |
numeric; if not |
seed |
numeric; random seed to be used to if sampling draws. |
xlab |
character or expression; the label for the x axis. If not
supplied, a suitable label will be generated from |
ylab |
character or expression; the label for the y axis. If not
supplied, a suitable label will be generated from |
title |
character or expression; the title for the plot. See
|
subtitle |
character or expression; the subtitle for the plot. See
|
caption |
character or expression; the plot caption. See
|
alpha |
numeric; alpha transparency for confidence or simultaneous interval. |
colour |
The colour to use to draw the posterior smooths. Passed to
|
contour |
logical; should contour lines be added to smooth surfaces? |
contour_col |
colour specification for contour lines. |
n_contour |
numeric; the number of contour bins. Will result in
|
scales |
character; should all univariate smooths be plotted with the
same y-axis scale? If Currently does not affect the y-axis scale of plots of the parametric terms. |
rug |
logical; draw a rug plot at the bottom of each plot for 1-D smooths or plot locations of data for higher dimensions. |
partial_match |
logical; should smooths be selected by partial matches
with |
angle |
numeric; the angle at which the x axis tick labels are to be
drawn passed to the |
ncol , nrow
|
numeric; the numbers of rows and columns over which to spread the plots |
guides |
character; one of |
... |
arguments to be passed to |
Gavin L. Simpson
load_mgcv() dat1 <- data_sim("eg1", n = 400, dist = "normal", scale = 1, seed = 1) ## a single smooth GAM m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat1, method = "REML") ## posterior smooths from m1 sm1 <- smooth_samples(m1, n = 15, seed = 23478) ## plot draw(sm1, alpha = 0.7) ## plot only 5 randomly smapled draws draw(sm1, n_samples = 5, alpha = 0.7) ## A factor-by smooth example dat2 <- data_sim("eg4", n = 400, dist = "normal", scale = 1, seed = 1) ## a multi-smooth GAM with a factor-by smooth m2 <- gam(y ~ fac + s(x2, by = fac) + s(x0), data = dat2, method = "REML") ## posterior smooths from m1 sm2 <- smooth_samples(m2, n = 15, seed = 23478) ## plot, this time selecting only the factor-by smooth draw(sm2, select = "s(x2)", partial_match = TRUE, alpha = 0.7) ## A 2D smooth example dat3 <- data_sim("eg2", n = 400, dist = "normal", scale = 1, seed = 1) ## fit a 2D smooth m3 <- gam(y ~ te(x, z), data = dat3, method = "REML") ## get samples sm3 <- smooth_samples(m3, n = 10) ## plot just 6 of the draws, with contour line overlays draw(sm3, n_samples = 6, contour = TRUE, seed = 42)
load_mgcv() dat1 <- data_sim("eg1", n = 400, dist = "normal", scale = 1, seed = 1) ## a single smooth GAM m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat1, method = "REML") ## posterior smooths from m1 sm1 <- smooth_samples(m1, n = 15, seed = 23478) ## plot draw(sm1, alpha = 0.7) ## plot only 5 randomly smapled draws draw(sm1, n_samples = 5, alpha = 0.7) ## A factor-by smooth example dat2 <- data_sim("eg4", n = 400, dist = "normal", scale = 1, seed = 1) ## a multi-smooth GAM with a factor-by smooth m2 <- gam(y ~ fac + s(x2, by = fac) + s(x0), data = dat2, method = "REML") ## posterior smooths from m1 sm2 <- smooth_samples(m2, n = 15, seed = 23478) ## plot, this time selecting only the factor-by smooth draw(sm2, select = "s(x2)", partial_match = TRUE, alpha = 0.7) ## A 2D smooth example dat3 <- data_sim("eg2", n = 400, dist = "normal", scale = 1, seed = 1) ## fit a 2D smooth m3 <- gam(y ~ te(x, z), data = dat3, method = "REML") ## get samples sm3 <- smooth_samples(m3, n = 10) ## plot just 6 of the draws, with contour line overlays draw(sm3, n_samples = 6, contour = TRUE, seed = 42)
Extracts the effective degrees of freedom (EDF) for model smooth terms or overall EDF for fitted GAMs
edf(object, ...) ## S3 method for class 'gam' edf( object, select = NULL, smooth = deprecated(), type = c("default", "unconditional", "alternative"), partial_match = FALSE, ... ) model_edf(object, ..., type = c("default", "unconditional", "alternative"))
edf(object, ...) ## S3 method for class 'gam' edf( object, select = NULL, smooth = deprecated(), type = c("default", "unconditional", "alternative"), partial_match = FALSE, ... ) model_edf(object, ..., type = c("default", "unconditional", "alternative"))
Multiple formulations for the effective degrees of freedom are
available. The additional uncertainty due to selection of smoothness
parameters can be taken into account when computing the EDF of smooths.
This form of the EDF is available with type = "unconditional"
.
Wood (2017; pp. 252) describes an alternative EDF for the model
where
is the matrix trace and
is a matrix
mapping un-penalized coefficient estimates to the penalized coefficient
estimates. The trace of
is effectively the average
shrinkage of the coefficients multipled by the number of coefficients
(Wood, 2017). Smooth-specific EDFs then are obtained by summing up the
relevent elements of
.
load_mgcv() df <- data_sim("eg1", n = 400, seed = 42) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # extract the EDFs for all smooths edf(m) # or selected smooths edf(m, select = c("s(x0)", "s(x2)")) # accounting for smoothness parameter uncertainty edf(m, type = "unconditional") # over EDF of the model, including the intercept model_edf(m) # can get model EDF for multiple models m2 <- gam(y ~ s(x0) + s(x1) + s(x3), data = df, method = "REML") model_edf(m, m2)
load_mgcv() df <- data_sim("eg1", n = 400, seed = 42) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # extract the EDFs for all smooths edf(m) # or selected smooths edf(m, select = c("s(x0)", "s(x2)")) # accounting for smoothness parameter uncertainty edf(m, type = "unconditional") # over EDF of the model, including the intercept model_edf(m) # can get model EDF for multiple models m2 <- gam(y ~ s(x0) + s(x1) + s(x3), data = df, method = "REML") model_edf(m, m2)
S3 methods to evaluate individual smooths
eval_smooth(smooth, ...) ## S3 method for class 'mgcv.smooth' eval_smooth( smooth, model, n = 100, n_3d = NULL, n_4d = NULL, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, dist = NULL, ... ) ## S3 method for class 'soap.film' eval_smooth( smooth, model, n = 100, n_3d = NULL, n_4d = NULL, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, ... ) ## S3 method for class 'scam_smooth' eval_smooth( smooth, model, n = 100, n_3d = NULL, n_4d = NULL, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, dist = NULL, ... ) ## S3 method for class 'fs.interaction' eval_smooth( smooth, model, n = 100, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, ... ) ## S3 method for class 'sz.interaction' eval_smooth( smooth, model, n = 100, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, ... ) ## S3 method for class 'random.effect' eval_smooth( smooth, model, n = 100, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, ... ) ## S3 method for class 'mrf.smooth' eval_smooth( smooth, model, n = 100, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, ... ) ## S3 method for class 't2.smooth' eval_smooth( smooth, model, n = 100, n_3d = NULL, n_4d = NULL, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, dist = NULL, ... ) ## S3 method for class 'tensor.smooth' eval_smooth( smooth, model, n = 100, n_3d = NULL, n_4d = NULL, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, dist = NULL, ... )
eval_smooth(smooth, ...) ## S3 method for class 'mgcv.smooth' eval_smooth( smooth, model, n = 100, n_3d = NULL, n_4d = NULL, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, dist = NULL, ... ) ## S3 method for class 'soap.film' eval_smooth( smooth, model, n = 100, n_3d = NULL, n_4d = NULL, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, ... ) ## S3 method for class 'scam_smooth' eval_smooth( smooth, model, n = 100, n_3d = NULL, n_4d = NULL, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, dist = NULL, ... ) ## S3 method for class 'fs.interaction' eval_smooth( smooth, model, n = 100, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, ... ) ## S3 method for class 'sz.interaction' eval_smooth( smooth, model, n = 100, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, ... ) ## S3 method for class 'random.effect' eval_smooth( smooth, model, n = 100, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, ... ) ## S3 method for class 'mrf.smooth' eval_smooth( smooth, model, n = 100, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, ... ) ## S3 method for class 't2.smooth' eval_smooth( smooth, model, n = 100, n_3d = NULL, n_4d = NULL, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, dist = NULL, ... ) ## S3 method for class 'tensor.smooth' eval_smooth( smooth, model, n = 100, n_3d = NULL, n_4d = NULL, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, dist = NULL, ... )
smooth |
currently an object that inherits from class |
... |
arguments passed to other methods |
model |
a fitted model; currently only |
n |
numeric; the number of points over the range of the covariate at which to evaluate the smooth. |
n_3d , n_4d
|
numeric; the number of points over the range of last
covariate in a 3D or 4D smooth. The default is |
data |
an optional data frame of values to evaluate |
unconditional |
logical; should confidence intervals include the
uncertainty due to smoothness selection? If |
overall_uncertainty |
logical; should the uncertainty in the model constant term be included in the standard error of the evaluate values of the smooth? |
dist |
numeric; if greater than 0, this is used to determine when
a location is too far from data to be plotted when plotting 2-D smooths.
The data are scaled into the unit square before deciding what to exclude,
and |
Returns values of parametric model terms
at values of factor terms and over a grid of covariate values for linear
parametric terms. This function is now deprecated in favour of
parametric_effects()
.
evaluate_parametric_term(object, ...) ## S3 method for class 'gam' evaluate_parametric_term(object, term, unconditional = FALSE, ...)
evaluate_parametric_term(object, ...) ## S3 method for class 'gam' evaluate_parametric_term(object, term, unconditional = FALSE, ...)
object |
an object of class |
... |
arguments passed to other methods. |
term |
character; which parametric term whose effects are evaluated |
unconditional |
logical; should confidence intervals include the
uncertainty due to smoothness selection? If |
Evaluate a smooth at a grid of evenly spaced value over the range of the covariate associated with the smooth. Alternatively, a set of points at which the smooth should be evaluated can be supplied.
evaluate_smooth(object, ...)
evaluate_smooth(object, ...)
object |
an object of class |
... |
arguments passed to other methods. |
evaluate_smooth()
is deprecated in
favour of smooth_estimates()
, which provides a cleaner way to evaluate a
smooth over a range of covariate values. smooth_estimates()
can handle a
much wider range of models than evaluate_smooth()
is capable of and
smooth_estimates()
is much easier to extend to handle new smooth types.
Most code that uses evaluate_smooth()
should work simply by changing the
function call to smooth_estimates()
. However, there are some differences:
the newdata
argument becomes data
A data frame, which is of class "evaluated_1d_smooth"
or
evaluated_2d_smooth
, which inherit from classes "evaluated_smooth"
and "data.frame"
.
For a continuous vector x
, evenly
and seq_min_max()
create a sequence of n
evenly-spaced values over the range lower
– upper
. By default, lower
is defined as min(x)
and upper
as
max(x)
, excluding NA
s. For a factor x
, the function returns
levels(x)
.
evenly(x, n = 100, by = NULL, lower = NULL, upper = NULL) seq_min_max(x, n, by = NULL, lower = NULL, upper = NULL)
evenly(x, n = 100, by = NULL, lower = NULL, upper = NULL) seq_min_max(x, n, by = NULL, lower = NULL, upper = NULL)
x |
numeric; vector over which evenly-spaced values are returned |
n |
numeric; the number of evenly-spaced values to return. A default of
|
by |
numeric; the increment of the sequence. If specified, argument |
lower |
numeric; the lower bound of the interval. |
upper |
numeric; the upper bound of the interval. |
A numeric vector of length n
.
See base::seq()
for details of the behaviour of evenly()
when
using by
.
x <- rnorm(10) n <- 10L # 10 values evenly over the range of `x` evenly(x, n = n) # evenly spaced values, incrementing by 0.2 evenly(x, by = 0.2) # evenly spaced values, incrementing by 0.2, starting at -2 evenly(x, by = 0.2, lower = -2)
x <- rnorm(10) n <- 10L # 10 values evenly over the range of `x` evenly(x, n = n) # evenly spaced values, incrementing by 0.2 evenly(x, by = 0.2) # evenly spaced values, incrementing by 0.2, starting at -2 evenly(x, by = 0.2, lower = -2)
All combinations of factor levels
factor_combos(object, ...) ## S3 method for class 'gam' factor_combos(object, vars = everything(), complete = TRUE, ...)
factor_combos(object, ...) ## S3 method for class 'gam' factor_combos(object, vars = everything(), complete = TRUE, ...)
object |
a fitted model object. |
... |
arguments passed to methods. |
vars |
terms to include or exclude from the returned object. Uses tidyselect principles. |
complete |
logical; should all combinations of factor levels be
returned? If |
Extracts the name of the family used to fit the supplied model.
family_name(object, ...)
family_name(object, ...)
object |
an R object. |
... |
arguments passed to other methods. |
A character vector containing the family name.
Extracts the type of family in a consistent way
family_type(object, ...) ## S3 method for class 'family' family_type(object, ...) ## Default S3 method: family_type(object, ...)
family_type(object, ...) ## S3 method for class 'family' family_type(object, ...) ## Default S3 method: family_type(object, ...)
object |
an R object. Currently |
... |
arguments passed to other methods. |
Provides a stats::family()
method for a range of GAM objects.
## S3 method for class 'gam' family(object, ...) ## S3 method for class 'gamm' family(object, ...) ## S3 method for class 'bam' family(object, ...) ## S3 method for class 'list' family(object, ...)
## S3 method for class 'gam' family(object, ...) ## S3 method for class 'gamm' family(object, ...) ## S3 method for class 'bam' family(object, ...) ## S3 method for class 'list' family(object, ...)
object |
a fitted model. Models fitted by |
... |
arguments passed to other methods. |
Expectations (fitted values) of the response drawn from the posterior distribution of fitted model using a Gaussian approximation to the posterior or a simple Metropolis Hastings sampler.
fitted_samples(model, ...) ## S3 method for class 'gam' fitted_samples( model, n = 1, data = newdata, seed = NULL, scale = c("response", "linear_predictor"), method = c("gaussian", "mh", "inla", "user"), n_cores = 1, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, freq = FALSE, unconditional = FALSE, draws = NULL, mvn_method = c("mvnfast", "mgcv"), ..., newdata = NULL, ncores = NULL )
fitted_samples(model, ...) ## S3 method for class 'gam' fitted_samples( model, n = 1, data = newdata, seed = NULL, scale = c("response", "linear_predictor"), method = c("gaussian", "mh", "inla", "user"), n_cores = 1, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, freq = FALSE, unconditional = FALSE, draws = NULL, mvn_method = c("mvnfast", "mgcv"), ..., newdata = NULL, ncores = NULL )
model |
a fitted model of the supported types |
... |
arguments passed to other methods. For |
n |
numeric; the number of posterior samples to return. |
data |
data frame; new observations at which the posterior draws
from the model should be evaluated. If not supplied, the data used to fit
the model will be used for |
seed |
numeric; a random seed for the simulations. |
scale |
character; what scale should the fitted values be returned on?
|
method |
character; which method should be used to draw samples from
the posterior distribution. |
n_cores |
number of cores for generating random variables from a
multivariate normal distribution. Passed to |
burnin |
numeric; number of samples to discard as the burnin draws.
Only used with |
thin |
numeric; the number of samples to skip when taking |
t_df |
numeric; degrees of freedom for t distribution proposals. Only
used with |
rw_scale |
numeric; Factor by which to scale posterior covariance
matrix when generating random walk proposals. Negative or non finite to
skip the random walk step. Only used with |
freq |
logical; |
unconditional |
logical; if |
draws |
matrix; user supplied posterior draws to be used when
|
mvn_method |
character; one of |
newdata |
Deprecated: use |
ncores |
Deprecated; use |
A tibble (data frame) with 3 columns containing the posterior predicted values in long format. The columns are
row
(integer) the row of data
that each posterior draw relates to,
draw
(integer) an index, in range 1:n
, indicating which draw each row
relates to,
response
(numeric) the predicted response for the indicated row of
data
.
Models with offset terms supplied via the offset
argument to
mgcv::gam()
etc. are ignored by mgcv::predict.gam()
. As such, this
kind of offset term is also ignored by posterior_samples()
. Offset terms
that are included in the model formula supplied to mgcv::gam()
etc are
not ignored and the posterior samples produced will reflect those offset
term values. This has the side effect of requiring any new data values
provided to posterior_samples()
via the data
argument must include the
offset variable.
Gavin L. Simpson
Wood, S.N., (2020). Simplified integrated nested Laplace approximation. Biometrika 107, 223–230. doi:10.1093/biomet/asz044
load_mgcv() dat <- data_sim("eg1", n = 1000, dist = "normal", scale = 2, seed = 2) m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") fs <- fitted_samples(m1, n = 5, seed = 42) fs # can generate own set of draws and use them drws <- generate_draws(m1, n = 2, seed = 24) fs2 <- fitted_samples(m1, method = "user", draws = drws) fs2
load_mgcv() dat <- data_sim("eg1", n = 1000, dist = "normal", scale = 2, seed = 2) m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") fs <- fitted_samples(m1, n = 5, seed = 42) fs # can generate own set of draws and use them drws <- generate_draws(m1, n = 2, seed = 24) fs2 <- fitted_samples(m1, method = "user", draws = drws) fs2
Generate fitted values from a estimated GAM
fitted_values(object, ...) ## S3 method for class 'gam' fitted_values( object, data = NULL, scale = c("response", "link", "linear predictor"), ci_level = 0.95, ... ) ## S3 method for class 'gamm' fitted_values(object, ...) ## S3 method for class 'scam' fitted_values(object, ...)
fitted_values(object, ...) ## S3 method for class 'gam' fitted_values( object, data = NULL, scale = c("response", "link", "linear predictor"), ci_level = 0.95, ... ) ## S3 method for class 'gamm' fitted_values(object, ...) ## S3 method for class 'scam' fitted_values(object, ...)
object |
a fitted model. Currently only models fitted by |
... |
arguments passed to |
data |
optional data frame of covariate values for which fitted values are to be returned. |
scale |
character; what scale should the fitted values be returned on?
|
ci_level |
numeric; a value between 0 and 1 indicating the coverage of the credible interval. |
A tibble (data frame) whose first m columns contain either the data
used to fit the model (if data
was NULL
), or the variables supplied to
data
. Four further columns are added:
fitted
: the fitted values on the specified scale,
se
: the standard error of the fitted values (always on the link scale),
lower
, upper
: the limits of the credible interval on the fitted values,
on the specified scale.
Models fitted with certain families will include additional variables
mgcv::ocat()
models: when scale = "repsonse"
, the returned object will
contain a row
column and a category
column, which indicate to which row
of the data
each row of the returned object belongs. Additionally, there
will be nrow(data) * n_categories
rows in the returned object; each row
is the predicted probability for a single category of the response.
For most families, regardless of the scale on which the fitted values
are returned, the se
component of the returned object is on the link
(linear predictor) scale, not the response scale. An exception is the
mgcv::ocat()
family, for which the se
is on the response scale if
scale = "response"
.
load_mgcv() sim_df <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = sim_df, method = "REML") fv <- fitted_values(m) fv
load_mgcv() sim_df <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = sim_df, method = "REML") fv <- fitted_values(m) fv
Identifies which variable, if any, is the model offset, and fixed the name
such that offset(foo(var))
is converted to var
, and possibly sets the
values of that variable to offset_val
.
fix_offset(model, newdata, offset_val = NULL)
fix_offset(model, newdata, offset_val = NULL)
model |
a fitted GAM. |
newdata |
data frame; new values at which to predict at. |
offset_val |
numeric, optional; if provided, then the offset variable
in |
The original newdata
is returned with fixed names and possibly
modified offset variable.
Gavin L. Simpson
load_mgcv() df <- data_sim("eg1", n = 400, dist = "normal", seed = 2) m <- gam(y ~ s(x0) + s(x1) + offset(x2), data = df, method = "REML") names(model.frame(m)) names(fix_offset(m, model.frame(m), offset_val = 1L))
load_mgcv() df <- data_sim("eg1", n = 400, dist = "normal", seed = 2) m <- gam(y ~ s(x0) + s(x1) + offset(x2), data = df, method = "REML") names(model.frame(m)) names(fix_offset(m, model.frame(m), offset_val = 1L))
Extract fixed effects estimates
object |
a fitted GAM |
... |
arguments passed to other methods |
Extract fixed effects estimates from a fitted GAM
## S3 method for class 'gam' fixef(object, ...) ## S3 method for class 'gamm' fixef(object, ...) ## S3 method for class 'lm' fixef(object, ...) ## S3 method for class 'glm' fixef(object, ...) fixed_effects(object, ...) ## Default S3 method: fixed_effects(object, ...)
## S3 method for class 'gam' fixef(object, ...) ## S3 method for class 'gamm' fixef(object, ...) ## S3 method for class 'lm' fixef(object, ...) ## S3 method for class 'glm' fixef(object, ...) fixed_effects(object, ...) ## Default S3 method: fixed_effects(object, ...)
object |
a fitted GAM |
... |
arguments passed to other methods |
load_mgcv() # run example if lme4 is available if (require("lme4")) { data(sleepstudy, package = "lme4") m <- gam( Reaction ~ Days + s(Subject, bs = "re") + s(Days, Subject, bs = "re"), data = sleepstudy, method = "REML" ) fixef(m) }
load_mgcv() # run example if lme4 is available if (require("lme4")) { data(sleepstudy, package = "lme4") m <- gam( Reaction ~ Days + s(Subject, bs = "re") + s(Days, Subject, bs = "re"), data = sleepstudy, method = "REML" ) fixef(m) }
Posterior samples using a simple Metropolis Hastings sampler
gaussian_draws(model, ...) ## S3 method for class 'gam' gaussian_draws( model, n, n_cores = 1L, index = NULL, frequentist = FALSE, unconditional = FALSE, mvn_method = "mvnfast", ... ) ## S3 method for class 'scam' gaussian_draws( model, n, n_cores = 1L, index = NULL, frequentist = FALSE, parametrized = TRUE, mvn_method = "mvnfast", ... )
gaussian_draws(model, ...) ## S3 method for class 'gam' gaussian_draws( model, n, n_cores = 1L, index = NULL, frequentist = FALSE, unconditional = FALSE, mvn_method = "mvnfast", ... ) ## S3 method for class 'scam' gaussian_draws( model, n, n_cores = 1L, index = NULL, frequentist = FALSE, parametrized = TRUE, mvn_method = "mvnfast", ... )
model |
a fitted R model. Currently only models fitted by |
... |
arguments passed to methods. |
n |
numeric; the number of posterior draws to take. |
n_cores |
integer; number of CPU cores to use when generating
multivariate normal distributed random values. Only used if
|
index |
numeric; vector of indices of coefficients to use. Can be used
to subset the mean vector and covariance matrix extracted from |
frequentist |
logical; if |
unconditional |
logical; if |
mvn_method |
character; one of |
parametrized |
logical; use parametrized coefficients and covariance
matrix, which respect the linear inequality constraints of the model. Only
for |
Extract an factor-by smooth by name
get_by_smooth(object, term, level)
get_by_smooth(object, term, level)
object |
a fitted GAM model object. |
term |
character; the name of a smooth term to extract. |
level |
character; which level of the factor to exrtact the smooth for. |
A single smooth object, or a list of smooths if several match the named term.
Extract an mgcv smooth by name
get_smooth(object, term)
get_smooth(object, term)
object |
a fitted GAM model object. |
term |
character; the name of a smooth term to extract |
A single smooth object, or a list of smooths if several match the named term.
Extract an mgcv smooth given its position in the model object
get_smooths_by_id(object, id) ## S3 method for class 'gam' get_smooths_by_id(object, id) ## S3 method for class 'scam' get_smooths_by_id(object, id) ## S3 method for class 'gamm' get_smooths_by_id(object, id) ## S3 method for class 'gamm4' get_smooths_by_id(object, id) ## S3 method for class 'list' get_smooths_by_id(object, id)
get_smooths_by_id(object, id) ## S3 method for class 'gam' get_smooths_by_id(object, id) ## S3 method for class 'scam' get_smooths_by_id(object, id) ## S3 method for class 'gamm' get_smooths_by_id(object, id) ## S3 method for class 'gamm4' get_smooths_by_id(object, id) ## S3 method for class 'list' get_smooths_by_id(object, id)
object |
a fitted GAM model object. |
id |
numeric; the position of the smooth in the model object. |
A subset of the data from the carData::GSSvocab
dataset from the
carData
package, containing observations from 2016 only.
A data frame with 1858 rows and 3 variables:
vocab
: numeric; the number of words out of 10 correct on a vocabulary
test.
nativeBorn
: factor; Was the respondent born in the US? A factor with
levels no
and yes
.
ageGroup
: factor; grouped age of the respondent with levels 18-29
30-39
, 40-49
, 50-59
, and 60+
.
Gu and Wabha test functions
gw_f0(x, ...) gw_f1(x, ...) gw_f2(x, ...) gw_f3(x, ...)
gw_f0(x, ...) gw_f1(x, ...) gw_f2(x, ...) gw_f3(x, ...)
x |
numeric; vector of points to evaluate the function at, on interval (0,1) |
... |
arguments passed to other methods, ignored. |
x <- seq(0, 1, length = 6) gw_f0(x) gw_f1(x) gw_f2(x) gw_f3(x) # should be constant 0
x <- seq(0, 1, length = 6) gw_f0(x) gw_f1(x) gw_f2(x) gw_f3(x) # should be constant 0
Are additional parameters available for a GAM?
has_theta(object)
has_theta(object)
object |
an R object, either a |
A logical; TRUE
if additional parameters available, FALSE
otherwise.
load_mgcv() df <- data_sim("eg1", dist = "poisson", seed = 42, scale = 1 / 5) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML", family = nb() ) has_theta(m) p <- theta(m)
load_mgcv() df <- data_sim("eg1", dist = "poisson", seed = 42, scale = 1 / 5) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML", family = nb() ) has_theta(m) p <- theta(m)
Functions to check if a smooth is a by-variable one and to test of the type of by-variable smooth is a factor-smooth or a continous-smooth interaction.
is_by_smooth(smooth) is_factor_by_smooth(smooth) is_continuous_by_smooth(smooth) by_variable(smooth) by_level(smooth)
is_by_smooth(smooth) is_factor_by_smooth(smooth) is_continuous_by_smooth(smooth) by_variable(smooth) by_level(smooth)
smooth |
an object of class |
A logical vector.
Gavin L. Simpson
Given the name (a term label) of a term in a model, identify if the term is a
factor term or numeric. This is useful when considering interactions, where
terms like fac1:fac2
or num1:fac1
may be requested by the user. Only for
terms of the type fac1:fac2
will this function return TRUE
.
is_factor_term(object, term, ...) ## S3 method for class 'terms' is_factor_term(object, term, ...) ## S3 method for class 'gam' is_factor_term(object, term, ...) ## S3 method for class 'bam' is_factor_term(object, term, ...) ## S3 method for class 'gamm' is_factor_term(object, term, ...) ## S3 method for class 'list' is_factor_term(object, term, ...)
is_factor_term(object, term, ...) ## S3 method for class 'terms' is_factor_term(object, term, ...) ## S3 method for class 'gam' is_factor_term(object, term, ...) ## S3 method for class 'bam' is_factor_term(object, term, ...) ## S3 method for class 'gamm' is_factor_term(object, term, ...) ## S3 method for class 'list' is_factor_term(object, term, ...)
object |
an R object on which method dispatch is performed |
term |
character; the name of a model term, in the sense of
|
... |
arguments passed to other methods. |
A logical: TRUE
if and only if all variables involved in the term
are factors, otherwise FALSE
.
Check if objects are smooths or are a particular type of smooth
is_mgcv_smooth(smooth) stop_if_not_mgcv_smooth(smooth) check_is_mgcv_smooth(smooth) is_mrf_smooth(smooth)
is_mgcv_smooth(smooth) stop_if_not_mgcv_smooth(smooth) check_is_mgcv_smooth(smooth) is_mrf_smooth(smooth)
smooth |
an R object, typically a list |
Check if a smooth inherits from class "mgcv.smooth"
.
stop_if_not_mgcv_smooth()
is a wrapper around is_mgcv_smooth()
, useful
when programming for checking if the supplied object is one of mgcv's
smooths, and throwing a consistent error if not.
check_is_mgcv_smooth()
is similar to stop_if_not_mgcv_smooth()
but
returns the result of is_mgcv_smooth()
invisibly.
Given a character vector of model terms, checks to see which, if any, is the model offset.
is_offset(terms)
is_offset(terms)
terms |
character vector of model terms. |
A logical vector of the same length as terms
.
Gavin L. Simpson
load_mgcv() df <- data_sim("eg1", n = 400, dist = "normal") m <- gam(y ~ s(x0) + s(x1) + offset(x0), data = df, method = "REML") nm <- names(model.frame(m)) nm is_offset(nm)
load_mgcv() df <- data_sim("eg1", n = 400, dist = "normal") m <- gam(y ~ s(x0) + s(x1) + offset(x0), data = df, method = "REML") nm <- names(model.frame(m)) nm is_offset(nm)
Returns the link or its inverse from an estimated model, and provides a simple way to extract these functions from complex models with multiple links, such as location scale models.
link(object, ...) ## S3 method for class 'family' link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'gam' link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'bam' link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'gamm' link(object, ...) ## S3 method for class 'glm' link(object, ...) ## S3 method for class 'list' link(object, ...) inv_link(object, ...) ## S3 method for class 'family' inv_link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'gam' inv_link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'bam' inv_link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'gamm' inv_link(object, ...) ## S3 method for class 'list' inv_link(object, ...) ## S3 method for class 'glm' inv_link(object, ...) extract_link(family, ...) ## S3 method for class 'family' extract_link(family, inverse = FALSE, ...) ## S3 method for class 'general.family' extract_link(family, parameter, inverse = FALSE, which_eta = NULL, ...)
link(object, ...) ## S3 method for class 'family' link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'gam' link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'bam' link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'gamm' link(object, ...) ## S3 method for class 'glm' link(object, ...) ## S3 method for class 'list' link(object, ...) inv_link(object, ...) ## S3 method for class 'family' inv_link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'gam' inv_link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'bam' inv_link(object, parameter = NULL, which_eta = NULL, ...) ## S3 method for class 'gamm' inv_link(object, ...) ## S3 method for class 'list' inv_link(object, ...) ## S3 method for class 'glm' inv_link(object, ...) extract_link(family, ...) ## S3 method for class 'family' extract_link(family, inverse = FALSE, ...) ## S3 method for class 'general.family' extract_link(family, parameter, inverse = FALSE, which_eta = NULL, ...)
object |
a family object or a fitted model from which to extract the
family object. Models fitted by |
... |
arguments passed to other methods. |
parameter |
character; which parameter of the distribution. Usually
|
which_eta |
numeric; the linear predictor to extract for families
|
family |
a family object, the result of a call to |
inverse |
logical; return the inverse of the link function? |
Gavin L. Simpson
load_mgcv() link(gaussian()) link(nb()) inv_link(nb()) dat <- data_sim("eg1", seed = 4234) mod <- gam(list(y ~ s(x0) + s(x1) + s(x2) + s(x3), ~1), data = dat, family = gaulss ) link(mod, parameter = "scale") inv_link(mod, parameter = "scale") ## Works with `family` objects too link(shash(), parameter = "skewness")
load_mgcv() link(gaussian()) link(nb()) inv_link(nb()) dat <- data_sim("eg1", seed = 4234) mod <- gam(list(y ~ s(x0) + s(x1) + s(x2) + s(x3), ~1), data = dat, family = gaulss ) link(mod, parameter = "scale") inv_link(mod, parameter = "scale") ## Works with `family` objects too link(shash(), parameter = "skewness")
Simple function that loads the mgcv package whilst suppressing the startup messages that it prints to the console.
load_mgcv()
load_mgcv()
Returns a logical vectors invisibly, indicating whether the package was loaded or not.
lp_matrix()
is a wrapper to predict(..., type = "lpmatrix")
for returning
the linear predictor matrix for the model training data (when data = NULL
),
or user-specified data values supplied via data
.
lp_matrix(model, ...) ## S3 method for class 'gam' lp_matrix(model, data = NULL, ...)
lp_matrix(model, ...) ## S3 method for class 'gam' lp_matrix(model, data = NULL, ...)
model |
a fitted model |
... |
arguments passed to other methods and |
data |
a data frame of values at which to return the linear prediction matrix. |
The linear prediction matrix is a matrix that maps values
of parameters
to values on the linear
predictor of the model
.
is the model matrix where spline
covariates have been replaced by the values of the basis functions evaluated
at the respective covariates. Parametric covariates are also included.
The linear prediction matrix is returned as a matrix. The object
returned is of class "lp_matrix"
, which inherits from classes "matrix"
and "array"
. The special class allows the printing of the matrix to be
controlled, which we do by printing the matrix as a tibble.
load_mgcv() df <- data_sim("eg1", seed = 1) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df) # linear prediction matrix for observed data xp <- lp_matrix(m) ## IGNORE_RDIFF_BEGIN xp ## IGNORE_RDIFF_END # the object `xp` *is* a matrix class(xp) # but we print like a tibble to avoid spamming the R console # linear predictor matrix for new data set ds <- data_slice(m, x2 = evenly(x2)) xp <- lp_matrix(m, data = ds) ## IGNORE_RDIFF_BEGIN xp ## IGNORE_RDIFF_END
load_mgcv() df <- data_sim("eg1", seed = 1) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df) # linear prediction matrix for observed data xp <- lp_matrix(m) ## IGNORE_RDIFF_BEGIN xp ## IGNORE_RDIFF_END # the object `xp` *is* a matrix class(xp) # but we print like a tibble to avoid spamming the R console # linear predictor matrix for new data set ds <- data_slice(m, x2 = evenly(x2)) xp <- lp_matrix(m, data = ds) ## IGNORE_RDIFF_BEGIN xp ## IGNORE_RDIFF_END
Posterior samples using a Gaussian approximation to the posterior distribution
mh_draws(model, ...) ## S3 method for class 'gam' mh_draws( model, n, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, index = NULL, ... )
mh_draws(model, ...) ## S3 method for class 'gam' mh_draws( model, n, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, index = NULL, ... )
model |
a fitted R model. Currently only models fitted by |
... |
arguments passed to methods. |
n |
numeric; the number of posterior draws to take. |
burnin |
numeric; the length of any initial burn in period to discard.
See |
thin |
numeric; retain only |
t_df |
numeric; degrees of freedom for static multivariate t proposal.
See |
rw_scale |
numeric; factor by which to scale posterior covariance
matrix when generating random walk proposals. See |
index |
numeric; vector of indices of coefficients to use. Can be used
to subset the mean vector and covariance matrix extracted from |
Concurvity of an estimated GAM
model_concurvity(model, ...) ## S3 method for class 'gam' model_concurvity( model, terms = everything(), type = c("all", "estimate", "observed", "worst"), pairwise = FALSE, ... ) concrvity( model, terms = everything(), type = c("all", "estimate", "observed", "worst"), pairwise = FALSE, ... )
model_concurvity(model, ...) ## S3 method for class 'gam' model_concurvity( model, terms = everything(), type = c("all", "estimate", "observed", "worst"), pairwise = FALSE, ... ) concrvity( model, terms = everything(), type = c("all", "estimate", "observed", "worst"), pairwise = FALSE, ... )
model |
a fitted GAM. Currently only objects of class |
... |
arguents passed to other methods. |
terms |
currently ignored |
type |
character; |
pairwise |
logical; extract pairwise concurvity of model terms? |
## simulate data with concurvity... library("tibble") load_mgcv() set.seed(8) n <- 200 df <- tibble( t = sort(runif(n)), x = gw_f2(t) + rnorm(n) * 3, y = sin(4 * pi * t) + exp(x / 20) + rnorm(n) * 0.3 ) ## fit model m <- gam(y ~ s(t, k = 15) + s(x, k = 15), data = df, method = "REML") ## overall concurvity o_conc <- concrvity(m) draw(o_conc) ## pairwise concurvity p_conc <- concrvity(m, pairwise = TRUE) draw(p_conc)
## simulate data with concurvity... library("tibble") load_mgcv() set.seed(8) n <- 200 df <- tibble( t = sort(runif(n)), x = gw_f2(t) + rnorm(n) * 3, y = sin(4 * pi * t) + exp(x / 20) + rnorm(n) * 0.3 ) ## fit model m <- gam(y ~ s(t, k = 15) + s(x, k = 15), data = df, method = "REML") ## overall concurvity o_conc <- concrvity(m) draw(o_conc) ## pairwise concurvity p_conc <- concrvity(m, pairwise = TRUE) draw(p_conc)
Extracts the model constant term(s), the model intercept, from a fitted model object.
model_constant(model, ...) ## S3 method for class 'gam' model_constant(model, lp = NULL, ...) ## S3 method for class 'gamlss' model_constant(model, ...)
model_constant(model, ...) ## S3 method for class 'gam' model_constant(model, lp = NULL, ...) ## S3 method for class 'gamlss' model_constant(model, ...)
model |
a fitted model for which a |
... |
arguments passed to other methods. |
lp |
numeric; which linear predictors to extract constant terms for. |
load_mgcv() # simulate a small example df <- data_sim("eg1", seed = 42) # fit the GAM m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # extract the estimate of the constant term model_constant(m) # same as coef(m)[1L] coef(m)[1L]
load_mgcv() # simulate a small example df <- data_sim("eg1", seed = 42) # fit the GAM m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # extract the estimate of the constant term model_constant(m) # same as coef(m)[1L] coef(m)[1L]
List the variables involved in a model fitted with a formula
model_vars(model, ...) ## S3 method for class 'gam' model_vars(model, ...) ## Default S3 method: model_vars(model, ...) ## S3 method for class 'bam' model_vars(model, ...) ## S3 method for class 'gamm' model_vars(model, ...) ## S3 method for class 'gamm4' model_vars(model, ...) ## S3 method for class 'list' model_vars(model, ...)
model_vars(model, ...) ## S3 method for class 'gam' model_vars(model, ...) ## Default S3 method: model_vars(model, ...) ## S3 method for class 'bam' model_vars(model, ...) ## S3 method for class 'gamm' model_vars(model, ...) ## S3 method for class 'gamm4' model_vars(model, ...) ## S3 method for class 'list' model_vars(model, ...)
model |
a fitted model object with a |
... |
Arguments passed to other methods. Currently ignored. |
load_mgcv() # simulate some Gaussian data df <- data_sim("eg1", n = 50, seed = 2) # fit a GAM with 1 smooth and 1 linear term m1 <- gam(y ~ s(x2, k = 7) + x1, data = df, method = "REML") model_vars(m1) # fit a lm with two linear terms m2 <- lm(y ~ x2 + x1, data = df) model_vars(m2)
load_mgcv() # simulate some Gaussian data df <- data_sim("eg1", n = 50, seed = 2) # fit a GAM with 1 smooth and 1 linear term m1 <- gam(y ~ s(x2, k = 7) + x1, data = df, method = "REML") model_vars(m1) # fit a lm with two linear terms m2 <- lm(y ~ x2 + x1, data = df) model_vars(m2)
Extracts the number of linear predictors from the fitted model.
n_eta(model, ...) ## S3 method for class 'gam' n_eta(model, ...)
n_eta(model, ...) ## S3 method for class 'gam' n_eta(model, ...)
model |
a fitted model. Currently, only models inheriting from class
|
... |
arguments passed to methods. |
An integer vector of length 1 containing the number of linear predictors in the model.
How many smooths in a fitted model
n_smooths(object) ## Default S3 method: n_smooths(object) ## S3 method for class 'gam' n_smooths(object) ## S3 method for class 'gamm' n_smooths(object) ## S3 method for class 'bam' n_smooths(object)
n_smooths(object) ## Default S3 method: n_smooths(object) ## S3 method for class 'gam' n_smooths(object) ## S3 method for class 'gamm' n_smooths(object) ## S3 method for class 'bam' n_smooths(object)
object |
a fitted GAM or related model. Typically the result of a call
to |
Negative binomial parameter theta
nb_theta(model) ## S3 method for class 'gam' nb_theta(model)
nb_theta(model) ## S3 method for class 'gam' nb_theta(model)
model |
a fitted model. |
A numeric vector of length 1 containing the estimated value of theta.
nb_theta(gam)
: Method for class "gam"
load_mgcv() df <- data_sim("eg1", n = 500, dist = "poisson", scale = 0.1, seed = 6) m <- gam(y ~ s(x0, bs = "cr") + s(x1, bs = "cr") + s(x2, bs = "cr") + s(x3, bs = "cr"), family = nb, data = df, method = "REML") ## IGNORE_RDIFF_BEGIN nb_theta(m) ## IGNORE_RDIFF_END
load_mgcv() df <- data_sim("eg1", n = 500, dist = "poisson", scale = 0.1, seed = 6) m <- gam(y ~ s(x0, bs = "cr") + s(x1, bs = "cr") + s(x2, bs = "cr") + s(x3, bs = "cr"), family = nb, data = df, method = "REML") ## IGNORE_RDIFF_BEGIN nb_theta(m) ## IGNORE_RDIFF_END
Extract the null deviance of a fitted model
null_deviance(model, ...) ## Default S3 method: null_deviance(model, ...)
null_deviance(model, ...) ## Default S3 method: null_deviance(model, ...)
model |
a fitted model |
... |
arguments passed to other methods |
Plot of fitted against observed response values
observed_fitted_plot( model, ylab = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL, point_col = "black", point_alpha = 1 )
observed_fitted_plot( model, ylab = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL, point_col = "black", point_alpha = 1 )
model |
a fitted model. Currently only class |
ylab |
character or expression; the label for the y axis. If not supplied, a suitable label will be generated. |
xlab |
character or expression; the label for the y axis. If not supplied, a suitable label will be generated. |
title |
character or expression; the title for the plot. See
|
subtitle |
character or expression; the subtitle for the plot. See
|
caption |
character or expression; the plot caption. See
|
point_col |
colour used to draw points in the plots. See
|
point_alpha |
numeric; alpha transparency for points in plots. |
Provides an overview of a model and the terms in that model
overview(model, ...) ## S3 method for class 'gam' overview( model, parametric = TRUE, random_effects = TRUE, dispersion = NULL, frequentist = FALSE, accuracy = 0.001, stars = FALSE, ... )
overview(model, ...) ## S3 method for class 'gam' overview( model, parametric = TRUE, random_effects = TRUE, dispersion = NULL, frequentist = FALSE, accuracy = 0.001, stars = FALSE, ... )
model |
a fitted model object to overview. |
... |
arguments passed to other methods. |
parametric |
logical; include the model parametric terms in the overview? |
random_effects |
tests of fully penalized smooth terms (those with a
zero-dimensional null space, e.g. random effects) are computationally
expensive and for large data sets producing these p values can take a
very long time. If |
dispersion |
numeric; a known value for the dispersion parameter. The
default |
frequentist |
logical; by default the Bayesian estimated covariance
matrix of the parameter estimates is used to calculate p values for
parametric terms. If |
accuracy |
numeric; accuracy with which to report p values, with p
values below this value displayed as |
stars |
logical; should significance stars be added to the output? |
load_mgcv() df <- data_sim(n = 400, seed = 2) m <- gam(y ~ x3 + s(x0) + s(x1, bs = "bs") + s(x2, bs = "ts"), data = df, method = "REML" ) overview(m)
load_mgcv() df <- data_sim(n = 400, seed = 2) m <- gam(y ~ x3 + s(x0) + s(x1, bs = "bs") + s(x2, bs = "ts"), data = df, method = "REML" ) overview(m)
Estimated values for parametric model terms
parametric_effects(object, ...) ## S3 method for class 'gam' parametric_effects( object, terms = NULL, data = NULL, unconditional = FALSE, unnest = TRUE, ci_level = 0.95, envir = environment(formula(object)), transform = FALSE, ... )
parametric_effects(object, ...) ## S3 method for class 'gam' parametric_effects( object, terms = NULL, data = NULL, unconditional = FALSE, unnest = TRUE, ci_level = 0.95, envir = environment(formula(object)), transform = FALSE, ... )
object |
a fitted model object. |
... |
arguments passed to other methods. |
terms |
character; which model parametric terms should be drawn? The
Default of |
data |
a optional data frame that may or may not be used? FIXME! |
unconditional |
logical; should confidence intervals include the
uncertainty due to smoothness selection? If |
unnest |
logical; unnest the parametric effect objects? |
ci_level |
numeric; the coverage required for the confidence interval. Currently ignored. |
envir |
an environment to look up the data within. |
transform |
logical; if |
Names of any parametric terms in a GAM
parametric_terms(model, ...) ## Default S3 method: parametric_terms(model, ...) ## S3 method for class 'gam' parametric_terms(model, ...)
parametric_terms(model, ...) ## Default S3 method: parametric_terms(model, ...) ## S3 method for class 'gam' parametric_terms(model, ...)
model |
a fitted model. |
... |
arguments passed to other methods. |
Partial derivatives of estimated multivariate smooths via finite differences
partial_derivatives(object, ...) ## Default S3 method: partial_derivatives(object, ...) ## S3 method for class 'gamm' partial_derivatives(object, ...) ## S3 method for class 'gam' partial_derivatives( object, select = NULL, term = deprecated(), focal = NULL, data = newdata, order = 1L, type = c("forward", "backward", "central"), n = 100, eps = 1e-07, interval = c("confidence", "simultaneous"), n_sim = 10000, level = 0.95, unconditional = FALSE, frequentist = FALSE, offset = NULL, ncores = 1, partial_match = FALSE, seed = NULL, ..., newdata = NULL )
partial_derivatives(object, ...) ## Default S3 method: partial_derivatives(object, ...) ## S3 method for class 'gamm' partial_derivatives(object, ...) ## S3 method for class 'gam' partial_derivatives( object, select = NULL, term = deprecated(), focal = NULL, data = newdata, order = 1L, type = c("forward", "backward", "central"), n = 100, eps = 1e-07, interval = c("confidence", "simultaneous"), n_sim = 10000, level = 0.95, unconditional = FALSE, frequentist = FALSE, offset = NULL, ncores = 1, partial_match = FALSE, seed = NULL, ..., newdata = NULL )
object |
an R object to compute derivatives for. |
... |
arguments passed to other methods. |
select |
character; vector of one or more smooth terms for which
derivatives are required. If missing, derivatives for all smooth terms
will be returned. Can be a partial match to a smooth term; see argument
|
term |
|
focal |
character; name of the focal variable. The partial derivative
of the estimated smooth with respect to this variable will be returned.
All other variables involved in the smooth will be held at constant. This
can be missing if supplying |
data |
a data frame containing the values of the model covariates at which to evaluate the first derivatives of the smooths. If supplied, all but one variable must be held at a constant value. |
order |
numeric; the order of derivative. |
type |
character; the type of finite difference used. One of
|
n |
numeric; the number of points to evaluate the derivative at. |
eps |
numeric; the finite difference. |
interval |
character; the type of interval to compute. One of
|
n_sim |
integer; the number of simulations used in computing the simultaneous intervals. |
level |
numeric; |
unconditional |
logical; use smoothness selection-corrected Bayesian covariance matrix? |
frequentist |
logical; use the frequentist covariance matrix? |
offset |
numeric; a value to use for any offset term |
ncores |
number of cores for generating random variables from a
multivariate normal distribution. Passed to |
partial_match |
logical; should smooths be selected by partial matches
with |
seed |
numeric; RNG seed to use. |
newdata |
Deprecated: use |
A tibble, currently with the following variables:
.smooth
: the smooth each row refers to,
.partial_deriv
: the estimated partial derivative,
.se
: the standard error of the estimated partial derivative,
.crit
: the critical value such that derivative
± (crit * se)
gives
the upper and lower bounds of the requested confidence or simultaneous
interval (given level
),
.lower_ci
: the lower bound of the confidence or simultaneous interval,
.upper_ci
: the upper bound of the confidence or simultaneous interval.
partial_derivatives()
will ignore any random effect smooths it
encounters in object
.
Gavin L. Simpson
library("ggplot2") library("patchwork") load_mgcv() df <- data_sim("eg2", n = 2000, dist = "normal", scale = 0.5, seed = 42) # fit the GAM (note: for execution time reasons, k is set articifially low) m <- gam(y ~ te(x, z, k = c(5, 5)), data = df, method = "REML") # data slice through te(x,z) holding z == 0.4 ds <- data_slice(m, x = evenly(x, n = 100), z = 0.4) # evaluate te(x,z) at values of x & z sm <- smooth_estimates(m, select = "te(x,z)", data = ds) |> add_confint() # partial derivatives pd_x <- partial_derivatives(m, data = ds, type = "central", focal = "x") # draw te(x,z) p1 <- draw(m, rug = FALSE) & geom_hline(yintercept = 0.4, linewidth = 1) p1 # draw te(x,z) along slice cap <- expression(z == 0.4) p2 <- sm |> ggplot(aes(x = x, y = .estimate)) + geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci), alpha = 0.2) + geom_line() + labs( x = "x", y = "Partial effect", title = "te(x,z)", caption = cap ) p2 # draw partial derivs p3 <- pd_x |> draw() + labs(caption = cap) p3 # draw all three panels p1 + p2 + p3 + plot_layout(ncol = 3)
library("ggplot2") library("patchwork") load_mgcv() df <- data_sim("eg2", n = 2000, dist = "normal", scale = 0.5, seed = 42) # fit the GAM (note: for execution time reasons, k is set articifially low) m <- gam(y ~ te(x, z, k = c(5, 5)), data = df, method = "REML") # data slice through te(x,z) holding z == 0.4 ds <- data_slice(m, x = evenly(x, n = 100), z = 0.4) # evaluate te(x,z) at values of x & z sm <- smooth_estimates(m, select = "te(x,z)", data = ds) |> add_confint() # partial derivatives pd_x <- partial_derivatives(m, data = ds, type = "central", focal = "x") # draw te(x,z) p1 <- draw(m, rug = FALSE) & geom_hline(yintercept = 0.4, linewidth = 1) p1 # draw te(x,z) along slice cap <- expression(z == 0.4) p2 <- sm |> ggplot(aes(x = x, y = .estimate)) + geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci), alpha = 0.2) + geom_line() + labs( x = "x", y = "Partial effect", title = "te(x,z)", caption = cap ) p2 # draw partial derivs p3 <- pd_x |> draw() + labs(caption = cap) p3 # draw all three panels p1 + p2 + p3 + plot_layout(ncol = 3)
Partial residuals
partial_residuals(object, ...) ## S3 method for class 'gam' partial_residuals(object, select = NULL, partial_match = FALSE, ...)
partial_residuals(object, ...) ## S3 method for class 'gam' partial_residuals(object, select = NULL, partial_match = FALSE, ...)
object |
an R object, typically a model. Currently only objects of
class |
... |
arguments passed to other methods. |
select |
character, logical, or numeric; which smooths to plot. If
|
partial_match |
logical; should smooths be selected by partial matches
with |
## load mgcv load_mgcv() ## example data - Gu & Wabha four term model df <- data_sim("eg1", n = 400, seed = 42) ## fit the model m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") ## extract partial residuals partial_residuals(m) ## and for a select term partial_residuals(m, select = "s(x2)") ## or with partial matching partial_residuals(m, select = "x", partial_match = TRUE) # returns all
## load mgcv load_mgcv() ## example data - Gu & Wabha four term model df <- data_sim("eg1", n = 400, seed = 42) ## fit the model m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") ## extract partial residuals partial_residuals(m) ## and for a select term partial_residuals(m, select = "s(x2)") ## or with partial matching partial_residuals(m, select = "x", partial_match = TRUE) # returns all
Extract and tidy penalty matrices
penalty(object, ...) ## Default S3 method: penalty( object, rescale = FALSE, data, knots = NULL, constraints = FALSE, diagonalize = FALSE, ... ) ## S3 method for class 'gam' penalty( object, select = NULL, smooth = deprecated(), rescale = FALSE, partial_match = FALSE, ... ) ## S3 method for class 'mgcv.smooth' penalty(object, rescale = FALSE, ...) ## S3 method for class 'tensor.smooth' penalty(object, margins = FALSE, ...) ## S3 method for class 't2.smooth' penalty(object, margins = FALSE, ...) ## S3 method for class 're.smooth.spec' penalty(object, data, ...)
penalty(object, ...) ## Default S3 method: penalty( object, rescale = FALSE, data, knots = NULL, constraints = FALSE, diagonalize = FALSE, ... ) ## S3 method for class 'gam' penalty( object, select = NULL, smooth = deprecated(), rescale = FALSE, partial_match = FALSE, ... ) ## S3 method for class 'mgcv.smooth' penalty(object, rescale = FALSE, ...) ## S3 method for class 'tensor.smooth' penalty(object, margins = FALSE, ...) ## S3 method for class 't2.smooth' penalty(object, margins = FALSE, ...) ## S3 method for class 're.smooth.spec' penalty(object, data, ...)
object |
a fitted GAM or a smooth. |
... |
additional arguments passed to methods. |
rescale |
logical; by default, mgcv will scale the penalty matrix for
better performance in |
data |
data frame; a data frame of values for terms mentioned in the smooth specification. |
knots |
a list or data frame with named components containing
knots locations. Names must match the covariates for which the basis
is required. See |
constraints |
logical; should identifiability constraints be applied to
the smooth basis. See argument |
diagonalize |
logical; if |
select |
character, logical, or numeric; which smooths to extract
penalties for. If |
smooth |
|
partial_match |
logical; should smooths be selected by partial matches
with |
margins |
logical; extract the penalty matrices for the tensor product or the marginal smooths of the tensor product? |
A 'tibble' (data frame) of class penalty_df
inheriting from
tbl_df
, with the following components:
.smooth
- character; the label mgcv uses to refer to the smooth,
.type
- character; the type of smooth,
.penalty
- character; the label for the specific penalty. Some smooths
have multiple penalty matrices, so the penalty
component identifies the
particular penalty matrix and uses the labelling that mgcv uses
internally,
.row
- character; a label of the form fn
where n
is an integer for
the n
th basis function, referencing the columns of the penalty matrix,
.col
- character; a label of the form fn
where n
is an integer for
the n
th basis function, referencing the columns of the penalty matrix,
.value
- double; the value of the penalty matrix for the combination of
row
and col
,
The print()
method uses base::zapsmall()
to turn very small numbers
into 0s for display purposes only; the underlying values of the penalty
matrix or matrices are not changed.
For smooths that are subject to an eigendecomposition (e.g. the default
thin plate regression splines, bs = "tp"
), the signs of the eigenvectors
are not defined and as such you can expect differences across systems in
the penalties for such smooths that are system-, OS-, and CPU architecture-
specific.
Gavin L. Simpson
load_mgcv() dat <- data_sim("eg4", n = 400, seed = 42) m <- gam( y ~ s(x0, bs = "cr") + s(x1, bs = "cr") + s(x2, by = fac, bs = "cr"), data = dat, method = "REML" ) # penalties for all smooths penalty(m) # for a specific smooth penalty(m, select = "s(x2):fac1")
load_mgcv() dat <- data_sim("eg4", n = 400, seed = 42) m <- gam( y ~ s(x0, bs = "cr") + s(x1, bs = "cr") + s(x2, by = fac, bs = "cr"), data = dat, method = "REML" ) # penalties for all smooths penalty(m) # for a specific smooth penalty(m, select = "s(x2):fac1")
Low-level Functions to generate draws from the posterior distribution of model coefficients
Generate posterior draws from a fitted model
post_draws(model, ...) ## Default S3 method: post_draws( model, n, method = c("gaussian", "mh", "inla", "user"), mu = NULL, sigma = NULL, n_cores = 1L, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, index = NULL, frequentist = FALSE, unconditional = FALSE, parametrized = TRUE, mvn_method = c("mvnfast", "mgcv"), draws = NULL, seed = NULL, ... ) generate_draws(model, ...) ## S3 method for class 'gam' generate_draws( model, n, method = c("gaussian", "mh", "inla"), mu = NULL, sigma = NULL, n_cores = 1L, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, index = NULL, frequentist = FALSE, unconditional = FALSE, mvn_method = c("mvnfast", "mgcv"), seed = NULL, ... )
post_draws(model, ...) ## Default S3 method: post_draws( model, n, method = c("gaussian", "mh", "inla", "user"), mu = NULL, sigma = NULL, n_cores = 1L, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, index = NULL, frequentist = FALSE, unconditional = FALSE, parametrized = TRUE, mvn_method = c("mvnfast", "mgcv"), draws = NULL, seed = NULL, ... ) generate_draws(model, ...) ## S3 method for class 'gam' generate_draws( model, n, method = c("gaussian", "mh", "inla"), mu = NULL, sigma = NULL, n_cores = 1L, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, index = NULL, frequentist = FALSE, unconditional = FALSE, mvn_method = c("mvnfast", "mgcv"), seed = NULL, ... )
model |
a fitted R model. Currently only models fitted by |
... |
arguments passed to methods. |
n |
numeric; the number of posterior draws to take. |
method |
character; which algorithm to use to sample from the posterior.
Currently implemented methods are: |
mu |
numeric; user-supplied mean vector (vector of model coefficients). Currently ignored. |
sigma |
matrix; user-supplied covariance matrix for |
n_cores |
integer; number of CPU cores to use when generating
multivariate normal distributed random values. Only used if
|
burnin |
numeric; the length of any initial burn in period to discard.
See |
thin |
numeric; retain only |
t_df |
numeric; degrees of freedom for static multivariate t proposal.
See |
rw_scale |
numeric; factor by which to scale posterior covariance
matrix when generating random walk proposals. See |
index |
numeric; vector of indices of coefficients to use. Can be used
to subset the mean vector and covariance matrix extracted from |
frequentist |
logical; if |
unconditional |
logical; if |
parametrized |
logical; use parametrized coefficients and covariance
matrix, which respect the linear inequality constraints of the model. Only
for |
mvn_method |
character; one of |
draws |
matrix; user supplied posterior draws to be used when
|
seed |
numeric; the random seed to use. If |
Draw samples from the posterior distribution of an estimated model
posterior_samples(model, ...) ## S3 method for class 'gam' posterior_samples( model, n = 1, data = newdata, seed = NULL, method = c("gaussian", "mh", "inla", "user"), n_cores = 1, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, freq = FALSE, unconditional = FALSE, weights = NULL, draws = NULL, mvn_method = c("mvnfast", "mgcv"), ..., newdata = NULL, ncores = NULL )
posterior_samples(model, ...) ## S3 method for class 'gam' posterior_samples( model, n = 1, data = newdata, seed = NULL, method = c("gaussian", "mh", "inla", "user"), n_cores = 1, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, freq = FALSE, unconditional = FALSE, weights = NULL, draws = NULL, mvn_method = c("mvnfast", "mgcv"), ..., newdata = NULL, ncores = NULL )
model |
a fitted model of the supported types |
... |
arguments passed to other methods. For |
n |
numeric; the number of posterior samples to return. |
data |
data frame; new observations at which the posterior draws
from the model should be evaluated. If not supplied, the data used to fit
the model will be used for |
seed |
numeric; a random seed for the simulations. |
method |
character; which method should be used to draw samples from
the posterior distribution. |
n_cores |
number of cores for generating random variables from a
multivariate normal distribution. Passed to |
burnin |
numeric; number of samples to discard as the burnin draws.
Only used with |
thin |
numeric; the number of samples to skip when taking |
t_df |
numeric; degrees of freedom for t distribution proposals. Only
used with |
rw_scale |
numeric; Factor by which to scale posterior covariance
matrix when generating random walk proposals. Negative or non finite to
skip the random walk step. Only used with |
freq |
logical; |
unconditional |
logical; if |
weights |
numeric; a vector of prior weights. If |
draws |
matrix; user supplied posterior draws to be used when
|
mvn_method |
character; one of |
newdata |
Deprecated: use |
ncores |
Deprecated; use |
A tibble (data frame) with 3 columns containing the posterior predicted values in long format. The columns are
row
(integer) the row of data
that each posterior draw relates to,
draw
(integer) an index, in range 1:n
, indicating which draw each row
relates to,
response
(numeric) the predicted response for the indicated row of
data
.
Models with offset terms supplied via the offset
argument to
mgcv::gam()
etc. are ignored by mgcv::predict.gam()
. As such, this
kind of offset term is also ignored by posterior_samples()
. Offset terms
that are included in the model formula supplied to mgcv::gam()
etc are
not ignored and the posterior samples produced will reflect those offset
term values. This has the side effect of requiring any new data values
provided to posterior_samples()
via the data
argument must include the
offset variable.
Gavin L. Simpson
Wood, S.N., (2020). Simplified integrated nested Laplace approximation. Biometrika 107, 223–230. doi:10.1093/biomet/asz044
Predicted values of the response (new response data) are drawn from the
fitted model, created via simulate()
(e.g. simulate.gam()
) and returned
in a tidy, long, format. These predicted values do not include the
uncertainty in the estimated model; they are simply draws from the
conditional distribution of the response.
predicted_samples(model, ...) ## S3 method for class 'gam' predicted_samples( model, n = 1, data = newdata, seed = NULL, weights = NULL, ..., newdata = NULL )
predicted_samples(model, ...) ## S3 method for class 'gam' predicted_samples( model, n = 1, data = newdata, seed = NULL, weights = NULL, ..., newdata = NULL )
model |
a fitted model of the supported types |
... |
arguments passed to other methods. For |
n |
numeric; the number of posterior samples to return. |
data |
data frame; new observations at which the posterior draws
from the model should be evaluated. If not supplied, the data used to fit
the model will be used for |
seed |
numeric; a random seed for the simulations. |
weights |
numeric; a vector of prior weights. If |
newdata |
Deprecated: use |
A tibble (data frame) with 3 columns containing the posterior predicted values in long format. The columns are
row
(integer) the row of data
that each posterior draw relates to,
draw
(integer) an index, in range 1:n
, indicating which draw each row
relates to,
response
(numeric) the predicted response for the indicated row of
data
.
Gavin L. Simpson
load_mgcv() dat <- data_sim("eg1", n = 1000, dist = "normal", scale = 2, seed = 2) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") predicted_samples(m, n = 5, seed = 42) ## Can pass arguments to predict.gam() newd <- data.frame( x0 = runif(10), x1 = runif(10), x2 = runif(10), x3 = runif(10) ) ## Exclude s(x2) predicted_samples(m, n = 5, newd, exclude = "s(x2)", seed = 25) ## Exclude s(x1) predicted_samples(m, n = 5, newd, exclude = "s(x1)", seed = 25) ## Select which terms --- result should be the same as previous ## but note that we have to include any parametric terms, including the ## constant term predicted_samples(m, n = 5, newd, seed = 25, terms = c("Intercept", "s(x0)", "s(x2)", "s(x3)") )
load_mgcv() dat <- data_sim("eg1", n = 1000, dist = "normal", scale = 2, seed = 2) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") predicted_samples(m, n = 5, seed = 42) ## Can pass arguments to predict.gam() newd <- data.frame( x0 = runif(10), x1 = runif(10), x2 = runif(10), x3 = runif(10) ) ## Exclude s(x2) predicted_samples(m, n = 5, newd, exclude = "s(x2)", seed = 25) ## Exclude s(x1) predicted_samples(m, n = 5, newd, exclude = "s(x1)", seed = 25) ## Select which terms --- result should be the same as previous ## but note that we have to include any parametric terms, including the ## constant term predicted_samples(m, n = 5, newd, seed = 25, terms = c("Intercept", "s(x0)", "s(x2)", "s(x3)") )
Quantile-quantile plots (QQ-plots) for GAMs using the reference quantiles of Augustin et al (2012).
qq_plot(model, ...) ## Default S3 method: qq_plot(model, ...) ## S3 method for class 'gam' qq_plot( model, method = c("uniform", "simulate", "normal", "direct"), type = c("deviance", "response", "pearson"), n_uniform = 10, n_simulate = 50, seed = NULL, level = 0.9, ylab = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL, ci_col = "black", ci_alpha = 0.2, point_col = "black", point_alpha = 1, line_col = "red", ... ) ## S3 method for class 'glm' qq_plot(model, ...) ## S3 method for class 'lm' qq_plot(model, ...)
qq_plot(model, ...) ## Default S3 method: qq_plot(model, ...) ## S3 method for class 'gam' qq_plot( model, method = c("uniform", "simulate", "normal", "direct"), type = c("deviance", "response", "pearson"), n_uniform = 10, n_simulate = 50, seed = NULL, level = 0.9, ylab = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL, ci_col = "black", ci_alpha = 0.2, point_col = "black", point_alpha = 1, line_col = "red", ... ) ## S3 method for class 'glm' qq_plot(model, ...) ## S3 method for class 'lm' qq_plot(model, ...)
model |
a fitted model. Currently models inheriting from class |
... |
arguments passed ot other methods. |
method |
character; method used to generate theoretical quantiles.
The default is Note that |
type |
character; type of residuals to use. Only |
n_uniform |
numeric; number of times to randomize uniform quantiles
in the direct computation method ( |
n_simulate |
numeric; number of data sets to simulate from the estimated
model when using the simulation method ( |
seed |
numeric; the random number seed to use for |
level |
numeric; the coverage level for reference intervals. Must be
strictly |
ylab |
character or expression; the label for the y axis. If not supplied, a suitable label will be generated. |
xlab |
character or expression; the label for the y axis. If not supplied, a suitable label will be generated. |
title |
character or expression; the title for the plot. See
|
subtitle |
character or expression; the subtitle for the plot. See
|
caption |
character or expression; the plot caption. See
|
ci_col |
fill colour for the reference interval when
|
ci_alpha |
alpha transparency for the reference
interval when |
point_col |
colour of points on the QQ plot. |
point_alpha |
alpha transparency of points on the QQ plot. |
line_col |
colour used to draw the reference line. |
The wording used in mgcv::qq.gam()
uses direct in reference to the
simulated residuals method (method = "simulated"
). To avoid confusion,
method = "direct"
is deprecated in favour of method = "uniform"
.
The underlying methodology used when method
is "simulate"
or "uniform"
is described in Augustin et al (2012):
Augustin, N.H., Sauleau, E.-A., Wood, S.N., (2012) On quantile quantile plots for generalized linear models. Computational Statatistics and Data Analysis 56, 2404-2409 doi:10.1016/j.csda.2012.01.026.
mgcv::qq.gam for more details on the methods used.
load_mgcv() ## simulate binomial data... dat <- data_sim("eg1", n = 200, dist = "binary", scale = .33, seed = 0) p <- binomial()$linkinv(dat$f) # binomial p n <- sample(c(1, 3), 200, replace = TRUE) # binomial n dat <- transform(dat, y = rbinom(n, n, p), n = n) m <- gam(y / n ~ s(x0) + s(x1) + s(x2) + s(x3), family = binomial, data = dat, weights = n, method = "REML" ) ## Q-Q plot; default using direct randomization of uniform quantiles qq_plot(m) ## Alternatively use simulate new data from the model, which ## allows construction of reference intervals for the Q-Q plot qq_plot(m, method = "simulate", seed = 42, point_col = "steelblue", point_alpha = 0.4 ) ## ... or use the usual normality assumption qq_plot(m, method = "normal")
load_mgcv() ## simulate binomial data... dat <- data_sim("eg1", n = 200, dist = "binary", scale = .33, seed = 0) p <- binomial()$linkinv(dat$f) # binomial p n <- sample(c(1, 3), 200, replace = TRUE) # binomial n dat <- transform(dat, y = rbinom(n, n, p), n = n) m <- gam(y / n ~ s(x0) + s(x1) + s(x2) + s(x3), family = binomial, data = dat, weights = n, method = "REML" ) ## Q-Q plot; default using direct randomization of uniform quantiles qq_plot(m) ## Alternatively use simulate new data from the model, which ## allows construction of reference intervals for the Q-Q plot qq_plot(m, method = "simulate", seed = 42, point_col = "steelblue", point_alpha = 0.4 ) ## ... or use the usual normality assumption qq_plot(m, method = "normal")
Extracts the reference or a specific level the supplied factor, returning it as a factor with the same levels as the one supplied.
ref_level(fct) level(fct, level)
ref_level(fct) level(fct, level)
fct |
factor; the factor from which the reference or specific level will be extracted. |
level |
character; the specific level to extract in the case of
|
A length 1 factor with the same levels as the supplied factor fct
.
f <- factor(sample(letters[1:5], 100, replace = TRUE)) # the reference level ref_level(f) # a specific level level(f, level = "b") # note that the levels will always match the input factor identical(levels(f), levels(ref_level(f))) identical(levels(f), levels(level(f, "c")))
f <- factor(sample(letters[1:5], 100, replace = TRUE)) # the reference level ref_level(f) # a specific level level(f, level = "b") # note that the levels will always match the input factor identical(levels(f), levels(ref_level(f))) identical(levels(f), levels(level(f, "c")))
A set of reference objects for testing data_sim()
.
A named list of simulated data sets created by data_sim()
.
Function to repeat the first level of a factor n times and return this vector as a factor with the original levels intact
rep_first_factor_value(f, n)
rep_first_factor_value(f, n)
f |
a factor |
n |
numeric; the number of times to repeat the first level of |
A factor of length n
with the levels of f
, but whose elements
are all the first level of f
.
Histogram of model residuals
residuals_hist_plot( model, type = c("deviance", "pearson", "response"), n_bins = c("sturges", "scott", "fd"), ylab = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL )
residuals_hist_plot( model, type = c("deviance", "pearson", "response"), n_bins = c("sturges", "scott", "fd"), ylab = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL )
model |
a fitted model. Currently only class |
type |
character; type of residuals to use. Only |
n_bins |
character or numeric; either the number of bins or a string indicating how to calculate the number of bins. |
ylab |
character or expression; the label for the y axis. If not supplied, a suitable label will be generated. |
xlab |
character or expression; the label for the y axis. If not supplied, a suitable label will be generated. |
title |
character or expression; the title for the plot. See
|
subtitle |
character or expression; the subtitle for the plot. See
|
caption |
character or expression; the plot caption. See
|
Plot of residuals versus linear predictor values
residuals_linpred_plot( model, type = c("deviance", "pearson", "response"), ylab = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL, point_col = "black", point_alpha = 1, line_col = "red" )
residuals_linpred_plot( model, type = c("deviance", "pearson", "response"), ylab = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL, point_col = "black", point_alpha = 1, line_col = "red" )
model |
a fitted model. Currently only class |
type |
character; type of residuals to use. Only |
ylab |
character or expression; the label for the y axis. If not supplied, a suitable label will be generated. |
xlab |
character or expression; the label for the y axis. If not supplied, a suitable label will be generated. |
title |
character or expression; the title for the plot. See
|
subtitle |
character or expression; the subtitle for the plot. See
|
caption |
character or expression; the plot caption. See
|
point_col |
colour used to draw points in the plots. See
|
point_alpha |
numeric; alpha transparency for points in plots. |
line_col |
colour specification for 1:1 line. |
Derivatives on the response scale from an estimated GAM
response_derivatives(object, ...) ## Default S3 method: response_derivatives(object, ...) ## S3 method for class 'gamm' response_derivatives(object, ...) ## S3 method for class 'gam' response_derivatives( object, focal = NULL, data = NULL, order = 1L, type = c("forward", "backward", "central"), scale = c("response", "linear_predictor"), method = c("gaussian", "mh", "inla", "user"), n = 100, eps = 1e-07, n_sim = 10000, level = 0.95, seed = NULL, mvn_method = c("mvnfast", "mgcv"), ... )
response_derivatives(object, ...) ## Default S3 method: response_derivatives(object, ...) ## S3 method for class 'gamm' response_derivatives(object, ...) ## S3 method for class 'gam' response_derivatives( object, focal = NULL, data = NULL, order = 1L, type = c("forward", "backward", "central"), scale = c("response", "linear_predictor"), method = c("gaussian", "mh", "inla", "user"), n = 100, eps = 1e-07, n_sim = 10000, level = 0.95, seed = NULL, mvn_method = c("mvnfast", "mgcv"), ... )
object |
an R object to compute derivatives for. |
... |
arguments passed to other methods and on to |
focal |
character; name of the focal variable. The response derivative
of the response with respect to this variable will be returned.
All other variables involved in the model will be held at constant values.
This can be missing if supplying |
data |
a data frame containing the values of the model covariates at which to evaluate the first derivatives of the smooths. If supplied, all but one variable must be held at a constant value. |
order |
numeric; the order of derivative. |
type |
character; the type of finite difference used. One of
|
scale |
character; should the derivative be estimated on the response
or the linear predictor (link) scale? One of |
method |
character; which method should be used to draw samples from
the posterior distribution. |
n |
numeric; the number of points to evaluate the derivative at (if
|
eps |
numeric; the finite difference. |
n_sim |
integer; the number of simulations used in computing the simultaneous intervals. |
level |
numeric; |
seed |
numeric; a random seed for the simulations. |
mvn_method |
character; one of |
A tibble, currently with the following variables:
.row
: integer, indexing the row of data
each row in the output
represents
.focal
: the name of the variable for which the partial derivative was
evaluated,
.derivative
: the estimated partial derivative,
.lower_ci
: the lower bound of the confidence or simultaneous interval,
.upper_ci
: the upper bound of the confidence or simultaneous interval,
additional columns containing the covariate values at which the derivative was evaluated.
Gavin L. Simpson
library("ggplot2") library("patchwork") load_mgcv() df <- data_sim("eg1", dist = "negbin", scale = 0.25, seed = 42) # fit the GAM (note: for execution time reasons using bam()) m <- bam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, family = nb(), method = "fREML" ) # data slice through data along x2 - all other covariates will be set to # typical values (value closest to median) ds <- data_slice(m, x2 = evenly(x2, n = 100)) # fitted values along x2 fv <- fitted_values(m, data = ds) # response derivatives - ideally n_sim = >10000 y_d <- response_derivatives(m, data = ds, type = "central", focal = "x2", eps = 0.01, seed = 21, n_sim = 1000 ) # draw fitted values along x2 p1 <- fv |> ggplot(aes(x = x2, y = .fitted)) + geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci, y = NULL), alpha = 0.2 ) + geom_line() + labs( title = "Estimated count as a function of x2", y = "Estimated count" ) # draw response derivatives p2 <- y_d |> ggplot(aes(x = x2, y = .derivative)) + geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci), alpha = 0.2) + geom_line() + labs( title = "Estimated 1st derivative of estimated count", y = "First derivative" ) # draw both panels p1 + p2 + plot_layout(nrow = 2)
library("ggplot2") library("patchwork") load_mgcv() df <- data_sim("eg1", dist = "negbin", scale = 0.25, seed = 42) # fit the GAM (note: for execution time reasons using bam()) m <- bam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, family = nb(), method = "fREML" ) # data slice through data along x2 - all other covariates will be set to # typical values (value closest to median) ds <- data_slice(m, x2 = evenly(x2, n = 100)) # fitted values along x2 fv <- fitted_values(m, data = ds) # response derivatives - ideally n_sim = >10000 y_d <- response_derivatives(m, data = ds, type = "central", focal = "x2", eps = 0.01, seed = 21, n_sim = 1000 ) # draw fitted values along x2 p1 <- fv |> ggplot(aes(x = x2, y = .fitted)) + geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci, y = NULL), alpha = 0.2 ) + geom_line() + labs( title = "Estimated count as a function of x2", y = "Estimated count" ) # draw response derivatives p2 <- y_d |> ggplot(aes(x = x2, y = .derivative)) + geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci), alpha = 0.2) + geom_line() + labs( title = "Estimated 1st derivative of estimated count", y = "First derivative" ) # draw both panels p1 + p2 + plot_layout(nrow = 2)
A rootogram is a model diagnostic tool that assesses the goodness of fit of
a statistical model. The observed values of the response are compared with
those expected from the fitted model. For discrete, count responses, the
frequency of each count (0, 1, 2, etc) in the observed data and expected
from the conditional distribution of the response implied by the model are
compared. For continuous variables, the observed and expected frequencies
are obtained by grouping the data into bins. The rootogram is drawn using
ggplot2::ggplot()
graphics. The design closely follows Kleiber & Zeileis
(2016).
rootogram(object, ...) ## S3 method for class 'gam' rootogram(object, max_count = NULL, breaks = "Sturges", ...)
rootogram(object, ...) ## S3 method for class 'gam' rootogram(object, max_count = NULL, breaks = "Sturges", ...)
object |
an R object |
... |
arguments passed to other methods |
max_count |
integer; the largest count to consider |
breaks |
for continuous responses, how to group the response. Can be
anything that is acceptable as the |
Kleiber, C., Zeileis, A., (2016) Visualizing Count Data Regressions Using Rootograms. Am. Stat. 70, 296–303. doi:10.1080/00031305.2016.1173590
load_mgcv() df <- data_sim("eg1", n = 1000, dist = "poisson", scale = 0.1, seed = 6) # A poisson example m <- gam(y ~ s(x0, bs = "cr") + s(x1, bs = "cr") + s(x2, bs = "cr") + s(x3, bs = "cr"), family = poisson(), data = df, method = "REML") rg <- rootogram(m) rg draw(rg) # plot the rootogram # A Gaussian example df <- data_sim("eg1", dist = "normal", seed = 2) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") draw(rootogram(m, breaks = "FD"), type = "suspended")
load_mgcv() df <- data_sim("eg1", n = 1000, dist = "poisson", scale = 0.1, seed = 6) # A poisson example m <- gam(y ~ s(x0, bs = "cr") + s(x1, bs = "cr") + s(x2, bs = "cr") + s(x3, bs = "cr"), family = poisson(), data = df, method = "REML") rg <- rootogram(m) rg draw(rg) # plot the rootogram # A Gaussian example df <- data_sim("eg1", dist = "normal", seed = 2) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") draw(rootogram(m, breaks = "FD"), type = "suspended")
Creates a sequence of n
evenly-spaced values over the range
min(x)
– max(x)
, where the minimum and maximum are adjusted such that
they are always contained within the range of x
when x
may be shifted
forwards or backwards by an amount related to eps
. This is particularly
useful in computing derivatives via finite differences where without this
adjustment we may be predicting for values outside the range of the data
and hence the conmstraints of the penalty.
seq_min_max_eps(x, n, order, type = c("forward", "backward", "central"), eps)
seq_min_max_eps(x, n, order, type = c("forward", "backward", "central"), eps)
x |
numeric; vector over which evenly-spaced values are returned |
n |
numeric; the number of evenly-spaced values to return |
order |
integer; the order of derivative. Either |
type |
character; the type of finite difference used. One of
|
eps |
numeric; the finite difference |
A numeric vector of length n
.
eps
Shift numeric values in a data frame by an amount eps
shift_values(df, h, i, FUN = `+`, focal = NULL)
shift_values(df, h, i, FUN = `+`, focal = NULL)
df |
a data frame or tibble. |
h |
numeric; the amount to shift values in |
i |
logical; a vector indexing columns of |
FUN |
function; a function to applut the shift. Typically |
focal |
character; the focal variable when computing partial
derivatives. This allows shifting only the focal variable by |
Simulations from the posterior distribution of a fitted GAM model involve computing predicted values for the observation data for which simulated data are required, then generating random draws from the probability distribution used when fitting the model.
## S3 method for class 'gam' simulate( object, nsim = 1, seed = NULL, data = newdata, weights = NULL, ..., newdata = NULL ) ## S3 method for class 'gamm' simulate( object, nsim = 1, seed = NULL, data = newdata, weights = NULL, ..., newdata = NULL ) ## S3 method for class 'scam' simulate( object, nsim = 1, seed = NULL, data = newdata, weights = NULL, ..., newdata = NULL )
## S3 method for class 'gam' simulate( object, nsim = 1, seed = NULL, data = newdata, weights = NULL, ..., newdata = NULL ) ## S3 method for class 'gamm' simulate( object, nsim = 1, seed = NULL, data = newdata, weights = NULL, ..., newdata = NULL ) ## S3 method for class 'scam' simulate( object, nsim = 1, seed = NULL, data = newdata, weights = NULL, ..., newdata = NULL )
object |
a fitted GAM, typically the result of a call to mgcv::gam'
or |
nsim |
numeric; the number of posterior simulations to return. |
seed |
numeric; a random seed for the simulations. |
data |
data frame; new observations at which the posterior draws
from the model should be evaluated. If not supplied, the data used to fit
the model will be used for |
weights |
numeric; a vector of prior weights. If |
... |
arguments passed to methods. |
newdata |
Deprecated. Use |
For simulate.gam()
to function, the family
component of the fitted
model must contain, or be updateable to contain, the required random
number generator. See mgcv::fix.family.rd()
.
(Currently) A matrix with nsim
columns.
Gavin L. Simpson
load_mgcv() dat <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2) m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") sims <- simulate(m1, nsim = 5, seed = 42) head(sims)
load_mgcv() dat <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2) m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") sims <- simulate(m1, nsim = 5, seed = 42) head(sims)
A dataset containing lead-210 based age depth measurements for the SMALL1 core from Small Water.
A data frame with 12 rows and 7 variables.
The variables are as follows:
Depth
Drymass
Date
Age
Error
SedAccRate
SedPerCentChange
Simpson, G.L. (Unpublished data).
Returns a vector of indices of the parametric terms that represent the supplied smooth. Useful for extracting model coefficients and columns of their covariance matrix.
smooth_coef_indices(smooth)
smooth_coef_indices(smooth)
smooth |
an object that inherits from class |
A numeric vector of indices.
Gavin L. Simpson
smooth_coefs()
for extracting the coefficients for a particular
smooth.
Returns a vector of model coefficients of the parametric terms that represent the supplied smooth.
smooth_coefs(object, ...) ## S3 method for class 'gam' smooth_coefs(object, select, term = deprecated(), ...) ## S3 method for class 'bam' smooth_coefs(object, select, term = deprecated(), ...) ## S3 method for class 'gamm' smooth_coefs(object, select, term = deprecated(), ...) ## S3 method for class 'gamm4' smooth_coefs(object, select, term = deprecated(), ...) ## S3 method for class 'list' smooth_coefs(object, select, term = deprecated(), ...) ## S3 method for class 'mgcv.smooth' smooth_coefs(object, model, ...) ## S3 method for class 'scam' smooth_coefs(object, select, term = deprecated(), ...)
smooth_coefs(object, ...) ## S3 method for class 'gam' smooth_coefs(object, select, term = deprecated(), ...) ## S3 method for class 'bam' smooth_coefs(object, select, term = deprecated(), ...) ## S3 method for class 'gamm' smooth_coefs(object, select, term = deprecated(), ...) ## S3 method for class 'gamm4' smooth_coefs(object, select, term = deprecated(), ...) ## S3 method for class 'list' smooth_coefs(object, select, term = deprecated(), ...) ## S3 method for class 'mgcv.smooth' smooth_coefs(object, model, ...) ## S3 method for class 'scam' smooth_coefs(object, select, term = deprecated(), ...)
object |
a fitted GAM(M) object, or, for the |
... |
arguments passed to other methods. |
select |
character; the label of the smooth whose coefficients will be returned. |
term |
|
model |
a fitted GAM(M) object. |
A numeric vector of model coefficients.
Gavin L. Simpson
smooth_coef_indices()
for extracting the indices of the
coefficients for a particular smooth.
load_mgcv() df <- data_sim("eg1", seed = 2) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") ## IGNORE_RDIFF_BEGIN smooth_coefs(m, select = "s(x2)") ## IGNORE_RDIFF_END
load_mgcv() df <- data_sim("eg1", seed = 2) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") ## IGNORE_RDIFF_BEGIN smooth_coefs(m, select = "s(x2)") ## IGNORE_RDIFF_END
Generate regular data over the covariates of a smooth
smooth_data( model, id, n = 100, n_2d = NULL, n_3d = NULL, n_4d = NULL, offset = NULL, include_all = FALSE, var_order = NULL )
smooth_data( model, id, n = 100, n_2d = NULL, n_3d = NULL, n_4d = NULL, offset = NULL, include_all = FALSE, var_order = NULL )
model |
a fitted model |
id |
the number ID of the smooth within |
n |
numeric; the number of new observations to generate. |
n_2d |
numeric; the number of new observations to generate for the second dimension of a 2D smooth. Currently ignored. |
n_3d |
numeric; the number of new observations to generate for the third dimension of a 3D smooth. |
n_4d |
numeric; the number of new observations to generate for the
dimensions higher than 2 (!) of a kD smooth (k >= 4). For example, if
the smooth is a 4D smooth, each of dimensions 3 and 4 will get |
offset |
numeric; value of the model offset to use. |
include_all |
logical; include all covariates involved in the smooth?
if |
var_order |
character; the order in which the terms in the smooth should be processed. Only useful for tensor products with at least one 2d marginal smooth. |
load_mgcv() df <- data_sim("eg1", seed = 42) m <- bam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df) # generate data over range of x1 for smooth s(x1) smooth_data(m, id = 2) # generate data over range of x1 for smooth s(x1), with typical value for # other covariates in the model smooth_data(m, id = 2, include_all = TRUE)
load_mgcv() df <- data_sim("eg1", seed = 42) m <- bam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df) # generate data over range of x1 for smooth s(x1) smooth_data(m, id = 2) # generate data over range of x1 for smooth s(x1), with typical value for # other covariates in the model smooth_data(m, id = 2, include_all = TRUE)
Extracts the dimension of an estimated smooth.
smooth_dim(object) ## S3 method for class 'gam' smooth_dim(object) ## S3 method for class 'gamm' smooth_dim(object) ## S3 method for class 'mgcv.smooth' smooth_dim(object)
smooth_dim(object) ## S3 method for class 'gam' smooth_dim(object) ## S3 method for class 'gamm' smooth_dim(object) ## S3 method for class 'mgcv.smooth' smooth_dim(object)
object |
an R object. See Details for list of supported objects. |
This is a generic function with methods for objects of class
"gam"
, "gamm"
, and "mgcv.smooth"
.
A numeric vector of dimensions for each smooth.
Gavin L. Simpson
Evaluate a smooth at a grid of evenly spaced value over the range of the
covariate associated with the smooth. Alternatively, a set of points at which
the smooth should be evaluated can be supplied. smooth_estimates()
is a new
implementation of evaluate_smooth()
, and replaces that function, which has
been removed from the package.
smooth_estimates(object, ...) ## S3 method for class 'gam' smooth_estimates( object, select = NULL, smooth = deprecated(), n = 100, n_3d = 16, n_4d = 4, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, dist = NULL, unnest = TRUE, partial_match = FALSE, ... )
smooth_estimates(object, ...) ## S3 method for class 'gam' smooth_estimates( object, select = NULL, smooth = deprecated(), n = 100, n_3d = 16, n_4d = 4, data = NULL, unconditional = FALSE, overall_uncertainty = TRUE, dist = NULL, unnest = TRUE, partial_match = FALSE, ... )
object |
an object of class |
... |
arguments passed to other methods. |
select |
character; select which smooth's posterior to draw from.
The default ( |
smooth |
|
n |
numeric; the number of points over the range of the covariate at which to evaluate the smooth. |
n_3d , n_4d
|
numeric; the number of points over the range of last
covariate in a 3D or 4D smooth. The default is |
data |
a data frame of covariate values at which to evaluate the smooth. |
unconditional |
logical; should confidence intervals include the
uncertainty due to smoothness selection? If |
overall_uncertainty |
logical; should the uncertainty in the model constant term be included in the standard error of the evaluate values of the smooth? |
dist |
numeric; if greater than 0, this is used to determine when
a location is too far from data to be plotted when plotting 2-D smooths.
The data are scaled into the unit square before deciding what to exclude,
and |
unnest |
logical; unnest the smooth objects? |
partial_match |
logical; in the case of character |
A data frame (tibble), which is of class "smooth_estimates"
.
load_mgcv() dat <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2) m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") ## evaluate all smooths smooth_estimates(m1) ## or selected smooths smooth_estimates(m1, select = c("s(x0)", "s(x1)"))
load_mgcv() dat <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2) m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") ## evaluate all smooths smooth_estimates(m1) ## or selected smooths smooth_estimates(m1, select = c("s(x0)", "s(x1)"))
The label 'mgcv' uses for smooths is useful in many contexts, including
selecting smooths or labelling plots. smooth_label()
extracts this label
from an 'mgcv' smooth object, i.e. an object that inherits from class
"mgcv.smooth"
. These would typically be found in the $smooth
component of
a GAM fitted by mgcv::gam()
or mgcv::bam()
, or related functions.
smooth_label(object, ...) ## S3 method for class 'gam' smooth_label(object, id, ...) ## S3 method for class 'mgcv.smooth' smooth_label(object, ...)
smooth_label(object, ...) ## S3 method for class 'gam' smooth_label(object, id, ...) ## S3 method for class 'mgcv.smooth' smooth_label(object, ...)
object |
an R object. Currently, methods for class |
... |
arguments passed to other methods. |
id |
numeric; the indices of the smooths whose labels are to be extracted. If missing, labels for all smooths in the model are returned. |
A character vector.
load_mgcv() df <- data_sim("gwf2", n = 100) m <- gam(y ~ s(x), data = df, method = "REML") # extract the smooth sm <- get_smooths_by_id(m, id = 1)[[1]] # extract the label smooth_label(sm) # or directly on the fitted GAM smooth_label(m$smooth[[1]]) # or extract labels by idex/position smooth_label(m, id = 1)
load_mgcv() df <- data_sim("gwf2", n = 100) m <- gam(y ~ s(x), data = df, method = "REML") # extract the smooth sm <- get_smooths_by_id(m, id = 1)[[1]] # extract the label smooth_label(sm) # or directly on the fitted GAM smooth_label(m$smooth[[1]]) # or extract labels by idex/position smooth_label(m, id = 1)
Returns draws from the posterior distributions of smooth functions in a GAM. Useful, for example, for visualising the uncertainty in individual estimated functions.
smooth_samples(model, ...) ## S3 method for class 'gam' smooth_samples( model, select = NULL, term = deprecated(), n = 1, data = newdata, method = c("gaussian", "mh", "inla", "user"), seed = NULL, freq = FALSE, unconditional = FALSE, n_cores = 1L, n_vals = 200, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, rng_per_smooth = FALSE, draws = NULL, partial_match = NULL, mvn_method = c("mvnfast", "mgcv"), ..., newdata = NULL, ncores = NULL )
smooth_samples(model, ...) ## S3 method for class 'gam' smooth_samples( model, select = NULL, term = deprecated(), n = 1, data = newdata, method = c("gaussian", "mh", "inla", "user"), seed = NULL, freq = FALSE, unconditional = FALSE, n_cores = 1L, n_vals = 200, burnin = 1000, thin = 1, t_df = 40, rw_scale = 0.25, rng_per_smooth = FALSE, draws = NULL, partial_match = NULL, mvn_method = c("mvnfast", "mgcv"), ..., newdata = NULL, ncores = NULL )
model |
a fitted model of the supported types |
... |
arguments passed to other methods. For |
select |
character; select which smooth's posterior to draw from.
The default ( |
term |
|
n |
numeric; the number of posterior samples to return. |
data |
data frame; new observations at which the posterior draws
from the model should be evaluated. If not supplied, the data used to fit
the model will be used for |
method |
character; which method should be used to draw samples from
the posterior distribution. |
seed |
numeric; a random seed for the simulations. |
freq |
logical; |
unconditional |
logical; if |
n_cores |
number of cores for generating random variables from a
multivariate normal distribution. Passed to |
n_vals |
numeric; how many locations to evaluate the smooth at if
|
burnin |
numeric; number of samples to discard as the burnin draws.
Only used with |
thin |
numeric; the number of samples to skip when taking |
t_df |
numeric; degrees of freedom for t distribution proposals. Only
used with |
rw_scale |
numeric; Factor by which to scale posterior covariance
matrix when generating random walk proposals. Negative or non finite to
skip the random walk step. Only used with |
rng_per_smooth |
logical; if TRUE, the behaviour of gratia version 0.8.1 or earlier is used, whereby a separate call the the random number generator (RNG) is performed for each smooth. If FALSE, a single call to the RNG is performed for all model parameters |
draws |
matrix; user supplied posterior draws to be used when
|
partial_match |
logical; should smooths be selected by partial matches
with |
mvn_method |
character; one of |
newdata |
Deprecated: use |
ncores |
Deprecated; use |
A tibble with additional classes "smooth_samples"
and
'"posterior_samples".
For the "gam"
method, the columns currently returned (not in this order)
are:
.smooth
; character vector. Indicates the smooth function for that
particular draw,
.term
; character vector. Similar to smooth
, but will contain the
full label for the smooth, to differentiate factor-by smooths for
example.
.by
; character vector. If the smooth involves a by
term, the
by variable will be named here, NA_character_
otherwise.
.row
; integer. A vector of values seq_len(n_vals)
, repeated if
n > 1L
. Indexes the row in data
for that particular draw.
.draw
; integer. A vector of integer values indexing the particular
posterior draw that each row belongs to.
.value
; numeric. The value of smooth function for this posterior draw
and covariate combination.
xxx
; numeric. A series of one or more columns containing data required
for the smooth, named as per the variables involved in the respective
smooth.
Additional columns will be present in the case of factor by smooths,
which will contain the level for the factor named in by_variable
for
that particular posterior draw.
The set of variables returned and their order in the tibble is subject to change in future versions. Don't rely on position.
Gavin L. Simpson
load_mgcv() dat <- data_sim("eg1", n = 400, seed = 2) m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") sms <- smooth_samples(m1, select = "s(x0)", n = 5, seed = 42) sms ## A factor by example (with a spurious covariate x0) dat <- data_sim("eg4", n = 1000, seed = 2) ## fit model... m2 <- gam(y ~ fac + s(x2, by = fac) + s(x0), data = dat) sms <- smooth_samples(m2, n = 5, seed = 42) draw(sms)
load_mgcv() dat <- data_sim("eg1", n = 400, seed = 2) m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML") sms <- smooth_samples(m1, select = "s(x0)", n = 5, seed = 42) sms ## A factor by example (with a spurious covariate x0) dat <- data_sim("eg4", n = 1000, seed = 2) ## fit model... m2 <- gam(y ~ fac + s(x2, by = fac) + s(x0), data = dat) sms <- smooth_samples(m2, n = 5, seed = 42) draw(sms)
smooth_terms(object, ...)
smooth_terms(object, ...)
object |
an R object the result of a call to |
... |
arguments passed to other methods. Currently unused. |
Determine the type of smooth and return it n a human readable form
smooth_type(smooth) ## Default S3 method: smooth_type(smooth) ## S3 method for class 'tprs.smooth' smooth_type(smooth) ## S3 method for class 'ts.smooth' smooth_type(smooth) ## S3 method for class 'cr.smooth' smooth_type(smooth) ## S3 method for class 'cs.smooth' smooth_type(smooth) ## S3 method for class 'cyclic.smooth' smooth_type(smooth) ## S3 method for class 'pspline.smooth' smooth_type(smooth) ## S3 method for class 'cpspline.smooth' smooth_type(smooth) ## S3 method for class 'Bspline.smooth' smooth_type(smooth) ## S3 method for class 'duchon.spline' smooth_type(smooth) ## S3 method for class 'fs.interaction' smooth_type(smooth) ## S3 method for class 'sz.interaction' smooth_type(smooth) ## S3 method for class 'gp.smooth' smooth_type(smooth) ## S3 method for class 'mrf.smooth' smooth_type(smooth) ## S3 method for class 'random.effect' smooth_type(smooth) ## S3 method for class 'sw' smooth_type(smooth) ## S3 method for class 'sf' smooth_type(smooth) ## S3 method for class 'soap.film' smooth_type(smooth) ## S3 method for class 't2.smooth' smooth_type(smooth) ## S3 method for class 'sos.smooth' smooth_type(smooth) ## S3 method for class 'tensor.smooth' smooth_type(smooth) ## S3 method for class 'mpi.smooth' smooth_type(smooth) ## S3 method for class 'mpd.smooth' smooth_type(smooth) ## S3 method for class 'cx.smooth' smooth_type(smooth) ## S3 method for class 'cv.smooth' smooth_type(smooth) ## S3 method for class 'micx.smooth' smooth_type(smooth) ## S3 method for class 'micv.smooth' smooth_type(smooth) ## S3 method for class 'mdcx.smooth' smooth_type(smooth) ## S3 method for class 'mdcv.smooth' smooth_type(smooth) ## S3 method for class 'miso.smooth' smooth_type(smooth) ## S3 method for class 'mifo.smooth' smooth_type(smooth)
smooth_type(smooth) ## Default S3 method: smooth_type(smooth) ## S3 method for class 'tprs.smooth' smooth_type(smooth) ## S3 method for class 'ts.smooth' smooth_type(smooth) ## S3 method for class 'cr.smooth' smooth_type(smooth) ## S3 method for class 'cs.smooth' smooth_type(smooth) ## S3 method for class 'cyclic.smooth' smooth_type(smooth) ## S3 method for class 'pspline.smooth' smooth_type(smooth) ## S3 method for class 'cpspline.smooth' smooth_type(smooth) ## S3 method for class 'Bspline.smooth' smooth_type(smooth) ## S3 method for class 'duchon.spline' smooth_type(smooth) ## S3 method for class 'fs.interaction' smooth_type(smooth) ## S3 method for class 'sz.interaction' smooth_type(smooth) ## S3 method for class 'gp.smooth' smooth_type(smooth) ## S3 method for class 'mrf.smooth' smooth_type(smooth) ## S3 method for class 'random.effect' smooth_type(smooth) ## S3 method for class 'sw' smooth_type(smooth) ## S3 method for class 'sf' smooth_type(smooth) ## S3 method for class 'soap.film' smooth_type(smooth) ## S3 method for class 't2.smooth' smooth_type(smooth) ## S3 method for class 'sos.smooth' smooth_type(smooth) ## S3 method for class 'tensor.smooth' smooth_type(smooth) ## S3 method for class 'mpi.smooth' smooth_type(smooth) ## S3 method for class 'mpd.smooth' smooth_type(smooth) ## S3 method for class 'cx.smooth' smooth_type(smooth) ## S3 method for class 'cv.smooth' smooth_type(smooth) ## S3 method for class 'micx.smooth' smooth_type(smooth) ## S3 method for class 'micv.smooth' smooth_type(smooth) ## S3 method for class 'mdcx.smooth' smooth_type(smooth) ## S3 method for class 'mdcv.smooth' smooth_type(smooth) ## S3 method for class 'miso.smooth' smooth_type(smooth) ## S3 method for class 'mifo.smooth' smooth_type(smooth)
smooth |
an object inheriting from class |
Names of smooths in a GAM
smooths(object) ## Default S3 method: smooths(object) ## S3 method for class 'gamm' smooths(object)
smooths(object) ## Default S3 method: smooths(object) ## S3 method for class 'gamm' smooths(object)
object |
a fitted GAM or related model. Typically the result of a call
to |
Evaluate a spline at provided covariate values
spline_values( smooth, data, model, unconditional, overall_uncertainty = TRUE, frequentist = FALSE )
spline_values( smooth, data, model, unconditional, overall_uncertainty = TRUE, frequentist = FALSE )
smooth |
currently an object that inherits from class |
data |
a data frame of values to evaluate |
model |
a fitted model; currently only |
unconditional |
logical; should confidence intervals include the
uncertainty due to smoothness selection? If |
overall_uncertainty |
logical; should the uncertainty in the model constant term be included in the standard error of the evaluate values of the smooth? |
frequentist |
logical; use the frequentist covariance matrix? |
Extract names of all variables needed to fit a GAM or a smooth
term_names(object, ...) ## S3 method for class 'gam' term_names(object, ...) ## S3 method for class 'mgcv.smooth' term_names(object, ...) ## S3 method for class 'gamm' term_names(object, ...)
term_names(object, ...) ## S3 method for class 'gam' term_names(object, ...) ## S3 method for class 'mgcv.smooth' term_names(object, ...) ## S3 method for class 'gamm' term_names(object, ...)
object |
a fitted GAM object (inheriting from class |
... |
arguments passed to other methods. Not currently used. |
A vector of variable names required for terms in the model
Given the name (a term label) of a term in a model, returns the names of the variables involved in the term.
term_variables(object, term, ...) ## S3 method for class 'terms' term_variables(object, term, ...) ## S3 method for class 'gam' term_variables(object, term, ...) ## S3 method for class 'bam' term_variables(object, term, ...)
term_variables(object, term, ...) ## S3 method for class 'terms' term_variables(object, term, ...) ## S3 method for class 'gam' term_variables(object, term, ...) ## S3 method for class 'bam' term_variables(object, term, ...)
object |
an R object on which method dispatch is performed |
term |
character; the name of a model term, in the sense of
|
... |
arguments passed to other methods. |
A character vector of variable names.
General extractor for additional parameters in mgcv models
theta(object, ...) ## S3 method for class 'gam' theta(object, transform = TRUE, ...)
theta(object, ...) ## S3 method for class 'gam' theta(object, transform = TRUE, ...)
object |
a fitted model |
... |
arguments passed to other methods. |
transform |
logical; transform to the natural scale of the parameter |
Returns a numeric vector of additional parameters
load_mgcv() df <- data_sim("eg1", dist = "poisson", seed = 42, scale = 1 / 5) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML", family = nb() ) p <- theta(m)
load_mgcv() df <- data_sim("eg1", dist = "poisson", seed = 42, scale = 1 / 5) m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML", family = nb() ) p <- theta(m)
Takes an object of class mgcv.smooth
and returns a tidy representation
of the basis.
tidy_basis(smooth, data = NULL, at = NULL, coefs = NULL, p_ident = NULL)
tidy_basis(smooth, data = NULL, at = NULL, coefs = NULL, p_ident = NULL)
smooth |
a smooth object of or inheriting from class |
data |
a data frame containing the variables used in |
at |
a data frame containing values of the smooth covariate(s) at which the basis should be evaluated. |
coefs |
numeric; an optional vector of coefficients for the smooth |
p_ident |
logical vector; only used for handling |
A tibble.
Gavin L. Simpson
load_mgcv() df <- data_sim("eg1", n = 400, seed = 42) # fit model m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # tidy representaition of a basis for a smooth definition # extract the smooth sm <- get_smooth(m, "s(x2)") # get the tidy basis - need to pass where we want it to be evaluated bf <- tidy_basis(sm, at = df) # can weight the basis by the model coefficients for this smooth bf <- tidy_basis(sm, at = df, coefs = smooth_coefs(sm, model = m))
load_mgcv() df <- data_sim("eg1", n = 400, seed = 42) # fit model m <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = df, method = "REML") # tidy representaition of a basis for a smooth definition # extract the smooth sm <- get_smooth(m, "s(x2)") # get the tidy basis - need to pass where we want it to be evaluated bf <- tidy_basis(sm, at = df) # can weight the basis by the model coefficients for this smooth bf <- tidy_basis(sm, at = df, coefs = smooth_coefs(sm, model = m))
NA
Given a vector i
indexing the elements of x
, sets the selected elements
of x
to NA
.
to_na(x, i)
to_na(x, i)
x |
vector of values |
i |
vector of values used to subset |
Returns x
with possibly some elements set to NA
Identifies pairs of covariate values that lie too far from the original data.
The function is currently a basic wrapper around mgcv::exclude.too.far()
.
too_far(x, y, ref_1, ref_2, dist = NULL)
too_far(x, y, ref_1, ref_2, dist = NULL)
x , y
|
numeric; vector of values of the covariates to compare with the observed data |
ref_1 , ref_2
|
numeric; vectors of covariate values that represent the
reference against which |
dist |
if supplied, a numeric vector of length 1 representing the
distance from the data beyond which an observation is excluded. For
example, you want to exclude values that lie further from an observation
than 10% of the range of the observed data, use |
Returns a logical vector of the same length as x1
.
NA
if the lie too far from a reference set of valuesSet rows of data to NA
if the lie too far from a reference set of values
too_far_to_na(smooth, input, reference, cols, dist = NULL)
too_far_to_na(smooth, input, reference, cols, dist = NULL)
smooth |
an mgcv smooth object |
input |
data frame containing the input observations and the columns to
be set to |
reference |
data frame containing the reference values |
cols |
character vector of columns whose elements will be set to |
dist |
numeric, the distance from the reference set beyond which
elements of |
Transform estimated values and confidence intervals by applying a function
transform_fun(object, fun = NULL, ...) ## S3 method for class 'smooth_estimates' transform_fun(object, fun = NULL, constant = NULL, ...) ## S3 method for class 'smooth_samples' transform_fun(object, fun = NULL, constant = NULL, ...) ## S3 method for class 'mgcv_smooth' transform_fun(object, fun = NULL, constant = NULL, ...) ## S3 method for class 'evaluated_parametric_term' transform_fun(object, fun = NULL, constant = NULL, ...) ## S3 method for class 'parametric_effects' transform_fun(object, fun = NULL, constant = NULL, ...) ## S3 method for class 'tbl_df' transform_fun(object, fun = NULL, column = NULL, constant = NULL, ...)
transform_fun(object, fun = NULL, ...) ## S3 method for class 'smooth_estimates' transform_fun(object, fun = NULL, constant = NULL, ...) ## S3 method for class 'smooth_samples' transform_fun(object, fun = NULL, constant = NULL, ...) ## S3 method for class 'mgcv_smooth' transform_fun(object, fun = NULL, constant = NULL, ...) ## S3 method for class 'evaluated_parametric_term' transform_fun(object, fun = NULL, constant = NULL, ...) ## S3 method for class 'parametric_effects' transform_fun(object, fun = NULL, constant = NULL, ...) ## S3 method for class 'tbl_df' transform_fun(object, fun = NULL, column = NULL, constant = NULL, ...)
object |
an object to apply the transform function to. |
fun |
the function to apply. |
... |
additional arguments passed to methods. |
constant |
numeric; a constant to apply before transformation. |
column |
character; for the |
Returns object
but with the estimate and upper and lower values
of the confidence interval transformed via the function.
Gavin L. Simpson
Typical values of model covariates
typical_values(object, ...) ## S3 method for class 'gam' typical_values( object, vars = everything(), envir = environment(formula(object)), data = NULL, ... ) ## S3 method for class 'data.frame' typical_values(object, vars = everything(), ...)
typical_values(object, ...) ## S3 method for class 'gam' typical_values( object, vars = everything(), envir = environment(formula(object)), data = NULL, ... ) ## S3 method for class 'data.frame' typical_values(object, vars = everything(), ...)
object |
a fitted GAM(M) model. |
... |
arguments passed to other methods. |
vars |
terms to include or exclude from the returned object. Uses tidyselect principles. |
envir |
the environment within which to recreate the data used to fit
|
data |
an optional data frame of data used to fit the model if reconstruction of the data from the model doesn't work. |
Handle user-supplied posterior draws
user_draws(model, draws, ...) ## S3 method for class 'gam' user_draws(model, draws, index = NULL, ...)
user_draws(model, draws, ...) ## S3 method for class 'gam' user_draws(model, draws, index = NULL, ...)
model |
a fitted R model. Currently only models fitted by |
draws |
matrix; user supplied posterior draws to be used when
|
... |
arguments passed to methods. |
index |
a vector to index (subset) the columns of |
The supplied draws
must be a matrix (currently), with 1 column per
model coefficient, and 1 row per posterior draw. The "gam"
method has
argument index
, which can be used to subset (select) coefficients
(columns) of draws
. index
can be any valid way of selecting (indexing)
columns of a matrix. index
is useful if you have a set of posterior draws
for the entire model (say from mgcv::gam.mh()
) and you wish to use those
draws for an individual smooth, via smooth_samples()
.
A wrapper to mgcv::gam.vcomp()
which returns the smoothing parameters
expressed as variance components.
variance_comp(object, ...) ## S3 method for class 'gam' variance_comp(object, rescale = TRUE, coverage = 0.95, ...)
variance_comp(object, ...) ## S3 method for class 'gam' variance_comp(object, rescale = TRUE, coverage = 0.95, ...)
object |
an R object. Currently only models fitted by |
... |
arguments passed to other methods |
rescale |
logical; for numerical stability reasons the penalty matrices
of smooths are rescaled before fitting. If |
coverage |
numeric; a value between 0 and 1 indicating the (approximate) coverage of the confidence interval that is returned. |
This function is a wrapper to mgcv::gam.vcomp()
which performs
three additional services
it suppresses the annoying text output that mgcv::gam.vcomp()
prints to
the terminal,
returns the variance of each smooth as well as the standard deviation, and
returns the variance components as a tibble.
Returns names of variables from a smooth label
vars_from_label(label)
vars_from_label(label)
label |
character; a length 1 character vector containing the label of a smooth. |
vars_from_label("s(x1)") vars_from_label("t2(x1,x2,x3)")
vars_from_label("s(x1)") vars_from_label("t2(x1,x2,x3)")
Identify a smooth term by its label
which_smooths(object, ...) ## Default S3 method: which_smooths(object, ...) ## S3 method for class 'gam' which_smooths(object, terms, ...) ## S3 method for class 'bam' which_smooths(object, terms, ...) ## S3 method for class 'gamm' which_smooths(object, terms, ...)
which_smooths(object, ...) ## Default S3 method: which_smooths(object, ...) ## S3 method for class 'gam' which_smooths(object, terms, ...) ## S3 method for class 'bam' which_smooths(object, terms, ...) ## S3 method for class 'gamm' which_smooths(object, terms, ...)
object |
a fitted GAM. |
... |
arguments passed to other methods. |
terms |
character; one or more (partial) term labels with which to identify required smooths. |
Worm plot of model residuals
worm_plot(model, ...) ## S3 method for class 'gam' worm_plot( model, method = c("uniform", "simulate", "normal", "direct"), type = c("deviance", "response", "pearson"), n_uniform = 10, n_simulate = 50, level = 0.9, ylab = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL, ci_col = "black", ci_alpha = 0.2, point_col = "black", point_alpha = 1, line_col = "red", ... ) ## S3 method for class 'glm' worm_plot(model, ...) ## S3 method for class 'lm' worm_plot(model, ...)
worm_plot(model, ...) ## S3 method for class 'gam' worm_plot( model, method = c("uniform", "simulate", "normal", "direct"), type = c("deviance", "response", "pearson"), n_uniform = 10, n_simulate = 50, level = 0.9, ylab = NULL, xlab = NULL, title = NULL, subtitle = NULL, caption = NULL, ci_col = "black", ci_alpha = 0.2, point_col = "black", point_alpha = 1, line_col = "red", ... ) ## S3 method for class 'glm' worm_plot(model, ...) ## S3 method for class 'lm' worm_plot(model, ...)
model |
a fitted model. Currently models inheriting from class |
... |
arguments passed ot other methods. |
method |
character; method used to generate theoretical quantiles.
The default is Note that |
type |
character; type of residuals to use. Only |
n_uniform |
numeric; number of times to randomize uniform quantiles
in the direct computation method ( |
n_simulate |
numeric; number of data sets to simulate from the estimated
model when using the simulation method ( |
level |
numeric; the coverage level for reference intervals. Must be
strictly |
ylab |
character or expression; the label for the y axis. If not supplied, a suitable label will be generated. |
xlab |
character or expression; the label for the y axis. If not supplied, a suitable label will be generated. |
title |
character or expression; the title for the plot. See
|
subtitle |
character or expression; the subtitle for the plot. See
|
caption |
character or expression; the plot caption. See
|
ci_col |
fill colour for the reference interval when
|
ci_alpha |
alpha transparency for the reference
interval when |
point_col |
colour of points on the QQ plot. |
point_alpha |
alpha transparency of points on the QQ plot. |
line_col |
colour used to draw the reference line. |
The wording used in mgcv::qq.gam()
uses direct in reference to the
simulated residuals method (method = "simulated"
). To avoid confusion,
method = "direct"
is deprecated in favour of method = "uniform"
.
load_mgcv() ## simulate binomial data... dat <- data_sim("eg1", n = 200, dist = "binary", scale = .33, seed = 0) p <- binomial()$linkinv(dat$f) # binomial p n <- sample(c(1, 3), 200, replace = TRUE) # binomial n dat <- transform(dat, y = rbinom(n, n, p), n = n) m <- gam(y / n ~ s(x0) + s(x1) + s(x2) + s(x3), family = binomial, data = dat, weights = n, method = "REML" ) ## Worm plot; default using direct randomization of uniform quantiles ## Note no reference bands are drawn with this method. worm_plot(m) ## Alternatively use simulate new data from the model, which ## allows construction of reference intervals for the Q-Q plot worm_plot(m, method = "simulate", point_col = "steelblue", point_alpha = 0.4 ) ## ... or use the usual normality assumption worm_plot(m, method = "normal")
load_mgcv() ## simulate binomial data... dat <- data_sim("eg1", n = 200, dist = "binary", scale = .33, seed = 0) p <- binomial()$linkinv(dat$f) # binomial p n <- sample(c(1, 3), 200, replace = TRUE) # binomial n dat <- transform(dat, y = rbinom(n, n, p), n = n) m <- gam(y / n ~ s(x0) + s(x1) + s(x2) + s(x3), family = binomial, data = dat, weights = n, method = "REML" ) ## Worm plot; default using direct randomization of uniform quantiles ## Note no reference bands are drawn with this method. worm_plot(m) ## Alternatively use simulate new data from the model, which ## allows construction of reference intervals for the Q-Q plot worm_plot(m, method = "simulate", point_col = "steelblue", point_alpha = 0.4 ) ## ... or use the usual normality assumption worm_plot(m, method = "normal")
The Madison lake zooplankton data are from a long-term study in seasonal dynamics of zooplankton, collected by the Richard Lathrop. The data were collected from a chain of lakes in Wisconsin (Mendota, Monona, Kegnonsa, and Waubesa) approximately bi-weekly from 1976 to 1994. They consist of samples of the zooplankton communities, taken from the deepest point of each lake via vertical tow. The data are provided by the Wisconsin Department of Natural Resources and their collection and processing are fully described in Lathrop (2000).
A data frame
Each record consists of counts of a given zooplankton taxon taken from a subsample from a single vertical net tow, which was then scaled to account for the relative volume of subsample versus the whole net sample and the area of the net tow and rounded to the nearest 1000 to give estimated population density per m2 for each taxon at each point in time in each sampled lake.
Pedersen EJ, Miller DL, Simpson GL, Ross N. 2018. Hierarchical generalized additive models: an introduction with mgcv. PeerJ Preprints 6:e27320v1 doi:10.7287/peerj.preprints.27320v1.
Lathrop RC. (2000). Madison Wisonsin Lakes Zooplankton 1976–1994. Environmental Data Initiative.