parametric_effects()
slightly escaped the great renaming that happened for
0.9.0. Columns type
and term
did not gain a prefix .
. This is now
rectified and these two columns are now .type
and .term
.Plots of random effects are now labelled with their smooth label. Previously,
the title was taken fro the variable involved in the smooth, but this doesn't
work for terms like s(subject, continuous_var, bs = "re")
for random slopes,
which previsouly would have the title "subject"
. Now such terms will have
title "s(subject,continuous_var)"
. Simple random intercept terms,
s(subject, bs = "re")
, are now titled "s(subject)"
. #287
The vignettes
custom-plotting.Rmd
, andposterior-simulation.Rmd
were moved to vignettes/articles
and thus are no longer available as package
vignettes. Instead, they are accessible as Articles through the package
website: https://gavinsimpson.github.io/gratia/fitted_samples()
now works for gam()
models with multiple linear
predictors, but currently only the location parameter is supported. The
parameter is indicated through a new variable .parameter
in the returned
object.partial_residuals()
was computing partial residuals from the deviance
residuals. For compatibility with mgcv::plot.gam()
, partial residuals are
now computed from the working residuals. Reported by @wStockhausen #273
appraise()
was not passing the ci_col
argument on qq_plot()
and
worm_plot()
. Reported by Sate Ahmed.
Couldn't pass mvn_method
on to posterior sampling functions from user facing
functions fitted_samples()
, posterior_samples()
, smooth_samples()
,
derivative_samples()
, and repsonse_derivatives()
. Reported by @stefgehrig
#279
fitted_values()
works again for quantile GAMs fitted by qgam()
.
confint.gam()
was not applying shift
to the estimate and upper and lower
interval. #280 reported by @TIMAVID & @rbentham
parametric_effects()
and draw.parametric_effects()
would forget about the
levels of factors (intentionally), but this would lead to problems with
ordered factors where the ordering of levels was not preserved. Now,
parametric_effects()
returns a named list of factor levels as attribute
"factor_levels"
containing the required information and the order of levels
is preserved when plotting. #284 Reported by @mhpob
parametric_effects()
would fail if there were parametric terms in the model
but they were all interaction terms (which we don't currently handle). #282
Many functions now return objects with different named variables. In order to
avoid clashes with variable names used in user's models or data, a period
(.
) is now being used as a prefix for generated variable names. The
functions whose names have changed are: smooth_estimates()
,
fitted_values()
, fitted_samples()
, posterior_samples()
, derivatives()
,
partial_derivatives()
, and derivative_samples()
. In addition,
add_confint()
also adds newly-named variables.
1. `est` is now `.estimate`,
2. `lower` and `upper` are now `.lower_ci` and `.upper_ci`,
3. `draw` and `row` and now `.draw` and `.row` respectively,
4. `fitted`, `se`, `crit` are now `.fitted`, `.se`, `.crit`, respectively
5. `smooth`, `by`, and `type` in `smooth_estimates()` are now `.smooth`,
`.by`, `.type`, respectively.
derivatives()
and partial_derivatives()
now work more like
smooth_estimates()
; in place of the var
and data
columns, gratia now
stores the data variables at which the derivatives were evaluated as columns
in the object with their actual variable names.
The way spline-on-the-sphere (SOS) smooths (bs = "sos"
) are plotted has
changed to use ggplot2::coord_sf()
instead of the previously-used
ggplot2::coord_map()
. This changed has been made as a result of
coord_map()
being soft-deprecated ("superseded") for a few minor versions of
ggplot2 by now already, and changes to the guides system in version 3.5.0 of
ggplot2.
The axes on plots created with coord_map()
never really worked
correctly and changing the angle of the tick labels never worked. As
coord_map()
is superseded, it didn't receive the updates to the guides
system and a side effect of these changes, the code that plotted SOS smooths
was producing a warning with the release of ggplot2 version 3.5.0.
The projection settings used to draw SOS smooths was previously controlled via
arguments projection
and orientation
. These arguments do not affect
ggplot2::coord_sf()
, Instead the projection used is controlled through new
argument crs
, which takes a PROJ string detailing the projection to use or
an integer that refers to a known coordinate reference system (CRS). The
default projection used is +proj=ortho +lat_0=20 +lon_0=XX
where XX
is the
mean of the longitude coordinates of the data points.
evaluate_smooth()
was deprecated in gratia version 0.7.0. This function and
all it's methods have been removed from the package. Use smooth_estimates()
instead.The following functions were deprecated in version 0.9.0 of gratia. They will eventually be removed from the package as part of a clean up ahead of an eventual 1.0.0 release. These functions will become defunct by version 0.11.0 or 1.0.0, whichever is released soonest.
evaluate_parametric_term()
has been deprecated. Use parametric_effects()
instead.
datagen()
has been deprecated. It never really did what it was originally
designed to do, and has been replaced by data_slice()
.
To make functions in the package more consistent, the arguments select
,
term
, and smooth
are all used for the same thing and hence the latter two
have been deprecated in favour of select
. If a deprecated argument is used, a
warning will be issued but the value assigned to the argument will be assigned
to select
and the function will continue.
smooth_samples()
now uses a single call to the RNG to generate draws from
the posterior of smooths. Previous to version 0.9.0, smooth_samples()
would
do a separate call to mvnfast::rmvn()
for each smooth. As a result, the
result of a call to smooth_samples()
on a model with multiple smooths will
now produce different results to those generated previously. To regain the
old behaviour, add rng_per_smooth = TRUE
to the smooth_samples()
call.
Note, however, that using per-smooth RNG calls with method = "mh"
will be
very inefficient as, with that method, posterior draws for all coefficients
in the model are sampled at once. So, only use rng_per_smooth = TRUE
with
method = "gaussian"
.
The output of smooth_estimates()
and its draw()
method have changed for
tensor product smooths that involve one or more 2D marginal smooths. Now,
if no covariate values are supplied via the data
argument,
smooth_estimates()
identifies if one of the marginals is a 2d surface and
allows the covariates involved in that surface to vary fastest, ahead of terms
in other marginals. This change has been made as it provides a better default
when nothing is provided to data
.
This also affects draw.gam()
.
fitted_values()
now has some level of support for location, scale, shape
families. Supported families are mgcv::gaulss()
, mgcv::gammals()
,
mgcv::gumbls()
, mgcv::gevlss()
, mgcv::shash()
, mgcv::twlss()
, and
mgcv::ziplss()
.
gratia now requires dplyr versions >= 1.1.0 and tidyselect >= 1.2.0.
A new vignette Posterior Simulation is available, which describes how to do posterior simulation from fitted GAMs using {gratia}.
Soap film smooths using basis bs = "so"
are now handled by draw()
,
smooth_estimates()
etc. #8
response_derivatives()
is a new function for computing derivatives of the
response with respect to a (continuous) focal variable. First or second
order derivatives can be computed using forward, backward, or central
finite differences. The uncertainty in the estimated derivative is determined
using posterior sampling via fitted_samples()
, and hence can be derived
from a Gaussian approximation to the posterior or using a Metropolis Hastings
sampler (see below.)
derivative_samples()
is the work horse function behind
response_derivatives()
, which computes and returns posterior draws of the
derivatives of any additive combination of model terms. Requested by
@jonathanmellor #237
data_sim()
can now simulate response data from gamma, Tweedie and ordered
categorical distributions.
data_sim()
gains two new example models "gwf2"
, simulating data only from
Gu & Wabha's f2 function, and "lwf6"
, example function 6 from Luo & Wabha
(1997 JASA 92(437), 107-116).
data_sim()
can also simulate data for use with GAMs fitted using
family = gfam()
for grouped families where different types of data in
the response are handled. #266 and part of #265
fitted_samples()
and smooth_samples()
can now use the Metropolis Hastings
sampler from mgcv::gam.mh()
, instead of a Gaussian approximation, to sample
from the posterior distribution of the model or specific smooths
respectively.
posterior_samples()
is a new function in the family of fitted_samples()
and smooth_samples()
. posterior_samples()
returns draws from the
posterior distribution of the response, combining the uncertainty in the
estimated expected value of the response and the dispersion of the response
distribution. The difference between posterior_samples()
and
predicted_samples()
is that the latter only includes variation due to
drawing samples from the conditional distribution of the response (the
uncertainty in the expected values is ignored), while the former includes
both sources of uncertainty.
fitted_samples()
can new use a matrix of user-supplied posterior draws.
Related to #120
add_fitted_samples()
, add_predicted_samples()
, add_posterior_samples()
,
and add_smooth_samples()
are new utility functions that add the respective
draws from the posterior distribution to an existing data object for the
covariate values in that object: obj |> add_posterior_draws(model)
. #50
basis_size()
is a new function to extract the basis dimension (number of
basis functions) for smooths. Methods are available for objects that inherit
from classes "gam"
, "gamm"
, and "mgcv.smooth"
(for individual smooths).
data_slice()
gains a method for data frames and tibbles.
typical_values()
gains a method for data frames and tibbles.
fitted_values()
now works with models fitted using the mgcv::ocat()
family. The predicted probability for each category is returned, alongside a
Wald interval created using the standard error (SE) of the estimated
probability. The SE and estimated probabilities are transformed to the logit
(linear predictor) scale, a Wald credible interval is formed, which is then
back-transformed to the response (probability) scale.
fitted_values()
now works for GAMMs fitted using mgcv::gamm()
. Fitted
(predicted) values only use the GAM part of the model, and thus exclude the
random effects.
link()
and inv_link()
work for models fitted using the cnorm()
family.
A worm plot can now be drawn in place of the QQ plot with appraise()
via
new argument use_worm = TRUE
. #62
smooths()
now works for models fitted with mgcv::gamm()
.
overview()
now returns the basis dimension for each smooth and gains an
argument stars
which if TRUE
add significance stars to the output plus a
legend is printed in the tibble footer. Part of wish of @noamross #214
New add_constant()
and transform_fun()
methods for smooth_samples()
.
evenly()
gains arguments lower
and upper
to modify the lower and / or
upper bound of the interval over which evenly spaced values will be generated.
add_sizer()
is a new function to add information on whether the derivative
of a smooth is significantly changing (where the credible interval excludes
0). Currently, methods for derivatives()
and smooth_estimates()
objects
are implemented. Part of request of @asanders11 #117
draw.derivatives()
gains arguments add_change
and change_type
to allow
derivatives of smooths to be plotted with indicators where the credible
interval on the derivative excludes 0. Options allow for periods of decrease
or increase to be differentiated via change_type = "sizer"
instead of the
default change_type = "change"
, which emphasises either type of change in
the same way. Part of wish of @asanders11 #117
draw.gam()
can now group factor by smooths for a given factor into a single
panel, rather than plotting the smooths for each level in separate panels.
This is achieved via new argument grouped_by
. Requested by @RPanczak #89
draw.smooth_estimates()
can now also group factor by smooths for a given
factor into a single panel.
The underlying plotting code used by draw_smooth_estimates()
for most
univariate smooths can now add change indicators to the plots of smooths if
those change indicators are added to the object created by
smooth_estimates()
using add_sizer()
. See the example in
?draw.smooth_estimates
.
smooth_estimates()
can, when evaluating a 3D or 4D tensor product smooth,
identify if one or more 2D smooths is a marginal of the tensor product. If
users do not provide covariate values at which to evaluate the smooths,
smooth_estimates()
will focus on the 2D marginal smooth (or the first if
more than one is involved in the tensor product), instead of following the
ordering of the terms in the definition of the tensor product. #191
For example, in te(z, x, y, bs = c(cr, ds), d = c(1, 2))
, the second
marginal smooth is a 2D Duchon spline of covariates x
and y
. Previously,
smooth_estimates()
would have generated n
values each for z
and x
and
n_3d
values for y
, and then evaluated the tensor product at all
combinations of those generated values. This would ignore the structure
implicit in the tensor product, where we are likely to want to know how the
surface estimated by the Duchon spline of x
and y
smoothly varies with
z
. Previously smooth_estimates()
would generate surfaces of z
and x
,
varying by y
. Now, smooth_estimates()
correctly identifies that one of the
marginal smooths of the tensor product is a 2D surface and will focus on that
surface varying with the other terms in the tensor product.
This improved behaviour is needed because in some bam()
models it is not
always possible to do the obvious thing and reorder the smooths when defining
the tensor product to be te(x, y, z, bs = c(ds, cr), d = c(2, 1))
. When
discrete = TRUE
is used with bam()
the terms in the tensor product may
get rearranged during model setup for maximum efficiency (See Details in
?mgcv::bam
).
Additionally, draw.gam()
now also works the same way.
New function null_deviance()
that extracts the null deviance of a fitted
model.
draw()
, smooth_estimates()
, fitted_values()
, data_slice()
, and
smooth_samples()
now all work for models fitted with scam::scam()
.
Where it matters, current support extends only to univariate smooths.
generate_draws()
is a new low-level function for generating posterior draws
from fitted model coefficients. generate_daws()
is an S3 generic function so
is extensible by users. Currently provides a simple interface to a simple
Gaussian approximation sampler (gaussian_draws()
) and the simple Metropolis
Hasting sample (mh_draws()
) available via mgcv::gam.mh()
. #211
smooth_label()
is a new function for extracting the labels 'mgcv' creates for
smooths from the smooth object itself.
penalty()
has a default method that works with s()
, te()
, t2()
, and
ti()
, which create a smooth specification.
transform_fun()
gains argument constant
to allow for the addition of a
constant value to objects (e.g. the estimate and confidence interval). This
enables a single obj |> transform_fun(fun = exp, constant = 5)
instead of
separate calls to add_constant()
and then transform_fun()
. Part of the
discussion of #79
model_constant()
is a new function that simply extracts the first
coefficient from the estimated model.
link()
, inv_link()
, and related family functions for the ocat()
weren't
correctly identifying the family name and as a result would throw an error
even when passed an object of the correct family.
link()
and inv_link()
now work correctly for the betar()
family in a
fitted GAM.
The print()
method for lp_matrix()
now converts the matrix to a data frame
before conversion to a tibble. This makes more sense as it results in more
typical behaviour as the columns of the printed object are doubles.
Constrained factor smooths (bs = "sz"
) where the factor is not the first
variable mentioned in the smooth (i.e. s(x, f, bs = "sz")
for continuous
x
and factor f
) are now plotable with draw()
. #208
parametric_effects()
was unable to handle special parametric terms like
poly(x)
or log(x)
in formulas. Reported by @fhui28 #212
parametric_effects()
now works better for location, scale, shape models.
Reported by @pboesu #45
parametric_effects
now works when there are missing values in one or more
variables used in a fitted GAM. #219
response_derivatives()
was incorrectly using .data
with tidyselect
selectors.
typical_values()
could not handle logical variables in a GAM fit as mgcv
stores these as numerics in the var.summary
. This affected evenly()
and
data_slice()
. #222
parametric_effects()
would fail when two or more ordered factors were in
the model. Reported by @dsmi31 #221
Continuous by smooths were being evaluated with the median value of the by
variable instead of a value of 1. #224
fitted_samples()
(and hence posterior_samples()
) now handles models with
offset terms in the formula. Offset terms supplied via the offset
argument
are ignored by mgcv:::predict.gam()
and hence are ignored also by gratia
.
Reported by @jonathonmellor #231 #233
smooth_estimates()
would fail on a "fs"
smooth when a multivariate base
smoother was used and the factor was not the last variable specified in the
definition of the smooth: s(x1, x2, f, bs = "fs", xt = list(bs = "ds"))
would work, but s(f, x1, x2, bs = "fs", xt = list(bs = "ds"))
(or any
ordering of variables that places the factor not last) would emit an obscure
error. The ordering of the terms involved in the smooth now doesn't matter.
Reported by @chrisaak #249.
draw.gam()
would fail when plotting a multivariate base smoother used in an
"sz"
smooth. Now, this use case is identified and a message printed
indicating that (currently) gratia doesn't know how to plot such a smooth.
Reported by @chrisaak #249.
draw.gam()
would fail when plotting a multivariate base smoother used in an
"fs"
smooth. Now, this use case is identified and a message printed
indicating that (currently) gratia doesn't know how to plot such a smooth.
Reported by @chrisaak #249.
derivative_samples()
would fail with order = 2
and was only computing
forward finite differences, regardless of type
for order = 1
. Partly
reported by @samlipworth #251.
The draw()
method for penalty()
was normalizing the penalty to the range
0--1, not the claimed and documented -1--1 with argument normalize = TRUE
.
This is now fixed.
smooth_samples()
was failing when data
was supplied that contained more
variables than were used in the smooth that was being sampled. Hence this
generally fail unless a single smooth was being sampled from or the model
contained only a single smooth. The function never intended to retain all the
variables in data
but was written in such a way that it would fail when
relocating the data columns to the end of the posterior sampling object. #255
draw.gam()
and draw.smooth_estimates()
would fail when plotting a
univariate tensor product smooth (e.g. te(x)
, ti(x)
, or t2()
). Reported
by @wStockhausen #260
plot.smooth()
was not printing the factor level in subtitles for ordered
factor by smooths.
smooth_samples()
now returns objects with variables involved in smooths
that have their correct name. Previously variables were named .x1
, .x2
,
etc. Fixing #126 and improving compatibility with compare_smooths()
and
smooth_estimates()
allowed the variables to be named correctly.
gratia now depends on version 1.8-41 or later of the mgcv package.
draw.gam()
can now handle tensor products that include a marginal random
effect smooth. Beware plotting such smooths if there are many levels,
however, as a separate surface plot will be produced for each level.Additional fixes for changes in dplyr 1.1.0.
smooth_samples()
now works when sampling from posteriors of multiple smooths
with different dimension. #126 reported by @Aariq
{gratia} now depends on R version 4.1 or later.
A new vignette "Data slices" is supplied with {gratia}.
Functions in {gratia} have harmonised to use an argument named data
instead
of newdata
for passing new data at which to evaluate features of smooths. A
message will be printed if newdata
is used from now on. Existing code does
not need to be changed as data
takes its value from newdata
.
Note that due to the way ...
is handled in R, if your R script uses the
data
argument, and is run with versions of gratia prior to 8.0 (when
released; 0.7.3.8 if using the development version) the user-supplied data
will be silently ignored. As such, scripts using data
should check that the
installed version of gratia is >= 0.8 and package developers should update
to depend on versions >= 0.8 by using gratia (>= 0.8)
in DESCRIPTION
.
The order of the plots of smooths has changed in draw.gam()
so that they
again match the order in which smooths were specified in the model formula.
See Bug Fixes below for more detail or #154.
Added basic support for GAMLSS (distributional GAMs) fitted with the
gamlss()
function from package GJRM. Support is currently restricted to a
draw()
method.
difference_smooths()
can now include the group means in the difference,
which many users expected. To include the group means use group_means = TRUE
in the function call, e.g.
difference_smooths(model, smooth = "s(x)", group_means = TRUE
). Note: this
function still differs from plot_diff()
in package itsadug, which
essentially computes differences of model predictions. The main practical
difference is that other effects beyond the factor by smooth, including random
effects, may be included with plot_diff()
.
This implements the main wish of #108 (@dinga92) and #143 (@mbolyanatz) despite my protestations that this was complicated in some cases (it isn't; the complexity just cancels out.)
data_slice()
has been totally revised. Now, the user provides the values for
the variables they want in the slice and any variables in the model that are
not specified will be held at typical values (i.e. the value of the
observation that is closest to the median for numeric variables, or the modal
factor level.)
Data slices are now produced by passing name
= value
pairs for the
variables and their values that you want to appear in the slice. For example
m <- gam(y ~ s(x1) + x2 + fac)
data_slice(model, x1 = evenly(x1, n = 100), x2 = mean(x2))
The value
in the pair can be an expression that will be looked up
(evaluated) in the data
argument or the model frame of the fitted model
(the default). In the above example, the resulting slice will be a data frame
of 100 observations, comprising x1
, which is a vector of 100 values spread
evenly over the range of x1
, a constant value of the mean of x2
for the
x2
variable, and a constant factor level, the model class of fac
, for the
fac
variable of the model.
partial_derivatives()
is a new function for computing partial derivatives
of multivariate smooths (e.g. s(x,z)
, te(x,z)
) with respect to one of
the margins of the smooth. Multivariate smooths of any dimension are handled,
but only one of the dimensions is allowed to vary. Partial derivatives are
estimated using the method of finite differences, with forward, backward,
and central finite differences available. Requested by @noamross #101
overview()
provides a simple overview of model terms for fitted GAMs.
The new bs = "sz"
basis that was released with mgcv version 1.18-41 is
now supported in smooth_estimates()
, draw.gam()
, and
draw.smooth_estimates()
and this basis has its own unique plotting method.
#202
basis()
now has a method for fitted GAM(M)s which can extract the estimated
basis from the model and plot it, using the estimated coefficients for the
smooth to weight the basis. #137
There is also a new draw.basis()
method for plotting the results of a call
to basis()
. This method can now also handle bivariate bases.
tidy_basis()
is a lower level function that does the heavy lifting in
basis()
, and is now exported. tidy_basis()
returns a tidy representation
of a basis supplied as an object inheriting from class "mgcv.smooth"
. These
objects are returned in the $smooth
component of a fitted GAM(M) model.
lp_matrix()
is a new utility function to quickly return the linear predictor
matrix for an estimated model. It is a wrapper to
predict(..., type = "lpmatrix")
evenly()
is a synonym for seq_min_max()
and is preferred going forward.
Gains argument by
to produce sequences over a covariate that increment in
units of by
.
ref_level()
and level()
are new utility functions for extracting the
reference or a specific level of a factor respectively. These will be most
useful when specifying covariate values to condition on in a data slice.
model_vars()
is a new, public facing way of returning a vector of variables
that are used in a model.
difference_smooths()
will now use the user-supplied data as points at
which to evaluate a pair of smooths. Also note that the argument newdata
has
been renamed data
. #175
The draw()
method for difference_smooths()
now uses better labels for
plot titles to avoid long labels with even modest factor levels.
derivatives()
now works for factor-smooth interaction ("fs"
) smooths.
draw()
methods now allow the angle of tick labels on the x axis of plots to
be rotated using argument angle
. Requested by @tamas-ferenci #87
draw.gam()
and related functions (draw.parametric_effects()
,
draw.smooth_estimates()
) now add the basis to the plot using a caption.
#155
smooth_coefs()
is a new utility function for extracting the coefficients
for a particular smooth from a fitted model. smooth_coef_indices()
is an
associated function that returns the indices (positions) in the vector of
model coefficients (returned by coef(gam_model)
) of those coefficients that
pertain to the stated smooth.
draw.gam()
now better handles patchworks of plots where one or more of
those plots has fixed aspect ratios. #190
draw.posterior_smooths
now plots posterior samples with a fixed aspect ratio
if the smooth is isotropic. #148
derivatives()
now ignores random effect smooths (for which derivatives
don't make sense anyway). #168
confint.gam(...., method = "simultaneous")
now works with factor by smooths
where parm
is passed the full name of a specific smooth s(x)faclevel
.
The order of plots produced by gratia::draw.gam()
again matches the order
in which the smooths entered the model formula. Recent changes to the
internals of gratia::draw.gam()
when the switch to smooth_estimates()
was
undertaken lead to a change in behaviour resulting from the use of
dplyr::group_split()
, and it's coercion internally of a character vector to
a factor. This factor is now created explicitly, and the levels set to the
correct order. #154
Setting the dist
argument to set response or smooth values to NA
if they
lay too far from the support of the data in multivariate smooths, this would
lead an incorrect scale for the response guide. This is now fixed. #193
Argument fun
to draw.gam()
was not being applied to any parametric terms.
Reported by @grasshoppermouse #195
draw.gam()
was adding the uncertainty for all linear predictors to smooths
when overall_uncertainty = TRUE
was used. Now draw.gam()
only includes the
uncertainty for those linear predictors in which a smooth takes part. #158
partial_derivatives()
works when provided with a single data point at
which to evaluate the derivative. #199
transform_fun.smooth_estimates()
was addressing the wrong variable names
when trying to transform the confidence interval. #201
data_slice()
doesn't fail with an error when used with a model that contains
an offset term. #198
confint.gam()
no longer uses evaluate_smooth()
, which is soft deprecated.
#167
qq_plot()
and worm_plot()
could compute the wrong deviance residuals used
to generate the theoretical quantiles for some of the more exotic families
(distributions) available in mgcv. This also affected appraise()
but only
for the QQ plot; the residuals shown in the other plots and the deviance
residuals shown on the y-axis of the QQ plot were correct. Only the
generation of the reference intervals/quantiles was affected.
confint.fderiv()
and confint.gam()
now return their results as a tibble
instead of a common-or-garden data frame. The latter mostly already did this.
Examples for confint.fderiv()
and confint.gam()
were reworked, in part to
remove some inconsistent output in the examples when run on M1 macs.
compare_smooths()
failed when passed non-standard model "names" like
compare_smooths(m_gam, m_gamm$gam)
or compare_smooths(l[[1]], l[[2]])
even if the evaluated objects were valid GAM(M) models. Reported by Andrew
Irwin #150draw.gam()
and draw.smooth_estimates()
can now handle splines on the
sphere (s(lat, long, bs = "sos")
) with special plotting methods using
ggplot2::coord_map()
to handle the projection to spherical coordinates. An
orthographic projection is used by default, with an essentially arbitrary
(and northern hemisphere-centric) default for the orientation of the view.
fitted_values()
insures that data
(and hence the returned object) is a
tibble rather than a common or garden data frame.
draw.posterior_smooths()
was redundantly plotting duplicate data in the rug
plot. Now only the unique set of covariate values are used for drawing the
rug.
data_sim()
was not passing the scale
argument in the bivariate example
setting ("eg2"
).
draw()
methods for gamm()
and gamm4::gamm4()
fits were not passing
arguments on to draw.gam()
.
draw.smooth_estimates()
would produce a subtitle with data for a continuous
by smooth as if it were a factor by smooth. Now the subtitle only contains the
name of the continuous by variable.
Due to an issue with the size of the package source tarball, which wasn't discovered until after submission to CRAN, 0.7.1 was never released.
draw.gam()
and draw.smooth_estimates()
: {gratia} can now handle smooths
of 3 or 4 covariates when plotting. For smooths of 3 covariates, the third
covariate is handled with ggplot2::facet_wrap()
and a set (default n
= 16)
of small multiples is drawn, each a 2d surface evaluated at the specified
value of the third covariate. For smooths of 4 covariates,
ggplot2::facet_grid()
is used to draw the small multiples, with the default
producing 4 rows by 4 columns of plots at the specific values of the third
and fourth covariates. The number of small multiples produced is controlled
by new arguments n_3d
(default = n_3d = 16
) and n_4d
(default
n_4d = 4
, yielding n_4d * n_4d
= 16 facets) respectively.
This only affects plotting; smooth_estimates()
has been able to handle
smooths of any number of covariates for a while.
When handling higher-dimensional smooths, actually drawing the plots on the
default device can be slow, especially with the default value of n = 100
(which for 3D or 4D smooths would result in 160,000 data points being
plotted). As such it is recommended that you reduce n
to a smaller value:
n = 50
is a reasonable compromise of resolution and speed.
model_concurvity()
returns concurvity measures from mgcv::concurvity()
for estimated GAMs in a tidy format. The synonym concrvity()
is also
provided. A draw()
method is provided which produces a bar plot or a heatmap
of the concurvity values depending on whether the overall concurvity of each
smooth or the pairwise concurvity of each smooth in the model is requested.
draw.gam()
gains argument resid_col = "steelblue3"
that allows the colour
of the partial residuals (if plotted) to be changed.
model_edf()
was not using the type
argument. As a result it only ever
returned the default EDF type.
add_constant()
methods weren't applying the constant to all the required
variables.
draw.gam()
, draw.parametric_effects()
now actually work for a model with
only parametric effects. #142 Reported by @Nelson-Gon
parametric_effects()
would fail for a model with only parametric terms
because predict.gam()
returns empty arrays when passed
exclude = character(0)
.
draw.gam()
now uses smooth_estimates()
internally and consequently uses
its draw()
method and underlying plotting code. This has simplified the code
compared to evaluate_smooth()
and its methods, which will allow for future
development and addition of features more easily than if evaluate_smooth()
had been retained.
Similarly, evaluate_parametric_terms()
is now deprecated in favour of
parametric_effects()
, which is also used internally by draw.gam()
if
parametric terms are present in the model (and parametric = TRUE
).
While a lot of code has been reused so differences between plots as a result of this change should be minimal, some corner cases may have been missed. File an Issue if you notice something that has changed that you think shouldn't.
draw.gam()
now plots 2D isotropic smooths (TPRS and Duchon splines) with
equally-scaled x and y coordinates using coord_equal(ratio = 1)
. Alignment
of these plots will be a little different now when plotting models with
multiple smooths. See Issue #81.
From version 0.7.0, the following functions are considered deprecated and their use is discouraged:
fderiv()
is soft-deprecated in favour of derivatives()
,evaluate_smooth()
is soft-deprecated in favour of smooth_estimates()
,evaluate_parametric_term()
is soft-deprecated in favour of
parametric_effects()
.The first call to one of these functions will generate a warning, pointing to
the newer, alternative, function. It is safe to ignore these warnings, but
these deprecated functions will no longer receive updates and are thus at risk
of being removed from the package at some future date. The newer alternatives
can handle more types of models and smooths, especially so in the case of
smooth_estimates()
.
fitted_values()
provides a tidy wrapper around predict.gam()
for
generating fitted values from the model. New covariate values can be provided
via argument data
. A credible interval on the fitted values is returned, and
values can be on the link (linear predictor) or response scale.
Note that this function returns expected values of the response. Hence, "fitted values" is used instead of "predictions" in the case of new covariate values to differentiate these values from the case of generating new response values from a fitted model.
rootogram()
and its draw()
method produce rootograms as diagnostic plots
for fitted models. Currently only for models fitted with poisson()
,
nb()
, negbin()
, gaussian()
families.
New helper functions typical_values()
, factor_combos()
and
data_combos()
for quickly creating data sets for producing predictions from
fitted models where some covariatess are fixed at come typical or
representative values.
typical_values()
is a new helper function to return typical values for the
covariates of a fitted model. It returns the value of the observation
closest to the median for numerical covariates or the modal level of a
factor while preserving the levels of that factor. typical_values()
is
useful in preparing data slices or scenarios for which fitted values from
the estimated model are required.
factor_combos()
extracts and returns the combinations of levels of factors
found in data used to fit a model. Unlike typical_values()
,
factor_combos()
returns all the combinations of factor levels observed
in the data, not just the modal level. Optionally, all combinations of
factor levels can be returned, not just those in the observed data.
data_combos()
combines returns the factor data from factor_combos()
plus
the typical values of numerical covariates. This is useful if you want to
generate predictions from the model for each combination of factor terms
while holding any continuous covariates at their median values.
nb_theta()
is a new extractor function that returns the theta parameter of
a fitted negative binomial GAM (families nb()
or negbin()
). Additionally,
theta()
and has_theta()
provide additional functionality. theta()
is an
experimental function for extracting any additional parameters from the model
or family. has_theta()
is useful for checking if any additional parameters
are available from the family or model.
edf()
extracts the effective degrees of freedom (EDF) of a fitted model or a
specific smooth in the model. Various forms for the EDF can be extracted.
model_edf()
returns the EDF of the overall model. If supplied with multiple
models, the EDFs of each model are returned for comparison.
draw.gam()
can now show a "rug" plot on a bivariate smooth by drawing small
points with high transparency over the smooth surface at the data coordinates.
In addition, the rugs on plots of factor by smooths now show the locations of covariate values for the specific level of the factor and not over all levels. This better reflects what data were used to estimate the smooth, even though the basis for each smooth was set up using all of the covariate locations.
draw.gam()
and draw.smooth_estimates()
now allow some aspects of the plot
to be changed: the fill (but not colour) and alpha attributes of the credible
interval, and the line colour for the smooth can now be specified using
arguments ci_col
, ci_alpha
, and smooth_col
respectively.
Partial residuals can now be plotted on factor by smooths. To allow this, the partial residuals are filtered so that only residuals associated with a particular level's smooth are drawn on the plot of the smooth.
smooth_estimates()
uses check_user_select_smooths()
to handle
user-specified selection of smooth terms. As such it is more flexible than
previously, and allows for easier selection of smooths to evaluate.
fixef()
is now imported (and re-exported) from the nlme package, with
methods for models fitted with gam()
and gamm()
, to extract fixed effects
estimates from fitted models. fixed_effects()
is an alias for fixef()
.
The draw()
method for smooth_samples()
can now handle 2D smooths.
Additionally, the number of posterior draws to plot can now be specified when
plotting using new argument n_samples
, which will result in n_samples
draws being selected at random from the set of draws for plotting. New
argument seed
allows the selection of draws to be repeatable.
smooth_estimates()
was not filtering user-supplied data for the by level of
the specific smooth when used with by factor smooths. This would result in the
smooth being evaluated at all rows of the user-supplied data, and therefore
would result in nrow(user_data) * nlevels(by_variable)
rows in the returned
object instead of nrow(user_data)
rows.
The add_confint()
method for smooth_estimates()
had the upper and lower
intervals reversed. #107 Reported by @Aariq
draw.gam()
and smooth_estimates()
were both ignoring the dist
argument
that allows covariate values that lie too far from the support of the data to
be excluded when returning estimated values from the smooth and plotting it.
#111 Reported by @Aariq
smooth_samples()
with a factor by GAM would return samples for the first
factor level only. Reported by @rroyaute in discussion of #121
smooth_samples()
would fail if the model contained random effect "smooths".
These are now ignored with a message when running smooth_samples()
.
Reported by @isabellaghement in #121
link()
, inv_link()
were failing on models fitted with family = scat()
.
Reported by @Aariq #130
The {cowplot} package has been replaced by the {patchwork} package for
producing multi-panel figures in draw()
and appraise()
. This shouldn't
affect any code that used {gratia} only, but if you passed additional
arguments to cowplot::plot_grid()
or used the align
or axis
arguments of
draw()
and appraise()
, you'll need to adapt code accordingly.
Typically, you can simply delete the align
or axis
arguments and
{patchwork} will just work and align plots nicely. Any arguments passed via
...
to cowplot::plot_grid()
will just be ignored by
patchwork::wrap_plots()
unless those passed arguments match any of the
arguments of patchwork::wrap_plots()
.
The {patchwork} package is now used for multi-panel figures. As such, {gratia} no longer Imports from the {cowplot} package.
Worm plot diagnostic plots are available via new function worm_plot()
. Worm
plots are detrended Q-Q plots, where deviation from the Q-Q reference line are
emphasized as deviations around the line occupy the full height of the plot.
worm_plot()
methods are available for models of classes "gam"
, "glm"
,
and "lm"
. (#62)
Smooths can now be compared across models using compare_smooths()
, and
comparisons visualised with the associated draw()
method. (#85 @dill)
This feature is a bit experimental; the returned object uses nested lists and may change in the future if users find this confusing.
The reference line in qq_plot()
with method = "normal"
was previously
drawn as a line with intercept 0 and slope 1, to match the other methods. This
was inconsistent with stats::qqplot()
which drew the line through the 1st
and 3rd quartiles. qq_plot()
with method = "normal"
now uses this robust
reference line. Reference lines for the other methods remain drawn with slope
1 and intercept 0.
qq_plot()
with method = "normal"
now draws a point-wise reference band
using the standard error of the order statistic.
The draw()
method for penalty()
now plots the penalty matrix heatmaps in a
more-logical orientation, to match how the matrices might be written down or
printed to the R console.
link()
, and inv_link()
now work for models fitted with the gumbls()
and
shash()
families. (#84)
extract_link()
is a lower level utility function related to link()
and
inv_link()
, and is now exported.
The default method name for generating reference quantiles in qq_plot()
was
changed from "direct"
to "uniform"
, to avoid confusion with the
mgcv::qq.gam()
help page description of the methods. Accordingly using
method = "direct"
is deprecated and a message to this effect is displayed if
used.
The way smooths/terms are selected in derivatives()
has been switched to use
the same mechanism as draw.gam()
's select
argument. To get a partial match
to term
, you now need to also specify partial_match = TRUE
in the call to
derivatives()
.
transform_fun()
had a copy paste bug in the definition of the then generic.
(#96 @Aariq)
derivatives()
with user-supplied newdata
would fail for factor by smooths
with interval = "simultaneous"
and would introduce rows with derivative == 0
with interval = "confidence"
because it didn't subset the rows of newdata
for the specific level of the by factor when computing derivatives.
(#102 @sambweber)
evaluate_smooth()
can now handle random effect smooths defined using an
ordered factor. (#99 @StefanoMezzini)
smooth_estimates()
can now handle
s(x, z, a)
,te()
, t2()
, & ti()
), e.g. te(x, z, a)
s(x, f, bs = "fs")
s(f, bs = "re")
penalty()
provides a tidy representation of the penalty matrices of
smooths. The tidy representation is most suitable for plotting with
ggplot()
.
A draw()
method is provided, which represents the penalty matrix as a
heatmap.
newdata
argument to smooth_estimates()
has been changed to data
as
was originally intended.Partial residuals for models can be computed with partial_residuals()
. The
partial residuals are the weighted residuals of the model added to the
contribution of each smooth term (as returned by predict(model, type = "terms")
.
Also, new function add_partial_residuals()
can be used to add the partial
residuals to data frames.
Users can now control to some extent what colour or fill scales are used when
plotting smooths in those draw()
methods that use them. This is most useful
to change the fill scale when plotting 2D smooths, or to change the discrete
colour scale used when plotting random factor smooths (bs = "fs"
).
The user can pass scales via arguments discrete_colour
and
continuous_fill
.
The effects of certain smooths can be excluded from data simulated from a
model using simulate.gam()
and predicted_samples()
by passing exclude
or
terms
on to predict.gam()
. This allows for excluding random effects, for
example, from model predicted values that are then used to simulate new data
from the conditional distribution. See the example in predicted_samples()
.
Wish of #74 (@hgoldspiel)
draw.gam()
and related functions gain arguments constant
and fun
to
allow for user-defined constants and transformations of smooth estimates and
confidence intervals to be applied.
Part of wish of Wish of #79.
confint.gam()
now works for 2D smooths also.
smooth_estimates()
is an early version of code to replace (or more likely
supersede) evaluate_smooth()
. smooth_estimates()
can currently only handle
1D smooths of the standard types.
The meaning of parm
in confint.gam
has changed. This argument now requires
a smooth label to match a smooth. A vector of labels can be provided, but
partial matching against a smooth label only works with a single parm
value.
The default behaviour remains unchanged however; if parm
is NULL
then all
smooths are evaluated and returned with confidence intervals.
data_class()
is no longer exported; it was only ever intended to be an
internal function.
confint.gam()
was failing on a tensor product smooth due to matching issues.
Reported by @tamas-ferenci
#88
This also fixes #80 (@butellyn) which was a related issue with selecting a specific smooth.
The vdiffr package is now used conditionally in package tests. Reported by Brian Ripley #93
draw.gam()
with scales = "fixed"
now applies to all terms that can be
plotted, including 2d smooths.
Reported by @StefanoMezzini #73
dplyr::combine()
was deprecated. Switch to vctrs::vec_c()
.
draw.gam()
with scales = "fixed"
wasn't using fixed scales where 2d
smooths were in the model.
Reported by @StefanoMezzini #73
draw.gam()
can include partial residuals when drawing univariate smooths.
Use residuals = TRUE
to add partial residuals to each univariate smooth that
is drawn. This feature is not available for smooths of more than one variable,
by smooths, or factor-smooth interactions (bs = "fs"
).
The coverage of credible and confidence intervals drawn by draw.gam()
can be
specified via argument ci_level
. The default is arbitrarily 0.95
for no
other reason than (rough) compatibility with plot.gam()
.
This change has had the effect of making the intervals slightly narrower than in previous versions of gratia; intervals were drawn at ± 2 × the standard error. The default intervals are now drawn at ± ~1.96 × the standard error.
New function difference_smooths()
for computing differences between factor
smooth interactions. Methods available for gam()
, bam()
, gamm()
and
gamm4::gamm4()
. Also has a draw()
method, which can handle differences of
1D and 2D smooths currently (handling 3D and 4D smooths is planned).
New functions add_fitted()
and add_residuals()
to add fitted values
(expectations) and model residuals to an existing data frame. Currently
methods available for objects fitted by gam()
and bam()
.
data_sim()
is a tidy reimplementation of mgcv::gamSim()
with the added
ability to use sampling distributions other than the Gaussian for all models
implemented. Currently Gaussian, Poisson, and Bernoulli sampling distributions
are available.
smooth_samples()
can handle continuous by variable smooths such as in
varying coefficient models.
link()
and inv_link()
now work for all families available in mgcv,
including the location, scale, shape families, and the more specialised
families described in ?mgcv::family.mgcv
.
evaluate_smooth()
, data_slice()
, family()
, link()
, inv_link()
methods for models fitted using gamm4()
from the gamm4 package.
data_slice()
can generate data for a 1-d slice (a single variable varying).
The colour of the points, reference lines, and simulation band in appraise()
can now be specified via arguments
point_col
,point_alpha
,ci_col
ci_alpha
line_col
These are passed on to qq_plot()
, observed_fitted_plot()
,
residuals_linpred_plot()
, and residuals_hist_plot()
, which also now take
the new arguments were applicable.
Added utility functions is_factor_term()
and term_variables()
for working
with models. is_factor_term()
identifies is the named term is a factor using
information from the terms()
object of the fitted model. term_variables()
returns a character vector of variable names that are involved in a model
term. These are strictly for working with parametric terms in models.
appraise()
now works for models fitted by glm()
and lm()
, as do the
underlying functions it calls, especially qq_plot
.
appraise()
also works for models fitted with family gaulss()
. Further
location scale models and models fitted with extended family functions will
be supported in upcoming releases.
datagen()
is now an internal function and is no longer exported. Use
data_slice()
instead.
evaluate_parametric_term()
is now much stricter and can only evaluate main
effect terms, i.e. those whose order, as stored in the terms
object of the
model is 1
.
The draw()
method for derivatives()
was not getting the x-axis label for
factor by smooths correctly, and instead was using NA
for the second and
subsequent levels of the factor.
The datagen()
method for class "gam"
couldn't possibly have worked for
anything but the simplest models and would fail even with simple factor by
smooths. These issues have been fixed, but the behaviour of datagen()
has
changed, and the function is now not intended for use by users.
Fixed an issue where in models terms of the form factor1:factor2
were
incorrectly identified as being numeric parametric terms.
#68
New functions link()
and inv_link()
to access the link function and its
inverse from fitted models and family functions.
Methods for classes: "glm"
, "gam"
, "bam"
, "gamm"
currently.
#58
Adds explicit family()
methods for objects of classes "gam"
, "bam"
, and
"gamm"
.
derivatives()
now handles non-numeric when creating shifted data for finite
differences. Fixes a problem with stringsAsFactors = FALSE
default in
R-devel. #64
gratia now uses the mvnfast package for random draws from a multivariate
normal distribution (mvnfast::rmvn()
). Contributed by Henrik Singmann
(@singmann) #28
New function basis()
for generating tidy representations of basis expansions
from an mgcv-like definition of a smooth, e.g. s()
, te()
, ti()
, or
t2()
. The basic smooth types also have a simple draw()
method for plotting
the basis. basis()
is a simple wrapper around mgcv::smoothCon()
with some
post processing of the basis model matrix into a tidy format.
#42
New function smooth_samples()
to draw samples of entire smooth functions
from their posterior distribution. Also has a draw()
method for plotting the
posterior samples.
draw.gam()
would produce empty plots between the panels for the parametric
terms if there were 2 or more parametric terms in a model. Reported by
@sklayn #39.
derivatives()
now works with factor by smooths, including ordered factor by
smooths. The function also now works correctly for complex models with
multiple covariates/smooths.
#47
derivatives()
also now handles 'fs'
smooths. Reported by
@tomand-uio #57.
evaluate_parametric_term()
and hence draw.gam()
would fail on a ziplss()
model because i) gratia didn't handle parametric terms in models with
multiple linear predictors correctly, and ii) gratia didn't convert to the
naming convention of mgcv for terms in higher linear predictors. Reported by
@pboesu #45