Compute partial dependence functions (i.e., marginal effects) for various model fitting objects.
partial(object, ...) # S3 method for default partial(object, pred.var, pred.grid, pred.fun = NULL, grid.resolution = NULL, ice = FALSE, center = FALSE, quantiles = FALSE, probs = 1:9/10, trim.outliers = FALSE, type = c("auto", "regression", "classification"), inv.link = NULL, which.class = 1L, prob = FALSE, recursive = TRUE, plot = FALSE, plot.engine = c("lattice", "ggplot2"), smooth = FALSE, rug = FALSE, chull = FALSE, levelplot = TRUE, contour = FALSE, contour.color = "white", palette = c("viridis", "magma", "inferno", "plasma", "cividis"), alpha = 1, train, cats = NULL, check.class = TRUE, progress = "none", parallel = FALSE, paropts = NULL, ...)
object  A fitted model object of appropriate class (e.g., 

...  Additional optional arguments to be passed onto

pred.var  Character string giving the names of the predictor variables of interest. For reasons of computation/interpretation, this should include no more than three variables. 
pred.grid  Data frame containing the joint values of interest for the
variables listed in 
pred.fun  Optional prediction function that requires two arguments:

grid.resolution  Integer giving the number of equally spaced points to
use for the continuous variables listed in 
ice  Logical indicating whether or not to compute individual
conditional expectation (ICE) curves. Default is 
center  Logical indicating whether or not to produce centered ICE
curves (cICE curves). Only used when 
quantiles  Logical indicating whether or not to use the sample
quantiles of the continuous predictors listed in 
probs  Numeric vector of probabilities with values in [0,1]. (Values up
to 2e14 outside that range are accepted and moved to the nearby endpoint.)
Default is 
trim.outliers  Logical indicating whether or not to trim off outliers
from the continuous predictors listed in 
type  Character string specifying the type of supervised learning.
Current options are 
inv.link  Function specifying the transformation to be applied to the
predictions before the partial dependence function is computed
(experimental). Default is 
which.class  Integer specifying which column of the matrix of predicted
probabilities to use as the "focus" class. Default is to use the first class.
Only used for classification problems (i.e., when

prob  Logical indicating whether or not partial dependence for
classification problems should be returned on the probability scale, rather
than the centered logit. If 
recursive  Logical indicating whether or not to use the weighted tree
traversal method described in Friedman (2001). This only applies to objects
that inherit from class 
plot  Logical indicating whether to return a data frame containing the
partial dependence values ( 
plot.engine  Character string specifying which plotting engine to use
whenever 
smooth  Logical indicating whether or not to overlay a LOESS smooth.
Default is 
rug  Logical indicating whether or not to include a rug display on the
predictor axes. The tick marks indicate the min/max and deciles of the
predictor distributions. This helps reduce the risk of interpreting the
partial dependence plot outside the region of the data (i.e., extrapolating).
Only used when 
chull  Logical indicating whether or not to restrict the values of the
first two variables in 
levelplot  Logical indicating whether or not to use a false color level
plot ( 
contour  Logical indicating whether or not to add contour lines to the
level plot. Only used when 
contour.color  Character string specifying the color to use for the
contour lines when 
palette  Character string indicating the colormap option to use. Five options are available: "viridis" (the default), "magma", "inferno", "plasma", and "cividis". 
alpha  Numeric value in 
train  An optional data frame, matrix, or sparse matrix containing the
original training data. This may be required depending on the class of

cats  Character string indicating which columns of 
check.class  Logical indicating whether or not to make sure each column
in 
progress  Character string giving the name of the progress bar to use
while constructing the partial dependence function. See

parallel  Logical indicating whether or not to run 
paropts  List containing additional options to be passed onto

By default, partial
returns an object of class
c("data.frame", "partial")
. If ice = TRUE
and
center = FALSE
then an object of class c("data.frame", "ice")
is returned. If ice = TRUE
and center = TRUE
then an object of
class c("data.frame", "cice")
is returned. These three classes
determine the behavior of the plotPartial
function which is
automatically called whenever plot = TRUE
. Specifically, when
plot = TRUE
, a "trellis"
object is returned (see
lattice
for details); the "trellis"
object will
also include an additional attribute, "partial.data"
, containing the
data displayed in the plot.
In some cases it is difficult for partial
to extract the original
training data from object
. In these cases an error message is
displayed requesting the user to supply the training data via the
train
argument in the call to partial
. In most cases where
partial
can extract the required training data from object
,
it is taken from the same environment in which partial
is called.
Therefore, it is important to not change the training data used to construct
object
before calling partial
. This problem is completely
avoided when the training data are passed to the train
argument in the
call to partial
.
It is recommended to call partial
with plot = FALSE
and store
the results. This allows for more flexible plotting, and the user will not
have to waste time calling partial
again if the default plot is not
sufficient.
It is possible to retrieve the last printed "trellis"
object, such as
those produced by plotPartial
, using trellis.last.object()
.
If ice = TRUE
or the prediction function given to pred.fun
returns a prediction for each observation in newdata
, then the result
will be a curve for each observation. These are called individual conditional
expectation (ICE) curves; see Goldstein et al. (2015) and
ice
for details.
J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29: 11891232, 2001.
Goldstein, A., Kapelner, A., Bleich, J., and Pitkin, E., Peeking Inside the Black Box: Visualizing Statistical Learning With Plots of Individual Conditional Expectation. (2014) Journal of Computational and Graphical Statistics, 24(1): 4465, 2015.
# NOT RUN { # # Regression example (requires randomForest package to run) # # Fit a random forest to the boston housing data library(randomForest) data (boston) # load the boston housing data set.seed(101) # for reproducibility boston.rf < randomForest(cmedv ~ ., data = boston) # Using randomForest's partialPlot function partialPlot(boston.rf, pred.data = boston, x.var = "lstat") # Using pdp's partial function head(partial(boston.rf, pred.var = "lstat")) # returns a data frame partial(boston.rf, pred.var = "lstat", plot = TRUE, rug = TRUE) # The partial function allows for multiple predictors partial(boston.rf, pred.var = c("lstat", "rm"), grid.resolution = 40, plot = TRUE, chull = TRUE, progress = "text") # The plotPartial function offers more flexible plotting pd < partial(boston.rf, pred.var = c("lstat", "rm"), grid.resolution = 40) plotPartial(pd, levelplot = FALSE, zlab = "cmedv", drape = TRUE, colorkey = FALSE, screen = list(z = 20, x = 60)) # The autplot function can be used to produce graphics based on ggplot2 library(ggplot2) autoplot(pd, contour = TRUE, legend.title = "Partial\ndependence") # # Individual conditional expectation (ICE) curves # # Use partial to obtain ICE/cICE curves rm.ice < partial(boston.rf, pred.var = "rm", ice = TRUE) plotPartial(rm.ice, rug = TRUE, train = boston, alpha = 0.2) autoplot(rm.ice, center = TRUE, alpha = 0.2, rug = TRUE, train = boston) # # Classification example (requires randomForest package to run) # # Fit a random forest to the Pima Indians diabetes data data (pima) # load the boston housing data set.seed(102) # for reproducibility pima.rf < randomForest(diabetes ~ ., data = pima, na.action = na.omit) # Partial dependence of positive test result on glucose (default logit scale) partial(pima.rf, pred.var = "glucose", plot = TRUE, chull = TRUE, progress = "text") # Partial dependence of positive test result on glucose (probability scale) partial(pima.rf, pred.var = "glucose", prob = TRUE, plot = TRUE, chull = TRUE, progress = "text") # }