`vignettes/pdp-classification.Rmd`

`pdp-classification.Rmd`

```
# Load required packages
library(kernlab) # for fitting support vector machines
library(pdp) # for partial dependence plots
library(vip) # for variable importance plots
```

Traditionally, for classification problems, partial dependence functions are on a scale similar to the logit; see, for example, (Hastie, Tibshirani, and Friedman 2009, 369–70). Suppose the response is categorical with \(K\) levels, then for each class we compute \[
f_k(x) = \log\left[p_k(x)\right] - \frac{1}{K}\sum_{k = 1}^K\log\left[p_k(x)\right], \quad k = 1, 2, \dots, K,
\] where \(p_k(x)\) is the predicted probability for the \(k\)-th class. Plotting \(f_k(x)\) helps us understand how the log-odds for the \(k\)-th class depends on different subsets of the predictor variables. The same goes for plotting *individual conditional expectation* (ICE) curves (Goldstein et al. 2015).

To illustrate, we consider Edgar Anderson’s iris data from the **datasets** package. The `iris`

data frame contains the sepal length, sepal width, petal length, and petal width (in centimeters) for 50 flowers from each of three species of iris: setosa, versicolor, and virginica (i.e., \(K = 3\)). In the code chunk below, we fit a support vector machine (SVM) with a Gaussian radial basis function kernel to the `iris`

data using the `svm()`

function in the **kernlab** package (Karatzoglou, Smola, and Hornik 2018) (the tuning parameters were determined using 5-fold cross-validation). **Note that the partial() function has to be able to extract the predicted probabilities for each class**, so it is necessary to set

`probability = TRUE`

in the call to `svm()`

.```
# Fit an SVM to the Edgar Anderson's iris data
iris_svm <- ksvm(Species ~ ., data = iris, kernel = "rbfdot",
kpar = list(sigma = 0.709), C = 0.5, prob.model = TRUE)
```

Before constructing ICE curves or partial dependence plots (PDPs), it is good practice to investigate variable importance. In the code chunk below, we use the **vip** package (Greenwell and Boehmke, n.d.) to a construct PDP-based variable importance plot (VIP); see `?vip::vi`

for a reference. The VIP (**Figure 1**) suggests that `Petal.Width`

and `Petal.Length`

```
# Variable importance plot
vip(
object = iris_svm,
method = "pdp",
feature_names = colnames(iris_svm@xmatrix[[1]]),
train = iris
)
#> Warning: Setting `method = "pdp"` is experimental, use at your own risk!
```

Next we’ll construct ICE curves (black curves) and PDPs (red curves) for all four features. The results are displayed in **Figure 2** below.

```
# Construct PDPs for each feature and stores results in a list
features <- names(subset(iris, select = -Species))
pdps <- list()
for (feature in features) {
pdps[[feature]] <- partial(iris_svm, pred.var = feature, ice = TRUE,
plot = TRUE, rug = TRUE, train = iris, alpha = 0.1)
}
grid.arrange(grobs = pdps, ncol = 2) # display plots in a grid
```

It is also possible to obtain PDPs for classification problems on the raw probability scale by setting `prob = TRUE`

in the call to `partial()`

. An example is given below and the results are displayed in **Figure 3**.

```
partial(iris_svm, pred.var = c("Petal.Width", "Sepal.Width"), plot = TRUE,
chull = TRUE, train = iris, prob = TRUE, plot.engine = "ggplot2",
palette = "magma")
```

By default, `partial()`

will use the first class as the *focus* class. To plot the PDP for a particular class, use the `which.class`

argument. To illustrate, we’ll plot the partial dependence of `Species`

on both `Petal.Width`

and `Petal.Length`

for each of the three classes. The results are displayed in **Figure 4** below.

```
# Compute partial dependence for each of the three classes
pd <- NULL
for (i in 1:3) {
tmp <- partial(iris_svm, pred.var = c("Petal.Width", "Petal.Length"),
which.class = i, grid.resolution = 10, train = iris)
pd <- rbind(pd, cbind(tmp, Species = levels(iris$Species)[i]))
}
# Figure 3
library(ggplot2)
ggplot(pd, aes(x = Petal.Width, y = Petal.Length, z = yhat, fill = yhat)) +
geom_tile() +
geom_contour(color = "white", alpha = 0.5) +
scale_fill_distiller(name = "Class-centered\nlogit", palette = "Spectral") +
theme_bw() +
facet_grid(~ Species)
```

Goldstein, Alex, Adam Kapelner, Justin Bleich, and Emil Pitkin. 2015. “Peeking Inside the Black Box: Visualizing Statistical Learning with Plots of Individual Conditional Expectation.” *Journal of Computational and Graphical Statistics* 24 (1): 44–65. https://doi.org/10.1080/10618600.2014.907095.

Greenwell, Brandon, and Brad Boehmke. n.d. *Vip: Variable Importance Plots*. https://koalaverse.github.io/vip/index.html.

Hastie, Trevor, Robert. Tibshirani, and Jerome Friedman. 2009. *The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition*. Springer Series in Statistics. Springer-Verlag.

Karatzoglou, Alexandros, Alex Smola, and Kurt Hornik. 2018. *Kernlab: Kernel-Based Machine Learning Lab*. https://CRAN.R-project.org/package=kernlab.