In this tutorial, we will see how to perform a local polynomial regression using the loess function and how to plot curved regression lines in base R and ggplot2. We will also see an alternative method using ggplot2 which consists in performing and plotting the regression lines using the geom_smooth function.
Load the ggplot2 package
library(ggplot2)
Load the dataset
data("iris")
Variables assignment
dataset = iris
x_column = "Sepal.Length"
y_column = "Petal.Length"
Order the data frame
Firstly, we need to order the data frame according to the x values to be able to plot the local polynomial regression line:
dataset = dataset[order(dataset[, x_column]),]
Local polynominal regression
Then, we can perform the local polynomial regression using the loess function:
loess_results = loess(dataset[, y_column] ~ dataset[, x_column])
Here is the result of the call:
Call:
loess(formula = dataset[, y_column] ~ dataset[, x_column])
Number of Observations: 150
Equivalent Number of Parameters: 4.74
Residual Standard Error: 0.7917
Extract the y fitted values
In order to add the regression line to plots, we need to extract the fitted values of the y axis:
dataset[, "loess"] = loess_results$fitted
Line color
In addition, we will use the following line color:
lcol = palette()[4]
Add a local polynomial regression line in R
plot(dataset[, x_column], dataset[, y_column], pch = 18,
xlab = x_column, ylab = y_column)
lines(dataset[, x_column], dataset[, "loess"], col = lcol, lwd = 3)
Add a local polynomial regression line in ggplot2
ggplot(dataset, aes_string(x = x_column, y = y_column)) +
geom_point(shape = 18)+
geom_line(data = data.frame(x = dataset[, x_column], y = dataset[, "loess"]),
aes(x = x, y = y), col = lcol, lwd = 1)
Perform the local polynomial regression in ggplot2
Furthermore, it is also possible (and easier) to make the local polynomial regression directly in ggplot2 using the geom_smooth function. It gives exactly the same plot as the previous method.
ggplot(dataset, aes_string(x = x_column, y = y_column)) +
geom_point(shape = 18)+
geom_smooth(method = "loess", se = FALSE, col = lcol, lwd = 1)
Conclusion
In conclusion, the local polynomial regression can be easily performed in R using the loess function and plotted using base R or ggplot2 functions.