How to add linear regression lines in R

We will see how to perform a linear regression using the lm function and to plot these straigth regression lines in base R and ggplot2, using the lines and geom_abline functions, respectively. Please read my article about adding straight lines in base R and ggplot2 if you are interested in the different plotting options.

Scatter plots with a linear regression line produced with base R above and ggplot2 below

Load the ggplot2 package

library(ggplot2)

Load the dataset

data("iris")

Variables assignment

dataset = iris
x_column = "Sepal.Length"
y_column = "Petal.Length"

Linear regression

Firstly, we can perform a regression in R using a linear model with the lm function:

lm_results = lm(dataset[, y_column] ~ dataset[, x_column])

Here is the result of the linear regression:

Call:
lm(formula = dataset[, y_column] ~ dataset[, x_column])

Coefficients:
        (Intercept)  dataset[, x_column]  
             -7.101                1.858

Therefore, we can describe the regression line with an intercept of -7.101 and a slope of 1.858.

Extract the linear regression coefficients

intercept = coef(lm_results)[[1]]
slope = coef(lm_results)[[2]]

Line color

Now, I will use the following color for the regression line:

acol = palette()[4]

If you are interested in plot colors, I have written a complete post about color palettes in R.

Add a linear regression line in base R

plot(dataset[, x_column], dataset[, y_column], pch = 18,
     xlab = x_column, ylab = y_column)
abline(a = intercept, b = slope, col = acol, lwd = 3)
Scatter plot with a linear regression line produced with base R

Add a linear regression line in ggplot2

ggplot(dataset, aes_string(x = x_column, y = y_column)) +
  geom_point(shape = 18)+
  geom_abline(intercept = intercept, slope = slope, col = acol, lwd = 1)
Scatter plot with a linear regression line produced with ggplot2

Perform the linear regression in ggplot2

Moreover, it is also possible to perform the linear regression directly in ggplot2 using the geom_smooth function. Surprinsingly, it gives the same intercept and slope but the line is finite unlike the previous method.

ggplot(dataset, aes_string(x = x_column, y = y_column)) +
  geom_point(shape = 18)+
  geom_smooth(method = "lm", se = FALSE, col = acol, lwd = 1)

Conclusion

To summarize, R is a very famous software to perform statistical analyses and the linear regression is indeed really easy to compute using the lm function and to plot using base R graphics as well as the ggplot2 library.

Related posts

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply