We will see how to perform a linear regression using the lm function and to plot these straigth regression lines in base R and ggplot2, using the lines and geom_abline functions, respectively. Please read my article about adding straight lines in base R and ggplot2 if you are interested in the different plotting options.
Load the ggplot2 package
library(ggplot2)
Load the dataset
data("iris")
Variables assignment
dataset = iris
x_column = "Sepal.Length"
y_column = "Petal.Length"
Linear regression
Firstly, we can perform a regression in R using a linear model with the lm function:
lm_results = lm(dataset[, y_column] ~ dataset[, x_column])
Here is the result of the linear regression:
Call:
lm(formula = dataset[, y_column] ~ dataset[, x_column])
Coefficients:
(Intercept) dataset[, x_column]
-7.101 1.858
Therefore, we can describe the regression line with an intercept of -7.101 and a slope of 1.858.
Extract the linear regression coefficients
intercept = coef(lm_results)[[1]]
slope = coef(lm_results)[[2]]
Line color
Now, I will use the following color for the regression line:
acol = palette()[4]
If you are interested in plot colors, I have written a complete post about color palettes in R.
Add a linear regression line in base R
plot(dataset[, x_column], dataset[, y_column], pch = 18,
xlab = x_column, ylab = y_column)
abline(a = intercept, b = slope, col = acol, lwd = 3)
Add a linear regression line in ggplot2
ggplot(dataset, aes_string(x = x_column, y = y_column)) +
geom_point(shape = 18)+
geom_abline(intercept = intercept, slope = slope, col = acol, lwd = 1)
Perform the linear regression in ggplot2
Moreover, it is also possible to perform the linear regression directly in ggplot2 using the geom_smooth function. Surprinsingly, it gives the same intercept and slope but the line is finite unlike the previous method.
ggplot(dataset, aes_string(x = x_column, y = y_column)) +
geom_point(shape = 18)+
geom_smooth(method = "lm", se = FALSE, col = acol, lwd = 1)
Conclusion
To summarize, R is a very famous software to perform statistical analyses and the linear regression is indeed really easy to compute using the lm function and to plot using base R graphics as well as the ggplot2 library.