In order to use the ggplot2 graphics system, you need a long data frame. How to obtain this data frame will be discussed in the Data Wrangling module. Here we will show the syntax of how to construct a ggplot2 graphic after you have an appropriate data frame.
library("tidyverse")
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4.9000 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Suppose we want to make a plot of the data in the warpbreaks data set. You can find out what information is in the warpbreaks data data set by looking at its helpfile ?warpbreaks. For now, we need to know what the names of the variables in the data set. It will likely be helpful to know what type of variables we have. We find the appropriate information by using the following commands.
# Investigate warpbreaks data setnames(warpbreaks)
[1] "breaks" "wool" "tension"
head(warpbreaks)
breaks wool tension
1 26 A L
2 30 A L
3 54 A L
4 25 A L
5 70 A L
6 52 A L
summary(warpbreaks)
breaks wool tension
Min. :10.00 A:27 L:18
1st Qu.:18.25 B:27 M:18
Median :26.00 H:18
Mean :28.15
3rd Qu.:34.00
Max. :70.00
Suppose we would like to make a plot of the number of breaks (y-axis) vs the tension. To do so, we can use the code below.
Since the points are aligned in vertical lines and therefore may be covering each other up. We will use jittering to add a little randomness to the position of the points to ensure that they don’t overlap.
# Warpbreaks scatterplot with jitterggplot(data = warpbreaks) +geom_point(mapping =aes(y = breaks, x = tension),position =position_jitter()) # <POSITION>
13.4 Stat
This rarely used option provides functionality for some types of plots. In a scatterplot, the functionality is rarely used.
# Warpbreaks scatterplot with point size as the sumggplot(data = warpbreaks) +geom_point(mapping =aes(y = breaks, x = tension),stat ="sum") # <STAT>
13.5 Geom
Let’s switch the type of plot to a boxplot.
# Example boxplotggplot(data = warpbreaks) +geom_boxplot( # <GEOM_FUNCTION>mapping =aes(y = breaks, x = tension) )
Here is another way to specify a scatterplot with jitter.
# Example jitter <GEOM>ggplot(data = warpbreaks) +geom_jitter( # <GEOM_FUNCTION>mapping =aes(y = breaks, x = tension) )
13.5.1 Smoothers
Recall the original plot
# One layerggplot(data = mtcars) +geom_point(mapping =aes(x = disp, y = mpg))
We can add layers to this plot by using multiple calls.
# Two layersggplot(data = mtcars) +geom_point( # <GEOM_FUNCTION>mapping =aes(x = disp, y = mpg)) +geom_smooth( # <GEOM_FUNCTION>mapping =aes(x = disp, y = mpg))
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Since we are using the same mappings in both calls, we can include the in the original ggplot call.
# Two layersggplot(data = mtcars,mapping =aes(x = disp, y = mpg)) +geom_point() +# <GEOM_FUNCTION> geom_smooth() # <GEOM_FUNCTION>
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Let’s add color to represent another variable.
# Colorggplot(data = mtcars,mapping =aes(x = disp, y = mpg, color =factor(vs))) +# <MAPPING>geom_point() +geom_smooth()
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Linear regression lines
# Regression layerggplot(data = mtcars,mapping =aes(x = disp, y = mpg, color =factor(vs))) +geom_point() +geom_smooth(method ="lm") # <GEOM_FUNCTION> option
`geom_smooth()` using formula = 'y ~ x'
13.6 Mappings
13.6.1 Axes
# x and y axesggplot(data = mtcars) +geom_point(mapping =aes(x = disp, # <MAPPINGS>y = mpg) )
# switchedggplot(data = mtcars) +geom_point(mapping =aes(x = hp, # new y = mpg))
Error in `geom_point()`:
! Problem while computing aesthetics.
ℹ Error occurred in the 1st layer.
Caused by error in `scale_f()`:
! A continuous variable cannot be mapped to the shape aesthetic.
ℹ Choose a different aesthetic or use `scale_shape_binned()`.
13.6.4 Colors and Shapes
# colors and shapesggplot(data = mtcars) +geom_point(mapping =aes(x = disp, y = mpg, color = wt, shape =factor(cyl)))