Lecture 2
May 16, 2025
I have office hours today! 1:00-3:00 PM in Old Chemistry 203/203B.
We will start grading your ae repositories next week - make sure you have them ready to go.
First ‘real’ lab is on Monday; the topic will be data visualization (what we are starting today).
Last time:
We introduced you to the course toolkit.
You cloned your ae
repositories and started making some updates in your Quarto documents.
You commited and pushed your changes back.
Today:
We will introduce data visualization.
You will pull to get today’s application exercise file.
You will work on the new application exercise on data visualization, commit your changes, and push them.
ae-01-meet-the-penguins
Go to RStudio, confirm that you’re in the ae
project, and open the document ae-01-meet-the-penguins.qmd
.
The environment used by Quarto when rendering starts EMPTY - it does not see what you see in your environment.
Using functions that cause a popup (like View()
) are not going to work when you render a document. Either use a comment (with #
) to remove them, or just delete before rendering!
Make sure you commit and then PUSH! Just committing is not enough!
How can you create something like this???
The ggplot2 package has the plotting functions you need!
ggplot2 is a part of the tidyverse package - when you load tidyverse, you also load ggplot2
What are some steps you can take to visualize a data set?
What do you want on the x-axis?
What do you want on the y-axis?
Map year
to the x
aesthetic
Map percent_yes
to the y
aesthetic
It’s common practice in R to omit the names of first two arguments of a function:
Map percent_yes
to the y
aesthetic
Map percent_yes
to the y
aesthetic
with a geom
geom_point()
resulted in the following warning:Warning: Removed 2 rows containing missing values or values outside the scale
range (`geom_point()`)
with a geom
Map species
to the color
aesthetic
Map species
to the color
aesthetic
ggplot(penguins, mapping = aes(x = bill_length_mm, y = body_mass_g, color = species)) +
geom_point()
What exactly are aesthetics? They map from a variable to a plot feature.
x and y axes
color, shape, size of points
with another geom
ggplot(penguins, mapping = aes(x = bill_length_mm, y = body_mass_g, color = species)) +
geom_point() +
geom_smooth()
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Warning: Removed 2 rows containing non-finite outside the scale range
(`stat_smooth()`).
Warning: Removed 2 rows containing missing values or values outside the scale
range (`geom_point()`).
geom_smooth()
resulted in the following warning:`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
with another geom
Use facet_wrap
to make sub-plots
We can facet by other variables!
Which plot do you think made it easier to compare between penguin species?
With a scale_color_
function
With another scale_color_
function
With a theme_
function
With a theme_
function
With a theme_
function
With a theme_
function
With labs()
function
with alpha
with alpha
with se = FALSE
You aren’t!!!
We built a plot layer-by-layer
What if we want to use our own data?
read_csv("data_file.csv")
(assuming the data is in a CSV format)
ae-02-bechdel-dataviz
We will be looking at data on movies and the Bechdel test.
ae-02-bechdel-dataviz
ae
project in RStudio.ggplot()
.+
s.What are some other types of plots you can make?
How can you talk about the information conveyed by plots?