Lecture 1
May 15, 2025
If you have not yet finished the Getting to Know You
survey, please do so ASAP!
Make your appointments in the Testing Center now!
Any questions about the syllabus??
Course operation
Doing data science
By the end of the course, you will be able to…
How do we make sure a data analysis is “reproducible”?
Short-term goals:
Long-term goals:
- Scriptability \(\rightarrow\) R
- Literate programming (code, narrative, output in one place) \(\rightarrow\) Quarto
- Version control \(\rightarrow\) Git / GitHub
Option 1:
Sit back and enjoy the show!
Option 2:
Go to your container and launch RStudio.
install.packages()
function and loaded with the library
function, once per session:Data frames: like the spreadsheets of R
?
to get help with objects (like data frames and functions):$
to access columnsNote
Generally, you need to use the $
to tell R where to find that column.
<-
or equals sign =
to save objectsNote
Check your environment pane for the saved object!
Note
If you have trouble understanding what a message is saying, there is a high chance someone has explained the message online.
Packages: Fundamental units of reproducible R code, including reusable R functions, the documentation that describes how to use them, and sample data1
As of 27 August 2024, there are 21,168 R packages available on CRAN (the Comprehensive R Archive Network)2
We’re going to work with a small (but important) subset of these!
Option 1:
Sit back and enjoy the show!
Option 2:
Go to RStudio and open the document ae-01-meet-the-penguins.qmd
.
GitHub is the home for your Git-based projects on the internet – like DropBox but much, much better
We will use GitHub as a platform for web hosting and collaboration (and as our course management system!)
with human readable messages
Option 1:
Sit back and enjoy the show!
Option 2:
Go to the course GitHub organization and clone ae-your_github_name
repo to your container.
Find your application repo, that will always be named using the naming convention assignment_title-your_github_name
Click on the green “Code” button, make sure SSH is selected, copy the repo URL
Once we made changes to our Quarto document, we
went to the Git pane in RStudio
staged our changes by clicking the checkboxes next to the relevant files
committed our changes with an informative commit message
pushed our changes to our application exercise repos
confirmed on GitHub that we could see our changes pushed from RStudio