Project
Introduction
TL;DR: Ask a question you’re curious about and answer it with a dataset of your choice. This is your project in a nutshell.
The project for this course will consist of analysis on a dataset of your own choosing. The dataset may already exist, or you may collect your own data using a survey or by conducting an experiment. You can choose the data based on your teams’ interests or based on work in other courses or research projects. The goal of this project is for you to demonstrate proficiency in the techniques we have covered in this course (and beyond, if you like) and apply them to a novel dataset in a meaningful way.
The goal is not to do an exhaustive data analysis i.e., do not calculate every statistic and procedure you have learned for every variable, but rather let me know that you are proficient at asking meaningful questions and answering them with results of data analysis, that you are proficient in using R, and that you are proficient at interpreting and presenting the results. Focus on methods that help you begin to answer your research questions. You do not have to apply every statistical procedure we learned. Also, critique your own methods and provide suggestions for improving your analysis. Issues pertaining to the reliability and validity of your data, and appropriateness of the statistical analysis should be discussed here.
The project is very open ended. You should create some kind of compelling visualization(s) of this data in R. There is no limit on what tools or packages you may use but sticking to packages we learned in the course is required. You do not need to visualize all of the data at once. A single high-quality visualization will receive a much higher grade than a large number of poor-quality visualizations. Also pay attention to your presentation. Neatness, coherency, and clarity will count. All analyses must be done in RStudio, using R, and all components of the project must be reproducible (with the exception of the slide deck).
You will work on the project with your lab teams.
The four milestones for the final project are
- Milestone 1 - Working collaboratively
- Milestone 2 - Proposals, with three dataset ideas
- Milestone 3 - Improvement and progress
- Milestone 4 - Peer review, on another team’s project
- Milestone 5 - Presentation with slides and a reproducible project write-up of your analysis, with a draft along the way.
You will not be submitting anything on Gradescope for the project. Submission of these deliverables will happen on GitHub and feedback will be provided as GitHub issues that you need to engage with and close. The collection of the documents in your GitHub repo will create a webpage for your project. To create the webpage go to the Build tab in RStudio, and click on Render Website.
Milestone 1 - Working collaboratively
For the first milestone of your project you’ll practice a collaborative Git workflow with your team members. Each team member taking part in the collaborative working activity will get 5 points towards their project.
Milestone 2 - Proposal
There are two main purposes of the project proposal:
- To help you think about the project early, so you can get a head start on finding data, reading relevant literature, thinking about the questions you wish to answer, etc.
- To ensure that the data you wish to analyze, methods you plan to use, and the scope of your analysis are feasible and will allow you to be successful for this project.
Milestone 3 - Improvement and progress
We want to see that you have made concrete progress towards the proposal selected from the previous milestone.
Milestone 4 - Peer review
Critically reviewing others’ work is a crucial part of the scientific process, and STA 199 is no exception. You will be assigned two teams to review. This feedback is intended to help you create a high quality final project, as well as give you experience reading and constructively critiquing the work of others.
Milestone 5 - Write-up and presentation
This is the final project deadline - all goals / deliverables should be completed.
Other
More information on grading, teamwork, and more coming soon!