We will send you the file you’ll use, stl_lead.csv
(a
comma-separated value file) in Slack.
Before you move on, read more about the data here.
stl-lead-yourinitials
(for example, mine would be
stl-lead-ah
). Remember: there are multiple ways to set up a
version controlled project, either through RStudio or starting with a
new repo on GitHub then cloning.data
,
docs
and figs
data
folder
of your projectstl_lead_inequity.qmd
in the docs
folderIn your .qmd:
Attach the tidyverse
and janitor
packages in a new code chunk
Read in the stl_lead.csv
data as
stl_lead
and use janitor::clean_names to convert all
variable names to lower snake case
Do some basic exploration of the dataset (e.g. using summary, data visualizations and summary statistics).
In a new code chunk, from stl_lead
create a new data
frame called stl_lead_prop
that has one additional column
called prop_white
that returns the percent of each census
tract identifying as white (variable white
in the dataset
divided by variable totalPop
, times 100). You may need to
do some Googling. Hint:
dplyr::mutate(new_col = col_a / col_b)
will create a new
column new_col
that contains the value of
col_a / col_b
pctElevated
) versus the percent of each census tract
identifying as white.alpha =
), color, etc.)stl_lead_plot
figs
, with
dimensions of (6” x 5”) (width x height)pctElevated
column in
the data frame (remember, this will only take one variable - the
frequency is calculated for you by geom_histogram
)figs
folder