library(tidyverse)
library(palmerpenguins)
library(lubridate)
library(kableExtra)
library(janitor)

Part 0: Setup

  • Create a new version-controlled R Project called eds221-m2021-day9-interactive from Terminal / git bash
  • Install the kableExtra package
  • Create a new R Markdown document in the project
  • Attach the following packages in the setup chunk:
    • tidyverse
    • kableExtra

1. Making a reproducible example ({reprex})

Making a minimum viable example is often the best way to troubleshoot problematic code when you can’t figure out a solution quickly – and is definitely the best way to share an example of something you’re struggling with so you’re most likely to get help. If people can’t run or play with your code, it’s much less likely they’ll be able to offer a solution.

You probably already have {reprex} (part of the tidyverse). Copy code to clipboard and run reprex() to make one!

Some guidelines:

  • Ruthlessly simplify
  • Consider using a subset of data (possibly w/ datapasta)
  • Include library calls (e.g. library(janitor) in the reprex)
  • Make the minimum viable example of the thing that’s not working

See more:

2. A few new wrangling tools (dplyr::across(), janitor::get_dupes())

janitor::get_dupes() to check for duplicates

dupes <- get_dupes(starwars) # Across all variables (exact match across all columns?)
## No variable names specified - using all columns.
## No duplicate combinations found of: name, height, mass, hair_color, skin_color, eye_color, birth_year, sex, gender, ... and 5 other variables
# Check for duplicate values in the `2000` column
dupes_2 <- starwars %>% 
  get_dupes(homeworld)

# Check for duplicates in the homeworld and species column
dupes_3 <- starwars %>% 
  get_dupes(homeworld, species)

dplyr::across() - operations across columns

Mutate across multiple columns:

starwars %>% 
  mutate(across(where(is.character), tolower))

You can use it within group_by() + summarize():

starwars %>% 
  group_by(homeworld) %>% 
  summarise(across(where(is.numeric), mean, na.rm = TRUE), count = n())

Another example:

mtcars %>% 
  group_by(cyl) %>% 
  summarize(across(drat:qsec, mean))

3. Tables with {kable} and {kableExtra}

We can produce finalized tables in R Markdown in a number of ways - see a bunch of them in David Keyes’ post How to make beautiful tables in R.

We’ll just use one tool: kable + kableExtra to make nice html tables.

Try it out of the box (knit to see the table):

penguins %>% 
  group_by(species, island) %>% 
  summarize(mean_mass = mean(body_mass_g, na.rm = TRUE)) %>% 
  kable(col.names = c("Species", "Island", "Body mass (g)")) %>% 
  kable_styling(bootstrap_options = "striped", full_width = FALSE)
## `summarise()` has grouped output by 'species'. You can override using the
## `.groups` argument.
Species Island Body mass (g)
Adelie Biscoe 3709.659
Adelie Dream 3688.393
Adelie Torgersen 3706.373
Chinstrap Dream 3733.088
Gentoo Biscoe 5076.016

4. A package for your ggplot theme

Creating a package

  1. In RStudio, create a new R Project that is an R Package (New Project > New directory > R Package). Make your package name something specific to you, e.g. themes_yourname (like themes_teddy). Make it a version controlled repo by running usethis::use_git() and usethis::use_github() (or w/ CLI).

  2. Check out the existing infrastructure for your R package, which currently contains a single function hello(), which prints “Hello, world!” Check out the R/hello.R file to see where that function is created.

  3. In the Build tab, click Install and Restart. You should see that the package is automatically attached in the Console.

  4. Run the hello() function in the Console, to see that it works.

  5. Create a new R script (.R). Copy and paste the following placeholder code into your R script. Then update the colors (in quotations) to colors that R will recognize (or hexidecimals) and change the function name to theme_yourname (e.g. theme_allison). You can also add other options within theme() to further customize your theme.

my_theme <- function() {
  theme(
    panel.background = element_rect(fill = "yellow"),
    panel.grid.major.x = element_line(colour = "purple", linetype = 3, size = 0.5),
    panel.grid.minor.x = element_blank(),
    panel.grid.major.y =  element_line(colour = "cyan4", linetype = 3, size = 0.5),
    axis.text = element_text(colour = "red"),
    axis.title = element_text(colour = "orange")
  )
}

Save the R script in to the R/ folder (with the file name matching the function name).

  1. Put your cursor anywhere in the function code within your R script. In the top menu of RStudio, select Code > Insert Roxygen skeleton. The information added is important - it specifies the params (arguments of the function) and more that will appear in the documentation, which we’ll create next. Save the .R file, which now contains your function and the Roxygen information.

  2. Document the function by running devtools::document() in the Console. This will create a new .Rd file in the man/ folder, containing important documentation information about your function.

  3. Press Build > Install and Restart. In the Console, run ?function_name, replacing function_name there instead. It will bring up the documentation, and let you know your function exists! Now go ahead and try to use your function by making a graph in the Console.

library(tidyverse)

ggplot(data = msleep, aes(x = sleep_total, y = sleep_rem)) +
  geom_point() 
  + THEME_NAME()
  1. Once you confirm it’s working, push your changes back to your repo on GitHub. Share your repo “username/reponame” with your neighbor so they can install your package from GitHub (recall: using remotes::install_github("username/reponame")). Install your neighbor’s theming package, check which functions exist by running help(package = "packagename") in the Console, then make a ggplot graph that uses your neighbor’s ggplot theme. Done!