• Projects
  • Quarto customization
  • Creating vectors and sequences
  • Our first ggplot2 graph
  • Functions continued

1. Projects

…one small step for a programmer, one giant leap for reproducibility

  • In R, Session > New (make this a frequent habit)
  • File > New Quarto project
  • For now, New directory (but version control coming up soon…)
  • Give your project a name
  • Pick where to put it (this’ll create a folder on your computer)
  • Create project

Discussion - what does this do? Where does it live on your computer? What does it contain?

  • Make a new Quarto doc (File > New File > New Quarto document) in your project, then follow along (adding notes in markdown cells) with the rest of the session

2. Exponents and logs in R

  • log() = natural log
  • log10() = log base 10
  • exp() = natural exponential

Let’s try some!

3. Making sequences in R

Sometimes we’ll want to create sequences of values that we can plug into a function to see how an output value changes over a range of inputs.

We can make a sequence of values, stored as a vector in R, using the seq() function. The general structure looks like this:

seq(from = start_value, to = end_value, by = increment)

For example, to create a sequence from 2 to 18 by increments of 0.3, I would use:

seq(from = 2, to = 18, by = 0.3)
##  [1]  2.0  2.3  2.6  2.9  3.2  3.5  3.8  4.1  4.4  4.7  5.0  5.3  5.6  5.9  6.2
## [16]  6.5  6.8  7.1  7.4  7.7  8.0  8.3  8.6  8.9  9.2  9.5  9.8 10.1 10.4 10.7
## [31] 11.0 11.3 11.6 11.9 12.2 12.5 12.8 13.1 13.4 13.7 14.0 14.3 14.6 14.9 15.2
## [46] 15.5 15.8 16.1 16.4 16.7 17.0 17.3 17.6 17.9

Note that the above sequence ends at 17.9 (the last complete increment). Another option is to specify the length of the output vector instead - like “I want to have 30 values between 2 and 18, evenly spaced.” To do that, use the length = argument within the seq() function.

seq(from = 2, to = 18, length = 30)
##  [1]  2.000000  2.551724  3.103448  3.655172  4.206897  4.758621  5.310345
##  [8]  5.862069  6.413793  6.965517  7.517241  8.068966  8.620690  9.172414
## [15]  9.724138 10.275862 10.827586 11.379310 11.931034 12.482759 13.034483
## [22] 13.586207 14.137931 14.689655 15.241379 15.793103 16.344828 16.896552
## [29] 17.448276 18.000000

4. Make the logistic growth function…function

We’ll write a LOT of functions in R and Python (especially in EDS 221). For now, we can use the nice Cmd + Option + X shortcut to create a function for us.

Let’s make a function of the logistic growth equation. Recall, the expression for population size at any time t following logistic growth is given by:

\[N_t=\frac{K}{1+[\frac{K-N_0}{N_0}]e^{-rt}}\]

Let’s write it out. When in doubt, parentheses! Keep in mind that you may want to make your argument names something a bit more descriptive. Always ask: What will make future me least likely to mess this up? What would make these function arguments clearest to my collaborators?

pop_logistic <- function(capacity, init_pop, rate, time_yr) {
  capacity / (1 + ((capacity - init_pop) / init_pop) * exp(-rate * time_yr))
}

Logistic population - one time

Let’s say that for a population of chipmunks in one region, the carrying capacity is 2,580 individuals, the exponential growth rate is 0.32 (yr-1), and time is in years. If the initial population is 230 individuals, what is the estimated population size a time = 2.4 years?

pop_logistic(capacity = 2580, init_pop = 230, rate = 0.32, time_yr = 2.4)
## [1] 449.4572

Logistic population - a lot of times

Now let’s say we want to predict (then plot) the estimated population over a bunch of different times. Based on what we’ve learned today, how do you expect we might do that? A sequence of values as the time input!

Let’s make a sequence of times (0 to 20 years, by 1/2 year increments), then use that vector as our time input in the logistic growth model.

# First, create the vector (a sequence of values)
time_vec <- seq(from = 0, to = 20, by = 0.5)

# Then, use that as your time input in the model:
pop_logistic(capacity = 2580, init_pop = 230, rate = 0.32, time_yr = time_vec)
##  [1]  230.0000  265.7962  306.4370  352.3458  403.9105  461.4584  525.2265
##  [8]  595.3303  671.7323  754.2132  842.3511  935.5115 1032.8508 1133.3382
## [15] 1235.7931 1338.9376 1441.4593 1542.0771 1639.6038 1733.0001 1821.4121
## [22] 1904.1943 1980.9141 2051.3424 2115.4329 2173.2940 2225.1574 2271.3465
## [29] 2312.2467 2348.2800 2379.8838 2407.4939 2431.5322 2452.3984 2470.4641
## [36] 2486.0700 2499.5249 2511.1059 2521.0597 2529.6041 2536.9311

We want to plot those estimated population sizes - but we didn’t store the vector of outputs! Remember - if you want to store an output, using the assingment operator (<-) in R, and check that it exists in your environment.

chipmunk_pop <- pop_logistic(capacity = 2580, init_pop = 230, rate = 0.32, time_yr = time_vec)

# Then we can call chipmunk_pop:
chipmunk_pop
##  [1]  230.0000  265.7962  306.4370  352.3458  403.9105  461.4584  525.2265
##  [8]  595.3303  671.7323  754.2132  842.3511  935.5115 1032.8508 1133.3382
## [15] 1235.7931 1338.9376 1441.4593 1542.0771 1639.6038 1733.0001 1821.4121
## [22] 1904.1943 1980.9141 2051.3424 2115.4329 2173.2940 2225.1574 2271.3465
## [29] 2312.2467 2348.2800 2379.8838 2407.4939 2431.5322 2452.3984 2470.4641
## [36] 2486.0700 2499.5249 2511.1059 2521.0597 2529.6041 2536.9311

5. Make a plot!

You will learn a lot more about data visualization throughout MEDS. But let’s make a first little rough one just for fun using the ggplot2 package, which is part of the tidyverse (more on this in EDS 221).

Attach the tidyverse in the setup chunk of your Quarto document using library(tidyverse). Note: you should already have the package installed on your computer - if not, you’ll need to do that first.

Let’s first combine our time sequence (time_vec) and predicted populations (chipmunk_pop) into a single data frame - a table of data where different vectors (we’ll think of these as variables moving forward) are stored in columns.

chipmunk_df <- data.frame(time_vec, chipmunk_pop)

# ALWAYS look:
head(chipmunk_df)
##   time_vec chipmunk_pop
## 1      0.0     230.0000
## 2      0.5     265.7962
## 3      1.0     306.4370
## 4      1.5     352.3458
## 5      2.0     403.9105
## 6      2.5     461.4584

Now follow along as Allison raves about the grammar of graphics to make a basic ggplot graph:

ggplot(data = chipmunk_df, aes(x = time_vec, y = chipmunk_pop)) +
  geom_point()

6. No precious objects or outputs!

  • Save your .qmd, which lives in your project.
  • Close your whole project (File > Close project)
  • Restart your R session & check environment
  • Find wherever your project lives on your computer
  • Open the .Rproj file (NOT the .Rmd on its own - don’t orphan your project files)
  • Check for clues you’re in your project
  • In the Files tab of RStudio, click on the .qmd you saved
  • Use Cmd + Option + R to run all code in your .qmd
  • Check to see that all objects and outputs are automatically reproduced

End interactive session 1b