ggplot2
graphDiscussion - what does this do? Where does it live on your computer? What does it contain?
log()
= natural loglog10()
= log base 10exp()
= natural exponentialLet’s try some!
Sometimes we’ll want to create sequences of values that we can plug into a function to see how an output value changes over a range of inputs.
We can make a sequence of values, stored as a vector in R,
using the seq()
function. The general structure looks like
this:
seq(from = start_value, to = end_value, by = increment)
For example, to create a sequence from 2 to 18 by increments of 0.3, I would use:
seq(from = 2, to = 18, by = 0.3)
## [1] 2.0 2.3 2.6 2.9 3.2 3.5 3.8 4.1 4.4 4.7 5.0 5.3 5.6 5.9 6.2
## [16] 6.5 6.8 7.1 7.4 7.7 8.0 8.3 8.6 8.9 9.2 9.5 9.8 10.1 10.4 10.7
## [31] 11.0 11.3 11.6 11.9 12.2 12.5 12.8 13.1 13.4 13.7 14.0 14.3 14.6 14.9 15.2
## [46] 15.5 15.8 16.1 16.4 16.7 17.0 17.3 17.6 17.9
Note that the above sequence ends at 17.9 (the last complete
increment). Another option is to specify the length of the
output vector instead - like “I want to have 30 values between 2 and 18,
evenly spaced.” To do that, use the length =
argument
within the seq()
function.
seq(from = 2, to = 18, length = 30)
## [1] 2.000000 2.551724 3.103448 3.655172 4.206897 4.758621 5.310345
## [8] 5.862069 6.413793 6.965517 7.517241 8.068966 8.620690 9.172414
## [15] 9.724138 10.275862 10.827586 11.379310 11.931034 12.482759 13.034483
## [22] 13.586207 14.137931 14.689655 15.241379 15.793103 16.344828 16.896552
## [29] 17.448276 18.000000
We’ll write a LOT of functions in R and Python (especially in EDS 221). For now, we can use the nice Cmd + Option + X shortcut to create a function for us.
Let’s make a function of the logistic growth equation. Recall, the expression for population size at any time t following logistic growth is given by:
\[N_t=\frac{K}{1+[\frac{K-N_0}{N_0}]e^{-rt}}\]
Let’s write it out. When in doubt, parentheses! Keep in mind that you may want to make your argument names something a bit more descriptive. Always ask: What will make future me least likely to mess this up? What would make these function arguments clearest to my collaborators?
pop_logistic <- function(capacity, init_pop, rate, time_yr) {
capacity / (1 + ((capacity - init_pop) / init_pop) * exp(-rate * time_yr))
}
Let’s say that for a population of chipmunks in one region, the carrying capacity is 2,580 individuals, the exponential growth rate is 0.32 (yr-1), and time is in years. If the initial population is 230 individuals, what is the estimated population size a time = 2.4 years?
pop_logistic(capacity = 2580, init_pop = 230, rate = 0.32, time_yr = 2.4)
## [1] 449.4572
Now let’s say we want to predict (then plot) the estimated population over a bunch of different times. Based on what we’ve learned today, how do you expect we might do that? A sequence of values as the time input!
Let’s make a sequence of times (0 to 20 years, by 1/2 year increments), then use that vector as our time input in the logistic growth model.
# First, create the vector (a sequence of values)
time_vec <- seq(from = 0, to = 20, by = 0.5)
# Then, use that as your time input in the model:
pop_logistic(capacity = 2580, init_pop = 230, rate = 0.32, time_yr = time_vec)
## [1] 230.0000 265.7962 306.4370 352.3458 403.9105 461.4584 525.2265
## [8] 595.3303 671.7323 754.2132 842.3511 935.5115 1032.8508 1133.3382
## [15] 1235.7931 1338.9376 1441.4593 1542.0771 1639.6038 1733.0001 1821.4121
## [22] 1904.1943 1980.9141 2051.3424 2115.4329 2173.2940 2225.1574 2271.3465
## [29] 2312.2467 2348.2800 2379.8838 2407.4939 2431.5322 2452.3984 2470.4641
## [36] 2486.0700 2499.5249 2511.1059 2521.0597 2529.6041 2536.9311
We want to plot those estimated population sizes - but we didn’t
store the vector of outputs! Remember - if you want to store an output,
using the assingment operator (<-
) in R, and check that
it exists in your environment.
chipmunk_pop <- pop_logistic(capacity = 2580, init_pop = 230, rate = 0.32, time_yr = time_vec)
# Then we can call chipmunk_pop:
chipmunk_pop
## [1] 230.0000 265.7962 306.4370 352.3458 403.9105 461.4584 525.2265
## [8] 595.3303 671.7323 754.2132 842.3511 935.5115 1032.8508 1133.3382
## [15] 1235.7931 1338.9376 1441.4593 1542.0771 1639.6038 1733.0001 1821.4121
## [22] 1904.1943 1980.9141 2051.3424 2115.4329 2173.2940 2225.1574 2271.3465
## [29] 2312.2467 2348.2800 2379.8838 2407.4939 2431.5322 2452.3984 2470.4641
## [36] 2486.0700 2499.5249 2511.1059 2521.0597 2529.6041 2536.9311
You will learn a lot more about data visualization
throughout MEDS. But let’s make a first little rough one just for fun
using the ggplot2
package, which is part of the
tidyverse
(more on this in EDS 221).
Attach the tidyverse in the setup chunk of your Quarto document using
library(tidyverse)
. Note: you should already have the
package installed on your computer - if not, you’ll need to do that
first.
Let’s first combine our time sequence (time_vec
) and
predicted populations (chipmunk_pop
) into a single data
frame - a table of data where different vectors (we’ll think of
these as variables moving forward) are stored in columns.
chipmunk_df <- data.frame(time_vec, chipmunk_pop)
# ALWAYS look:
head(chipmunk_df)
## time_vec chipmunk_pop
## 1 0.0 230.0000
## 2 0.5 265.7962
## 3 1.0 306.4370
## 4 1.5 352.3458
## 5 2.0 403.9105
## 6 2.5 461.4584
Now follow along as Allison raves about the grammar of graphics to
make a basic ggplot
graph:
ggplot(data = chipmunk_df, aes(x = time_vec, y = chipmunk_pop)) +
geom_point()