eds221-day2-comp
r-data-types
py-data-types
dogs <- c("teddy", "khora", "waffle", "banjo")
typeof(dogs)
## [1] "character"
class(dogs)
## [1] "character"
weights <- c(50, 55, 25, 35)
typeof(weights) # Hmmm what is different about this and the line below?
## [1] "double"
class(weights)
## [1] "numeric"
dog_age <- c(5L, 6L, 1L, 7L)
typeof(dog_age)
## [1] "integer"
class(dog_age)
## [1] "integer"
# Check with a logical:
is.numeric(dog_age)
## [1] TRUE
There is a hierarchy of classes. The broadest of all in a vector wins (if there are characters, then character will be the class of the entire vector).
dog_info <- c("teddy", 50, 5L)
dog_info
## [1] "teddy" "50" "5"
typeof(dog_info)
## [1] "character"
class(dog_info)
## [1] "character"
is.character(dog_info)
## [1] TRUE
dog_food <- c(teddy = "purina", khora = "alpo", waffle = "fancy feast", banjo = "blue diamond")
dog_food
## teddy khora waffle banjo
## "purina" "alpo" "fancy feast" "blue diamond"
class(dog_food)
## [1] "character"
typeof(dog_food)
## [1] "character"
Use []
with the position or name to access elements of a vector.
dog_food[2]
## khora
## "alpo"
dog_food["khora"]
## khora
## "alpo"
Or we can specify a range of values within a vector using [:]
. The first element in R vectors is assigned element = 1. This is an important distinction. In Python, the first element is assigned 0 (zero-index).
# Create a vector of car colors observed
cars <- c("red", "orange", "white", "blue", "green", "silver", "black")
# Access just the 5th element
cars[5]
## [1] "green"
# Access elements 2 through 4
cars[2:4]
## [1] "orange" "white" "blue"
i <- 4
cars[i]
## [1] "blue"
i <- seq(1:3)
cars[i]
## [1] "red" "orange" "white"
cars[3] <- "BURRITOS!"
cars
## [1] "red" "orange" "BURRITOS!" "blue" "green" "silver"
## [7] "black"
(…we did some of this in EDS 212 too!)
fish_size <- matrix(c(0.8, 1.2, 0.4, 0.9), ncol = 2, nrow = 2, byrow = FALSE)
fish_size
## [,1] [,2]
## [1,] 0.8 0.4
## [2,] 1.2 0.9
typeof(fish_size) # Returns the class of values
## [1] "double"
class(fish_size) # Returns matrix / array
## [1] "matrix" "array"
What happens if we try to combine multiple data types into a matrix?
dog_walk <- matrix(c("teddy", 5, "khora", 10), ncol = 2, nrow = 2, byrow = FALSE)
dog_walk
## [,1] [,2]
## [1,] "teddy" "khora"
## [2,] "5" "10"
class(dog_walk)
## [1] "matrix" "array"
typeof(dog_walk)
## [1] "character"
# Hmmmmmm once again back to the broadest category of data type in the hierarchy
Index using [row, column]
.
whale_travel <- matrix(data = c(31.8, 1348, 46.9, 1587), nrow = 2, ncol = 2, byrow = TRUE)
# Take a look
whale_travel
## [,1] [,2]
## [1,] 31.8 1348
## [2,] 46.9 1587
# Access the value 1348
whale_travel[1,2] # Row 1, column 2
## [1] 1348
# Access the value 46.9
whale_travel[2,1]
## [1] 46.9
If you leave any element blank (keeping the comma), it will return all values from the other element. For example, to get everything in row 2:
whale_travel[2,]
## [1] 46.9 1587.0
Or, to access everything in column 1:
whale_travel[, 1]
## [1] 31.8 46.9
What happens if I only give a matrix one element? That’s the position in the matrix as if populated by column. Check out a few:
whale_travel[3]
## [1] 1348
urchins <- list("blue", c(1, 2, 3), c("a cat", "a dog"), 5L)
urchins
## [[1]]
## [1] "blue"
##
## [[2]]
## [1] 1 2 3
##
## [[3]]
## [1] "a cat" "a dog"
##
## [[4]]
## [1] 5
Important: a single [] returns a list. [[]] returns the item STORED in the list.
urchins[[2]]
## [1] 1 2 3
# Compare that to:
urchins[2]
## [[1]]
## [1] 1 2 3
tacos <- list(topping = c("onion", "cilantro", "guacamole"), filling = c("beans", "meat", "veggie"), price = c(6.75, 8.25, 9.50))
# The whole thing
tacos
## $topping
## [1] "onion" "cilantro" "guacamole"
##
## $filling
## [1] "beans" "meat" "veggie"
##
## $price
## [1] 6.75 8.25 9.50
# Just get one piece of it:
tacos[[2]]
## [1] "beans" "meat" "veggie"
#...or, the same thing:
tacos$filling
## [1] "beans" "meat" "veggie"
A data frame is a list containing vectors of the same length, where each column is a variable stored in a vector. Let’s make one:
fruit <- data.frame(type = c("apple", "banana", "peach"),
mass = c(130, 195, 150))
# Look at it
fruit
## type mass
## 1 apple 130
## 2 banana 195
## 3 peach 150
# Check the class
class(fruit)
## [1] "data.frame"
Use [row#, col#], or name the column (then element number).
fruit[1,2]
## [1] 130
fruit[3,1]
## [1] "peach"
fruit[2,1] <- "pineapple"
fruit
## type mass
## 1 apple 130
## 2 pineapple 195
## 3 peach 150
import numpy as np
import pandas as pd
teddy = [1,2,8]
teddy_vec = np.array(teddy)
teddy_vec
## array([1, 2, 8])
type(teddy_vec)
## <class 'numpy.ndarray'>
A list is mutable - you can change it directly!
teddy[1] = 1000
# See that element 1 is updated directly!
teddy
## [1, 1000, 8]
A tuple is immutable - you’ll get yelled at if you try to change it!
khora = (1, 5, 12)
type(khora)
# khora[1] = 16 # Nope.
## <class 'tuple'>
A more involved list (note: you can also use list() to create lists in python).
waffle = [["cat", "dog", "penguin"], 2, "a burrito", [1,2,5]]
waffle
# Access an element from the list waffle:
## [['cat', 'dog', 'penguin'], 2, 'a burrito', [1, 2, 5]]
waffle[0] # Default just returns that piece (not as a list)
## ['cat', 'dog', 'penguin']
We can reassign pieces of a list:
waffle[1] = "AN EXTRAVAGANZA"
waffle
## [['cat', 'dog', 'penguin'], 'AN EXTRAVAGANZA', 'a burrito', [1, 2, 5]]
fox = {'sound': ["screech", "squeal", "bark"], 'age': [2, 6, 10]}
fox['sound']
## ['screech', 'squeal', 'bark']
fox['age']
## [2, 6, 10]
cows = {'name': ["moo", "spots", "happy"], 'location': ["pasture", "prairie", "barn"], 'height': [5.7, 5.4, 4.9]}
cows_df = pd.DataFrame(cows)
# Take a look
cows_df
# Get a column
## name location height
## 0 moo pasture 5.7
## 1 spots prairie 5.4
## 2 happy barn 4.9
cows_df['name']
# Get an element using df.at[]
## 0 moo
## 1 spots
## 2 happy
## Name: name, dtype: object
cows_df.at[1, 'name']
## 'spots'