In this chapter we will learn what is an object in R. It is equivalent but not the same as a variable in other programming languages. A variable is used to store and access data. An object has interesting properties, also known as attributes, that it can use to process the data accessible to that object. In R, objects can store and process data.
Lets start with some warm up calculator style operations. Please pay special attention to the operators (+,-,*,/,^,%) that have been used to achieve a specific purpose:
Note: Any statement that starts with a #
is a comment in the R code. It will not be executed.
# Addition
9+2
## [1] 11
# Substraction
9-2
## [1] 7
# Multiplication
39*2
## [1] 78
# Division
9/2
## [1] 4.5
# Modulus
9%%2
## [1] 1
# Integer division
9%/%2
## [1] 4
# Exponentiation (type 1)
9^2
## [1] 81
# Exponentiation (type 2)
9**2
## [1] 81
It was very tiring to repetitively type 9
and 2
. Working with numbers like 528.548
and 854.698
and larger numbers may introduce some challenge and also errors. To make things easy, we can take advantage of an object. Everything in R is an object and it has certain interesting properties depending on the kind of task we assign to it. One such task is to store data. Each object has a name, either can store data or can be a function, and has certain attributes. To create an object we use the following command structure.
# Three ways of creating an object
# Example 1
object_name <- DATA
# or
# Example 2
DATA -> object_name
# or
# Example 3, not the recommended usage
object_name = DATA
It is very important to note that the object name is case sensitive.
Now with an example:
a <- 2
# print what is stored in a
a
## [1] 2
# another way of doing it
print(a)
## [1] 2
More examples on what can be done with objects:
a <- 7
b<-3
a
## [1] 7
b
## [1] 3
a*b
## [1] 21
a+b
## [1] 10
a/b
## [1] 2.333333
a**b
## [1] 343
a^b
## [1] 343
# change 'a'to upper case and see what happens
# A^b
There are eight main object types in R excluding function which we will deal with separately. I wish to show you examples of each object type, before offering some explanation about such diversity. Lets dive into each object type.
There are two types of numbers that we work with integers
(\(-x,0,+x\)) and floating point numbers also referred to as double
for eg. 3.22
.
# integer
x <- 22
# double
y <- 3.141
# Lets convert y to an integer
y <- as.integer(y)
x+y
## [1] 25
This is how we deal with words, sentences, and alphabets. We use "" or ’’ to define them in R:
# Character
x <- "cars"
# Not an integer anymore
x <- "292"
As long as your RAM memory permits, you can load multiple sentences or the content of an entire book into one object if it satisfies two conditions: surrounded by single/double quotes and does not have single/double quotes as part of the sentence.
# Quote from Lord of the Rings
y <- "There is only one Lord of the Ring, only one who can bend it to his will. And he does not share power."
# This assignemnt is not permitted. Check what will happen when you remove the # from the statement below
#y <- "There is only one "Lord" of the Ring, only one who can bend it to his will. And he does not share power."
In programming we will come across situations where we would like to know if a condition has been met or not and for such situations we use logical (also referred to as boolean) data types. They present only two possibilities TRUE
or FALSE
also abbreviated to T
or F
.
x_is <- TRUE
# or
x_is <- T
# or
y_is <- FALSE
# or
y_is <- F
If you want to divide your data set into categories, factors will help you do that. Some examples of factors, days in a week, months, type of cars (SUV, CRV, Sedan), etc. An example is always better:
# in the statement below, we are just defining a list of characters/strings
plane_types <- c("A320","A330","A340","A380","Cargo","A340","A340","A340","A320")
plane_types
## [1] "A320" "A330" "A340" "A380" "Cargo" "A340" "A340" "A340" "A320"
# lets convert them to factors through a process called coercion
plane_types <- as.factor(plane_types)
# notice the difference
plane_types
## [1] A320 A330 A340 A380 Cargo A340 A340 A340 A320
## Levels: A320 A330 A340 A380 Cargo
A factor object, will show you the number of levels, i.e. categories in the object. Did you notice the difference?
So far we have worked with objects that store a single value, also referred to as atomic data types. They can hold onto a single data point. To store multiple values in a single object we can use vectors/lists. We call on the native R function c()
, to create a vector:
x <- c(1,2,7,9,10)
y <- c(10,12,20,23,25)
x
## [1] 1 2 7 9 10
y
## [1] 10 12 20 23 25
We can do some neat mathematics with vectors as shown below:
x <- c(1,2,7,9,10)
y <- c(10,12,20,23,25)
x+y
## [1] 11 14 27 32 35
x-y
## [1] -9 -10 -13 -14 -15
x*y
## [1] 10 24 140 207 250
While a vector is collection of numerical data types, a list holds collection of items of diverse data types.
x <- as.list(c(1,4.32,"Car",T))
x
## [[1]]
## [1] "1"
##
## [[2]]
## [1] "4.32"
##
## [[3]]
## [1] "Car"
##
## [[4]]
## [1] "TRUE"
Its very convenient to store multiple elements in a list/vector. Each element’s location in the list/vector is indexed starting from \(1, 2, 3,....,N\) where \(N\) is the last position. We can access the elements:
x <- c(1:30)
x
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
## [26] 26 27 28 29 30
# Accessing the element stored at the 19th position
x[19]
## [1] 19
# We can get information on the length i.e. the number of elements in the list using the function
length(x)
## [1] 30
# getting elements from point A ..... B
x[5:10]
## [1] 5 6 7 8 9 10
A matrix is 2D vector, allows us to do some fun math. We will have to use the matrix
function to create a matrix:
m <- matrix(1:9, nrow = 3, ncol = 3)
m
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
# another way of transforming a vector into a matrix
m <- matrix(seq(1:30), nrow = 6, ncol = 5)
m
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 7 13 19 25
## [2,] 2 8 14 20 26
## [3,] 3 9 15 21 27
## [4,] 4 10 16 22 28
## [5,] 5 11 17 23 29
## [6,] 6 12 18 24 30
# Getting the first row
m[1,]
## [1] 1 7 13 19 25
# Getting the second column
m[,2]
## [1] 7 8 9 10 11 12
# specific element
m[4,3]
## [1] 16
m*m
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 49 169 361 625
## [2,] 4 64 196 400 676
## [3,] 9 81 225 441 729
## [4,] 16 100 256 484 784
## [5,] 25 121 289 529 841
## [6,] 36 144 324 576 900
A data frame is probably the most useful storage method for data when doing data analysis. This is similar to a spreadsheet, or a Pandas Data frame (if you are familiar with Python). Elements of a data frame can be accessed by using squared brackets or using $. Lets look at some examples:
x <- seq(1:20)
y <- seq(21:40)
df<-as.data.frame(cbind(x,y))
df
## x y
## 1 1 1
## 2 2 2
## 3 3 3
## 4 4 4
## 5 5 5
## 6 6 6
## 7 7 7
## 8 8 8
## 9 9 9
## 10 10 10
## 11 11 11
## 12 12 12
## 13 13 13
## 14 14 14
## 15 15 15
## 16 16 16
## 17 17 17
## 18 18 18
## 19 19 19
## 20 20 20
#this takes the element in the first row and first column
df[1,1]
## [1] 1
df$x
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
df$y
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Data provided to us is always not complete and has missing values. In R, we can use the NA
, which is a logical constant data type or the NULL
object. For example:
x <-c(3,NA,5,NA,44,NULL)
x
## [1] 3 NA 5 NA 44
We can use the function is.na()
and is.null()
to check if an object has NA
data type or is a NULL
object.
# checking for NA
is.na(x)
## [1] FALSE TRUE FALSE TRUE FALSE
# checking for NULL
x <- NULL
is.null(x)
## [1] TRUE
Coercion is the process of converting an object of one data type into another. It is generally done using inbuilt R functions.
# 0 can be alias for FALSE and 1 for TRUE
x <- c(0,1)
as.logical(x)
## [1] FALSE TRUE
x <- 23
y <- 38
x+y
## [1] 61
# overwriting x
x <- as.character(x)
# overwriting y
y <- as.character(y)
# Can you try to add two characters??
#x+y
x
## [1] "23"
as.integer(x)
## [1] 23
# cannot convert 23 to logical
as.logical(x)
## [1] NA
as.logical(0)
## [1] FALSE
# character vector
cat <- c("A","B","C","A","D","B","C")
cat
## [1] "A" "B" "C" "A" "D" "B" "C"
# coercing into factors
as.factor(cat)
## [1] A B C A D B C
## Levels: A B C D
Introduction to R by Dr. Sarath Chandra Dantu
This course material is available under a Creative Commons BY-SA license (CC BY-SA) version 4.0