Introduction

In this chapter we will learn what is an object in R. It is equivalent but not the same as a variable in other programming languages. A variable is used to store and access data. An object has interesting properties, also known as attributes, that it can use to process the data accessible to that object. In R, objects can store and process data.

Lets start with some warm up calculator style operations. Please pay special attention to the operators (+,-,*,/,^,%) that have been used to achieve a specific purpose:

Note: Any statement that starts with a # is a comment in the R code. It will not be executed.

# Addition
9+2
## [1] 11
# Substraction
9-2
## [1] 7
# Multiplication
39*2
## [1] 78
# Division
9/2
## [1] 4.5
# Modulus
9%%2
## [1] 1
# Integer division
9%/%2
## [1] 4
# Exponentiation (type 1)
9^2
## [1] 81
# Exponentiation (type 2)
9**2
## [1] 81

How to create object

It was very tiring to repetitively type 9 and 2. Working with numbers like 528.548 and 854.698 and larger numbers may introduce some challenge and also errors. To make things easy, we can take advantage of an object. Everything in R is an object and it has certain interesting properties depending on the kind of task we assign to it. One such task is to store data. Each object has a name, either can store data or can be a function, and has certain attributes. To create an object we use the following command structure.

# Three ways of creating an object 

# Example 1
object_name <- DATA

# or

# Example 2
DATA -> object_name 

# or

# Example 3, not the recommended usage
object_name = DATA

It is very important to note that the object name is case sensitive.

Now with an example:

a <- 2
# print what is stored in a
a 
## [1] 2
# another way of doing it
print(a)
## [1] 2

More examples on what can be done with objects:

a <- 7
b<-3
a
## [1] 7
b
## [1] 3
a*b
## [1] 21
a+b
## [1] 10
a/b
## [1] 2.333333
a**b
## [1] 343
a^b
## [1] 343
# change 'a'to upper case and see what happens
# A^b

Object types

There are eight main object types in R excluding function which we will deal with separately. I wish to show you examples of each object type, before offering some explanation about such diversity. Lets dive into each object type.


Numeric

There are two types of numbers that we work with integers (\(-x,0,+x\)) and floating point numbers also referred to as double for eg. 3.22.

# integer
x <- 22 
# double
y <- 3.141

# Lets convert y to an integer
y <- as.integer(y)
x+y
## [1] 25

Character

This is how we deal with words, sentences, and alphabets. We use "" or ’’ to define them in R:

# Character
x <- "cars"
# Not an integer anymore
x <- "292"

As long as your RAM memory permits, you can load multiple sentences or the content of an entire book into one object if it satisfies two conditions: surrounded by single/double quotes and does not have single/double quotes as part of the sentence.

# Quote from Lord of the Rings
y <- "There is only one Lord of the Ring, only one who can bend it to his will. And he does not share power."
# This assignemnt is not permitted. Check what will happen when you remove the # from the statement below
#y <- "There is only one "Lord" of the Ring, only one who can bend it to his will. And he does not share power."

Logical

In programming we will come across situations where we would like to know if a condition has been met or not and for such situations we use logical (also referred to as boolean) data types. They present only two possibilities TRUE or FALSE also abbreviated to T or F.

x_is <- TRUE
# or
x_is <- T
# or
y_is <- FALSE
# or
y_is <- F

Factor

If you want to divide your data set into categories, factors will help you do that. Some examples of factors, days in a week, months, type of cars (SUV, CRV, Sedan), etc. An example is always better:

# in the statement below, we are just defining a list of characters/strings
plane_types <- c("A320","A330","A340","A380","Cargo","A340","A340","A340","A320")
plane_types
## [1] "A320"  "A330"  "A340"  "A380"  "Cargo" "A340"  "A340"  "A340"  "A320"
# lets convert them to factors through a process called coercion
plane_types <- as.factor(plane_types)
# notice the difference
plane_types
## [1] A320  A330  A340  A380  Cargo A340  A340  A340  A320 
## Levels: A320 A330 A340 A380 Cargo

A factor object, will show you the number of levels, i.e. categories in the object. Did you notice the difference?


Vectors and Lists

So far we have worked with objects that store a single value, also referred to as atomic data types. They can hold onto a single data point. To store multiple values in a single object we can use vectors/lists. We call on the native R function c(), to create a vector:

x <- c(1,2,7,9,10)
y <- c(10,12,20,23,25)
x
## [1]  1  2  7  9 10
y
## [1] 10 12 20 23 25

We can do some neat mathematics with vectors as shown below:

x <- c(1,2,7,9,10)
y <- c(10,12,20,23,25)
x+y
## [1] 11 14 27 32 35
x-y
## [1]  -9 -10 -13 -14 -15
x*y
## [1]  10  24 140 207 250

While a vector is collection of numerical data types, a list holds collection of items of diverse data types.

x <- as.list(c(1,4.32,"Car",T))
x
## [[1]]
## [1] "1"
## 
## [[2]]
## [1] "4.32"
## 
## [[3]]
## [1] "Car"
## 
## [[4]]
## [1] "TRUE"

Accessing elements in a list/vector

Its very convenient to store multiple elements in a list/vector. Each element’s location in the list/vector is indexed starting from \(1, 2, 3,....,N\) where \(N\) is the last position. We can access the elements:

x <- c(1:30)
x
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
## [26] 26 27 28 29 30
# Accessing the element stored at the 19th position
x[19]
## [1] 19
# We can get information on the length i.e. the number of elements in the list using the function
length(x)
## [1] 30
# getting elements from point A ..... B
x[5:10]
## [1]  5  6  7  8  9 10

Matrix

A matrix is 2D vector, allows us to do some fun math. We will have to use the matrix function to create a matrix:

m <- matrix(1:9, nrow = 3, ncol = 3)
m
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9
# another way of transforming a vector into a matrix
m <- matrix(seq(1:30), nrow = 6, ncol = 5)
m
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    7   13   19   25
## [2,]    2    8   14   20   26
## [3,]    3    9   15   21   27
## [4,]    4   10   16   22   28
## [5,]    5   11   17   23   29
## [6,]    6   12   18   24   30
# Getting the first row
m[1,]
## [1]  1  7 13 19 25
# Getting the second column
m[,2]
## [1]  7  8  9 10 11 12
# specific element
m[4,3]
## [1] 16
m*m
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1   49  169  361  625
## [2,]    4   64  196  400  676
## [3,]    9   81  225  441  729
## [4,]   16  100  256  484  784
## [5,]   25  121  289  529  841
## [6,]   36  144  324  576  900

Data frame

A data frame is probably the most useful storage method for data when doing data analysis. This is similar to a spreadsheet, or a Pandas Data frame (if you are familiar with Python). Elements of a data frame can be accessed by using squared brackets or using $. Lets look at some examples:

x <- seq(1:20)
y <- seq(21:40)
df<-as.data.frame(cbind(x,y))
df
##     x  y
## 1   1  1
## 2   2  2
## 3   3  3
## 4   4  4
## 5   5  5
## 6   6  6
## 7   7  7
## 8   8  8
## 9   9  9
## 10 10 10
## 11 11 11
## 12 12 12
## 13 13 13
## 14 14 14
## 15 15 15
## 16 16 16
## 17 17 17
## 18 18 18
## 19 19 19
## 20 20 20
#this takes the element in the first row and first column
df[1,1]
## [1] 1
df$x
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
df$y
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

Missing values (NA/NULL)

Data provided to us is always not complete and has missing values. In R, we can use the NA, which is a logical constant data type or the NULL object. For example:

x <-c(3,NA,5,NA,44,NULL)
x
## [1]  3 NA  5 NA 44

We can use the function is.na() and is.null() to check if an object has NA data type or is a NULL object.

# checking for NA
is.na(x)
## [1] FALSE  TRUE FALSE  TRUE FALSE
# checking for NULL
x <- NULL
is.null(x)
## [1] TRUE

Coercion

Coercion is the process of converting an object of one data type into another. It is generally done using inbuilt R functions.

# 0 can be alias for FALSE and 1 for TRUE
x <- c(0,1)
as.logical(x)
## [1] FALSE  TRUE
x <- 23
y <- 38
x+y
## [1] 61
# overwriting x
x <- as.character(x)
# overwriting y
y <- as.character(y)
# Can you try to add two characters??
#x+y
x
## [1] "23"
as.integer(x)
## [1] 23
# cannot convert 23 to logical 
as.logical(x)
## [1] NA
as.logical(0)
## [1] FALSE
# character vector
cat <- c("A","B","C","A","D","B","C")
cat
## [1] "A" "B" "C" "A" "D" "B" "C"
# coercing into factors
as.factor(cat)
## [1] A B C A D B C
## Levels: A B C D
 

Introduction to R by Dr. Sarath Chandra Dantu

This course material is available under a Creative Commons BY-SA license (CC BY-SA) version 4.0