Reading and Writing Data
Examples
How do I read a csv file called grades.csv
into a data.frame?
.
means the current working directory.
So, if we were in /home/john/projects
, that is our current working directory.
./grades.csv
is the same as /home/john/projects/grades.csv
.
In this case, ./grades.csv
is the relative path.
dat <- read.csv("./grades.csv")
head(dat)
grade year
1 100 junior
2 99 sophomore
3 75 sophomore
4 74 sophomore
5 44 senior
6 69 junior
How do I read a csv file called grades.csv
into a data.frame using the function fread
?
Note: The fread
function is part of the data.table
package. It reads in datasets faster than read.csv
. You are strongly encouraged to use fread
to read large datasets in R.
library(data.table)
dat <- data.frame(fread("./grades.csv"))
head(dat)
grade year
1 100 junior
2 99 sophomore
3 75 sophomore
4 74 sophomore
5 44 senior
6 69 junior
How do I read a csv file called grades2.csv
where the data is separated by semi-colons (;) in a data.frame?
CSV stands for comma-separated values. So, read.csv has comma (,) as the default value for the sep parameter.
|
dat <- read.csv("./grades_semi.csv", sep=";")
head(dat)
grade year
1 100 junior
2 99 sophomore
3 75 sophomore
4 74 sophomore
5 44 senior
6 69 junior
How do I prevent R from reading in strings as factors when using a function like read.csv
?
In R 4.0+, strings are not read in as factors, so you don’t need to worry about this.
For R < 4.0 (any older R version than 4.0), use stringsAsFactors
.
dat <- read.csv("./grades.csv", stringsAsFactors=F)
head(dat)
grade year
1 100 junior
2 99 sophomore
3 75 sophomore
4 74 sophomore
5 44 senior
6 69 junior
How do I specify the type of 1 or more columns when reading in a csv file?
dat <- read.csv("./grades.csv", colClasses=c("grade"="character", "year"="factor"))
str(dat)
'data.frame': 10 obs. of 2 variables:
$ grade: chr "100" "99" "75" "74" ...
$ year : Factor w/ 4 levels "freshman","junior",..: 2 4 4 4 3 2 2 3 1 2
Given a list of csv files with the same columns, how can I read in and combine them into a one dataframe?
# We want to read in grades.csv, grades2.csv, and grades3.csv
# into a single dataframe.
list_of_files <- c("grades.csv", "grades2.csv", "grades3.csv")
results <- data.frame()
for (file in list_of_files) {
dat <- read.csv(file)
results <- rbind(results, dat)
}
dim(results)
[1] 32 2