Read_delim()

The readr::read_delim() function allows you to read in delimited rectangular data from a file to a tibble.

We will only cover readr::read_delim() but the full list of read functions are below:

Tidyverse reference page

Delimiters

Although there are specific functions for some delimiters this webpage will show you how to read in data with any delimiter using readr::read_delim() and the delim= option.

The most common types of delimiters for rectangular data files are:

  • Commas (,): Files with commas as delimiters are known as CSV files (Comma Separated Values) and commonly end with the suffix .csv.
  • Tabs (\t): Files with tabs as delimiters are known as TSV files (Tab Separated Values) and commonly end with the suffix .tsv.
  • Spaces (): Spaces are used as delimiters for many files but are generally not recommended in data science.

Additionally, non-standard delimiters can be specified. Tilde (~), Colon (:), Semi-colon (;), Pipe (|) are commonly used as non-standard delimiters.

CSV

We’ll read in the file all_plant_details.csv into R as a tibble.

CSV file contents

Prior to reading in the CSV file (comma separated value) first print out the first five lines of the file with the base R function readLines(). This shows the contents of the file with the delimiters, in this case commas (,). This step is for demonstration and can be skipped in your own analyses if you know the delimiter of your file.

readLines("https://neof-workshops.github.io/Tidyverse/data/all_plant_details.csv", n=5)
[1] "id,common_name,seeds,drought_tolerant,salt_tolerant,thorny,invasive,tropical,indoor,flowers,cones,fruits,edible_fruit,leaf,edible_leaf,cuisine,medicinal,poisonous_to_humans,poisonous_to_pets,sunlight_part_sun_part_shade,sunlight_full_shade,sunlight_deep_shade,sunlight_part_shade,sunlight_full_sun_only_if_soil_kept_moist,sunlight_full_sun,sunlight_filtered_shade,care_level_encoded,maintenance_encoded,watering_encoded,growth_rate_encoded,cycle_perennial,cycle_herbaceous_perennial,cycle_annual"
[2] "425,flowering-maple,0,1,0,1,0,1,1,1,0,0,0,1,0,0,1,0,0,0,0,0,1,0,1,0,2,0,2,0,1,0,0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[3] "426,flowering-maple,0,1,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,1,0,1,0,1,0,0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[4] "427,flowering-maple,0,1,0,0,0,1,1,1,0,0,0,1,0,0,1,0,0,0,0,0,1,0,1,0,1,0,1,0,1,0,0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[5] "428,flowering-maple,0,1,1,0,0,1,1,1,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0,2,1,1,0,1,0,0"                                                                                                                                                                                                                                                                                                                                                                                                                              

Read in CSV file

Read in the CSV file with readr::read_delim() setting the delimiter to commas with delim = ",".

Note: You can ignore the Lines between “Column specification” and the first 2 lines beginning with . They are discussed in the column types page.

readr::read_delim(
    file = "https://neof-workshops.github.io/Tidyverse/data/all_plant_details.csv",
    delim = ",") |>
    #Slice out the first 5 lines
    dplyr::slice(1:5)
Rows: 155 Columns: 33
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): common_name
dbl (32): id, seeds, drought_tolerant, salt_tolerant, thorny, invasive, trop...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# A tibble: 5 × 33
     id common_name     seeds drought_tolerant salt_tolerant thorny invasive
  <dbl> <chr>           <dbl>            <dbl>         <dbl>  <dbl>    <dbl>
1   425 flowering-maple     0                1             0      1        0
2   426 flowering-maple     0                1             0      0        0
3   427 flowering-maple     0                1             0      0        0
4   428 flowering-maple     0                1             1      0        0
5   434 Jacob's coat        0                0             0      0        0
# ℹ 26 more variables: tropical <dbl>, indoor <dbl>, flowers <dbl>,
#   cones <dbl>, fruits <dbl>, edible_fruit <dbl>, leaf <dbl>,
#   edible_leaf <dbl>, cuisine <dbl>, medicinal <dbl>,
#   poisonous_to_humans <dbl>, poisonous_to_pets <dbl>,
#   sunlight_part_sun_part_shade <dbl>, sunlight_full_shade <dbl>,
#   sunlight_deep_shade <dbl>, sunlight_part_shade <dbl>,
#   sunlight_full_sun_only_if_soil_kept_moist <dbl>, sunlight_full_sun <dbl>, …

TSV

We’ll read in the file all_plant_details.tsv into R as a tibble. This tab delimited file was created with the readr::write_delim() and only contains the header line plus the first 5 rows of the all_plant_details.csv file.

TSV file contents

View the file contents before reading it as a tibble.

If you were to open the file in a text editor it would most likely display the \t characters as tab spaces.

readLines("https://neof-workshops.github.io/Tidyverse/data/plant_detail_slice.tsv")
[1] "id\tcommon_name\tseeds\tdrought_tolerant\tsalt_tolerant\tthorny\tinvasive\ttropical\tindoor\tflowers\tcones\tfruits\tedible_fruit\tleaf\tedible_leaf\tcuisine\tmedicinal\tpoisonous_to_humans\tpoisonous_to_pets\tsunlight_part_sun_part_shade\tsunlight_full_shade\tsunlight_deep_shade\tsunlight_part_shade\tsunlight_full_sun_only_if_soil_kept_moist\tsunlight_full_sun\tsunlight_filtered_shade\tcare_level_encoded\tmaintenance_encoded\twatering_encoded\tgrowth_rate_encoded\tcycle_perennial\tcycle_herbaceous_perennial\tcycle_annual"
[2] "425\tflowering-maple\t0\t1\t0\t1\t0\t1\t1\t1\t0\t0\t0\t1\t0\t0\t1\t0\t0\t0\t0\t0\t1\t0\t1\t0\t2\t0\t2\t0\t1\t0\t0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[3] "426\tflowering-maple\t0\t1\t0\t0\t0\t0\t1\t1\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t1\t0\t1\t0\t1\t0\t1\t0\t0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[4] "427\tflowering-maple\t0\t1\t0\t0\t0\t1\t1\t1\t0\t0\t0\t1\t0\t0\t1\t0\t0\t0\t0\t0\t1\t0\t1\t0\t1\t0\t1\t0\t1\t0\t0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[5] "428\tflowering-maple\t0\t1\t1\t0\t0\t1\t1\t1\t0\t0\t0\t1\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t1\t0\t2\t1\t1\t0\t1\t0\t0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[6] "434\tJacob's coat\t0\t0\t0\t0\t0\t0\t1\t1\t0\t0\t0\t1\t0\t0\t0\t0\t0\t0\t0\t0\t1\t0\t1\t0\t2\t0\t2\t0\t1\t0\t0"                                                                                                                                                                                                                                                                                                                                                                                                                                 

Read in TSV file

Read in the TSV file with readr::read_delim() setting the delimiter to tabs with delim = "\t".

Additionally, we’ll specify the option show_col_types = FALSE to quiet the column types message (more info covered in the column types page).

readr::read_delim(
    file = "https://neof-workshops.github.io/Tidyverse/data/plant_detail_slice.tsv",
    delim = "\t", 
    show_col_types = FALSE)
# A tibble: 5 × 33
     id common_name     seeds drought_tolerant salt_tolerant thorny invasive
  <dbl> <chr>           <dbl>            <dbl>         <dbl>  <dbl>    <dbl>
1   425 flowering-maple     0                1             0      1        0
2   426 flowering-maple     0                1             0      0        0
3   427 flowering-maple     0                1             0      0        0
4   428 flowering-maple     0                1             1      0        0
5   434 Jacob's coat        0                0             0      0        0
# ℹ 26 more variables: tropical <dbl>, indoor <dbl>, flowers <dbl>,
#   cones <dbl>, fruits <dbl>, edible_fruit <dbl>, leaf <dbl>,
#   edible_leaf <dbl>, cuisine <dbl>, medicinal <dbl>,
#   poisonous_to_humans <dbl>, poisonous_to_pets <dbl>,
#   sunlight_part_sun_part_shade <dbl>, sunlight_full_shade <dbl>,
#   sunlight_deep_shade <dbl>, sunlight_part_shade <dbl>,
#   sunlight_full_sun_only_if_soil_kept_moist <dbl>, sunlight_full_sun <dbl>, …

Space

We’ll read in the file all_plant_details.txt into R as a tibble. This space delimited file was created with the readr::write_delim() function and only contains the header line plus the first 5 rows of the all_plant_details.csv file.

Space delimited file contents

View the file contents before reading it as a tibble.

You will notice the 6th line has "Jacob’s coat". As this value has a space the \" are used to indicate that Jacob’s coat is the value within one field.

readLines("https://neof-workshops.github.io/Tidyverse/data/plant_detail_slice.txt")
[1] "id common_name seeds drought_tolerant salt_tolerant thorny invasive tropical indoor flowers cones fruits edible_fruit leaf edible_leaf cuisine medicinal poisonous_to_humans poisonous_to_pets sunlight_part_sun_part_shade sunlight_full_shade sunlight_deep_shade sunlight_part_shade sunlight_full_sun_only_if_soil_kept_moist sunlight_full_sun sunlight_filtered_shade care_level_encoded maintenance_encoded watering_encoded growth_rate_encoded cycle_perennial cycle_herbaceous_perennial cycle_annual"
[2] "425 flowering-maple 0 1 0 1 0 1 1 1 0 0 0 1 0 0 1 0 0 0 0 0 1 0 1 0 2 0 2 0 1 0 0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[3] "426 flowering-maple 0 1 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[4] "427 flowering-maple 0 1 0 0 0 1 1 1 0 0 0 1 0 0 1 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[5] "428 flowering-maple 0 1 1 0 0 1 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 2 1 1 0 1 0 0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[6] "434 \"Jacob's coat\" 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 2 0 2 0 1 0 0"                                                                                                                                                                                                                                                                                                                                                                                                                             

Read in space delimited file file

Read in the space delimited file with readr::read_delim() setting the delimiter to spaces with delim = " ".

Additionally, we’ll specify the option show_col_types = FALSE to quiet the column types message (more info covered in the column types page).

readr::read_delim(
    file = "https://neof-workshops.github.io/Tidyverse/data/plant_detail_slice.txt",
    delim = " ", 
    show_col_types = FALSE)
# A tibble: 5 × 33
     id common_name     seeds drought_tolerant salt_tolerant thorny invasive
  <dbl> <chr>           <dbl>            <dbl>         <dbl>  <dbl>    <dbl>
1   425 flowering-maple     0                1             0      1        0
2   426 flowering-maple     0                1             0      0        0
3   427 flowering-maple     0                1             0      0        0
4   428 flowering-maple     0                1             1      0        0
5   434 Jacob's coat        0                0             0      0        0
# ℹ 26 more variables: tropical <dbl>, indoor <dbl>, flowers <dbl>,
#   cones <dbl>, fruits <dbl>, edible_fruit <dbl>, leaf <dbl>,
#   edible_leaf <dbl>, cuisine <dbl>, medicinal <dbl>,
#   poisonous_to_humans <dbl>, poisonous_to_pets <dbl>,
#   sunlight_part_sun_part_shade <dbl>, sunlight_full_shade <dbl>,
#   sunlight_deep_shade <dbl>, sunlight_part_shade <dbl>,
#   sunlight_full_sun_only_if_soil_kept_moist <dbl>, sunlight_full_sun <dbl>, …

Non-standard delimiter

You can use many other characters as delimiters when reading files. This can be useful if values in the data contain the three common delimiters (comm, tab, and space).

The most common non-standard delimiters are:

  • Tilde (~)
  • Colon (:)
  • Semi-colon (;)
  • Pipe (|)

We’ll read in the file plant_detail_slice.pipe_delimit.text into R as a tibble. This pipe (|) delimited file was created with the readr::write_delim() and only contains the header line plus the first 5 rows of the all_plant_details.csv file.

Non-standard delimited file contents

View the file contents before reading it as a tibble.

readLines("https://neof-workshops.github.io/Tidyverse/data/plant_detail_slice.pipe_delimit.text")
[1] "id|common_name|seeds|drought_tolerant|salt_tolerant|thorny|invasive|tropical|indoor|flowers|cones|fruits|edible_fruit|leaf|edible_leaf|cuisine|medicinal|poisonous_to_humans|poisonous_to_pets|sunlight_part_sun_part_shade|sunlight_full_shade|sunlight_deep_shade|sunlight_part_shade|sunlight_full_sun_only_if_soil_kept_moist|sunlight_full_sun|sunlight_filtered_shade|care_level_encoded|maintenance_encoded|watering_encoded|growth_rate_encoded|cycle_perennial|cycle_herbaceous_perennial|cycle_annual"
[2] "425|flowering-maple|0|1|0|1|0|1|1|1|0|0|0|1|0|0|1|0|0|0|0|0|1|0|1|0|2|0|2|0|1|0|0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[3] "426|flowering-maple|0|1|0|0|0|0|1|1|0|0|0|1|0|0|0|0|0|0|0|0|1|0|1|0|1|0|1|0|1|0|0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[4] "427|flowering-maple|0|1|0|0|0|1|1|1|0|0|0|1|0|0|1|0|0|0|0|0|1|0|1|0|1|0|1|0|1|0|0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[5] "428|flowering-maple|0|1|1|0|0|1|1|1|0|0|0|1|0|0|1|0|0|0|0|0|0|0|1|0|2|1|1|0|1|0|0"                                                                                                                                                                                                                                                                                                                                                                                                                              
[6] "434|Jacob's coat|0|0|0|0|0|0|1|1|0|0|0|1|0|0|0|0|0|0|0|0|1|0|1|0|2|0|2|0|1|0|0"                                                                                                                                                                                                                                                                                                                                                                                                                                 

Read in non-standard delimited file

Read in the pipe (|) delimited file with readr::read_delim() setting the delimiter to pipes with delim = "|".

Additionally, we’ll specify the option show_col_types = FALSE to quiet the column types message (more info covered in the column types page).

readr::read_delim(
    file = "https://neof-workshops.github.io/Tidyverse/data/plant_detail_slice.pipe_delimit.text",
    delim = "|", 
    show_col_types = FALSE)
# A tibble: 5 × 33
     id common_name     seeds drought_tolerant salt_tolerant thorny invasive
  <dbl> <chr>           <dbl>            <dbl>         <dbl>  <dbl>    <dbl>
1   425 flowering-maple     0                1             0      1        0
2   426 flowering-maple     0                1             0      0        0
3   427 flowering-maple     0                1             0      0        0
4   428 flowering-maple     0                1             1      0        0
5   434 Jacob's coat        0                0             0      0        0
# ℹ 26 more variables: tropical <dbl>, indoor <dbl>, flowers <dbl>,
#   cones <dbl>, fruits <dbl>, edible_fruit <dbl>, leaf <dbl>,
#   edible_leaf <dbl>, cuisine <dbl>, medicinal <dbl>,
#   poisonous_to_humans <dbl>, poisonous_to_pets <dbl>,
#   sunlight_part_sun_part_shade <dbl>, sunlight_full_shade <dbl>,
#   sunlight_deep_shade <dbl>, sunlight_part_shade <dbl>,
#   sunlight_full_sun_only_if_soil_kept_moist <dbl>, sunlight_full_sun <dbl>, …