Unite

The unite() function combines/pastes multiple string/character columns into one. This can be useful when combing multiple metadata columns into one for analyses/statistics/plotting purposes.

Tidyverse reference page

Dataset

For demonstration we’ll load the crop_and_soil_tbl data from the mgrtibbles package (hyperlink includes install instructions).

#Load package
library("mgrtibbles")
#mammal_sleep_tbl tibble for demonstration
mgrtibbles::crop_and_soil_tbl |>
    #Select all but the fourth column
    dplyr::select(-4)
# A tibble: 8,000 × 8
   Soil_type Crop_type Fertiliser Humidity Moisture Nitrogen Potassium
   <chr>     <chr>     <chr>         <dbl>    <dbl>    <dbl>     <dbl>
 1 Sandy     Maize     Urea             52       38       37         0
 2 Loamy     Sugarcane DAP              52       45       12         0
 3 Black     Cotton    14-35-14         65       62        7         9
 4 Red       Tobacco   28-28            62       34       22         0
 5 Clayey    Paddy     Urea             54       46       35         0
 6 Sandy     Barley    17-17-17         52       35       12        10
 7 Red       Cotton    20-20            50       64        9         0
 8 Loamy     Wheat     Urea             64       50       41         0
 9 Sandy     Millets   28-28            60       42       21         0
10 Black     Oil seeds 14-35-14         58       33        9         7
# ℹ 7,990 more rows
# ℹ 1 more variable: Phosphorous <dbl>

Unite columns

Unite 2 columns into one. This will remove the columns to be united.

The two options below are:

  • The name of the new united column (Crop_fertiliser).
  • The columns to unite (Crop_type:Fertiliser).
crop_and_soil_tbl |>
    #Select all but the fourth column
    dplyr::select(-4) |>
    #Unite the Crop and fertiliser columns
    tidyr::unite("Crop_Fertiliser", Crop_type:Fertiliser)
# A tibble: 8,000 × 7
   Soil_type Crop_Fertiliser    Humidity Moisture Nitrogen Potassium Phosphorous
   <chr>     <chr>                 <dbl>    <dbl>    <dbl>     <dbl>       <dbl>
 1 Sandy     Maize_Urea               52       38       37         0           0
 2 Loamy     Sugarcane_DAP            52       45       12         0          36
 3 Black     Cotton_14-35-14          65       62        7         9          30
 4 Red       Tobacco_28-28            62       34       22         0          20
 5 Clayey    Paddy_Urea               54       46       35         0           0
 6 Sandy     Barley_17-17-17          52       35       12        10          13
 7 Red       Cotton_20-20             50       64        9         0          10
 8 Loamy     Wheat_Urea               64       50       41         0           0
 9 Sandy     Millets_28-28            60       42       21         0          18
10 Black     Oil seeds_14-35-14       58       33        9         7          30
# ℹ 7,990 more rows

Further unite options

With tidyr::unite() you can:

  • Choose specific columns and their order in the united column with a string vector of the column names.
  • Choose the delimiter/separator for the strings in the united column (sep=).
  • Retain the original columns with the option remove=FALSE.
crop_and_soil_tbl |>
    #Select all but the fourth column
    dplyr::select(-4) |>
    #Unite columns
    tidyr::unite(
        #United column name
        "Crop.Soil.Fertiliser",
        #Columns to unite
        c("Crop_type","Soil_type","Fertiliser"), 
        #Separator for stings in united column
        sep=".", 
        #Do not remove the original columns to be united
        remove=FALSE
        )
# A tibble: 8,000 × 9
   Crop.Soil.Fertiliser     Soil_type Crop_type Fertiliser Humidity Moisture
   <chr>                    <chr>     <chr>     <chr>         <dbl>    <dbl>
 1 Maize.Sandy.Urea         Sandy     Maize     Urea             52       38
 2 Sugarcane.Loamy.DAP      Loamy     Sugarcane DAP              52       45
 3 Cotton.Black.14-35-14    Black     Cotton    14-35-14         65       62
 4 Tobacco.Red.28-28        Red       Tobacco   28-28            62       34
 5 Paddy.Clayey.Urea        Clayey    Paddy     Urea             54       46
 6 Barley.Sandy.17-17-17    Sandy     Barley    17-17-17         52       35
 7 Cotton.Red.20-20         Red       Cotton    20-20            50       64
 8 Wheat.Loamy.Urea         Loamy     Wheat     Urea             64       50
 9 Millets.Sandy.28-28      Sandy     Millets   28-28            60       42
10 Oil seeds.Black.14-35-14 Black     Oil seeds 14-35-14         58       33
# ℹ 7,990 more rows
# ℹ 3 more variables: Nitrogen <dbl>, Potassium <dbl>, Phosphorous <dbl>