The function dplyr::select() allows you to select columns/variables from a tibble. There are many different ways to do this with various helper functions.
When subsetting with dplyr::select() the resulting object will always be a tibble.
For demonstration we’ll load the mammal_sleep_tbl data from the mgrtibbles package (hyperlink includes install instructions). For easier viewing we’ll subset it so it only has 5 rows.
#Load packagelibrary("mgrtibbles")#mammal_sleep_tbl tibble for demonstrationmammal_sleep_tbl<- mgrtibbles::mammal_sleep_tbl |> dplyr::slice(1:5)mammal_sleep_tbl
# A tibble: 5 × 11
species body_wt brain_wt non_dreaming dreaming total_sleep life_span gestation
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Africa… 6654 5.71 NA NA 3.3 38.6 645
2 Africa… 1 0.0066 6.3 2 8.3 4.5 42
3 Arctic… 3.38 0.0445 NA NA 12.5 14 60
4 Arctic… 0.92 0.0057 NA NA 16.5 NA 25
5 Asiane… 2547 4.60 2.1 1.8 3.9 69 624
# ℹ 3 more variables: predation <fct>, exposure <fct>, danger <fct>
One column
Select one column
mammal_sleep_tbl |> dplyr::select(dreaming)
# A tibble: 5 × 1
dreaming
<dbl>
1 NA
2 2
3 NA
4 NA
5 1.8
Consecutive range of columns
Select a consecutive range of columns using the first and last column names of the range.
If you include variables/column names that do not exist in the tibble then an error will occur and the command will not run.
Any of
The any_of() function won’t check if a variable exists or not. This means you can include variables/column names that don’t exist and the function will work with no error.
#Vector of column namesvars <-c("genus", "species", "brain_wt", "total_sleep", "total_awake", "life_span")#Select columns from vectormammal_sleep_tbl |> dplyr::select(any_of(vars))
This is especially useful to remove variables as if you try to remove a variable after already removing it the function will still work. To unselect variables place a - in front of the any_of() function.
#Vector of column namesvars <-c("genus", "species", "brain_wt", "total_sleep", "total_awake", "life_span")#Unselect columns from vectormammal_sleep_tbl |> dplyr::select(-any_of(vars))
# A tibble: 5 × 7
body_wt non_dreaming dreaming gestation predation exposure danger
<dbl> <dbl> <dbl> <dbl> <fct> <fct> <fct>
1 6654 NA NA 645 3 5 3
2 1 6.3 2 42 3 1 3
3 3.38 NA NA 60 1 1 1
4 0.92 NA NA 25 5 2 3
5 2547 2.1 1.8 624 3 5 4
Numeric indexes
Numeric indexes can be used for column selection.
Select the first column.
mammal_sleep_tbl |> dplyr::select(1)
# A tibble: 5 × 1
species
<chr>
1 Africanelephant
2 Africangiantpouchedrat
3 ArcticFox
4 Arcticgroundsquirrel
5 Asianelephant
Select columns 3:5.
mammal_sleep_tbl |> dplyr::select(3:5)
# A tibble: 5 × 3
brain_wt non_dreaming dreaming
<dbl> <dbl> <dbl>
1 5.71 NA NA
2 0.0066 6.3 2
3 0.0445 NA NA
4 0.0057 NA NA
5 4.60 2.1 1.8
Select columns 4, 7,and 2.
mammal_sleep_tbl |> dplyr::select(c(4,7,2))
# A tibble: 5 × 3
non_dreaming life_span body_wt
<dbl> <dbl> <dbl>
1 NA 38.6 6654
2 6.3 4.5 1
3 NA 14 3.38
4 NA NA 0.92
5 2.1 69 2547
Select all but the 6th column.
mammal_sleep_tbl |> dplyr::select(-6)
# A tibble: 5 × 10
species body_wt brain_wt non_dreaming dreaming life_span gestation predation
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct>
1 Africane… 6654 5.71 NA NA 38.6 645 3
2 Africang… 1 0.0066 6.3 2 4.5 42 3
3 ArcticFox 3.38 0.0445 NA NA 14 60 1
4 Arcticgr… 0.92 0.0057 NA NA NA 25 5
5 Asianele… 2547 4.60 2.1 1.8 69 624 3
# ℹ 2 more variables: exposure <fct>, danger <fct>
Last column
Select the last column with last_col() helper function.