Subsetting

There are various ways to subset a tibble. This sections shows the various methods with examples.

Tidyverse’s tibble subsetting webpage

Load data

The examples in this section use the pie_crab data from the lterdatasampler package (includes install instructions).

Load package

library("lterdatasampler")

Create a tibble from a subset of the pie_crab data.

pie_crab_tbl <- tibble::as_tibble(lterdatasampler::pie_crab[1:10,1:7])
pie_crab_tbl
# A tibble: 10 × 7
   date       latitude site   size air_temp air_temp_sd water_temp
   <date>        <dbl> <chr> <dbl>    <dbl>       <dbl>      <dbl>
 1 2016-07-24       30 GTM    12.4     21.8        6.39       24.5
 2 2016-07-24       30 GTM    14.2     21.8        6.39       24.5
 3 2016-07-24       30 GTM    14.5     21.8        6.39       24.5
 4 2016-07-24       30 GTM    12.9     21.8        6.39       24.5
 5 2016-07-24       30 GTM    12.4     21.8        6.39       24.5
 6 2016-07-24       30 GTM    13.0     21.8        6.39       24.5
 7 2016-07-24       30 GTM    10.3     21.8        6.39       24.5
 8 2016-07-24       30 GTM    11.2     21.8        6.39       24.5
 9 2016-07-24       30 GTM    12.7     21.8        6.39       24.5
10 2016-07-24       30 GTM    14.6     21.8        6.39       24.5

One column with $

Subsetting a column using $ and a column name will produce a vector.

pie_crab_tbl$site
 [1] "GTM" "GTM" "GTM" "GTM" "GTM" "GTM" "GTM" "GTM" "GTM" "GTM"
str(pie_crab_tbl$site)
 chr [1:10] "GTM" "GTM" "GTM" "GTM" "GTM" "GTM" "GTM" "GTM" "GTM" "GTM"

Rows and columns with [,]

Subsetting a tibble with [] will always produce a tibble.

This is different from a data.frame where you may extract a scalar, vector, or data.frame.

Load data.frame

Create a data.frame from the pie_crab_tbl.

pie_crab_df <- as.data.frame(pie_crab_tbl)
pie_crab_df
         date latitude site  size air_temp air_temp_sd water_temp
1  2016-07-24       30  GTM 12.43   21.792       6.391     24.502
2  2016-07-24       30  GTM 14.18   21.792       6.391     24.502
3  2016-07-24       30  GTM 14.52   21.792       6.391     24.502
4  2016-07-24       30  GTM 12.94   21.792       6.391     24.502
5  2016-07-24       30  GTM 12.45   21.792       6.391     24.502
6  2016-07-24       30  GTM 12.99   21.792       6.391     24.502
7  2016-07-24       30  GTM 10.32   21.792       6.391     24.502
8  2016-07-24       30  GTM 11.19   21.792       6.391     24.502
9  2016-07-24       30  GTM 12.68   21.792       6.391     24.502
10 2016-07-24       30  GTM 14.55   21.792       6.391     24.502

Extracting a value

Subsetting one value from a data.frame leads to a scalar

pie_crab_df[1,1]
[1] "2016-07-24"
str(pie_crab_df[1,1])
 Date[1:1], format: "2016-07-24"

Subsetting one value from a tibble leads to a tibble

pie_crab_tbl[1,1]
# A tibble: 1 × 1
  date      
  <date>    
1 2016-07-24

Extracting a row

Subsetting a row from a data.frame leads to a data.frame

pie_crab_df[1,]
        date latitude site  size air_temp air_temp_sd water_temp
1 2016-07-24       30  GTM 12.43   21.792       6.391     24.502
str(pie_crab_df[1,])
'data.frame':   1 obs. of  7 variables:
 $ date       : Date, format: "2016-07-24"
 $ latitude   : num 30
 $ site       : chr "GTM"
 $ size       : num 12.4
 $ air_temp   : num 21.8
 $ air_temp_sd: num 6.39
 $ water_temp : num 24.5

Subsetting a row from a tibble leads to a tibble

pie_crab_tbl[1,]
# A tibble: 1 × 7
  date       latitude site   size air_temp air_temp_sd water_temp
  <date>        <dbl> <chr> <dbl>    <dbl>       <dbl>      <dbl>
1 2016-07-24       30 GTM    12.4     21.8        6.39       24.5

Extracting a column

Subsetting a column from a data.frame leads to a vector

pie_crab_df[,1]
 [1] "2016-07-24" "2016-07-24" "2016-07-24" "2016-07-24" "2016-07-24"
 [6] "2016-07-24" "2016-07-24" "2016-07-24" "2016-07-24" "2016-07-24"
str(pie_crab_df[,1])
 Date[1:10], format: "2016-07-24" "2016-07-24" "2016-07-24" "2016-07-24" "2016-07-24" ...

Subsetting a column from a tibble leads to a tibble

pie_crab_tbl[,1]
# A tibble: 10 × 1
   date      
   <date>    
 1 2016-07-24
 2 2016-07-24
 3 2016-07-24
 4 2016-07-24
 5 2016-07-24
 6 2016-07-24
 7 2016-07-24
 8 2016-07-24
 9 2016-07-24
10 2016-07-24

Extracting a combo of columns and rows

Subsetting rows and columns together from a data.frame leads to a data.frame

pie_crab_df[1:3,c("date","air_temp","size")]
        date air_temp  size
1 2016-07-24   21.792 12.43
2 2016-07-24   21.792 14.18
3 2016-07-24   21.792 14.52
str(pie_crab_df[1:3,c("date","air_temp","size")])
'data.frame':   3 obs. of  3 variables:
 $ date    : Date, format: "2016-07-24" "2016-07-24" ...
 $ air_temp: num  21.8 21.8 21.8
 $ size    : num  12.4 14.2 14.5

Subsetting rows and columns together from a tibble leads to a tibble

pie_crab_tbl[1:3,c("date","air_temp","size")]
# A tibble: 3 × 3
  date       air_temp  size
  <date>        <dbl> <dbl>
1 2016-07-24     21.8  12.4
2 2016-07-24     21.8  14.2
3 2016-07-24     21.8  14.5

Dplyr

There are also many ways to subset a tibble with dplyr.

dplyr page