The function summarise() allows you to get a summary of the unique values within a grouped tibble . It will produce a __tibble_ of the summarised information.
Various helper functions can be used to get specific info including:
n(): Count number of instances of group.
mean(): Calculate means of a columns.
median(): Calculate median of columns.
sd(): Calculate standard deviation.
IQR(): Calculate interquartile range. iqr = upper quartile - lower quartile
first(): Extract first value.
last(): Extract last value.
nth(): Extract specified nth value.
Tidyverse reference page
Dataset
For demonstration we’ll load the amphibian_div_tbl data from the mgrtibbles package (hyperlink includes install instructions).
We’ll remove any rows with NAs as NAs cause mean and other other calculations to return NA.
#Load package
library ("mgrtibbles" )
#amphibian_div_tbl tibble for demonstration
amphibian_div_tbl <- mgrtibbles:: amphibian_div_tbl |> na.omit ()
#View tibble
amphibian_div_tbl
# A tibble: 65 × 15
Species IUCN.Red.List.Status iucn_2cat Order Family Genus
<chr> <fct> <fct> <chr> <chr> <chr>
1 Acris crepitans Least Concern (LC) LC Anura Hylidae Acris
2 Acris gryllus Least Concern (LC) LC Anura Hylidae Acris
3 Ambystoma jeffersonianum Least Concern (LC) LC Caudata Ambyst… Amby…
4 Ambystoma macrodactylum Least Concern (LC) LC Caudata Ambyst… Amby…
5 Ambystoma maculatum Least Concern (LC) LC Caudata Ambyst… Amby…
6 Ambystoma texanum Least Concern (LC) LC Caudata Ambyst… Amby…
7 Ambystoma tigrinum Least Concern (LC) LC Caudata Ambyst… Amby…
8 Amphiuma means Least Concern (LC) LC Caudata Amphiu… Amph…
9 Amphiuma tridactylum Least Concern (LC) LC Caudata Amphiu… Amph…
10 Anaxyrus boreas Least Concern (LC) LC Anura Bufoni… Anax…
# ℹ 55 more rows
# ℹ 9 more variables: Age_at_maturity_min_y <dbl>, Age_at_maturity_max_y <dbl>,
# Body_size_mm <dbl>, Longevity_max_y <dbl>, Litter_size_min_n <dbl>,
# Litter_size_max_n <dbl>, Offspring_size_min_mm <dbl>,
# Offspring_size_max_mm <fct>, Development <chr>
Summarise
The default of summarise() is to produce a tibble of unique group values.
amphibian_div_tbl |>
dplyr:: group_by (IUCN.Red.List.Status) |>
dplyr:: summarise ()
# A tibble: 5 × 1
IUCN.Red.List.Status
<fct>
1 Least Concern (LC)
2 Near Threatened (NT)
3 Vulnerable (VU)
4 Endangered (EN)
5 Least Concern (LC) - Provisional
Count
The counts of each unique value can be added with n().
Notice that the new column’s name is specified before the = sign. This is the same as the count() function.
amphibian_div_tbl |>
dplyr:: group_by (IUCN.Red.List.Status) |>
dplyr:: summarise (n = n ())
# A tibble: 5 × 2
IUCN.Red.List.Status n
<fct> <int>
1 Least Concern (LC) 55
2 Near Threatened (NT) 5
3 Vulnerable (VU) 3
4 Endangered (EN) 1
5 Least Concern (LC) - Provisional 1
Standard deviation and IQR
The sd() and IQR() function calculate standard deviation and inter quartile range (upper quartile - lower quartile).
Note NAs are provided for standard deviation and 0 for IQR if there is only one value in the group.
amphibian_div_tbl |>
#Group by IUCN.Red.List.Status
dplyr:: group_by (IUCN.Red.List.Status) |>
#Summarise
dplyr:: summarise (n = n (),
sd_body_size = sd (Body_size_mm),
iqr_body_size = IQR (Body_size_mm),
)
# A tibble: 5 × 4
IUCN.Red.List.Status n sd_body_size iqr_body_size
<fct> <int> <dbl> <dbl>
1 Least Concern (LC) 55 200. 94.5
2 Near Threatened (NT) 5 51.7 83
3 Vulnerable (VU) 3 71.1 63.5
4 Endangered (EN) 1 NA 0
5 Least Concern (LC) - Provisional 1 NA 0
First, last and nth values
The first, last, and nth value can be extracted with the function first(), last(), and nth().
Note NAs are provided for nth() as there is only one value. first() and last() work as the single value is the first and last value.
amphibian_div_tbl |>
#Group by IUCN.Red.List.Status
dplyr:: group_by (IUCN.Red.List.Status) |>
#Summarise
dplyr:: summarise (n = n (),
first_body_size = first (Body_size_mm),
last_body_size = last (Body_size_mm),
second_body_size = nth (Body_size_mm, 2 )
)
# A tibble: 5 × 5
IUCN.Red.List.Status n first_body_size last_body_size second_body_size
<fct> <int> <dbl> <dbl> <dbl>
1 Least Concern (LC) 55 38 197 33
2 Near Threatened (NT) 5 51 76 111
3 Vulnerable (VU) 3 51 170 178
4 Endangered (EN) 1 87 87 NA
5 Least Concern (LC) - Pr… 1 48 48 NA