Tidyr

The purpose of tidyr
is to create tidy data. Tidy data has three main features:
- Each variable is a column; each column is a variable.
- Each observation is a row; each row is an observation.
- Each value is a cell; each cell is a single value.
The aim of tidy data is to work within the tidyverse so less time is needed to manipulate data and fight with tools. This allows you to spend your time and effort analysing the data.
This website aims to quickly cover the most commonly used tidyr
functions and uses. Therefore there are a lot more tidyr
functions than those covered here. Please check the below link for the full list.
Sections
The sections for tidyr
are summarised below.
Pivoting
Certain analyses and tools require data to be in a specific structure. Two rectangular/table data structures are wide and long. The two pivot functions transform data between these two types.
pivot_longer()
: Lengthens data, transforming it from wide to long. Many Tidyverse packages are built with long data in mind, especiallyggplot2
.pivot_wider()
: Widens data, transforming it from long to wide.
Character vectors
A column may contain multiple pieces of character/string data or you may want to create a column by uniting multiple character/string columns. The below functions can be used for these purposes.
separate_wider_delim()
: Splits a string column into multiple columns by a delimiter.unite()
: Combines/unites multiple string columns into one column.
Missing values
It is common to encounter NAs in tabular data. Dropping columns with NAs or replacing NA values are common approaches to deal with them.
drop_na()
: Remove rows with NAs.replace_na()
: Replace NA values.