2.4.3 Content
This practical is broken up into the following broad sections.
- Raw data: We will first link to a dataset that we have downloaded for this tutorial. We will take a quick look at what the sequence files look like and briefly discuss the origin of the samples.
- Trimming data: This entails preprocessing our data to ensure that it is of good quality.
- Host removal: When sequencing the genomic content of host's microbiota (bacteriome, archaeome, mycobiome, and more) it is likely you will also sequence the host's genome. This step shows a method of removing possible host contamination.
- Taxonomic profiling: We will analyse the dataset to determine the species abundance in each sample. Following this, we will visualise the data and compare the samples.
- Functional profiling: We will analyse the dataset to determine the pathway abundance and completeness in each sample. Following this, we will visualise the data and compare the samples.
- Metagenome assembly: Here, we will move away from just analysing the reads directly and will assemble the metagenome into contigs. Prior to this, we will 'stitch' the reads together to ensure we get the best assembly possible.
- Binning: This step attempts to seperate each assembled genomes into bins. These genome assemblies are called Metagenome-assembled Genomes (MAGs).
- Functional annotation: We will take our MAGs, predict genes and then functionally annotate them with MetaCyc.