A Mamba installs

A.1 Mamba installation and environment

Mamba is a reimplementation of conda. It is a great tool for installing bioinformatic packages including R packages.

Mamba github: https://github.com/mamba-org/mamba

The best way to use Mamba is to install Miniforge. It has both Conda and Mamba commands.

Miniforge installation: https://github.com/conda-forge/miniforge

Mamba guide: https://mamba.readthedocs.io/en/latest/user_guide/mamba.html

To create the mamba environment shotgun_meta run the below commands in your bash. You will need to have installed mamba first.

#shotgun_meta
mamba create -n shotgun_meta
mamba activate shotgun_meta
#Install packages
mamba install -c bioconda fastqc trim-galore multiqc bowtie2 kraken2 \
krona bracken lefse flash megahit quast metabat2 prokka bbmap bakta circos
#Update krona taxonomy database
ktUpdateTaxonomy.sh
#Install GenoVi via pip
pip install genovi

You will need to install the Bakta database as well. This requires you have a directory to store the database. In the below example you will need a directory with the path ~/databases/bakta_db, of course feel free to use a different location you have.

bakta_db download --output ~/databases/bakta --type full

Install MinPath via git. I suggest creating a directory called "git_installs" in your home directory and running this code there.

#git_installs directory
mkdir ~/git_installs
cd ~/git_installs
#MinPath
git clone https://github.com/mgtools/MinPath

To run MinPath in your own machines you will need to use the full path of the python file. If you installed these in your "~/git_installs" examples are below.

#MinPath help page
python ~/git_installs/Minpath/MinPath.py -h

To create the mamba environment cocopye run the below commands in your bash. You will need to have installed mamba first.

#checkm
mamba create -n cocopye
mamba activate checkm2
mamba install bioconda::checkm2=1.1.0
#Download required databases
checkm2 database --download

To create the mamba environment biobakery3.9 run the below commands in your bash. You will need to have installed mamba first.

#biobakery3
mamba create -n biobakery3.9
mamba activate biobakery3.9
#Add required channels
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --add channels biobakery
#Install huamnn3.9
mamba install -c biobakery humann=3.9
#Install hclust2 and matplotlib
mamba install bioncda::hclust2=1.0.0 conda-forge::matplotlib=3.7.3

Next you will need to install the HUMAnN and MetaPhlAn databases.

#Ensure you have biobakery3 mamba environment activated
#Update HUMAnN databases
humann_databases --download chocophlan full /path/to/databases --update-config yes
humann_databases --download uniref uniref90_diamond /path/to/databases --update-config yes
humann_databases --download utility_mapping full /path/to/databases --update-config yes
#Install MetaPhlAn databases
metaphlan --install --index mpa_vOct22_CHOCOPhlAnSGB_202403

Installing the MetaPhlAn databases may not work with the command above. If not follow the below instructions.

Once the databses are all setup you can test HUMAnN.

humann_test

B Next steps

Below are some good links to start with before carrying out your own projects.

D Obtaining Read Data

The following commands can be used to obtain the sequence data used in this practical, directly from the EBI metagenomics site. It is worth noting that these are the full set of data, not like the miniaturised version you have used in the tutorial.

wget -O K1_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505102/ERR505102_1.fastq.gz
wget -O K1_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505102/ERR505102_2.fastq.gz
wget -O K2_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505104/ERR505104_1.fastq.gz
wget -O K2_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505104/ERR505104_2.fastq.gz
wget -O K3_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505095/ERR505095_1.fastq.gz
wget -O K3_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505095/ERR505095_2.fastq.gz
wget -O K4_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505096/ERR505096_1.fastq.gz
wget -O K4_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505096/ERR505096_2.fastq.gz
wget -O K5_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505097/ERR505097_1.fastq.gz
wget -O K5_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505097/ERR505097_2.fastq.gz
wget -O K6_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505098/ERR505098_1.fastq.gz
wget -O K6_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505098/ERR505098_2.fastq.gz
wget -O K7_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505099/ERR505099_1.fastq.gz
wget -O K7_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505099/ERR505099_2.fastq.gz
wget -O K8_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505100/ERR505100_1.fastq.gz
wget -O K8_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505100/ERR505100_2.fastq.gz
wget -O K9_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505101/ERR505101_1.fastq.gz
wget -O K9_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505101/ERR505101_2.fastq.gz
wget -O K10_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505103/ERR505103_1.fastq.gz
wget -O K10_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505103/ERR505103_2.fastq.gz
wget -O K11_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505105/ERR505105_1.fastq.gz
wget -O K11_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505105/ERR505105_2.fastq.gz
wget -O K12_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505106/ERR505106_1.fastq.gz
wget -O K12_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505106/ERR505106_2.fastq.gz
wget -O W1_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505090/ERR505090_1.fastq.gz
wget -O W1_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505090/ERR505090_2.fastq.gz
wget -O W2_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505088/ERR505088_1.fastq.gz
wget -O W2_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505088/ERR505088_2.fastq.gz
wget -O W3_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505084/ERR505084_1.fastq.gz
wget -O W3_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505084/ERR505084_2.fastq.gz
wget -O W4_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505085/ERR505085_1.fastq.gz
wget -O W4_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505085/ERR505085_2.fastq.gz
wget -O W5_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505086/ERR505086_1.fastq.gz
wget -O W5_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505086/ERR505086_2.fastq.gz
wget -O W6_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505087/ERR505087_1.fastq.gz
wget -O W6_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505087/ERR505087_2.fastq.gz
wget -O W7_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505089/ERR505089_1.fastq.gz
wget -O W7_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505089/ERR505089_2.fastq.gz
wget -O W8_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505091/ERR505091_1.fastq.gz
wget -O W8_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505091/ERR505091_2.fastq.gz
wget -O W9_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505092/ERR505092_1.fastq.gz
wget -O W9_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505092/ERR505092_2.fastq.gz
wget -O W10_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505093/ERR505093_1.fastq.gz
wget -O W10_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505093/ERR505093_2.fastq.gz
wget -O W11_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505094/ERR505094_1.fastq.gz
wget -O W11_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505094/ERR505094_2.fastq.gz
wget -O W12_R1.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505083/ERR505083_1.fastq.gz
wget -O W12_R2.fastq.gz ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR505/ERR505083/ERR505083_2.fastq.gz