Chapter 3 Exporting data

It is useful to be able to export our ASV sequences and a matrix table of counts from DADA2/R as we might want to visualise or analyse these using other software.

Our ASV sequences and counts per sample are stored in the object seqtab.nochim. The ASVs are not named, so first let's name them (ASV_1, ASV_2, etc.).

# The column names of seqtab.nochim are actually the ASV sequences, 
    # so extract these and assign them to `mifish_seqs`
mifish_seqs <- colnames(seqtab.nochim)

# Make a new variable for ASV names, `mifish_headers`, 
    #with length equal to the number of ASVs
mifish_headers <- vector(dim(seqtab.nochim)[2], mode="character")

# Fill the vector with names formatted for a fasta header (>ASV_1, >ASV_2, etc.)
for (i in 1:dim(seqtab.nochim)[2]) {
  mifish_headers[i] <- paste(">ASV", i, sep="_")
}

3.1 Fasta file

Now we have our sequences and names as variables we can join them and make a fasta file.

mifish_fasta <- c(rbind(mifish_headers, mifish_seqs))
write(mifish_fasta, "MiFish_ASVs.fa")

You should now have this fasta file in your working directory on the server.

3.2 Sequence count matrix

Next make a table of sequence counts for each sample and ASV.

# First transpose the `seqtab.nochim` and assign this to the variable `mifish_tab`
mifish_tab <- t(seqtab.nochim)

# Name each row with the ASV name, omitting the '>' used in the fasta file
row.names(mifish_tab) <- sub(">", "", mifish_headers)
write.table(mifish_tab, "MiFish_ASV_counts.tsv", sep="\t", quote=F, col.names=NA)

You should now have an ASV by sample matrix with sequence counts in a tab-separated value (tsv) file in your working directory.

3.3 Table of taxon names

Lastly, if we've used dada2 to assign taxonomy we can make a table of taxon names for each ASV.

# Replace the row names in `taxa` with the ASV names, 
    # omitting the '>' used for the fasta file.
rownames(taxa) <- gsub(pattern=">", replacement="", x=mifish_headers)
write.table(taxa, "MiFish_ASV_taxonomy.tsv", sep = "\t", quote=F, col.names=NA)

You should now have a tsv file of taxonomic assignments for each ASV in your working directory.