This is a quick post more to document some useful code. When conducting bioinformatic analyses using Qiime, one of the last steps is to cluster sequences into OTUs (operational taxonomic units) and assign taxonomy to them. You can then make an OTU table which contains all your OTUs and their associated taxonomy. Bam, easy!
But what to do if you haven’t/don’t want to use Qiime to cluster OTUs? For example, I often use Vsearch to cluster my OTUs and then may use Qiime to assign taxonomy via the RDP classifier. This means we need some way of uniting our already constructed OTU table with our taxonomy file.
Here is a short piece of R code which will do the job nicely.
# First, let's read in our OTU table otuTable <- read.delim("archOtuTab.txt", header = T) # now to read in the taxonomy file generated by the RDP classifier in Qiime taxa <- read.table("archaeaCentroids_tax_assignments.txt", header = F, col.names= c("OTUId", "taxonomy", "confidence"), stringsAsFactors = F, sep = "\t") otuTable <- merge(otuTable, taxa[, -3], by = "OTUId") # this line then merges the OTU table with taxonomy. Order of OTUs/taxonomy is # not important, R will correctly match OTUs with their taxonomy string # Note that this line will not print the confidence column in the final OTU # table as it is often not that useful... # now we just need to write our completed OTU table to a file write.table(otuTable, "archaeaOtuTableTaxonomy.txt", sep = "\t", quote = F, row.names = F)
Done! A simple way of uniting taxonomy information with a standalone OTU table without using Qiime.