Barking up the wrong tree? Leaf your troubles behind with dendextend

I was recently asked by one of my PhD supervisors to help out on a paper by doing some metagenomic analyses. My mission was essentially to perform some taxonomic analyses of metagenomes and show how a metagenome generated in our lab related to these.

So, naturally, I said yes, carried out the necessary analyses and proceeded to design a figure to show the result. I figured a dendrogram would be a nice way of showing compositional similarity between the community we studied and other communities. First of all, I set about creating a distance matrix and plotting the dendrogram like so:

library(vegan)
data(dune)  # load some dummy data
commDist <- vegdist(dune, "jaccard")  # calculate the community similarity distance matrix
plot(hclust(commDist))

plot of chunk unnamed-chunk-1

There you go. It’s easy to generate a basic dendrogram using any sort of distance matrix. So I then started tidying it up to make it a bit more presentable.

siteLabels <- paste("Sample ", rownames(dune), sep = "")  # create some sample labels
plot(hclust(commDist), labels = siteLabels, main = "", sub = "", xlab = "", ylab = "Jaccard Similarity", lwd = 2, cex = 1.2)

plot of chunk unnamed-chunk-2

Much nicer! We’ve gotten rid of the nonsense at the bottom, given a more informative y axis label and generally made things ‘nicer’. Except, when I saw the first draft, I saw my supervisor had rotated the plot so that the leaf tips pointed to the left. For some reason this just looked wrong to my eyes and so I opted to replot the dendrogram, in a more sensible orientation.

Note that to do this, we have to change our approach slightly. For some stupid reason, when we want to plot a dendrogram horizontally, we have to coerce the hclust output to dendrogram class. Additionally, the ‘plot’ method for a dendrogram class object no longer accepts the labels parameter, so we must rename the rows of our dataframe and recalculate the distance matrix instead.

rownames(dune) <- siteLabels  # rename rows of dataframe with our sample labels created earlier

commDist <- vegdist(dune, "jaccard")  # recalculate the distance matrix so that it features our new sample names

par(mar = c(5, 1, 1, 5))  # adjust margins to make room for tip labels

# replot the dendrogram, note that we can now remove the "main =" and "sub =" arguments
# also remember to switch the x and y labels! 
plot(as.dendrogram(hclust(commDist)), xlab = "Jaccard Similarity", lwd = 2, cex = 1.2, horiz = T)

plot of chunk unnamed-chunk-3

Perfect! Or at least I thought so. Turns out my supervisor liked the improved orientation but didn’t like the way the branches all ended at the same point. That should be easy to correct right? Wrong!

I spent near enough an entire day trying to get this plot perfect without success. I went as far as exploring the ggdendro package which allows plotting of dendrograms in a ggplot-esque manner.

Then I stumbled on a solution, enter dendextend! A quick peek of the package manual reveals some really awesome capabilities, I really urge you to take a look as some of the figures you can create are amazing.
For my humble needs, this package solved all my problems easily, and in a couple of lines I’d created exactly the figure my supervisor wanted.

ifelse("dendextend" %in% rownames(installed.packages()) == T, library(dendextend), install.packages(dendextend)) 
## [1] "dendextend"
# uncomment following line if this is the fist time you've installed this package!
# library(dendextend)
dend1 <- as.dendrogram(hclust(commDist))
wellHung <- hang.dendrogram(dend1)  # the cheeky variable names are absolutely essential!
plot_horiz.dendrogram(wellHung, side = F, xlab = "Jaccard Dissimilarity")

plot of chunk unnamed-chunk-4

There you go, an awesome package which save me from wasting too many more days of work!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s