Recently, the newest version of the popular
ggplot2 graphics package was announced, and it has some nifty mapping features that I was keen to try out Continue reading
After my post yesterday, documenting a faster parallelised version of the
rarecurve function (
quickRareCurve), I realised it’d be good to show a real world example using it on a reasonably large OTU table, to prove that it is indeed quicker than the original function. So, here we go. Continue reading
When beginning analyses on microbial community data, it is often helpful to compute rarefaction curves. A rarefaction curve tells you about the rate at which new species/OTUs are detected as you increase the number of individuals/sequences sampled. It does this by taking random subsamples from 1, up to the size of your sample, and computing the number of species present in each subsample. Ideally, you want your rarefaction curves to be relatively flat, as this indicates that additional sampling would not likely yield further species.
vegan package in
R has a nice function for computing rarefaction curves for species by site abundance tables. However, for microbial datasets this function is often prohibitively slow. Continue reading
I was recently asked via twitter to share some code on how I implemented the
facet_zoom() function, from the newly released
ggforce package, to zoom in on a particular region of a map. I’ve no idea if this is the “best” way of creating these sorts of maps, but anyway, here goes…
Let me start off by stating that I have enormous respect for Rob Edgar (creator of USEARCH, UPARSE etc.). His contributions to the field of bioinformatics, and indirectly to the fields of molecular and microbial ecology have been huge, you only need to look at his citation rates to see that! So this post is not intended as a criticism of him or his work in any way.
That said, I’ve recently been thinking about the deluge of new algorithms for picking Operational Taxonomic Units (OTUs) from molecular sequence datasets and wondering where and how USEARCH, UCLUST et al. will fit in. Continue reading
If you use
R to analyse and plot your data, then you’ve probably heard of and used the
ggplot2 package, written by Hadley Wickham.
ggplot2 is a highly flexible plotting package allowing you to create just about any kind of plot you can think of, and customise just about any aspect of your plot.
ggplot2 is also known for it’s somewhat strange choice of default options (at least, they seem strange to me!). Therefore, it can seem like a lot of work to go from a basic plot to something that is approaching publication quality. Continue reading
This is a quick post more to document some useful code. When conducting bioinformatic analyses using Qiime, one of the last steps is to cluster sequences into OTUs (operational taxonomic units) and assign taxonomy to them. You can then make an OTU table which contains all your OTUs and their associated taxonomy. Bam, easy!
But what to do if you haven’t/don’t want to use Qiime to cluster OTUs? Continue reading
As a microbial ecologist, part of my job is to try and assign taxonomy to all of the microbial critters living in the habitats I study. For archaeal or bacterial 16S rRNA gene sequences this is relatively easy. The Ribosomal Database Project have a naive Bayesian classifier which is trained on a large curated database of archaeal and bacterial 16S rRNA gene sequences. Best of all, it is implemented in the popular bioinformatics pipeline Qiime, making it nice and easy to apply to your own data.
But what if you are for dealing with fungal ITS sequences instead? Continue reading
I was recently asked by one of my PhD supervisors to help out on a paper by doing some metagenomic analyses. My mission was essentially to perform some taxonomic analyses of metagenomes and show how a metagenome generated in our lab related to these.
So, naturally, I said yes, carried out the necessary analyses and proceeded to design a figure to show the result. I figured a dendrogram would be a nice way of showing compositional similarity between the community we studied and other communities. Continue reading
Being both an
R nerd and a dad, I find there are disappointingly few opportunities to combine these (hobbies, responsibilities, not sure what the right word is here…). However, a new concept occurred to me the other day… Continue reading