#Charlop-Powers et al 2014
Global Biogeography of Bacterial Secondary Metabolism

Results + Discussion


similarity metric of OTUs

Jaccard Distance

OTU set A = $\mathcal{A}$
OTU set B = $\mathcal{B}$

Jaccard index:

J(\mathcal{A}, \mathcal{B}) = \frac{|\mathcal{A}\bigcap\mathcal{B}|}

has the nice property that:

$$ 0 \leq J(\mathcal{A},\mathcal{B}) \leq 1 $$

the distance is then:

$$ d_{J} = 1 - J(\mathcal{A}, \mathcal{B}) $$

Effect of Geographic Distance

Estimates of Diversity

Chao1 metric


“The typical way these estimators operate is by using the number of rare species that are found in a sample as a way of calculating how likely it is there are more undiscovered species. “

The estimate of the true number of species is a function of the observed number of species plus a ratio of the squared number of singletons over the number of doubletons

S{1} = S{obs} + \frac{F{1}^{2}}{2F{2}^{2}}

The rationale is that as long as singletons are still being observed, there remain more unobserved species in the sample.

scaffold hotspots

This is interesting, and I wish they did more

particular soil samples can be shown to enrich for OTUs (domains) that map precisely back to specific scaffold clusters (Ansamycin, glycopeptides, etc)

It seems stupid to me to map the hotspots of geographic latitudes. If they performed the same study but moved every sample site 100m in another direction, they would get a different set of hotspots. I think the distribution of hotspots was essential random. Doesn’t take away from the idea that hotspots exist, I just don’t think it’s predictive.

Diversity hotspots

I can’t say whether I think it’s true or not that different environments are better or worse for exploring biosynthetic diversity. Neither this paper nor their previous ones have convinced me of this.

Atlantic forest and Desert samples contain more biosynthetic diversity - I speculate that this claim would fall apart if they sampled more broadly. It’s certainly true of this set of samples, but I don’t think it’s a recipe for looking for diversity, it just happened to be true for lack of more samples.