2.2.4. Alpha diversity

Estimators of within-community (alpha) diversity have been proposed and refined for decades (Whittaker, 1972; Magurran, 2004). For NGS surveys of bacterial symbionts, three measurements of alpha diversity are commonly used: rarefaction curves, species richness estimators (often in conjunction with rarefaction curves), and community diversity indices.  Bee-associated bacterial surveys commonly report all three measures, but it should be noted that the abundance of 16S amplicon sequences can be a poor predictor of relative bacterial abundances (Amend et al., 2010). Estimates of within and between community diversity that rely on 16S amplicon sequence abundance should therefore be interpreted with caution.  Recently, a method to account for 16S gene copy number in estimating bacterial abundance was developed (Kembel et al., 2012), which may help improve the accuracy of bacterial diversity measurements based on 16S amplicons.

  1. Species richness estimators estimate the total number of species present in a community.  The Chao 1 index is commonly used, and is based upon the number of rare classes (i.e. OTUs) found in a sample (Chao, 1984):
    where Sest is the estimated number of species, Sobs is the observed number of species, f1 is the number of singleton taxa (taxa represented by a single read in that community), and f2 is the number of doubleton taxa.  If a sample contains many singletons, it is likely that more undetected OTUs exist, and the Chao 1 index will estimate greater species richness than it would for a sample without rare OTUs.  Besides the Chao1 estimator, mothur includes several other species richness estimators and a wrapped version of CatchAll, which calculates 12 different estimators and proposes a best estimate of species richness (Bunge et al., 2012). Qiime also includes the Chao1 estimator along with several other species richness estimators.
  2. Rarefaction curves are used to determine whether sampling depth was sufficient to accurately characterize the bacterial community being studied. To build rarefaction curves, each community is randomly subsampled without replacement at different intervals, and the average number of OTUs at each interval is plotted against the size of the subsample (Gotelli and Colwell, 2001).  The point at which the number of OTUs does not increase with further sampling is the point at which enough samples have been taken to accurately characterize the community. Mothur and QIIME will both calculate rarefaction for observed and estimated species richness. QIIME will additionally create graphs of rarefaction curves, while mothur outputs results that can be imported into graphing software.
  3. Community diversity indices combine species richness and abundance into a single value of evenness. Communities that are numerically dominated by one or a few species exhibit low evenness while communities where abundance is distributed equally amongst species exhibit high evenness (Gotelli, 2008).  Two of the most widely used indices are the Shannon (or Shannon-Wiener) index (Shannon, 1948) and Simpson’s index (Simpson, 1949). A recommended index that is not sensitive to sample size is the Probability of an Interspecific Encounter (PIE [Hurlbert, 1971]):
    where N is the sample size, pi is the proportion of the sample that is made up of individuals of species i, and S is the number of species in the sample. PIE is bounded between 0 (a community comprised of a single species), and 1 (a community comprised of an infinite number of equally abundant species), but is not currently included in either mothur or QIIME.  Both mothur and QIIME include multiple community diversity indices.