oaks at Sedgwick Research Preserve, Santa Barbara County, California, USA


Dispersal Diversity : Statistics and Tests

This is a collection of R functions to calculate dispersal diversity statistics and to compare diversity statistics, as described in Scofield et al. 2012 American Naturalist 180: 719-732.

The most current versions of all files can be found below and here: https://github.com/douglasgscofield/dispersal

Input requirements

All functions take as input a simple data structure: a table of site (rows) by source (columns) counts. Though we originally developed the diversity tests to understand seed dispersal in plant populations, the tests themselves should be useful for biodiversity data or any other diversity-like data that can be expressed with this same data structure.

Getting started

The pmiDiversity.R and diversityTest.r source files are required for performing diversity tests. If all that is desired are PMI (Grivet et al. 2005, Scofield et al. 2010, Scofield et al. 2011) and diversity (Scofield et al. 2012) statistics (_qgg, _ag, etc.), the source file pmiDiversity.R contains the pmiDiversity() function that provides these and can be used separately.

Put all the source files in the same directory, and within your R session simply


Additional source files are provided to perform other tasks. plotPairwiseMatrix.R is available for plotting pairwise divergence/overlap matrices. More information is available below. This file requires the pmiDiversity.R source file to be available within the same directory:


gammaAccum.R is available for collecting ? diversity accumulation information and plotting this. More information is avaialble below. This file requires the pmiDiversity.R source file to be available within the same directory:


These statistical tools were developed in collaboration with Peter Smouse (Rutgers University) and Victoria Sork (UCLA) and were funded by U.S. National Science Foundation awards NSF-DEB-0514956 and NSF-DEB-0516529.


Defines the R function pmiDiversity() which takes a site-by-source table and produces statistics for Probability of Maternal Identity aka PMI (Grivet et al. 2005, Scofield et al. 2010, Scofield et al. 2011) and dispersal diversity (Scofield et al. 2012). Three different PMI and diversity statistics are calculated:

  • qgg-based, known to be biased (Grivet et al. 2005)

  • rgg-based, unbiased but poor performers at low sample sizes (Grivet et al. 2005, Scofield et al. 2012)

  • q*gg-based, which apply the transformation developed by Nielsen et al. (2003) to be unbiased and seem to perform well (Scofield et al. 2010, Scofield et al. 2011, Scofield et al. 2012).


Defines several R functions which, like pmiDiversity(), take a site-by-source table (one or more) and test diversity statistics within and among them. See (Scofield et al. 2012) for methodological details. The file pmiDiversity.R (see above) is required to be in the same directory, as it provides functions used here.

alphaDiversityTest(tab) : Test for differences in a diversity among sites within a single dataset

alphaContrastTest(tab.a, tab.b) : Test whether there is a difference in the a diversity between two datasets

alphaContrastTest.3(tab.a, tab.b, tab.c) : Test whether there is a difference in the a diversity among three datasets

plotAlphaTest(result) : Plot the list returned from alphaDiversityTest() or alphaContrastTest() for evaluation

pairwiseMeanTest(tab) : Test whether mean pairwise divergence/overlap among sites is different from the null espectation

plotPairwiseMeanTest() : Plot the list returned from the above test for evaluation

gammaContrastTest(tab.a, tab.b) : Test whether there is a difference in the ? diversity between two datasets

gammaContrastTest.3(tab.a, tab.b, tab.c) : Test whether there is a difference in the ? diversity among three datasets


Provides a function for plotting pairwise diversity matrices as returned by the pmiDiversity() function, examples of which can be seen in Figure 4A-C of Scofield et al. Am Nat.

plotPairwiseMatrix() : Create a visual plot of pairwise divergence or overlap values as calculated by pmiDiversity()

For example, with tab defined as above, plot the divergence matrix based on rgg calculations, labelling the axes "Seed Pool", using the following code:

pmiD = pmiDiversity(tab)
                   axis.label="Seed Pool")


Provides functions for calculating ? accumulation across sites, and plotting the result, examples of which can be seen in Figure 4D-F of Scofield et al. Am Nat. The file pmiDiversity.R (see above) is required to be in the same directory, as it provides functions used here.

A typical workflow using these functions would be:

rga.result = runGammaAccum(tab)


Perform a ? diversity accumulation on the site-by-source data in tab. The result is returned in a list, which may be passed to plotGammaAccum() to plot the result. Several arguments control the method of accumulation and value of ? calculated. Only the defaults have been tested; the others were developed while exploring the data and must be considered experimental.

tab : Site-by-source table, same format as that passed to pmiDiversity()

gamma.method : Calculate ? using "r" (default), "q.nielsen" or "q" method (see paper)

resample.method : "permute" (default) or "bootstrap"; whether to resample sites without ("permute") or with ("bootstrap") replacement

accum.method : "random" (default) or "proximity". If proximity is used, then distance.file must be supplied

distance.file : A file or data.frame containing three columns of data, with the header/column names being pool, X, and Y, containing the spatial locations of the seed pools named in the row names of tab; only used with accum.method="proximity"


Create a visual plot of ? accumulation results from runGammaAccum().

Additional functions

The following functions typically won't be used separately, use runGammaAccum() instead.

gammaAccum() : Workhorse function for ? accumulation

gammaAccumStats() : Extracts stats from the result of gammaAccum()

runGammaAccumSimple() : Wrapper that runs and then returns stats from gammaAccum()


Scofield, D. G., P. E. Smouse, J. Karubian and V. L. Sork. 2012. Use of α, β, and γ diversity measures to characterize seed dispersal by animals. American Naturalist 180: 719-732, supplement, data.

Scofield, D. G., V. R. Alfaro, V. L. Sork, D. Grivet, E. Martinez, J. Papp, A. R. Pluess et al. 2011. Foraging patterns of acorn woodpeckers (Melanerpes formicivorus) on valley oak (Quercus lobata Née) in two California oak savanna-woodlands. Oecologia 166: 187-196, supplement.

Scofield, D. G., V. L. Sork, and P. E. Smouse. 2010. Influence of acorn woodpecker social behaviour on transport of coast live oak (Quercus agrifolia) acorns in a southern California oak savanna. Journal of Ecology 98: 561-572, supplement.

Grivet, D., P. E. Smouse, and V. L. Sork. 2005. A novel approach to an old problem: tracking dispersed seeds. Molecular Ecology 14: 3585-3595.

Nielsen, R., D. R. Tarpy, and H. K. Reeve. 2003. Estimating effective paternity number in social insects and the effective number of alleles in a population. Molecular Ecology 12: 3157-3164.